Difference between revisions of "IARPA/Pattern"

From Knowitall
Jump to: navigation, search
 
Line 1: Line 1:
Expressions:
+
Regex Expressions:
 +
 
 +
* alternation: <code>|</code>
 +
* option: <code>?</code>
 +
* Kleene-star: <code>*</code>
 +
* plus: <code>+</code>
 +
* start assertion: <code>^</code>
 +
* end assertion: <code>$</code>
 +
* matching group: <code>()</code>
 +
* non-matching group: <code>(?:)</code>
 +
* named group: <code>(<name>:)</code>
 +
 
 +
Token Expressions:
  
 
* string: takes a case-insensitive regular expression
 
* string: takes a case-insensitive regular expression

Latest revision as of 22:04, 14 March 2011

Regex Expressions:

  • alternation: |
  • option: ?
  • Kleene-star: *
  • plus: +
  • start assertion: ^
  • end assertion: $
  • matching group: ()
  • non-matching group: (?:)
  • named group: (<name>:)

Token Expressions:

  • string: takes a case-insensitive regular expression
  • stringcs: take a case-sensitive regular expression
  • lemma: take a case-insensitive regular expression for the lemma
  • pos: takes a case-insensitive regular expression for the pos tag
  • chunk: takes a case-insensitive regular expression for the chunk tag
  • type: takes a case-insensitive string for any type that spans the token

Examples:

 <string="an?|the">? <pos="JJ">* <pos="NNP">+ <pos="NN">+ <pos="NNP>+
 The incredible U.S. president Barack Obama
 famed UW professor Oren Etzioni
 <pos="NNP">+ <stringcs="president">+ <pos="NNP>+
 U.S. president Barack Obama
 not: U.S. President Barack Obama