Difference between revisions of "Iarpa"

From Knowitall
Jump to: navigation, search
(Quality of Extractions)
Line 4: Line 4:
  
 
== Quality of Extractions ==
 
== Quality of Extractions ==
* Too short relations (2 characters)
+
=== TODO ===
 
* Too long relations (7 words)
 
* Too long relations (7 words)
 
* some args are just special symbols (e.g., ")
 
* some args are just special symbols (e.g., ")
 +
* remove relations that break database constraints (i.e. token > 64 characters, these can probably be more strict)
 
* Pronoun Resolution
 
* Pronoun Resolution
 +
=== DONE ===
 +
* Too short relations (2 characters)
  
 
== Speed ==
 
== Speed ==

Revision as of 00:05, 11 January 2011

Domain Recognizers

Rule Learning

Quality of Extractions

TODO

  • Too long relations (7 words)
  • some args are just special symbols (e.g., ")
  • remove relations that break database constraints (i.e. token > 64 characters, these can probably be more strict)
  • Pronoun Resolution

DONE

  • Too short relations (2 characters)

Speed

  • Compile to native code.
  • Compare NLP libraries.

Conversion of Docs to Text

  • Detect headlines and handle separately
  • Sentences with only single letters
  • Detect bullet points