Difference between revisions of "Iarpa"

From Knowitall
Jump to: navigation, search
Line 1: Line 1:
== Domain Recognizers ==
 
 
== Rule Learning ==
 
 
 
== Quality of Extractions ==
 
== Quality of Extractions ==
 
=== TODO ===
 
=== TODO ===
* Too long relations (7 words)
+
== Engineering Tasks ==
* some args are just special symbols (e.g., ")
+
* Handle capital sentences.
* remove relations that break database constraints (i.e. token > 64 characters, these can probably be more strict)
+
* Handle geotagging and other oddities of the classified data.
 +
* Address efficiency issues.
 +
 
 +
== Research Tasks ==
 
* Pronoun Resolution
 
* Pronoun Resolution
 +
 
=== DONE ===
 
=== DONE ===
 
* Too short relations (2 characters)
 
* Too short relations (2 characters)

Revision as of 19:17, 4 April 2011

Quality of Extractions

TODO

Engineering Tasks

  • Handle capital sentences.
  • Handle geotagging and other oddities of the classified data.
  • Address efficiency issues.

Research Tasks

  • Pronoun Resolution

DONE

  • Too short relations (2 characters)

Tools

Speed

  • Compile to native code.
  • Compare NLP libraries.

Conversion of Docs to Text

  • Detect headlines and handle separately
  • Sentences with only single letters
  • Detect bullet points