Difference between revisions of "Iarpa"

From Knowitall
Jump to: navigation, search
(Conversion of Docs to Text)
(Engineering Tasks)
Line 1: Line 1:
 
== TODO ==
 
== TODO ==
 
=== Engineering Tasks ===
 
=== Engineering Tasks ===
* Handle capital sentences.
+
* Handle capitalized/allcaps/nocap sentences.
 +
* Oddity of data where person lastname is in parentheses.
 
* Handle geotagging and other oddities of the classified data.
 
* Handle geotagging and other oddities of the classified data.
 
* Address efficiency issues.
 
* Address efficiency issues.

Revision as of 21:11, 6 April 2011

TODO

Engineering Tasks

  • Handle capitalized/allcaps/nocap sentences.
  • Oddity of data where person lastname is in parentheses.
  • Handle geotagging and other oddities of the classified data.
  • Address efficiency issues.

Research Tasks

  • Pronoun Resolution

Tools

Speed

  • Compile to native code.
  • Compare NLP libraries.

Conversion of Docs to Text

  • Capitalized sentences such as "Nuclear Material Seized"
  • Detect headlines and handle separately
  • Sentences with only single letters
  • Detect bullet points