Difference between revisions of "Iarpa"
From Knowitall
(→Conversion of Docs to Text) |
(→Engineering Tasks) |
||
Line 1: | Line 1: | ||
== TODO == | == TODO == | ||
=== Engineering Tasks === | === Engineering Tasks === | ||
− | * Handle | + | * Handle capitalized/allcaps/nocap sentences. |
+ | * Oddity of data where person lastname is in parentheses. | ||
* Handle geotagging and other oddities of the classified data. | * Handle geotagging and other oddities of the classified data. | ||
* Address efficiency issues. | * Address efficiency issues. |
Revision as of 21:11, 6 April 2011
Contents
TODO
Engineering Tasks
- Handle capitalized/allcaps/nocap sentences.
- Oddity of data where person lastname is in parentheses.
- Handle geotagging and other oddities of the classified data.
- Address efficiency issues.
Research Tasks
- Pronoun Resolution
Tools
Speed
- Compile to native code.
- Compare NLP libraries.
Conversion of Docs to Text
- Capitalized sentences such as "Nuclear Material Seized"
- Detect headlines and handle separately
- Sentences with only single letters
- Detect bullet points