Difference between revisions of "Iarpa"
From Knowitall
(→Engineering Tasks) |
(→Conversion of Docs to Text) |
||
Line 17: | Line 17: | ||
== Conversion of Docs to Text == | == Conversion of Docs to Text == | ||
− | |||
* Detect headlines and handle separately | * Detect headlines and handle separately | ||
* Sentences with only single letters | * Sentences with only single letters | ||
* Detect bullet points | * Detect bullet points |
Revision as of 21:12, 6 April 2011
Contents
TODO
Engineering Tasks
- Handle capitalized/allcaps/nocap sentences.
- Oddity of data where person lastname is in parentheses.
- Handle geotagging and other oddities of the classified data.
- Address efficiency issues.
Research Tasks
- Pronoun Resolution
Tools
Speed
- Compile to native code.
- Compare NLP libraries.
Conversion of Docs to Text
- Detect headlines and handle separately
- Sentences with only single letters
- Detect bullet points