Difference between revisions of "Document-level Open IE"

Latest revision as of 22:37, 22 November 2013

Goals

Extend sentence-based Open IE extractors to incorporate document-level reasoning, such as:
- Coreference
- Entity Linking
- NER
- Rules implemented for TAC 2013 Entity Linking
Define necessary data structures and interfaces by Oct-9 (done)
Preliminary End-to-end system evaluation by Nov-11 (done)
Quantitatively determine how much this adds to present Open IE

Work Log

11-22

Trained and evaluated a linear classifier for best-mentions (instances of a rule-application or best-mention-resolution), which provides 95% precision at 90% yield over news data. Features include rule type (person, organization, location), whether coreference info was used, and the ambiguity of a given mention. Todo from here:

Polish features:
- Include coreference info when deciding candidates for a rule (currently coreference is only considered after applying rules)
- Improve ambiguity measures: Return a value that indicates prominence of the chosen mention out of other ambiguous mentions
- Improve location ambiguity measure - fix a technical issue reading tipster gazetteer. Consider city, stateOrProvince, and Country ambiguity separately.
- Debug an issue where names are resolved to something only matching a prefix (e.g. Steven Miller -> Steven Tyler)
Produce a formal evaluation
- How much does this help "Open IE?"
  - How many extractions get annotated with additional, useful information?
  - How many more links get found as a result of best mentions? Coref? Are they higher confidence links?
  - How much does it increase running time to do Document-level processing with/without Coref?
Code cleanup, packaging, and release.

11-12

Implemented serialization to allow extractor input to be saved to disk after pre-processing, saving us from redoing with each run:

Parsing
Chunking
Stemming
Coref
Sentence-level Open IE
NER tagging

This saves roughly 3 minutes per run (over 20 docs) and will greatly speed development time.

Started refactoring and cleaning up rules. Next step: get all substitution rules "on equal footing" programatically so that a classifier can be built to rank them.

11-8

Finished annotating data and discussed results. String substitution rules need to be tightened up, and a confidence measure over them would help greatly.

Extraction-level stats

From 20 documents, there were 528 total extractions in all runs.

Rules-diff: -- 206 extractions in diff -- 75 baseline better -- 99 rule-based system better -- 33 bad extractions (neither better)

Coref-diff: -- 280 extractions in diff -- 105 baseline better -- 115 coref+rule-based system better -- 59 bad extractions (neither better)

I took a closer look at the "baseline better" cases to see where we were getting it wrong:

Rule-based system -- 49 strange string errors e.g. "CDC" -> "CIENCE SLIGHTED IN" -- 16 location errors (e.g. "Washington" [DC] -> "Washington, Georgia") -- 8 entity disambiguation errors, e.g. ("he" [Scott Peterson] => "Laci Peterson") -- 1 incorrect link (e.g. "the theory" linked to "Theory" in FreeBase) -- 75 total

Coref+rule-based system -- 49 strange string errors -- 11 location errors -- 13 entity disambiguation errors -- 17 incorrect links -- 6 coref errors (e.g. "make it clear that" -> "make the CDC clear that") -- 105 total

Approximate running times over 20 documents: Baseline: 45 sec Rules: 45 Sec Rules+Coref: 230 sec

11-4

Released system output for evaluation:
- "Rules" configuration, using rule-based best-mention disambiguation, NO Coref.
- "Coref" configuration, using coref-assisted rule-based best-mention disambiguation. Entity Linking context also extended via coreference.
- Entity Linking output, showing differences in Entitylinks between each system configuration (and baseline).

Next: Stephen, John and I will annotate the output and analyze performance.

10-25

Met with Stephen, John, and Michael. Items:

Create a (very simple) webapp for doc extractor
Cleanup arguments before submitting them to the linker.
Replace best-mention substrings rather than substituting best mentions for the entire argument.
Reformat evaluation output to show only extractions that have been annotated with additional info (diff)
Evaluate difference in linker performance with/without document-level info.

10-18

Met with Stephen and John. Discussed:

Evaluation systems:
- Baseline sentence extractor with entity linker, no coreference
- Full system with best-mention finding rules
- Full system without coreference.
Evaluation data:
- Sample of 20-30 documents from TAC 2013.
- Moving away from QA/Query based approach, since the queries/questions will bias evaluation of the document extractor.
- Instead, we will evaluate all (or a uniform sample) of extractions.
Evaluation criteria:
- Extractions "correct" if their arguments are as unambiguous as possible given the document text.
- Measure prec/yield using this metric and compare systems.

10-17

Completed: Integrated sentence-level Open IE and Freebase Linker, test run OK.

Next Goals:

Integrate best-mention finding rules.
- First: Drop in code "as-is"
- After: Factor out NER tagging, coref components
Fix issues with tracking character offsets
- Offsets are not properly computed for Open IE extractions
- Find a good way for retrieving document metadata by character offset.

10-9

Short term goal - define necessary interfaces and data structures by 10-11

Implemented interfaces for:
- Document
- Sentence
- Extraction
- Argument/Relation
- Coreference Mention
- Coreference Cluster
- Entity Link
Discussed interfaces at length with John and Michael
- Interfaces to be incorporated into generic NLP tool library (nlptools):
  - Document
  - Sentence
  - CorefResolver

@@ Line 5: / Line 5: @@
 ** NER
 ** Rules implemented for TAC 2013 Entity Linking
-* Define necessary data structures and interfaces by Oct-9
+* Define necessary data structures and interfaces by Oct-9 (done)
-* End-to-end system evaluation by Nov-11
+* Preliminary End-to-end system evaluation by Nov-11 (done)
+* Quantitatively determine how much this adds to present Open IE
 == Work Log ==
+=== 11-22 ===
+Trained and evaluated a linear classifier for best-mentions (instances of a rule-application or best-mention-resolution), which provides 95% precision at 90% yield over news data.
+Features include rule type (person, organization, location), whether coreference info was used, and the ambiguity of a given mention.
+Todo from here:
+* Polish features:
+** Include coreference info when deciding candidates for a rule (currently coreference is only considered after applying rules)
+** Improve ambiguity measures: Return a value that indicates prominence of the chosen mention out of other ambiguous mentions
+** Improve location ambiguity measure - fix a technical issue reading tipster gazetteer. Consider city, stateOrProvince, and Country ambiguity separately.
+** Debug an issue where names are resolved to something only matching a prefix (e.g. Steven Miller -> Steven Tyler)
+* Produce a formal evaluation
+** How much does this help "Open IE?"
+*** How many extractions get annotated with additional, useful information?
+*** How many more links get found as a result of best mentions? Coref? Are they higher confidence links?
+*** How much does it increase running time to do Document-level processing with/without Coref?
+* Code cleanup, packaging, and release.
+=== 11-12 ===
+Implemented serialization to allow extractor input to be saved to disk after pre-processing, saving us from redoing with each run:
+* Parsing
+* Chunking
+* Stemming
+* Coref
+* Sentence-level Open IE
+* NER tagging
+This saves roughly 3 minutes per run (over 20 docs) and will greatly speed development time.
+Started refactoring and cleaning up rules. Next step: get all substitution rules "on equal footing" programatically so that a classifier can be built to rank them.
+=== 11-8 ===
+Finished annotating data and discussed results. String substitution rules need to be tightened up, and a confidence measure over them would help greatly.
+==== Extraction-level stats ====
+From 20 documents, there were 528 total extractions in all runs.
+Rules-diff:
+-- 206 extractions in diff
+-- 75 baseline better
+-- 99 rule-based system better
+-- 33 bad extractions (neither better)
+Coref-diff:
+-- 280 extractions in diff
+-- 105 baseline better
+-- 115 coref+rule-based system better
+-- 59 bad extractions (neither better)
+I took a closer look at the "baseline better" cases to see where we were getting it wrong:
+Rule-based system
+-- 49 strange string errors e.g. "CDC" -> "CIENCE SLIGHTED IN"
+-- 16 location errors (e.g. "Washington" [DC] -> "Washington, Georgia")
+-- 8 entity disambiguation errors, e.g. ("he" [Scott Peterson] => "Laci Peterson")
+-- 1 incorrect link (e.g. "the theory" linked to "Theory" in FreeBase)
+-- 75 total
+Coref+rule-based system
+-- 49 strange string errors
+-- 11 location errors
+-- 13 entity disambiguation errors
+-- 17 incorrect links
+-- 6 coref errors (e.g. "make it clear that" -> "make the CDC clear that")
+-- 105 total
+Approximate running times over 20 documents:
+Baseline: 45 sec
+Rules: 45 Sec
+Rules+Coref: 230 sec
+=== 11-4 ===
+* Released system output for evaluation:
+** "Rules" configuration, using rule-based best-mention disambiguation, NO Coref.
+** "Coref" configuration, using coref-assisted rule-based best-mention disambiguation. Entity Linking context also extended via coreference.
+** Entity Linking output, showing differences in Entitylinks between each system configuration (and baseline).
+Next: Stephen, John and I will annotate the output and analyze performance.
+=== 10-25 ===
+Met with Stephen, John, and Michael. Items:
+* Create a (very simple) webapp for doc extractor
+* Cleanup arguments before submitting them to the linker.
+* Replace best-mention substrings rather than substituting best mentions for the entire argument.
+* Reformat evaluation output to show only extractions that have been annotated with additional info (diff)
+* Evaluate difference in linker performance with/without document-level info.
+=== 10-18 ===
+Met with Stephen and John. Discussed:
+* Evaluation systems:
+** Baseline sentence extractor with entity linker, no coreference
+** Full system with best-mention finding rules
+** Full system without coreference.
+* Evaluation data:
+** Sample of 20-30 documents from TAC 2013.
+** Moving away from QA/Query based approach, since the queries/questions will bias evaluation of the document extractor.
+** Instead, we will evaluate all (or a uniform sample) of extractions.
+* Evaluation criteria:
+** Extractions "correct" if their arguments are as unambiguous as possible given the document text.
+** Measure prec/yield using this metric and compare systems.
 === 10-17 ===
 Completed: Integrated sentence-level Open IE and Freebase Linker, test run OK.

Difference between revisions of "Document-level Open IE"

Latest revision as of 22:37, 22 November 2013

Contents

Goals

Work Log

11-22

11-12

11-8

Extraction-level stats

11-4

10-25

10-18

10-17

10-9

Navigation menu

Views

Personal tools

Navigation

Search

Tools