Difference between revisions of "Vulcan/MeetingNotes/Aug16 2013"
From Knowitall
(→Update) |
(→Notes) |
||
(46 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
− | == | + | == Notes == |
− | + | * Greg will own evidence finder. | |
+ | <blockquote> | ||
+ | * Walk through the system architecture with Greg. | ||
+ | </blockquote> | ||
− | + | * Axiomatic representations can be limiting. Figure out how to allow for entailment type matching within Tuffy. | |
<blockquote> | <blockquote> | ||
− | * | + | * Store sentences with axioms. |
− | * | + | * Figure out how to do procedural escapes in Tuffy. |
− | * | + | * Mid next week figure out an estimate for when we will have a system that works on one or five examples. |
</blockquote> | </blockquote> | ||
+ | * Get a knowledge spec: | ||
+ | <blockquote> | ||
+ | * isa, partOf, etc. | ||
+ | * Don't reimplement. Find resources. | ||
+ | </blockquote> | ||
− | + | * Stephen to take lead on definition extractor. | |
+ | <blockquote> | ||
+ | * Send Stephen literature and other material for definition processing. | ||
− | + | </blockquote> | |
− | + | * What are the research problems? | |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
<blockquote> | <blockquote> | ||
− | * | + | * Definitions extractor |
− | * | + | * Reading rules from text. |
+ | * Abductive reasoning | ||
+ | * Procedural escapes for textual matching. | ||
</blockquote> | </blockquote> | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
== Agenda == | == Agenda == | ||
+ | * Update | ||
+ | * System architecture | ||
+ | * Plan for Greg | ||
+ | <blockquote> | ||
+ | * Processing text collections (definitions, study guide etc.) using Open IE and import into Solr. | ||
+ | * Converting WordNet and CNC to Tuffy axiom format and import into Postgres. | ||
+ | * Convert scored assertions into a format that is acceptable to Vulcan's evaluation framework. | ||
+ | Long term plan: Greg will be responsible for inference (online) components and | ||
+ | Niranjan will focus on the offline components (generating axioms and rules) and experimentation.<br/> | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
</blockquote> | </blockquote> | ||
+ | * Experiment/Evaluation plan | ||
− | : | + | == Update == |
+ | ; System development ([[SystemStatus | Details on architecture and status]]) | ||
+ | : 1. Online inference components implemented. | ||
<blockquote> | <blockquote> | ||
− | * | + | * Proposition generator -- Extract tuples from input sentence and convert into a proposition.<br/> |
− | * | + | * Evidence finder -- Tuple matching over Open IE Clueweb data.<br/> |
+ | * MLN Inference -- A wrapper around Tuffy's MLN inferencer. | ||
</blockquote> | </blockquote> | ||
+ | : 2. Offline components -- axioms and rule generation -- NOT implemented. | ||
− | : 3. | + | : 3. Planning to use Tuffy MLN Inference system directly. |
<blockquote> | <blockquote> | ||
− | + | <b>Why Tuffy and not Jena or another inference engine? Why not Alchemy?</b> | |
− | * | + | * Inference engines such as Jena/OWLim don't directly support multiple inference paths. Community's response is to suggest Datalog/prolog implementations. |
+ | * Tuffy supports MLN capabilities in Alchemy but is orders of magnitude faster (what takes 6 hours in Alchemy takes 2 minutes in Tuffy). | ||
</blockquote> | </blockquote> | ||
− | + | ; Experiments and Evaluation | |
− | |||
− | |||
− | |||
− | |||
− | + | Not ready to do evaluation yet but here are some useful details. | |
− | |||
− | |||
− | |||
− | : | + | : 1. Framework: Vulcan has a good evaluation interface setup. We will use this for starters. <b>([http://homes.cs.washington.edu/~niranjan/vulcan/example-results.html Example output from the evaluation framework.])</b><br/> |
− | <b> | + | : 2. Data: Training/Test splits set up by Vulcan. The questions cover 4-12th and AP exams. <br/> |
<blockquote> | <blockquote> | ||
− | + | Training = <b>474</b> questions.<br/> | |
− | + | Test = <b>290</b> questions.<br/> | |
− | </ | ||
− | + | Training data distribution and Vulcan's current performance: | |
− | + | {| class="wikitable" | |
+ | |- | ||
+ | !Grade !! All Questions !! #Mult.Choice and<br/> Non-diag. (MC-ND) !! Vulcan Performance<br/> on MC-ND | ||
+ | |- | ||
+ | |4th grade || 249 || 108 || 55.09% | ||
+ | |- | ||
+ | |8th grade || 476 || 125 || 55.07% | ||
+ | |- | ||
+ | | 12th grade || 446 || 160 || 25.83% | ||
+ | |- | ||
+ | | AP || 116 || 81 || 45.68% | ||
+ | |- | ||
+ | | All || 1287 || 474 | ||
+ | |- | ||
+ | |} | ||
− | : | + | <!-- |
+ | {| class="wikitable" | ||
+ | |- | ||
+ | !Grade (# Questions) !! All !! MC-Only !! Non-Diagrams-Only !! MC-Non-Diagrams-Only | ||
+ | |- | ||
+ | |4th grade (249) || 35.16% || 52.55% || 49.58% || 55.09% | ||
+ | |- | ||
+ | |8th grade (476) || 23.01% || 43.46% || 45.29% || 55.07% | ||
+ | |- | ||
+ | | 12th grade (446) || 17.06% || 31.29% || 14.11% || 25.83% | ||
+ | |- | ||
+ | | AP (116) || 22.55% || 41.92% || 30.58% || 45.68% | ||
+ | |- | ||
+ | | All (1287) || || || || | ||
+ | |- | ||
+ | |}--> | ||
+ | </blockquote> | ||
+ | : 3. Method: Input sentences that correspond to each assertion. Score assertions using our system and submit to Vulcan's web interface.<br/> |
Latest revision as of 21:44, 16 August 2013
Notes
- Greg will own evidence finder.
- Walk through the system architecture with Greg.
- Axiomatic representations can be limiting. Figure out how to allow for entailment type matching within Tuffy.
- Store sentences with axioms.
- Figure out how to do procedural escapes in Tuffy.
- Mid next week figure out an estimate for when we will have a system that works on one or five examples.
- Get a knowledge spec:
- isa, partOf, etc.
- Don't reimplement. Find resources.
- Stephen to take lead on definition extractor.
- Send Stephen literature and other material for definition processing.
- What are the research problems?
- Definitions extractor
- Reading rules from text.
- Abductive reasoning
- Procedural escapes for textual matching.
Agenda
- Update
- System architecture
- Plan for Greg
- Processing text collections (definitions, study guide etc.) using Open IE and import into Solr.
- Converting WordNet and CNC to Tuffy axiom format and import into Postgres.
- Convert scored assertions into a format that is acceptable to Vulcan's evaluation framework.
Long term plan: Greg will be responsible for inference (online) components and Niranjan will focus on the offline components (generating axioms and rules) and experimentation.
- Experiment/Evaluation plan
Update
- System development ( Details on architecture and status)
- 1. Online inference components implemented.
- Proposition generator -- Extract tuples from input sentence and convert into a proposition.
- Evidence finder -- Tuple matching over Open IE Clueweb data.
- MLN Inference -- A wrapper around Tuffy's MLN inferencer.
- 2. Offline components -- axioms and rule generation -- NOT implemented.
- 3. Planning to use Tuffy MLN Inference system directly.
Why Tuffy and not Jena or another inference engine? Why not Alchemy?
- Inference engines such as Jena/OWLim don't directly support multiple inference paths. Community's response is to suggest Datalog/prolog implementations.
- Tuffy supports MLN capabilities in Alchemy but is orders of magnitude faster (what takes 6 hours in Alchemy takes 2 minutes in Tuffy).
- Experiments and Evaluation
Not ready to do evaluation yet but here are some useful details.
- 1. Framework: Vulcan has a good evaluation interface setup. We will use this for starters. (Example output from the evaluation framework.)
- 2. Data: Training/Test splits set up by Vulcan. The questions cover 4-12th and AP exams.
Training = 474 questions.
Test = 290 questions.
Training data distribution and Vulcan's current performance:
Grade All Questions #Mult.Choice and
Non-diag. (MC-ND)Vulcan Performance
on MC-ND4th grade 249 108 55.09% 8th grade 476 125 55.07% 12th grade 446 160 25.83% AP 116 81 45.68% All 1287 474
- 3. Method: Input sentences that correspond to each assertion. Score assertions using our system and submit to Vulcan's web interface.