Difference between revisions of "Vulcan/MeetingNotes/Aug16 2013"
From Knowitall
(→Update) |
(→Update) |
||
Line 14: | Line 14: | ||
<b>Why Tuffy and not Jena or another inference engine? Why not Alchemy?</b> | <b>Why Tuffy and not Jena or another inference engine? Why not Alchemy?</b> | ||
* Inference engines such as Jena/OWLim don't directly support multiple inference paths. Community's response is to suggest Datalog/prolog implementations. | * Inference engines such as Jena/OWLim don't directly support multiple inference paths. Community's response is to suggest Datalog/prolog implementations. | ||
− | * Tuffy supports MLN capabilities in Alchemy | + | * Tuffy supports MLN capabilities in Alchemy but is orders of magnitude faster (what takes 6 hours in Alchemy takes 2 minutes in Tuffy). |
</blockquote> | </blockquote> | ||
Line 27: | Line 27: | ||
Test = <b>290</b> questions.<br/> | Test = <b>290</b> questions.<br/> | ||
− | Training data distribution: | + | Training data distribution and Vulcan's current performance: |
{| class="wikitable" | {| class="wikitable" | ||
|- | |- | ||
− | !Grade !! All Questions !! #Mult.Choice and<br/> Non-diag. | + | !Grade !! All Questions !! #Mult.Choice and<br/> Non-diag. (MC-ND) !! Vulcan Performance<br/> on MC-ND |
|- | |- | ||
− | |4th grade || 249 || 108 | + | |4th grade || 249 || 108 || 55.09% |
|- | |- | ||
− | |8th grade || 476 || 125 | + | |8th grade || 476 || 125 || 55.07% |
|- | |- | ||
− | | 12th grade || 446 || 160 | + | | 12th grade || 446 || 160 || 25.83% |
|- | |- | ||
− | | AP || 116 || 81 | + | | AP || 116 || 81 || 45.68% |
|- | |- | ||
| All || 1287 || 474 | | All || 1287 || 474 |
Revision as of 18:32, 16 August 2013
Update
- System development ( Details on architecture and status)
- 1. Online inference components implemented.
- Proposition generator -- Extract tuples from input sentence and convert into a proposition.
- Evidence finder -- Tuple matching over Open IE Clueweb data.
- MLN Inference -- A wrapper around Tuffy's MLN inferencer.
- 2. Offline components -- axioms and rule generation -- NOT implemented.
- 3. Planning to use Tuffy MLN Inference system directly.
Why Tuffy and not Jena or another inference engine? Why not Alchemy?
- Inference engines such as Jena/OWLim don't directly support multiple inference paths. Community's response is to suggest Datalog/prolog implementations.
- Tuffy supports MLN capabilities in Alchemy but is orders of magnitude faster (what takes 6 hours in Alchemy takes 2 minutes in Tuffy).
- Experiments and Evaluation
Not ready to do evaluation yet but here are some useful details.
- 1. Framework: Vulcan has a good evaluation interface setup. We will use this for starters. (Example output from the evaluation framework.)
- 2. Data: Training/Test splits set up by Vulcan. The questions cover 4-12th and AP exams.
Training = 474 questions.
Test = 290 questions.
Training data distribution and Vulcan's current performance:
Grade All Questions #Mult.Choice and
Non-diag. (MC-ND)Vulcan Performance
on MC-ND4th grade 249 108 55.09% 8th grade 476 125 55.07% 12th grade 446 160 25.83% AP 116 81 45.68% All 1287 474
- 3. Method: Input sentences that correspond to each assertion. Score assertions using our system and submit to Vulcan's web interface.
- Design questions.
- 1. Why not use MLN directly? Why use a backward chained inferencer (such as Jena) as an intermediate step?
- Looks like a separte backward-chained inferencer won't be necessary.
- Tuffy, an MLN implementation, does KBMC to scale MLN inference. Details [1]
- 2.
- Analysis
- 1. Selected 10 propositions that are single Open IE tuples as starting targets.
- 2. Wrote down steps involved in verifying these propositions.
Agenda
To Do (Copied over from previous week)
- System building
- 1. Implement "template matching" using the ClueWeb corpus.Pending
- URL for Open IE backend is available.
- For an assertion A, find sentences that have high overlap. Generate regex patterns for the proposition. Score sentences by how well they match the regex patterns.
- 2. Continue system building.
- Create a derivation scorer stub. This will be replaced with a MLN or a BLP scorer. Done.
- Test with iron nail example.
- 3. Jena API doesn't readily support multiple derivations.
- Ask Jena community to find out if this is possible. Done. Not possible.
- OWLIM as replacement. Done. Doesn't look promising. No response from community.
- 4. Try out Tuffy MLN implemenatation. Done.
- Use output of iron nail example
- If easy to use write wrappers around Tuffy to hook into our system.
- 5 Write evaluation code. Vulcan has a good interface set up.
- Check with Peter.
- 6. Create a system architecture page with a figure and overview of the main components.
Created a System status page instead.
- Created a figure. Added it to system design document.
- Need to create a wiki page for system architecture and overview.
- Experiments Pending
- 1. Run template matching approach as a baseline.
- 2. Run inference system as a baseline.