Difference between revisions of "Vulcan/MeetingNotes/Aug16 2013"

Latest revision as of 21:44, 16 August 2013

Notes

Greg will own evidence finder.

Walk through the system architecture with Greg.

Axiomatic representations can be limiting. Figure out how to allow for entailment type matching within Tuffy.

Store sentences with axioms.

Figure out how to do procedural escapes in Tuffy.

Mid next week figure out an estimate for when we will have a system that works on one or five examples.

Get a knowledge spec:

isa, partOf, etc.

Don't reimplement. Find resources.

Stephen to take lead on definition extractor.

Send Stephen literature and other material for definition processing.

What are the research problems?

Definitions extractor

Reading rules from text.

Abductive reasoning

Procedural escapes for textual matching.

Agenda

Update
System architecture
Plan for Greg

Processing text collections (definitions, study guide etc.) using Open IE and import into Solr.

Converting WordNet and CNC to Tuffy axiom format and import into Postgres.

Convert scored assertions into a format that is acceptable to Vulcan's evaluation framework.

Long term plan: Greg will be responsible for inference (online) components and Niranjan will focus on the offline components (generating axioms and rules) and experimentation.

Experiment/Evaluation plan

Update

System development ( Details on architecture and status): 1. Online inference components implemented.

Proposition generator -- Extract tuples from input sentence and convert into a proposition.

Evidence finder -- Tuple matching over Open IE Clueweb data.

MLN Inference -- A wrapper around Tuffy's MLN inferencer.

2. Offline components -- axioms and rule generation -- NOT implemented.

3. Planning to use Tuffy MLN Inference system directly.

Why Tuffy and not Jena or another inference engine? Why not Alchemy?

Inference engines such as Jena/OWLim don't directly support multiple inference paths. Community's response is to suggest Datalog/prolog implementations.

Tuffy supports MLN capabilities in Alchemy but is orders of magnitude faster (what takes 6 hours in Alchemy takes 2 minutes in Tuffy).

Experiments and Evaluation

Not ready to do evaluation yet but here are some useful details.

1. Framework: Vulcan has a good evaluation interface setup. We will use this for starters. (Example output from the evaluation framework.)

2. Data: Training/Test splits set up by Vulcan. The questions cover 4-12th and AP exams.

Training = 474 questions.
Test = 290 questions.

Training data distribution and Vulcan's current performance:

Grade All Questions #Mult.Choice and
Non-diag. (MC-ND) Vulcan Performance
on MC-ND

4th grade 249 108 55.09%

8th grade 476 125 55.07%

12th grade 446 160 25.83%

AP 116 81 45.68%

All 1287 474

3. Method: Input sentences that correspond to each assertion. Score assertions using our system and submit to Vulcan's web interface.

@@ Line 1: / Line 1: @@
-== Update ==
+== Notes ==
-; System development [More details on the system architecture and status [[SystemStatus | here]].]
+* Greg will own evidence finder.
+<blockquote>
+* Walk through the system architecture with Greg.
+</blockquote>
-: 1. Online inference components implemented.
+* Axiomatic representations can be limiting. Figure out how to allow for entailment type matching within Tuffy.
 <blockquote>
-* Proposition generator<br/>
+* Store sentences with axioms.
-* Evidence finder -- Tuple matching over Open IE Clueweb data.<br/>
+* Figure out how to do procedural escapes in Tuffy.
-* MLN Inference -- A wrapper around Tuffy's MLN inferencer.
+* Mid next week figure out an estimate for when we will have a system that works on one or five examples.
 </blockquote>
+* Get a knowledge spec:
+<blockquote>
+* isa, partOf, etc.
+* Don't reimplement. Find resources.
+</blockquote>
-: 2. Offline components -- axioms and rule generation -- NOT implemented.
+* Stephen to take lead on definition extractor.
+<blockquote>
+* Send Stephen literature and other material for definition processing.
-; Experiments and Evaluation
+</blockquote>
+* What are the research problems?
-: 1. Framework: Vulcan has a good evaluation interface setup. We will use this for starters.<br/>
-: 2. Data: Training/Test splits set up by Vulcan. See details [[]]<br/>
-: 3. Method: Input sentences that correspond to each assertion. Score assertions using our system and submit to Vulcan's web interface.<br/>
-; Design questions.
-: 1. Why not use MLN directly? Why use a backward chained inferencer (such as Jena) as an intermediate step?
 <blockquote>
-* Looks like a separte backward-chained inferencer won't be necessary.<br/>
+* Definitions extractor
-* Tuffy, an MLN implementation, does KBMC to scale MLN inference. Details [http://hazy.cs.wisc.edu/hazy/papers/tuffy-vldb2011-slides.pdf|here]<br/>
+* Reading rules from text.
+* Abductive reasoning
+* Procedural escapes for textual matching.
 </blockquote>
-*
-; Analysis
-: 1. Selected 10 propositions that are single Open IE tuples as starting targets.
-: 2. Wrote down [http://homes.cs.washington.edu/~niranjan/vulcan/aug09/stepsinvolved.docx steps involved] in verifying these propositions.
 == Agenda ==
+* Update
+* System architecture
+* Plan for Greg
+<blockquote>
+* Processing text collections (definitions, study guide etc.) using Open IE and import into Solr.
+* Converting WordNet and CNC to Tuffy axiom format and import into Postgres.
+* Convert scored assertions into a format that is acceptable to Vulcan's evaluation framework.
+Long term plan: Greg will be responsible for inference (online) components and
+Niranjan will focus on the offline components (generating axioms and rules) and experimentation.<br/>
-== To Do (Copied over from previous week) ==
-; System building
-: 1. Implement "template matching" using the ClueWeb corpus.<b>Pending</b>
-<blockquote>
-* URL for Open IE backend is available.
-* For an assertion A, find sentences that have high overlap. Generate regex patterns for the proposition. Score sentences by how well they match the regex patterns.
 </blockquote>
+* Experiment/Evaluation plan
-: 2. Continue system building.
+== Update ==
+; System development ([[SystemStatus | Details on architecture and status]])
+: 1. Online inference components implemented.
 <blockquote>
-* Create a derivation scorer stub. This will be replaced with a MLN or a BLP scorer. <b>Done.</b>
+* Proposition generator -- Extract tuples from input sentence and convert into a proposition.<br/>
-* Test with iron nail example.
+* Evidence finder -- Tuple matching over Open IE Clueweb data.<br/>
+* MLN Inference -- A wrapper around Tuffy's MLN inferencer.
 </blockquote>
+: 2. Offline components -- axioms and rule generation -- NOT implemented.
-: 3. Jena API doesn't readily support multiple derivations.
+: 3. Planning to use Tuffy MLN Inference system directly.
 <blockquote>
-* Ask Jena community to find out if this is possible. <b>Done. Not possible.</b>
+<b>Why Tuffy and not Jena or another inference engine? Why not Alchemy?</b>
-* OWLIM as replacement. <b>Done. Doesn't look promising. No response from community.</b>
+* Inference engines such as Jena/OWLim don't directly support multiple inference paths. Community's response is to suggest Datalog/prolog implementations.
+* Tuffy supports MLN capabilities in Alchemy but is orders of magnitude faster (what takes 6 hours in Alchemy takes 2 minutes in Tuffy).
 </blockquote>
-:4. Try out [http://hazy.cs.wisc.edu/hazy/tuffy/ Tuffy MLN] implemenatation. <b>Done.</b>
+; Experiments and Evaluation
-<blockquote>
-* Use output of iron nail example
-* If easy to use write wrappers around Tuffy to hook into our system.
-</blockquote>
-:5 Write evaluation code. <b>Vulcan has a good interface set up.</b>
+Not ready to do evaluation yet but here are some useful details.
-<blockquote>
-* Check with Peter.
-</blockquote>
-:6. Create a [[Vulcan/SystemArchitecture| system architecture]] page with a figure and overview of the main components.
+: 1. Framework: Vulcan has a good evaluation interface setup. We will use this for starters. <b>([http://homes.cs.washington.edu/~niranjan/vulcan/example-results.html Example output from the evaluation framework.])</b><br/>
-<b>Created a [[Vulcan/SystemStatus| System status page]] instead.</b>
+: 2. Data: Training/Test splits set up by Vulcan. The questions cover 4-12th and AP exams. <br/>
 <blockquote>
-* Created a figure. Added it to system design document.
+Training = <b>474</b> questions.<br/>
-* Need to create a wiki page for system architecture and overview.
+Test = <b>290</b> questions.<br/>
-</blockquote>
-; Experiments <b>Pending</b>
+Training data distribution and Vulcan's current performance:
-: 1. Run template matching approach as a baseline.
+{| class="wikitable"
+|-
+!Grade !! All Questions !! #Mult.Choice and<br/> Non-diag. (MC-ND) !! Vulcan Performance<br/> on MC-ND
+|-
+|4th grade || 249 || 108 || 55.09%
+|-
+|8th grade || 476 || 125 || 55.07%
+|-
+| 12th grade || 446 || 160 || 25.83%
+|-
+| AP || 116 || 81 || 45.68%
+|-
+| All || 1287 || 474
+|-
+|}
-: 2. Run inference system as a baseline.
+<!--
+{| class="wikitable"
+|-
+!Grade (# Questions) !! All !! MC-Only !! Non-Diagrams-Only !! MC-Non-Diagrams-Only
+|-
+|4th grade (249) || 35.16% || 52.55% || 49.58% || 55.09%
+|-
+|8th grade (476) || 23.01% || 43.46% || 45.29% || 55.07%
+|-
+| 12th grade (446) || 17.06% || 31.29% || 14.11% || 25.83%
+|-
+| AP (116) || 22.55% || 41.92% || 30.58% || 45.68%
+|-
+| All (1287) || || || ||
+|-
+|}-->
+</blockquote>
+: 3. Method: Input sentences that correspond to each assertion. Score assertions using our system and submit to Vulcan's web interface.<br/>

Difference between revisions of "Vulcan/MeetingNotes/Aug16 2013"

Latest revision as of 21:44, 16 August 2013

Notes

Agenda

Update

Navigation menu

Views

Personal tools

Navigation

Search

Tools

Grade	All Questions	#Mult.Choice and Non-diag. (MC-ND)	Vulcan Performance on MC-ND
4th grade	249	108	55.09%
8th grade	476	125	55.07%
12th grade	446	160	25.83%
AP	116	81	45.68%
All	1287	474