Difference between revisions of "Vulcan/SystemTarget"

From Knowitall
Jump to: navigation, search
(Method)
(System Components)
 
(27 intermediate revisions by the same user not shown)
Line 1: Line 1:
This page describes the features in the first implementation of the system.
+
This page describes the features to target for the first implementation. The goal for this iteration is to get a prototype that does a complete walk-through on one or two example question.  
  
 
== I/O ==
 
== I/O ==
Line 12: Line 12:
 
Q3: X helps a fox find food. X = {A sense of smell, ...}<br/>
 
Q3: X helps a fox find food. X = {A sense of smell, ...}<br/>
 
</blockquote>
 
</blockquote>
 +
 +
; Knowledge
 +
: Hand-generated inference rules -- Small set that is necessary to answer the three questions.
 +
: Axioms -- Hand selected from existing resources.
  
 
; Output
 
; Output
  
 
: Marginal inference probability for each proposition according to Tuffy.  
 
: Marginal inference probability for each proposition according to Tuffy.  
: Set of rules and axioms used (human-readable and MLN format). <b>Note this is going to be hand generated in this iteration.</b>
 
 
: Debugging output that shows the set of axioms, rules and new facts used in the inference. Preferably in a graphical format.
 
: Debugging output that shows the set of axioms, rules and new facts used in the inference. Preferably in a graphical format.
  
 
== Method ==
 
== Method ==
  
 +
; Steps performed in verifying the input sentence.
 
<pre>
 
<pre>
1. Hand generate the set of rules and axioms in a human-readable format (click [[|here]] for an example.  
+
1. Create query tuple: Run Open IE 4.0 on each input sentence. Extract the best tuple to use as the query tuple for MLN.  
   Convert this into the Tuffy MLN format.  
+
   Use the one that covers the sentence best. If there is a tie use the tuple with the longest relation phrase.
 +
  [NOTE: We will only deal with single tuple propositions for now. The query tuple can be n-ary or nested.]
  
2. Run Open IE 4.0 on each input sentence. Extract the best tuple: One that covers the sentence best.  
+
2. Search for evidence: Find tuples that have some keyword overlap with the input sentence. We will use synonyms to broaden this match further.
   If there is a tie use the tuple with the longest relation phrase.
+
   Translate these tuples into Tuffy's evidence format.  
  
3. Run Tuffy's MLN inference using the best tuple as the evidence.
+
3. Run Tuffy's MLN inference procedure for obtaining the marginal probability of the query tuple.
  
 
4. Parse Tuffy's output to extract the final score and find the set of rules, axioms and newly inferred facts used in the inference.  
 
4. Parse Tuffy's output to extract the final score and find the set of rules, axioms and newly inferred facts used in the inference.  
Line 34: Line 39:
  
 
</pre>
 
</pre>
 +
 +
 +
; Setting up axioms and rules.
 +
 +
<pre>
 +
1. Hand generate the set of rules in a human-readable format.
 +
  Convert these rules into the Tuffy MLN format.
 +
 +
2. Process the study guide sentences using Open IE 4.0. Have it indexed in Solr for easy search.
 +
 +
3. Use existing Open IE Solr instance for finding sentences from ClueWeb.
 +
 +
4. Write converter that exports WordNet synonyms into Tuffy's predicate format.
 +
  I am exploring how to get Tuffy to work with externally defined procedures as predicates. This materialization
 +
  is a stop-gap arrangement.
 +
 +
5. Come up with a short-cut for dealing with stemming. One solution is to restrict the stem predicate table to the set of words
 +
  in the evidence tuples and the query tuple.
 +
 +
</pre>
 +
 +
Details of the human-readable rule representation can be found [[Vulcan/TextualMatching| here]].
 +
Details about transforming these into MLN rule format can be found [[RuleTransformation| here]].
  
 
== System Components ==
 
== System Components ==
 +
 +
The system will have the following components implemented.
 +
 +
; 1. Proposition Generator
 +
: Convert input sentence into a tuple.
 +
 +
; 2. Rules converter
 +
: Convert hand-generated rules to Tuffy's MLN format rules.
 +
 +
; 3. Tuffy Wrapper
 +
: Run tuffy program, collect its output, and parse it for debugging.
 +
 +
; 4. Axioms generator
 +
: Convert WordNet synonyms into Tuffy's evidence format.
 +
 +
; 5. Tuples searcher:
 +
: Greg is working on this.
 +
 +
== When will this system be ready? ==
 +
 +
Expect to have a running version on Aug 27th.
 +
 +
* Some of these components (1 and 3) are already implemented.
 +
* I am developing the rule converter and working with Tuffy for now. However, among the alternatives -- RockIt (Mathias Neipert), theBeast (Sebastian Reidel), and Alchemy -- RockIt is the one that I prefer. RockIt has more convenient notations for representing text based constants.
 +
* Greg will build up the tuple search component and process the study guide.

Latest revision as of 17:32, 23 August 2013

This page describes the features to target for the first implementation. The goal for this iteration is to get a prototype that does a complete walk-through on one or two example question.

I/O

Input
Natural language sentences as input.
The system will handle propositions that correspond to three questions.

Q1: X is the best conductor of electricity. X = {Iron nail, wax crayon, plastic cup, rubber boat}
Q2: X causes leaves of a plant to become larger. X = {Growth, A repair, Germination, Decomposition}
Q3: X helps a fox find food. X = {A sense of smell, ...}

Knowledge
Hand-generated inference rules -- Small set that is necessary to answer the three questions.
Axioms -- Hand selected from existing resources.
Output
Marginal inference probability for each proposition according to Tuffy.
Debugging output that shows the set of axioms, rules and new facts used in the inference. Preferably in a graphical format.

Method

Steps performed in verifying the input sentence.
1. Create query tuple: Run Open IE 4.0 on each input sentence. Extract the best tuple to use as the query tuple for MLN. 
   Use the one that covers the sentence best. If there is a tie use the tuple with the longest relation phrase.
   [NOTE: We will only deal with single tuple propositions for now. The query tuple can be n-ary or nested.]

2. Search for evidence: Find tuples that have some keyword overlap with the input sentence. We will use synonyms to broaden this match further.
   Translate these tuples into Tuffy's evidence format. 

3. Run Tuffy's MLN inference procedure for obtaining the marginal probability of the query tuple.

4. Parse Tuffy's output to extract the final score and find the set of rules, axioms and newly inferred facts used in the inference. 
   This would require some understanding of the database schemas that Tuffy uses. 


Setting up axioms and rules.
1. Hand generate the set of rules in a human-readable format.
   Convert these rules into the Tuffy MLN format. 

2. Process the study guide sentences using Open IE 4.0. Have it indexed in Solr for easy search. 

3. Use existing Open IE Solr instance for finding sentences from ClueWeb. 

4. Write converter that exports WordNet synonyms into Tuffy's predicate format.
   I am exploring how to get Tuffy to work with externally defined procedures as predicates. This materialization
   is a stop-gap arrangement.

5. Come up with a short-cut for dealing with stemming. One solution is to restrict the stem predicate table to the set of words 
   in the evidence tuples and the query tuple.
 

Details of the human-readable rule representation can be found here. Details about transforming these into MLN rule format can be found here.

System Components

The system will have the following components implemented.

1. Proposition Generator
Convert input sentence into a tuple.
2. Rules converter
Convert hand-generated rules to Tuffy's MLN format rules.
3. Tuffy Wrapper
Run tuffy program, collect its output, and parse it for debugging.
4. Axioms generator
Convert WordNet synonyms into Tuffy's evidence format.
5. Tuples searcher
Greg is working on this.

When will this system be ready?

Expect to have a running version on Aug 27th.

  • Some of these components (1 and 3) are already implemented.
  • I am developing the rule converter and working with Tuffy for now. However, among the alternatives -- RockIt (Mathias Neipert), theBeast (Sebastian Reidel), and Alchemy -- RockIt is the one that I prefer. RockIt has more convenient notations for representing text based constants.
  • Greg will build up the tuple search component and process the study guide.