Vulcan/SystemTarget

From Knowitall
Revision as of 05:20, 22 August 2013 by Niranjan (talk | contribs) (System Components)

Jump to: navigation, search

This page describes the features to target for the first implementation. The goal for this iteration is to get a prototype that does a complete walk-through on one or two example question.

I/O

Input
Natural language sentences as input.
The system will handle propositions that correspond to three questions.

Q1: X is the best conductor of electricity. X = {Iron nail, wax crayon, plastic cup, rubber boat}
Q2: X causes leaves of a plant to become larger. X = {Growth, A repair, Germination, Decomposition}
Q3: X helps a fox find food. X = {A sense of smell, ...}

Output
Marginal inference probability for each proposition according to Tuffy.
Set of rules and axioms used (human-readable and MLN format). Note this is going to be hand generated in this iteration.
Debugging output that shows the set of axioms, rules and new facts used in the inference. Preferably in a graphical format.

Method

Steps performed in verifying the input sentence.
1. Create query tuple: Run Open IE 4.0 on each input sentence. Extract the best tuple to use as the query tuple for MLN. 
   Use the one that covers the sentence best. If there is a tie use the tuple with the longest relation phrase.

2. Search for evidence: Find tuples that have some keyword overlap with the input sentence. 
   Translate these tuples into Tuffy's evidence format. 

3. Run Tuffy's MLN inference procedure for obtaining the marginal probability of the query tuple.

4. Parse Tuffy's output to extract the final score and find the set of rules, axioms and newly inferred facts used in the inference. 
   This would require some understanding of the database schemas that Tuffy uses. 


Setting up axioms and rules.
1. Hand generate the set of rules in a human-readable format.
   Convert these rules into the Tuffy MLN format. 

2. Process the study guide sentences using Open IE 4.0. Have it indexed in Solr for easy search. 

3. Use existing Open IE Solr instance for finding sentences from ClueWeb. 

4. Write converter that exports WordNet synonyms into Tuffy's predicate format.
   I am exploring how to get Tuffy to work with externally defined procedures as predicates. This materialization
   is a stop-gap arrangement.

5. Come up with a short-cut for dealing with stemming. One solution is to restrict the stem predicate table to the set of words 
   in the evidence tuples and the query tuple.
 

Details of the human-readable rule representation can be found here. Details about transforming these into MLN rule format can be found here.

System Components

The system will have the following components implemented.

1. Proposition Generator
Convert input sentence into a tuple.
2. Rules converter
Convert hand-generated rules to Tuffy's MLN format rules.
3. Tuffy Wrapper
Run tuffy program, collect its output, and parse it for debugging.
4. Axioms generator
Convert WordNet synonyms into Tuffy's evidence format.
5. Tuples searcher

Greg's working on this. Will know status tomorrow.

When will this system be ready?

Expect to have a running version on Aug 27th.


  • Most of these components (1-3) are already implemented.
  • I am developing the rule converter and working with Tuffy for now. Among the alternatives -- RockIt (Mathias Neipert), theBeast (Sebastian Reidel), and Alchemy -- RockIt is the one that I prefer. RockIt has more convenient notations for representing text based constants.
  • Greg will build up the tuple search component and process the study guide.