Difference between revisions of "Vulcan/SystemPrototype"

From Knowitall
Jump to: navigation, search
(Status)
(Status)
Line 10: Line 10:
  
 
== Status ==  
 
== Status ==  
Sample MLN programs and output from Tuffy can be found [[Vulcan/SystemPrototype/SampleIO| here]].
+
* Ran Tuffy on three example questions.
* <b>Input: Evidence relating to </b>
+
 
* <b>Knowledge:</b> Worked out the facts and rules required.
+
<blockquote>
* <b>Output: </b> The system outputs scores for each query predicate. If query is not in output then score is zero.
+
* Hand generated the input evidence for the propositions (one correct and one incorrect) for three questions.
 +
* Hand generated the MLN rules (adapted from Stephen's rules)
 +
* Ran Tuffy to obtain the inference probabilities on the propositions.
 +
* Tuffy gets it right 2/3 questions. i.e., it assigns higher probabilities for the correct proposition.
 +
<blockquote>
 +
 
 +
The MLN rules can be found [[Vulcan/SystemPrototype/SampleIO| here]].
 +
 
 +
* <b>Why does the iron nail example not work?</b>
 +
<blockquote>
 +
* Both "iron nail" and "plastic cup" get similar weights (iron nail is slightly higher). I don't yet understand the scoring enough to explain this. Will dig in when I come back.
 +
</blockquote>
  
 
* <b>How do you know that it works?</b>  
 
* <b>How do you know that it works?</b>  

Revision as of 20:40, 27 August 2013

Overview

The prototype is designed to work on three questions. We want the system to output the following:

  • Score for the input proposition.
  • New facts inferred.
  • Facts and rules used in scoring.

Status

  • Ran Tuffy on three example questions.
  • Hand generated the input evidence for the propositions (one correct and one incorrect) for three questions.
  • Hand generated the MLN rules (adapted from Stephen's rules)
  • Ran Tuffy to obtain the inference probabilities on the propositions.
  • Tuffy gets it right 2/3 questions. i.e., it assigns higher probabilities for the correct proposition.

The MLN rules can be found here.

  • Why does the iron nail example not work?
  • Both "iron nail" and "plastic cup" get similar weights (iron nail is slightly higher). I don't yet understand the scoring enough to explain this. Will dig in when I come back.
  • How do you know that it works?
  • In all three examples the correct answer is assigned higher score compared to the incorrect ones.
  • Facts inferred by larger number of steps have a lower score compared to those inferred by a smaller number of steps.
  • What other diagnostics do we have?
  • Inferred facts along with their probabilities.
  • Rules that are reachable from the query fact. i.e., Clauses in the MLN that are relevant to the inference of the query fact.
  • What diagnostics do we NOT have?
  • Connections between the clauses in the MLN.
  • A reconstruction/visualization of the MLN network.
  • What does this exercise suggest?
  • Use predicates with small arity. For example, avoid writing rules entire nested tuples as predicates.
  • The only reason we'd need a nested tuple is for the purpose of computing the score. For now we can compute this from the score of its components: Score(nested_tuple) = Score(top tuple) * Score (nested).