Difference between revisions of "Vulcan/SystemPrototype"
From Knowitall
(→Status) |
(→Status) |
||
Line 10: | Line 10: | ||
== Status == | == Status == | ||
− | + | * Ran Tuffy on three example questions. | |
− | * <b> | + | |
− | + | <blockquote> | |
− | + | * Hand generated the input evidence for the propositions (one correct and one incorrect) for three questions. | |
+ | * Hand generated the MLN rules (adapted from Stephen's rules) | ||
+ | * Ran Tuffy to obtain the inference probabilities on the propositions. | ||
+ | * Tuffy gets it right 2/3 questions. i.e., it assigns higher probabilities for the correct proposition. | ||
+ | <blockquote> | ||
+ | |||
+ | The MLN rules can be found [[Vulcan/SystemPrototype/SampleIO| here]]. | ||
+ | |||
+ | * <b>Why does the iron nail example not work?</b> | ||
+ | <blockquote> | ||
+ | * Both "iron nail" and "plastic cup" get similar weights (iron nail is slightly higher). I don't yet understand the scoring enough to explain this. Will dig in when I come back. | ||
+ | </blockquote> | ||
* <b>How do you know that it works?</b> | * <b>How do you know that it works?</b> |
Revision as of 20:40, 27 August 2013
Overview
The prototype is designed to work on three questions. We want the system to output the following:
- Score for the input proposition.
- New facts inferred.
- Facts and rules used in scoring.
Status
- Ran Tuffy on three example questions.
- Hand generated the input evidence for the propositions (one correct and one incorrect) for three questions.
- Hand generated the MLN rules (adapted from Stephen's rules)
- Ran Tuffy to obtain the inference probabilities on the propositions.
- Tuffy gets it right 2/3 questions. i.e., it assigns higher probabilities for the correct proposition.
The MLN rules can be found here.
- Why does the iron nail example not work?
- Both "iron nail" and "plastic cup" get similar weights (iron nail is slightly higher). I don't yet understand the scoring enough to explain this. Will dig in when I come back.
- How do you know that it works?
- In all three examples the correct answer is assigned higher score compared to the incorrect ones.
- Facts inferred by larger number of steps have a lower score compared to those inferred by a smaller number of steps.
- What other diagnostics do we have?
- Inferred facts along with their probabilities.
- Rules that are reachable from the query fact. i.e., Clauses in the MLN that are relevant to the inference of the query fact.
- What diagnostics do we NOT have?
- Connections between the clauses in the MLN.
- A reconstruction/visualization of the MLN network.
- What does this exercise suggest?
- Use predicates with small arity. For example, avoid writing rules entire nested tuples as predicates.
- The only reason we'd need a nested tuple is for the purpose of computing the score. For now we can compute this from the score of its components: Score(nested_tuple) = Score(top tuple) * Score (nested).