Difference between revisions of "Vulcan/SystemPrototype"
From Knowitall
(→Status) |
(→Status) |
||
Line 26: | Line 26: | ||
<blockquote> | <blockquote> | ||
* Tuffy gets it right for <strike>2 out of</strike> all 3 questions. i.e., it assigns higher probabilities for the correct proposition. | * Tuffy gets it right for <strike>2 out of</strike> all 3 questions. i.e., it assigns higher probabilities for the correct proposition. | ||
− | * | + | * For the most part, facts inferred by larger number of steps have a lower score compared to those inferred by a smaller number of steps. <b>This depends on the weights on the inference rules.</b> |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
*<b><span style="background-color:yellow">I need to better understand how the MLN scoring works in general and how Tuffy implements the inference.</span></b> | *<b><span style="background-color:yellow">I need to better understand how the MLN scoring works in general and how Tuffy implements the inference.</span></b> | ||
+ | * <b>Tuffy's output probabilities are non-deterministic</b> making it harder to debug. Need a fix for this. | ||
</blockquote> | </blockquote> | ||
− | |||
− | |||
− | |||
* <b> What diagnostics do we NOT have?</b> | * <b> What diagnostics do we NOT have?</b> | ||
<blockquote> | <blockquote> |
Revision as of 19:43, 5 September 2013
Overview
The prototype is designed to work on three questions. We want the system to output the following:
- Score for the input proposition.
- New facts inferred.
- Facts and rules used in scoring.
Status
- Ran Tuffy on three example questions. It failed on one question.
- Hand generated the input evidence for the propositions (one correct and one incorrect) for three questions.
- Hand generated the MLN rules based on Stephen's human-readable rules.The MLN rules can be found here.
- Ran Tuffy to obtain the inference probabilities on the propositions.
- The system also outputs:
- All inferred facts along with their probabilities.
- All rules that are reachable from the query fact. i.e., Clauses in the MLN that are relevant to the inference of the query fact.
- Does it work?
- Tuffy gets it right for
2 out ofall 3 questions. i.e., it assigns higher probabilities for the correct proposition.- For the most part, facts inferred by larger number of steps have a lower score compared to those inferred by a smaller number of steps. This depends on the weights on the inference rules.
- I need to better understand how the MLN scoring works in general and how Tuffy implements the inference.
- Tuffy's output probabilities are non-deterministic making it harder to debug. Need a fix for this.
- What diagnostics do we NOT have?
- Connections between the clauses in the MLN.
- A reconstruction/visualization of the MLN network. Working with Tuffy developers on this.
- What next?
Figure out why iron nail fails.- Use automatically found evidence i.e., actual open ie tuples instead of hand-generated evidence.
- Figure out what mismatches exist between required and available knowledge.
- Improve diagnostics output.
- What does this exercise suggest?
- Need to figure out how the weights on the MLN rules and evidence are use. [I assigned them arbitrarily for this round.]
- Use predicates with small arity. For example, avoid writing rules entire nested tuples as predicates.
- The only reason we'd need a nested tuple is for the purpose of computing the score. For now we can compute this from the score of its components: Score(nested_tuple) = Score(top tuple) * Score (nested).