Difference between revisions of "Vulcan/SystemPrototype"

Revision as of 18:57, 3 September 2013

Overview

The prototype is designed to work on three questions. We want the system to output the following:

Score for the input proposition.

New facts inferred.

Facts and rules used in scoring.

Status

Ran Tuffy on three example questions. It failed on one question.

Hand generated the input evidence for the propositions (one correct and one incorrect) for three questions.

Hand generated the MLN rules based on Stephen's human-readable rules.The MLN rules can be found here.

Ran Tuffy to obtain the inference probabilities on the propositions.

The system also outputs:

All inferred facts along with their probabilities.

All rules that are reachable from the query fact. i.e., Clauses in the MLN that are relevant to the inference of the query fact.

Does it work?

Tuffy gets it right for 2 out of 3 questions. i.e., it assigns higher probabilities for the correct proposition.

Facts inferred by larger number of steps have a lower score compared to those inferred by a smaller number of steps.

Why does it fail on the one question?

The system fails on the iron nail example.

Both "iron nail" and "plastic cup" get same high probability score (0.94). Assigning smaller weights on the tuple generation rules fixes this problem. Previously, these rules had a infinite weight..

I need to better understand how the MLN scoring works in general and how Tuffy implements the inference.

So why having

What diagnostics do we NOT have?

Connections between the clauses in the MLN.

A reconstruction/visualization of the MLN network. Working with Tuffy developers on this.

What next?

Figure out why iron nail fails.

Use automatically found evidence i.e., actual open ie tuples instead of hand-generated evidence.

Figure out what mismatches exist between required and available knowledge.

Improve diagnostics output.

What does this exercise suggest?

Need to figure out how the weights on the MLN rules and evidence are use. [I assigned them arbitrarily for this round.]

Use predicates with small arity. For example, avoid writing rules entire nested tuples as predicates.

The only reason we'd need a nested tuple is for the purpose of computing the score. For now we can compute this from the score of its components: Score(nested_tuple) = Score(top tuple) * Score (nested).

@@ Line 32: / Line 32: @@
 <blockquote>
 * The system fails on the iron nail example.
-* Both "iron nail" and "plastic cup" get same high probability score (0.94). <b>Don't know why this is happening.</b>.
+* Both "iron nail" and "plastic cup" get same high probability score (0.94). <b>Assigning smaller weights on the tuple generation rules fixes this problem. Previously, these rules had a infinite weight.</b>.
-* Based on manual inspection the plastic cup proposition should not get any score at all. Could be a bug in the way I codified the rules, or my interpretation of the MLN syntax is incorrect, or there is a genuine derivation that I somehow miss. Will dig in when I come back.
+*<b><span style="background-color:yellow">I need to better understand how the MLN scoring works in general and how Tuffy implements the inference.</span></b>
+*<b><span style="background-color:yellow">So why having</span></b>
 </blockquote>

Difference between revisions of "Vulcan/SystemPrototype"

Revision as of 18:57, 3 September 2013

Overview

Status

Navigation menu

Views

Personal tools

Navigation

Search

Tools