Difference between revisions of "Vulcan/SystemPrototype"

Revision as of 19:43, 5 September 2013

Overview

The prototype is designed to work on three questions. We want the system to output the following:

Score for the input proposition.

New facts inferred.

Facts and rules used in scoring.

Status

Ran Tuffy on three example questions. It failed on one question.

Hand generated the input evidence for the propositions (one correct and one incorrect) for three questions.

Hand generated the MLN rules based on Stephen's human-readable rules.The MLN rules can be found here.

Ran Tuffy to obtain the inference probabilities on the propositions.

The system also outputs:

All inferred facts along with their probabilities.

All rules that are reachable from the query fact. i.e., Clauses in the MLN that are relevant to the inference of the query fact.

Does it work?

Tuffy gets it right for ~~2 out of~~ all 3 questions. i.e., it assigns higher probabilities for the correct proposition.

For the most part, facts inferred by larger number of steps have a lower score compared to those inferred by a smaller number of steps. This depends on the weights on the inference rules.

I need to better understand how the MLN scoring works in general and how Tuffy implements the inference.

Tuffy's output probabilities are non-deterministic making it harder to debug. Need a fix for this.

What diagnostics do we NOT have?

Connections between the clauses in the MLN.

A reconstruction/visualization of the MLN network. Working with Tuffy developers on this.

What next?

~~Figure out why iron nail fails.~~

Use automatically found evidence i.e., actual open ie tuples instead of hand-generated evidence.

Figure out what mismatches exist between required and available knowledge.

Improve diagnostics output.

What does this exercise suggest?

Need to figure out how the weights on the MLN rules and evidence are use. [I assigned them arbitrarily for this round.]

Use predicates with small arity. For example, avoid writing rules entire nested tuples as predicates.

The only reason we'd need a nested tuple is for the purpose of computing the score. For now we can compute this from the score of its components: Score(nested_tuple) = Score(top tuple) * Score (nested).

@@ Line 26: / Line 26: @@
 <blockquote>
 * Tuffy gets it right for <strike>2 out of</strike> all 3 questions. i.e., it assigns higher probabilities for the correct proposition.
-* Facts inferred by larger number of steps have a lower score compared to those inferred by a smaller number of steps.
+* For the most part, facts inferred by larger number of steps have a lower score compared to those inferred by a smaller number of steps. <b>This depends on the weights on the inference rules.</b>
-</blockquote>
-* <b>Why does it fail on the one question?</b>
-<blockquote>
-* The system fails on the iron nail example.
-* Both "iron nail" and "plastic cup" get same high probability score (0.94). <b>Assigning smaller weights on the tuple generation rules fixes this problem. Previously, these rules had a infinite weight.</b>.
 *<b><span style="background-color:yellow">I need to better understand how the MLN scoring works in general and how Tuffy implements the inference.</span></b>
+* <b>Tuffy's output probabilities are non-deterministic</b> making it harder to debug. Need a fix for this.
 </blockquote>
-* <b>Tuffy's output probabilities are non-deterministic</b> making it harder to debug. Need a fix for this.
 * <b> What diagnostics do we NOT have?</b>
 <blockquote>

Difference between revisions of "Vulcan/SystemPrototype"

Revision as of 19:43, 5 September 2013

Overview

Status

Navigation menu

Views

Personal tools

Navigation

Search

Tools