Vulcan/MeetingNotes/Aug09 2013

From Knowitall
Revision as of 18:35, 16 August 2013 by Niranjan (talk | contribs) (To Do)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Update

Focusing on system development.
1. Inferencer stub using Jena. Stub takes in axioms and rules and outputs a derviation.
  • Jena API doesn't readily support multiple derivations.
  • Ask Jena community to find out if this is possible or

replace with OWLIM (or Seasame) triplestore/inference systems.

2. Tested stub with axioms and rules that would help us solve the iron nail example.
3. Built Proposition extractor stub that converts the answer assertions into propositions represented as Open IE 4.0 tuples
Resource collection
1. Gathered assertions from Peter/Phil. Each assertion corresponds to a single multiple choice answer.
2. Found RDF representations for WordNet and imported them into Jena.
Analysis
1. Selected 10 propositions that are single Open IE tuples as starting targets.
2. Started to write down steps involved in verifying these propositions.

Notes

1. Do we really needed an inference engine (Jena)?

  • We need an inference engine primarily to limit the set of groundings for BLP or MLN.
  • However, the step by step derivation used by inference engines is a brittle approach and is likely to fail often.
  • Instead, Mausam suggested looking into Knowledge Base Model Construction as an alternative for finding all relevant axioms and rules that can be fed into a MLN or KBP.

2. Discussed the inference steps involved for 5 different questions.

It was clear we may not be able to have a complete derivation always.

3. What happens when deductive inference fails?

  • Add weak entailment rules that can help complete the inference.
  • Use a "linguistically motivated pattern matcher" as a form of abductive reasoning. We can use this pattern matcher as a stand alone method as well.

Approach A: Identify axioms (tuples) that are highly "similar" to some node in the backward chained derivation graph. Add weak entailment rules (axiom -> derivation node) scored using edit distance.

Approach B: Find the most plausible answer.

Example 1: (x, helps, a fox find food), where x is one of {sense of smell, thick fur, long tail, pointed teeth}

We don't find sentences that directly state that "sense of smell helps fox find food".
However, several sentences say "sense of smell helps animals find food":

"smell helps * find food" returns 7 million hits on Google.
"fur helps * find food" returns no hits.

From this we cannot prove that foxes use sense of smell to find food but it increases our belief in it.

Example 2: (x, conductor of, electricity), where x is one of {iron nail, rubber boat, wax crayon, plastic cup}

We don't find sentences that directly state that "iron nail is a conductor of electricity" (or that it conducts electricity).
However, several sentences match the pattern "iron * is a conductor of electricity":

"iron * is a conductor of electricity" returns 794,000 hits on Google.
"rubber * is a conductor of electricity" returns 196,000 hits.


This is a form of abductive reasoning using linguistically motivated templates.

4. How to use this template matching module?

  • Use template matching first to find the most plausible answer. Then, use the inference engine to produce the best possible explanation.
  • Use template matching to add links in the inference graphs (whenever deductive inference fails).
  • Implement Approach B as a standalone method for answering questions.</b>



To Do (Updated on Aug 16)

System building
1. Implement "template matching" using the ClueWeb corpus.Pending
  • URL for Open IE backend is available.
  • For an assertion A, find sentences that have high overlap. Generate regex patterns for the proposition. Score sentences by how well they match the regex patterns.
2. Continue system building.
  • Create a derivation scorer stub. This will be replaced with a MLN or a BLP scorer. Done.
  • Test with iron nail example.
3. Jena API doesn't readily support multiple derivations.
  • Ask Jena community to find out if this is possible. Done. Not possible.
  • OWLIM as replacement. Done. Doesn't look promising. No response from community.
4. Try out Tuffy MLN implemenatation. Done.
  • Use output of iron nail example
  • If easy to use write wrappers around Tuffy to hook into our system.
5 Write evaluation code. Vulcan has a good interface set up.
  • Check with Peter.
6. Create a system architecture page with a figure and overview of the main components.

Created a System status page instead.

  • Created a figure. Added it to system design document.
  • Need to create a wiki page for system architecture and overview.
Experiments Pending
1. Run template matching approach as a baseline.
2. Run inference system as a baseline.