Difference between revisions of "Vulcan/MeetingNotes/Sep13 2013"

From Knowitall
Jump to: navigation, search
(Agenda)
(Agenda)
Line 61: Line 61:
 
:2. On the previously identified examples, run the system, identify mismatch between available and required knowledge. Address mismatch and iterate.
 
:2. On the previously identified examples, run the system, identify mismatch between available and required knowledge. Address mismatch and iterate.
  
:3. * Create a benchmark/development suite. For each question find and document all available resources that can be used for answering the question.
+
:3. Create a benchmark/development suite. For each question find and document all available resources that can be used for answering the question.
 
<blockquote>
 
<blockquote>
 
* Question parses.
 
* Question parses.

Revision as of 20:00, 13 September 2013

Notes

Update

System Dev. To Do
1. Convert current prototype into a working system.[Niranjan: Not ready yet. In progress].
  • Implemented dynamic predicates (i.e. predicates don't have to be fully materialized within Tuffy's databases).
  • Actual evidence is being selected for each proposition (though not through Textual Evidence Finder)
  • Rules being used are fed in Tuffy format.
  • Stemming/Headwords and Synonym transforms need to be added.
  • Integrate closely with Textual evidence finder.
  • Finish system building and have system output be evaluated by Vulcan evaluation frame work. Revised ETC Sept 18th.
2. Textual evidence finder [Greg]
  • minimal web interface [Done!]
  • tuple representation design [Done!!]
  • lexical variants (head word, stemming etc.) [In progress?]
3. Update the system architecture figure to emphasize textual matching using procedural escapes. [Niranjan, Done.]

4. Definition processing -- Stephen has been analyzing and adding to his patterns based on a larger sample of definitions.

  • Key issue is in determining what form the output should take to facilitate inference.
  • Niranjan needs to look at examples from Stephen's output to see if the current design fits.
Points of integration
1. Add UW folks to e-mail list [Peter, done.]
2. Make a single wiki at Vulcan. Move UW's content to the new wiki. [Niranjan will work with Wil Smith. Status?]
3. Integrate w/ Vulcan testing framework. [Niranjan: UW's inference system will be a web service that outputs the reqd xml.]
4. Training data [Niranjan will work with Peter to make sure all data is available at UW.]
5. Share URLs for demos and services at Vulcan [Peter]
6. Send information about various knowledge sources inside vulcan [Peter]
7. Send Stephen pointers about dicitionary processing. [Peter, done.]

Agenda

Planning
0. Let's set some medium term deadlines?
  • Where should we be at the end of October?
1. Get a working system up that:
  • Handles n-ary and nested tuples.
  • Uses textual evidence finder to obtain evidence.
  • Uses synonyms and hypernyms from WordNet and head word extraction.
  • Uses Compound Noun Categorizer (CNC)
  • Performs a single iteration of inference.
  • Outputs newly inferred facts and rules reachable from input proposition.
2. On the previously identified examples, run the system, identify mismatch between available and required knowledge. Address mismatch and iterate.
3. Create a benchmark/development suite. For each question find and document all available resources that can be used for answering the question.
  • Question parses.
  • Sentences, Open IE tuples, and other relevant knowledge available for answering these questions.
  • Relevant rules.
4. Sharing between Vulcan and UW.
  • Compound Noun Categorizer
  • Describe components, systems/resources in internal Wiki.(Description: I/O with examples, link to a web service or data etc.)
  • Question analyses, Knowledge requirements etc on the benchmark suite.
Technical items
1. Inference system architecture for handling gaps in inference.
2. Tuffy non-determinism issue during rule development. Some options include
  • Fix Tuffy to be determinisitic -- Not feasible.
  • Use Maximum-a-posteriori (MAP) inference instead of marginal inference?
  • Good: Exact (deterministic) solutions available, which are typically faster (compared to Marginal). Easily formulated with a simple adjustment to the setup.
  • Bad: Not probabilities and Tuffy doesn't implement a deterministic algorithm.
  • Consider switching to theBeast (Pedro recommended.)
3. Definition output design.
  • Key issue is in determining what form the output should take to facilitate inference.
  • Two kinds of uses for definitions
  • Matching an alternate description -- e..g, Materials that resist flow of electrons are called (A) insulators (B) conductors (C) solids (D) gases.
  • Identifying instances of the defined term -- e.g., Which of the following is an insulator? (A) plastic cup (B) iron nail (C) copper wire (D) water