Difference between revisions of "Vulcan/MeetingNotes/Sep13 2013"

From Knowitall
Jump to: navigation, search
(Agenda)
(Agenda)
Line 66: Line 66:
 
</blockquote>
 
</blockquote>
  
:3. Run system on benchmark suite, identify mismatch between available and required knowledge. Address mismatch and iterate.
+
:3. Run system on benchmark. Identify mismatch between available and required knowledge. Address mismatch and iterate.
  
 
:4. Sharing between Vulcan and UW.
 
:4. Sharing between Vulcan and UW.

Revision as of 20:03, 13 September 2013

Notes

Update

System Dev. To Do
1. Convert current prototype into a working system.[Niranjan: Not ready yet. In progress].
  • Implemented dynamic predicates (i.e. predicates don't have to be fully materialized within Tuffy's databases).
  • Actual evidence is being selected for each proposition (though not through Textual Evidence Finder)
  • Rules being used are fed in Tuffy format.
  • Stemming/Headwords and Synonym transforms need to be added.
  • Integrate closely with Textual evidence finder.
  • Finish system building and have system output be evaluated by Vulcan evaluation frame work. Revised ETC Sept 18th.
2. Textual evidence finder [Greg]
  • minimal web interface [Done!]
  • tuple representation design [Done!!]
  • lexical variants (head word, stemming etc.) [In progress?]
3. Update the system architecture figure to emphasize textual matching using procedural escapes. [Niranjan, Done.]

4. Definition processing -- Stephen has been analyzing and adding to his patterns based on a larger sample of definitions.

  • Key issue is in determining what form the output should take to facilitate inference.
  • Niranjan needs to look at examples from Stephen's output to see if the current design fits.
Points of integration
1. Add UW folks to e-mail list [Peter, done.]
2. Make a single wiki at Vulcan. Move UW's content to the new wiki. [Niranjan will work with Wil Smith. Status?]
3. Integrate w/ Vulcan testing framework. [Niranjan: UW's inference system will be a web service that outputs the reqd xml.]
4. Training data [Niranjan will work with Peter to make sure all data is available at UW.]
5. Share URLs for demos and services at Vulcan [Peter]
6. Send information about various knowledge sources inside vulcan [Peter]
7. Send Stephen pointers about dicitionary processing. [Peter, done.]

Agenda

Planning
0. Let's set some medium term deadlines?
  • Where should we be at the end of October?
1. Get a working system up that:
  • Handles n-ary and nested tuples.
  • Uses textual evidence finder to obtain evidence.
  • Uses synonyms and hypernyms from WordNet and head word extraction.
  • Uses Compound Noun Categorizer (CNC)
  • Performs a single iteration of inference.
  • Outputs newly inferred facts and rules reachable from input proposition.
2. Create a benchmark/development suite. For each question find and document all available resources that can be used for answering the question.
  • Question parses.
  • Sentences, Open IE tuples, and other relevant knowledge available for answering these questions.
  • Relevant rules.
3. Run system on benchmark. Identify mismatch between available and required knowledge. Address mismatch and iterate.
4. Sharing between Vulcan and UW.
  • Describe components, systems/resources in internal Wiki.(Description: I/O with examples, link to a web service or data etc.)
  • Question analyses, Knowledge requirements etc on the benchmark suite.
Technical items
1. Inference system architecture for handling gaps in inference.
2. Tuffy non-determinism issue during rule development. Some options include
  • Fix Tuffy to be determinisitic -- Not feasible.
  • Use Maximum-a-posteriori (MAP) inference instead of marginal inference?
  • Good: Exact (deterministic) solutions available, which are typically faster (compared to Marginal). Easily formulated with a simple adjustment to the setup.
  • Bad: Not probabilities and Tuffy doesn't implement a deterministic algorithm.
  • Consider switching to theBeast (Pedro recommended.)
3. Definition output design.
  • Key issue is in determining what form the output should take to facilitate inference.
  • Two kinds of uses for definitions
  • Matching an alternate description -- e..g, Materials that resist flow of electrons are called (A) insulators (B) conductors (C) solids (D) gases.
  • Identifying instances of the defined term -- e.g., Which of the following is an insulator? (A) plastic cup (B) iron nail (C) copper wire (D) water