Vulcan/MeetingNotes/Sep13 2013
From Knowitall
Notes
Update
- System Dev. To Do
- 1. Convert current prototype into a working system.[Niranjan: Not ready yet. In progress].
- Implemented dynamic predicates (i.e. predicates don't have to be fully materialized within Tuffy's databases).
- Actual evidence is being selected for each proposition (though not through Textual Evidence Finder)
- Rules being used are fed in Tuffy format.
- Stemming/Headwords and Synonym transforms need to be added.
- Integrate closely with Textual evidence finder.
- Finish system building and have system output be evaluated by Vulcan evaluation frame work. Revised ETC Sept 18th.
- 2. Textual evidence finder [Greg]
- minimal web interface [Done!]
- tuple representation design [Done!!]
- lexical variants (head word, stemming etc.) [In progress?]
- 3. Update the system architecture figure to emphasize textual matching using procedural escapes. [Niranjan, Done.]
4. Definition processing -- Stephen has been analyzing and adding to his patterns based on a larger sample of definitions.
- Key issue is in determining what form the output should take to facilitate inference.
- Niranjan needs to look at examples from Stephen's output to see if the current design fits.
- Points of integration
- 1. Add UW folks to e-mail list [Peter, done.]
- 2. Make a single wiki at Vulcan. Move UW's content to the new wiki. [Niranjan will work with Wil Smith. Status?]
- 3. Integrate w/ Vulcan testing framework. [Niranjan: UW's inference system will be a web service that outputs the reqd xml.]
- 4. Training data [Niranjan will work with Peter to make sure all data is available at UW.]
- 5. Share URLs for demos and services at Vulcan [Peter]
- 6. Send information about various knowledge sources inside vulcan [Peter]
- 7. Send Stephen pointers about dicitionary processing. [Peter, done.]
Agenda
- Planning
- 0. Let's set some medium term deadlines?
- Where should we be at the end of October?
- 1. Get a working system up that:
- Handles n-ary and nested tuples.
- Uses textual evidence finder to obtain evidence.
- Uses synonyms and hypernyms from WordNet and head word extraction.
- Uses Compound Noun Categorizer (CNC)
- Performs a single iteration of inference.
- Outputs newly inferred facts and rules reachable from input proposition.
- 2. Create a benchmark/development suite. For each question find and document all available resources that can be used for answering the question.
- Question parses.
- Sentences, Open IE tuples, and other relevant knowledge available for answering these questions.
- Relevant rules.
- 3. Run system on benchmark suite, identify mismatch between available and required knowledge. Address mismatch and iterate.
- 4. Sharing between Vulcan and UW.
- Describe components, systems/resources in internal Wiki.(Description: I/O with examples, link to a web service or data etc.)
- Question analyses, Knowledge requirements etc on the benchmark suite.
- Technical items
- 1. Inference system architecture for handling gaps in inference.
- 2. Tuffy non-determinism issue during rule development. Some options include
- Fix Tuffy to be determinisitic -- Not feasible.
- Use Maximum-a-posteriori (MAP) inference instead of marginal inference?
- Good: Exact (deterministic) solutions available, which are typically faster (compared to Marginal). Easily formulated with a simple adjustment to the setup.
- Bad: Not probabilities and Tuffy doesn't implement a deterministic algorithm.
- Consider switching to theBeast (Pedro recommended.)
- 3. Definition output design.
- Key issue is in determining what form the output should take to facilitate inference.
- Two kinds of uses for definitions
- Matching an alternate description -- e..g, Materials that resist flow of electrons are called (A) insulators (B) conductors (C) solids (D) gases.
- Identifying instances of the defined term -- e.g., Which of the following is an insulator? (A) plastic cup (B) iron nail (C) copper wire (D) water