Difference between revisions of "Paralex"

From Knowitall
Jump to: navigation, search
 
Line 4: Line 4:
  
 
== Work Log ==
 
== Work Log ==
 +
=== Oct 25 ===
 +
Project on hold pending stable release of Paralex/Open-IE Demo.
 
=== Oct 17 ===
 
=== Oct 17 ===
 
Shifting focus from this project to the Paralex Demo project for the time being - until a stable demo is up.
 
Shifting focus from this project to the Paralex Demo project for the time being - until a stable demo is up.

Latest revision as of 21:58, 25 October 2013

Goals

  • Paraphrase-Driven Open-Domain Question Answering
  • Answer noisy user questions by generalizing from a question-paraphrase corpus.

Work Log

Oct 25

Project on hold pending stable release of Paralex/Open-IE Demo.

Oct 17

Shifting focus from this project to the Paralex Demo project for the time being - until a stable demo is up.

  • Merged all of my code changes into the main branch, including new scoring features and training data.
  • Discussed with Tony ideas for other sub-projects - next work will likely be on relation synonyms using PMI computed over the triplestore.

Oct 9

Discussed with Tony:

  • Paraphrase module:
    • New paraphrase template PMI scores are now based on more accurate template co-occurrence counts, and a minimum frequency of 50 was set for templates.
    • Template language model (LM) trained based on n-grams from the lemmatized question corpus.
    • Ideas for upcoming template classifier - Features for PMI, LM, pos-tag pattern, and leveraging the paraphrase corpus for unsupervised learning
  • Answer Ranking
    • AnswerDerivation ordering (baseline) ranking
    • Strategies for training answer ranking function:
      • Uniform sample of answers from representative questions?
      • Use answers given in WikiAnswers for unsupervised learning?

Current Goals:

  • Produce set of training (Q,A) pairs for hand-labeling from a uniform sample of answers over training questions.
  • Explore feasibility of self-labeling data from answers given in WikiAnswers.