Paralex

From Knowitall
Revision as of 22:01, 9 October 2013 by Rbart (talk | contribs) (Created page with "== Goals == Paraphrase-Driven Open-Domain Question Answering Answer noisy user questions by generalizing from a question-paraphrase corpus. == Work Log == === Oct 9 === Discusse...")

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Goals

Paraphrase-Driven Open-Domain Question Answering Answer noisy user questions by generalizing from a question-paraphrase corpus.

Work Log

Oct 9

Discussed with Tony:

  • Paraphrase module:
    • New paraphrase template PMI scores are now based on more accurate template co-occurrence counts, and a minimum frequency of 50 was set for templates.
    • Template language model (LM) trained based on n-grams from the lemmatized question corpus.
    • Ideas for upcoming template classifier - Features for PMI, LM, pos-tag pattern, and leveraging the paraphrase corpus for unsupervised learning
  • Answer Ranking
    • AnswerDerivation ordering (baseline) ranking
    • Strategies for training answer ranking function:
      • Uniform sample of answers from representative questions?
      • Use answers given in WikiAnswers for unsupervised learning?

Current Goals:

  • Produce set of training (Q,A) pairs for hand-labeling from a uniform sample of answers over training questions.
  • Explore feasibility of self-labeling data from answers given in WikiAnswers.