Difference between revisions of "Paralex"

Revision as of 22:01, 9 October 2013

Discussed with Tony:

Paraphrase module:
- New paraphrase template PMI scores are now based on more accurate template co-occurrence counts, and a minimum frequency of 50 was set for templates.
- Template language model (LM) trained based on n-grams from the lemmatized question corpus.
- Ideas for upcoming template classifier - Features for PMI, LM, pos-tag pattern, and leveraging the paraphrase corpus for unsupervised learning
Answer Ranking
- AnswerDerivation ordering (baseline) ranking
- Strategies for training answer ranking function:
  - Uniform sample of answers from representative questions?
  - Use answers given in WikiAnswers for unsupervised learning?

Current Goals:

Produce set of training (Q,A) pairs for hand-labeling from a uniform sample of answers over training questions.
Explore feasibility of self-labeling data from answers given in WikiAnswers.

@@ Line 1: / Line 1: @@
 == Goals ==
-Paraphrase-Driven Open-Domain Question Answering
+* Paraphrase-Driven Open-Domain Question Answering
-Answer noisy user questions by generalizing from a question-paraphrase corpus.
+* Answer noisy user questions by generalizing from a question-paraphrase corpus.
 == Work Log ==