Pattern Learning

From Knowitall
Revision as of 00:31, 20 October 2011 by Schmmd (talk | contribs) (Created page with "= Building the boostrapping data = == Determining target relations == # Restrict high quality set of ClueWeb extractions to have proper noun arguments # Choose the most freque...")

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Building the boostrapping data

Determining target relations

  1. Restrict high quality set of ClueWeb extractions to have proper noun arguments
  2. Choose the most frequent relations from this set

Determining target extractions

  1. Measure the occurrence of the arguments
  2. Keep extractions from the target relations that have arguments that occur commonly (100)

Reducing the lemma grep results

  1. Remove duplicate sentences.
  2. Remove extractions that occur anomalously frequently.