Difference between revisions of "Datasets"

From Knowitall
Jump to: navigation, search
Line 3: Line 3:
 
== Corpora ==
 
== Corpora ==
 
* [http://googleresearch.blogspot.com/2013/05/syntactic-ngrams-over-time.html Syntactic Ngrams over Time] (2013)
 
* [http://googleresearch.blogspot.com/2013/05/syntactic-ngrams-over-time.html Syntactic Ngrams over Time] (2013)
 +
* [http://www.yelp.com/dataset_challenge/ Yelp Dataset Challenge]: sample of Yelp data from Phoenix, AZ (2013)
 
* [http://storage.googleapis.com/books/ngrams/books/datasetsv2.html Google N-grams] N-grams from a large corpus of books (2010)
 
* [http://storage.googleapis.com/books/ngrams/books/datasetsv2.html Google N-grams] N-grams from a large corpus of books (2010)
  

Revision as of 03:15, 18 July 2013

Below is a list of potentially useful NLP datasets.

Corpora

Knowledge bases

Entities

Relations

Paraphrasing