QA-SRL: Question-Answer Driven Semantic Role Labeling

QA-SRL: Question-Answer Driven Semantic Role Labeling
Use Natural Language to Annotate Natural Language

We use question-answer pairs to model verbal predicate-argument structure. The questions start with wh-words (Who, What, Where, What, etc.) and contains a verb predicate in the sentence; the answers are phrases in the sentence. For example:

UCD finished the 2006 championship as Dublin champions , by beating St Vincents in the final .
finished	Who finished something?	UCD
	What did someone finish?	the 2006 championship
	What did someone finish something as?	Dublin champions
	How did someone finish something?	by beating St Vincents in the final
beating	Who beat someone?	UCD
	When did someone beat someone?	in the final
	Who did someone beat?	St Vincents

Human-in-the-Loop Parsing

Coming soon!

Publications

The QA-SRL framework is described in the following paper:

Question-Answer Driven Semantic Role Labeling: Using Natural Language to Annotate Natural Language
Luheng He, Mike Lewis and Luke Zettlemoyer
In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP-2015)

The QA-SRL Dataset

File Format

Dataset	No. Sentences	No. Verbs	No. QAs
newswire-train	744	2020	4904
newswire-dev	249	664	1606
newswire-test	248	652	1599
Wikipedia-train	1174	2647	6414
Wikipedia-dev	392	895	2183
Wikipedia-test	393	898	2201

*The newswire data does not contain the original sentences. You will need to download and run the following python script with the CoNLL-2009 English training data to get the complete data.

Our Annotation Tool

Code for generating the annotation spreadsheets can be found here: https://github.com/luheng/qasrl_annotation

Contact

If you have any question about the data or the code, please contact: {first name of first author} at cs dot washington dot edu