We use question-answer pairs to model verbal predicate-argument structure. The questions start with wh-words (Who, What, Where, What, etc.) and contains a verb predicate in the sentence; the answers are phrases in the sentence. For example:
| UCD finished the 2006 championship as Dublin champions , by beating St Vincents in the final . | ||
| finished | Who finished something? | UCD |
| What did someone finish? | the 2006 championship | |
| What did someone finish something as? | Dublin champions | |
| How did someone finish something? | by beating St Vincents in the final | |
| beating | Who beat someone? | UCD |
| When did someone beat someone? | in the final | |
| Who did someone beat? | St Vincents | |
Coming soon!
The QA-SRL framework is described in the following paper:
Question-Answer Driven Semantic Role Labeling: Using Natural Language to Annotate Natural Language
Luheng He, Mike Lewis and Luke Zettlemoyer
In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP-2015)
| Dataset | No. Sentences | No. Verbs | No. QAs |
|---|---|---|---|
| newswire-train | 744 | 2020 | 4904 |
| newswire-dev | 249 | 664 | 1606 |
| newswire-test | 248 | 652 | 1599 |
| Wikipedia-train | 1174 | 2647 | 6414 |
| Wikipedia-dev | 392 | 895 | 2183 |
| Wikipedia-test | 393 | 898 | 2201 |
*The newswire data does not contain the original sentences. You will need to download and run the following python script with the CoNLL-2009 English training data to get the complete data.