Tagger
The Tagger classes search for content in a sentence and mark it with a Type. Taggers are responsible for deserializing their content XML, however the descriptor is a field that is common to all taggers. The descriptor contains a string that names the tagger. For example, a tagger that looks for "knife", "sword", and "gun" might have the descriptor "AttackWeapon". Maybe I should have just named this "name".
Here is an example of a simple tagger.
<CaseInsensitiveKeywordTagger descriptor="WaterVehicle"> <constraint type="NounPhraseConstraint" /> <keywords> <keyword>watercraft</keyword> <keyword>tugcraft</keyword> <keyword>tanker</keyword> <keyword>yacht</keyword> </keywords> </CaseInsensitiveKeywordTagger>
This will search for the keywords, ignoring case, and tag matches. There is a further constraint that text to be tagged must be in a noun phrase (as defined by OpenNLP). What if we wanted to match phrases like "yachts" and "tugcrafts"? Then we would use a NormalizedKeywordTagger.
<NormalizedKeywordTagger descriptor="WaterVehicle"> <constraint type="NounPhraseConstraint" /> <keywords> <keyword>watercraft</keyword> <keyword>tugcraft</keyword> <keyword>tanker</keyword> <keyword>yacht</keyword> </keywords> </NormalizedKeywordTagger>
There are many different types of taggers. For a complete list see the javadoc [1].