Difference between revisions of "Rule Learner/Rules"

From Knowitall
Jump to: navigation, search
Line 18: Line 18:
  
  
This rule will extract an ontological relation from an extraction if that extraction's predicate contains the string "founder", if the first argument contains the type "Person", and the second argument contains the type "Organization".  The arguments of the ontological relation are defined by the text under the types "Person" and "Organization".  Here is a more complicated rule.
+
This rule will extract an ontological relation from an extraction if that extraction's predicate contains the string "founder", if the first argument contains the type "Person", and the second argument contains the type "Organization".  The arguments of the ontological relation are defined by the text under the types "Person" and "Organization".  The arguments require CaptureConstraints, or constraints that when matched can be resolved as text.  Here is a more complicated rule.
  
 
   <rule>
 
   <rule>

Revision as of 20:28, 10 May 2011

Rules are represented with an XML syntax. Here is an example.

<rule> 
  <form name="FounderOf">
    <argument type="TypeConstraint" part="argument1" name="Founder">
      <descriptor>Person</descriptor>
    </argument>
    <argument type="TypeConstraint" part="argument2" name="Organization">
      <descriptor>Organization</descriptor>
    </argument>
  </form>
  <constraints>
    <constraint type="TermConstraint" part="predicate">
      <term>founder</term>
    </constraint>
  </constraints>
</rule>


This rule will extract an ontological relation from an extraction if that extraction's predicate contains the string "founder", if the first argument contains the type "Person", and the second argument contains the type "Organization". The arguments of the ontological relation are defined by the text under the types "Person" and "Organization". The arguments require CaptureConstraints, or constraints that when matched can be resolved as text. Here is a more complicated rule.

 <rule>
   <form name="hasFather">
     <argument part="ARGUMENT1" type="TypeConstraint" name="Son">
       <descriptor>Person</descriptor>
     </argument>
     <argument part="ARGUMENT2" type="TypeConstraint" name="Father">
       <descriptor>Person</descriptor>
     </argument>
   </form>
   <constraints>
     <constraint part="PREDICATE" type="SequenceConstraint">
       <term>
         <lemma>the</lemma>
       </term>
       <term>
         <lemma>son</lemma>
       </term>
     </constraint>
   </constraints>
 </rule>

This rule will extract an ontological relation from an extraction if the extraction's predicate contains "the son" and both arguments contain the type "Person". The relation's arguments will be the text under the "Person" types.

The SequenceConstraint is rather verbose. In fact, this rule could also be represented with a StringConstraint.

 <rule>
   <form name="hasFather">
     <argument part="ARGUMENT1" type="TypeConstraint" name="Son">
       <descriptor>Person</descriptor>
     </argument>
     <argument part="ARGUMENT2" type="TypeConstraint" name="Father">
       <descriptor>Person</descriptor>
     </argument>
   </form>
   <constraints>
     <constraint part="PREDICATE" type="StringConstraint">
       <string>the son</string>
     </constraint>
   </constraints>
 </rule>

The SequenceConstraint is necessary because you may have a more complicated rule. Consider the following.


 <rule>
   <form name="hasFather">
     <argument part="ARGUMENT1" type="TypeConstraint" name="Son">
       <descriptor>Person</descriptor>
     </argument>
     <argument part="ARGUMENT2" type="TypeConstraint" name="Father">
       <descriptor>Person</descriptor>
     </argument>
   </form>
   <constraints>
     <constraint part="PREDICATE" type="SequenceConstraint">
       <term>
         <lemma>the</lemma>
       </term>
       <term>
         <lemma>son</lemma>
         <pos>NN</pos>
       </term>
     </constraint>
   </constraints>
 </rule>

In this rule, there is a further constraint that the part of speech under "son" must be "NN" (common noun). The other reason for using SequenceConstraint is it better defines what it means to generalize the rule. A StringConstraint, at present, does not define the generalize method because there is not an intuitive way to. The SequenceConstraint does (to see how it performs look at the code or the javadoc). The XML format, while a format that is easy to serialize and deserialize from, is rather obtuse, so there is a toString (and toMultilineString) method to each rule.

   hasFather {
       Son=TypeConstraint(ARGUMENT1, person)
       Father=TypeConstraint(ARGUMENT2, person)
   }
   constraints {
       SequenceConstraint(PREDICATE, [Term(lemma="the"), Term(lemma="son")])
   }

Ahh, much better. At least to me.