Difference between revisions of "Rule Learner/Rules"

From Knowitall
Jump to: navigation, search
(Created page with "Rules are represented with an XML syntax. Here is an example. <rule> <form name="FounderOf"> <argument type="TypeConstraint" part="argument1"> <descriptor>Pers...")
 
Line 3: Line 3:
 
  <rule>  
 
  <rule>  
 
   <form name="FounderOf">
 
   <form name="FounderOf">
     <argument type="TypeConstraint" part="argument1">
+
     <argument type="TypeConstraint" part="argument1" name="Founder">
 
       <descriptor>Person</descriptor>
 
       <descriptor>Person</descriptor>
 
     </argument>
 
     </argument>
     <argument type="TypeConstraint" part="argument2">
+
     <argument type="TypeConstraint" part="argument2" name="Organization">
 
       <descriptor>Organization</descriptor>
 
       <descriptor>Organization</descriptor>
 
     </argument>
 
     </argument>

Revision as of 02:17, 7 May 2011

Rules are represented with an XML syntax. Here is an example.

<rule> 
  <form name="FounderOf">
    <argument type="TypeConstraint" part="argument1" name="Founder">
      <descriptor>Person</descriptor>
    </argument>
    <argument type="TypeConstraint" part="argument2" name="Organization">
      <descriptor>Organization</descriptor>
    </argument>
  </form>
  <constraints>
    <constraint type="TermConstraint" part="predicate">
      <term>founder</term>
    </constraint>
  </constraints>
</rule>


This rule will extract an ontological relation from an extraction if that extraction's predicate contains the string "founder", if the first argument contains the type "Person", and the second argument contains the type "Organization". The arguments of the ontological relation are defined by the text under the types "Person" and "Organization". Here is a more complicated rule.

 <rule>
   <form name="hasFather">
     <argument part="ARGUMENT1" type="TypeConstraint" name="Son">
       <descriptor>Person</descriptor>
     </argument>
     <argument part="ARGUMENT2" type="TypeConstraint" name="Father">
       <descriptor>Person</descriptor>
     </argument>
   </form>
   <constraints>
     <constraint part="PREDICATE" type="SequenceConstraint">
       <term>
         <lemma>the</lemma>
       </term>
       <term>
         <lemma>son</lemma>
       </term>
     </constraint>
   </constraints>
 </rule>

This rule will extract an ontological relation from an extraction if the extraction's predicate contains "the son" and both arguments contain the type "Person". The relation's arguments will be the text under the "Person" types.

The SequenceConstraint is rather verbose. In fact, this rule could also be represented with a StringConstraint.

 <rule>
   <form name="hasFather">
     <argument part="ARGUMENT1" type="TypeConstraint" name="Son">
       <descriptor>Person</descriptor>
     </argument>
     <argument part="ARGUMENT2" type="TypeConstraint" name="Father">
       <descriptor>Person</descriptor>
     </argument>
   </form>
   <constraints>
     <constraint part="PREDICATE" type="StringConstraint">
       <string>the son</string>
     </constraint>
   </constraints>
 </rule>

The SequenceConstraint is necessary because you may have a more complicated rule. Consider the following.


 <rule>
   <form name="hasFather">
     <argument part="ARGUMENT1" type="TypeConstraint" name="Son">
       <descriptor>Person</descriptor>
     </argument>
     <argument part="ARGUMENT2" type="TypeConstraint" name="Father">
       <descriptor>Person</descriptor>
     </argument>
   </form>
   <constraints>
     <constraint part="PREDICATE" type="SequenceConstraint">
       <term>
         <lemma>the</lemma>
       </term>
       <term>
         <lemma>son</lemma>
         <pos>NN</pos>
       </term>
     </constraint>
   </constraints>
 </rule>

In this rule, there is a further constraint that the part of speech under "son" must be "NN" (common noun). The other reason for using SequenceConstraint is it better defines what it means to generalize the rule. A StringConstraint, at present, does not define the generalize method because there is not an intuitive way to. The SequenceConstraint does (to see how it performs look at the code or the javadoc). The XML format, while a format that is easy to serialize and deserialize from, is rather obtuse, so there is a toString (and toMultilineString) method to each rule.

   hasFather {
       Son=TypeConstraint(ARGUMENT1, person)
       Father=TypeConstraint(ARGUMENT2, person)
   }
   constraints {
       SequenceConstraint(PREDICATE, [TermConstraint(PREDICATE, lemma="the"), TermConstraint(PREDICATE, lemma="son", pos="NN")])
   }

Ahh, much better. At least in my mind. I could probably make this prettier.