Vulcan/DefinitionsExtractor

From Knowitall
(Redirected from DefinitionsExtractor)
Jump to: navigation, search

Overview

Definitions fall into a small number of syntactic patterns, each begins with “X is Y” giving a class that X belongs to, followed by some distinguishing characteristic. A typical pattern is “X is Y that rel Z”, e.g. “Carnivore is an animal that eats other animals.”. Another typical is “X is Y prep Z”, e.g.” Fossil is the remains of a living thing from many years ago”.

We need a much more aggressive extractor than Open IE 4.0 for definitions, since we want to include all words in a definition, and to create relations from prepositions and gerunds that Open IE doesn’t handle.

One approach is to create a Definition Pattern Matcher that takes as input the POS-tagged sentence, the parse tree, and the SRL-based tuples. It then matches the definition to one of the manually identified definition patterns and creates a conjuction of tuples.

Patterns

I analyzed a random set of 25 definition sentences and identified the following 15 definition patterns. I will continue this analysis on a larger sample.

Patterns
1.	X is Y that rel Z => (X, is-a, Y) & (Y, rel, Z)   freq 3
	Stamen is the part of the plant that holds the male cells for reproduction
	(stamen, is-a, part of the plant) & (part of the plant , holds, the male cells for reproduction)

2.	X is Y that rel1 Z rel2 W => (X, is-a, Y) & (Y, rel1, (Z, re2l, W))   freq 2
	Adaptation is a special trait that helps an organism survive in its environment
	(Adaptation, is-a,  special trait)   & (special trait, helps, (organism , survive in, its environment)

3.	X is Y that rel1 when Z rel2 W => (X, is-a, Y) & (Y, rel1 when, (Z, rel2, W)   freq 2
	Volcano is a mountain that forms when red-hot melted rock flows through a crack onto earth's surface
	(Volcano, is-a, mountain) & 
		(mountain, forms when, (red-hot melted rock, flows through, crack onto earth's surface)

4. 	X is Y rel by Z => (X, is-a, Y) & (Z, rel, Y)   freq 1
	Carbon dioxide is a gas breathed out by animals
	(Carbon dioxide, is-a, gas) & (animals, breathe out, gas)

5.	X is Y rel Z => (X, is-a, Y) & (Y, rel, Z)   freq 1
	Experiment is an organized test designed to support or disprove a hypothesis
	(Experiment,  is-a, organized test) & (organized test,  designed to support or disprove, a hypothesis)

6.	X is Y prep Z => (X, is-a, Y & (Y, is prep, Z)   freq 6
	Carbon dioxide is a gas in the air
	(Carbon dioxide, is-a, gas) & (gas, is in, the air)

7.	X is Y prep Z rel W => (X, is-a, Y & (Y, is prep,( Z, rel, W))   freq 2
	Inclined plane is a flat surface with one end higher than the other
	(Inclined plane, is-a, flat surface)  & (flat surface, is with one end higher than, the other)

8.	X is Y prep1 Z prep2 which W rel => (X, is-a, Y) & (Y, is prep1, Z) & (W, rel prep2, Z)  freq 1
	Stomata is the holes on the bottoms of leaves through which air and water pass
	(Stomata, is-a, holes) &  (holes , is on, the bottoms of leaves) & (air and water, pass through, holes)

9.	X is Y prep rel Z => (X, is-a, Y) & (Y,  is prep rel, Z)   freq 1
	Classification is the grouping of things by using a set of rules
	(Classification, is-a, grouping of things)   & (grouping of things , is by using,  a set of rules)

10.  X is having Y rel Z => (X, is having, (Y, rel, Z)   freq 1
	Hybrid is having two or more different things mixed together
	(Hybrid,  is having, ( two or more different things,  mixed,  together)

11.  X is Y; for example Z => (X, is-a, Y) & (Z, is example of, X)   freq 1
	Star is a huge, burning sphere of gases; for example, the sun
	(Star,  is-a, huge, burning sphere of gases)  & (sun, is example of, Star)

12.  X is Y or Z => (X, is-a, Y) or (X, is-a, Z)  freq 1
	Force is a push or pull
	(Force, is-a,  push)  OR  (Force, is-a,  pull)

13.  X is Y rel1 Z rel2 W => (X, is-a, Y) & (Y, rel1, Z) & (Y, rel2, W)  freq 1
	Electromagnet is an arrangement of wire wrapped around a core producing a temporary magnet
	(Electromagnet, is-a, arrangement of wire)   & (arrangement of wire,  wrapped around,  core)   &
		 (arrangement of wire,  producing, a temporary magnet)

14.  X is Y that Z rel1 as it rel2 W => (X, is-a, Y) & (Z, rel1, Y) & (Z, rel2, W)   freq 1
	Orbit is the path that an object such as a planet makes as it revolves around a second object
	(Orbit, is-a,  path)  &  (object such as a planet,  makes, path)  &  
		(object such as a planet,  revolves around,  a second object)

15.  X is Y prep1 Z prep2 W => (X, is-a, Y) & (Y, is prep1, Z) & (Y, is prep2, W)   freq 1
	Competition is the struggle among living things for the same resources
	(Competition,  is-a, struggle)   & (struggle, is among living things) &
		 (struggle, is for, the same resources)