Difference between revisions of "Solr"

From Knowitall
Jump to: navigation, search
(Added instructions on how to make your own version of the Open IE demo.)
 
Line 26: Line 26:
 
# Checkout the solr branch of openie-demo (if it hasn't been merged).
 
# Checkout the solr branch of openie-demo (if it hasn't been merged).
 
# Run <code>sbt run</code>
 
# Run <code>sbt run</code>
# Visit your site at ??? (TODO: fill this in)
+
# Visit your site at http://rv-n15.cs.washington.edu:9000

Revision as of 19:58, 29 January 2013

Open IE demo

To create your own instance of the Open IE demo, you will need to do three things

  1. Copy the Solr installation and start the new instances
  2. Import the data
  3. Run the Open IE demo

Copy the Solr installation

  1. Copy the folder rv-n16:/scratch/usr/schmmd/disk1 to your own directory, e.g., rv-n15:/scratch/usr/jstn/disk1.
  2. Create 3 more folders on the same machine: /scratch/usr/jstn/disk2, /scratch2/usr/jstn/disk3, /scratch2/usr/jstn/disk4
  3. Check if there's already a Solr instance running on that machine by going to http://rv-n15.cs.washington.edu:8983/solr. If there is, then you need to change the port number.
    1. Inside the disk1 folder, open solr/example/etc/jetty.xml, and change the line <Set name="port"><SystemProperty name="jetty.port" default="8983"/></Set> to use some unused port number.
  4. Open distribute.sh in the disk1 folder, and modify the servers line to point to the folders you created. If you had to modify the port number, change that as well throughout.
  5. Run distribute.sh. It will take a bit of time, as it copies the solr folder to each directory. It also starts screen sessions for each Solr instance.
  6. Check that it's working by going to http://rv-n15.cs.washington.edu:8983/solr, changing the port if necessary. You should see the Solr dashboard.

Import the data

Assuming you have a flat text file with ReVerbExtractionGroups in it (example), you now need to import the data. This will take a day or two, as it needs to be indexed.

  1. Make sure your data file is available locally (i.e., not sitting in HDFS).
  2. git clone the openie-backend project.
  3. Checkout the solr branch (if it hasn't been merged, check for the SolrLoader class).
  4. Open the pom.xml and make edu.washington.cs.knowitall.browser.solr.SolrLoader the main class, by changing the mainClass element.
  5. Run mvn clean compile assembly:single
  6. Go to the target folder and run cat mydata | java -jar openie-backend.jar http://rv-n15.cs.washington.edu:8983/solr stdin, adjusting the args as necessary.

Run the demo

  1. Checkout the solr branch of openie-demo (if it hasn't been merged).
  2. Run sbt run
  3. Visit your site at http://rv-n15.cs.washington.edu:9000