Difference between revisions of "Scoobi"
From Knowitall
(Added info on how to run a Scoobi job.) |
|||
Line 1: | Line 1: | ||
− | |||
Scoobi ([https://github.com/NICTA/scoobi Github]) is a Scala library for Hadoop. We have a bunch of Scoobi jobs in the browser-hadoop project under edu.washington.cs.knowitall.browser.hadoop.scoobi | Scoobi ([https://github.com/NICTA/scoobi Github]) is a Scala library for Hadoop. We have a bunch of Scoobi jobs in the browser-hadoop project under edu.washington.cs.knowitall.browser.hadoop.scoobi | ||
Revision as of 20:27, 17 January 2013
Scoobi (Github) is a Scala library for Hadoop. We have a bunch of Scoobi jobs in the browser-hadoop project under edu.washington.cs.knowitall.browser.hadoop.scoobi
Running
To run a Scoobi job, set the main class in the browser-hadoop pom.xml, and compile it using mvn clean compile assembly:single
. Then, you can test the job locally by running java -jar myjob.jar [args]
. Or, you can run it on Hadoop using a command like this: hadoop jar myjob.jar -Dmapred.task.timeout=1200000 -Dmapred.child.java.opts=-Xmx4G [args] -- scoobi nolibjars
If you're getting an error like java.lang.ClassNotFoundException: com.nicta.scoobi.impl.exec.MscrMapper, it's probably because you forgot to add -- scoobi nolibjars
to the end of the command.