Nutch - Getting Error: JAVA_HOME is not set. when trying to crawl
问题 First and foremost I'm a Nutch/Hadoop newbie. I have installed Cassandra. I have installed Nutch on the Master node of my EMR cluster. When I attempt to execute a crawl using the following command: sudo bin/crawl crawl urls -dir crawl -depth 3 -topN 5 I get Error: JAVA_HOME is not set. If I run the command without 'sudo' I get: Injector: starting at 2014-07-16 02:12:24 Injector: crawlDb: urls/crawldb Injector: urlDir: crawl Injector: Converting injected urls to crawl db entries. Injector: org