问题
I'm trying to debug some issues with a single node Hadoop cluster on my Mac. In all the setup docs it says to add:
export HADOOP_OPTS="-Djava.security.krb5.realm=OX.AC.UK -Djava.security.krb5.kdc=kdc0.ox.ac.uk:kdc1.ox.ac.uk"
to remove this error:
Unable to load realm info from SCDynamicStore
This works, but it only seems to work for STDOUT. When I check my Hadoop logs directory, under "job_###/atempt_###/stderr" the error is still there:
2013-02-08 09:58:23.662 java[2772:1903] Unable to load realm info from SCDynamicStore
I'm having great difficulty loading RVM Rubies into the Hadoop environment to execute Ruby code with Hadoop streaming. STDOUT is printing that RVM is loaded and using the right Ruby/gemset but my STDERR logs:
env: ruby_noexec_wrapper: No such file or directory
Is there some way to find out what path Hadoop is actually using to execute the jobs, or if it's invoking some other environment here?
Further background:
I'm using Hadoop 1.1.1 installed via Homebrew. It's setup in a manner very similar to "INSTALLING HADOOP ON MAC OSX LION" and debugging an implementation of wukong 3.0.0 as the wrapper for executing Hadoop jobs.
回答1:
To answer my own question so other's can find it.
I appeared to be loading rvm in my hadoop-env but I must have not restarted the cluster after adding it. To make sure your rubies and gemsets are loaded, add the standard rvm clause to hadoop-env.sh. Something like:
[[ -s "/Users/ScotterC/.rvm/scripts/rvm" ]] && source "/Users/ScotterC/.rvm/scripts/rvm"
And make sure to restart the cluster so it picks it up. Oddly enough, without restarting, my logs would show that it was loading rvm but it clearly wasn't executing that ruby and it's respective gemfiles. After restarting it worked.
来源:https://stackoverflow.com/questions/14775459/hadoop-environment-variables