问题
I am trying to set up Tachyon on the S3 filesystem. I am completely new to Tachyon and am still really reading what I can find on it. My tachyon-env.sh is given below:
!/usr/bin/env bash
# This file contains environment variables required to run Tachyon. Copy it as tachyon-env.sh and
# edit that to configure Tachyon for your site. At a minimum,
# the following variables should be set:
#
# - JAVA_HOME, to point to your JAVA installation
# - TACHYON_MASTER_ADDRESS, to bind the master to a different IP address or hostname
# - TACHYON_UNDERFS_ADDRESS, to set the under filesystem address.
# - TACHYON_WORKER_MEMORY_SIZE, to set how much memory to use (e.g. 1000mb, 2gb) per worker
# - TACHYON_RAM_FOLDER, to set where worker stores in memory data
# - TACHYON_UNDERFS_HDFS_IMPL, to set which HDFS implementation to use (e.g. com.mapr.fs.MapRFileSystem,
# org.apache.hadoop.hdfs.DistributedFileSystem)
# The following gives an example:
if [[ `uname -a` == Darwin* ]]; then
# Assuming Mac OS X
export JAVA_HOME=${JAVA_HOME:-$(/usr/libexec/java_home)}
export TACHYON_RAM_FOLDER=/Volumes/ramdisk
export TACHYON_JAVA_OPTS="-Djava.security.krb5.realm= -Djava.security.krb5.kdc="
else
# Assuming Linux
if [ -z "$JAVA_HOME" ]; then
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
fi
export TACHYON_RAM_FOLDER=/mnt/ramdisk
fi
export JAVA="$JAVA_HOME/bin/java"
export TACHYON_MASTER_ADDRESS=localhost
export TACHYON_UNDERFS_ADDRESS=s3n://test
#export TACHYON_UNDERFS_ADDRESS=hdfs://localhost:9000
export TACHYON_WORKER_MEMORY_SIZE=0.5GB
export TACHYON_UNDERFS_HDFS_IMPL=org.apache.hadoop.hdfs.DistributedFileSystem
CONF_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
export TACHYON_JAVA_OPTS+="
-Dlog4j.configuration=file:$CONF_DIR/log4j.properties
-Dtachyon.debug=false
-Dtachyon.underfs.address=$TACHYON_UNDERFS_ADDRESS
-Dtachyon.underfs.hdfs.impl=$TACHYON_UNDERFS_HDFS_IMPL
-Dtachyon.data.folder=$TACHYON_UNDERFS_ADDRESS/tmp/tachyon/data
-Dtachyon.workers.folder=$TACHYON_UNDERFS_ADDRESS/tmp/tachyon/workers
-Dtachyon.worker.memory.size=$TACHYON_WORKER_MEMORY_SIZE
-Dtachyon.worker.data.folder=$TACHYON_RAM_FOLDER/tachyonworker/
-Dtachyon.master.worker.timeout.ms=60000
-Dtachyon.master.hostname=$TACHYON_MASTER_ADDRESS
-Dtachyon.master.journal.folder=$TACHYON_HOME/journal/
-Dorg.apache.jasper.compiler.disablejsr199=true
-Djava.net.preferIPv4Stack=true
-Dfs.s3n.awsAccessKeyId=123
-Dfs.s3n.awsSecretAccessKey=456
"
# Master specific parameters. Default to TACHYON_JAVA_OPTS.
export TACHYON_MASTER_JAVA_OPTS="$TACHYON_JAVA_OPTS"
# Worker specific parameters that will be shared to all workers. Default to TACHYON_JAVA_OPTS.
export TACHYON_WORKER_JAVA_OPTS="$TACHYON_JAVA_OPTS"
However, when I am trying to format Tachyon, I am getting the following error:
Exception in thread "main" java.lang.NoClassDefFoundError: org/jets3t/service/S3ServiceException
at org.apache.hadoop.fs.s3native.NativeS3FileSystem.createDefaultStore(NativeS3FileSystem.java:224)
at org.apache.hadoop.fs.s3native.NativeS3FileSystem.initialize(NativeS3FileSystem.java:214)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
at tachyon.UnderFileSystemHdfs.<init>(UnderFileSystemHdfs.java:89)
at tachyon.UnderFileSystemHdfs.getClient(UnderFileSystemHdfs.java:56)
at tachyon.UnderFileSystem.get(UnderFileSystem.java:69)
at tachyon.UnderFileSystem.get(UnderFileSystem.java:54)
at tachyon.Format.formatFolder(Format.java:32)
at tachyon.Format.main(Format.java:59)
Caused by: java.lang.ClassNotFoundException: org.jets3t.service.S3ServiceException
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 13 more
Should I change my jets3t jar file, or is it something else? The question may be really basic, but that is exactly my level right now. I ran some basic tests with Tachyon, though.
I would be glad for any help!!
回答1:
The problem is that accessing to S3 requires more dependencies than package by default. For tachyon 0.5.0 to work with hadoop 1.0.4, you'll have to export first paths to : * jets3:0.7.1 * commons-httpclient:3.1
You can do it likewise (hijacking the TACHYON_CLASSPATH
var):
export TACHYON_CLASSPATH=~/.m2/repository/commons-httpclient/commons-httpclient/3.1/commons-httpclient-3.1.jar:~/.m2/repository/net/java/dev/jets3t/jets3t/0.7.1/jets3t-0.7.1.jar
Then, you have to use this variable also in the start script, everywhere TACHYON_JAR
is used as -cp
, prepend TACHYON_CLASSPATH
.
Example:
(nohup $JAVA -cp $TACHYON_JAR -Dtachyon.home=$TACHYON_HOME -Dtachyon.logger.type="MASTER_LOGGER" -Dlog4j.configuration=file:$TACHYON_CONF_DIR/log4j.properties $TACHYON_MASTER_JAVA_OPTS tachyon.master.TachyonMaster > /dev/null 2>&1) &
becomes
(nohup $JAVA -cp $TACHYON_CLASSPATH:$TACHYON_JAR -Dtachyon.home=$TACHYON_HOME -Dtachyon.logger.type="MASTER_LOGGER" -Dlog4j.configuration=file:$TACHYON_CONF_DIR/log4j.properties $TACHYON_MASTER_JAVA_OPTS tachyon.master.TachyonMaster > /dev/null 2>&1) &
Finally, you can format your s3 bucket and start tachyon:
./bin/tachyon format
./bin/tachyon-start local
WARN about the scheme s3n
: I've looked deeper in the code and found some weird stuffs regarding s3 credentials. Apparently, only s3n
scheme will work (because only this flavor will be injected/copied in the conf).
Hence, the UNDERFS url to s3 must use the s3n
scheme, and you java properties in the tachyon-env.sh
as well.
Luckily, you're already okay, but others might not.
来源:https://stackoverflow.com/questions/26637407/error-in-setting-up-tachyon-on-s3-under-filesystem