What should be hadoop.tmp.dir ?

前端 未结 3 983
礼貌的吻别
礼貌的吻别 2020-12-24 03:28

Hadoop has configuration parameter hadoop.tmp.dir which, as per documentation, is `\"A base for other temporary directories.\" I presume, this path ref

相关标签:
3条回答
  • 2020-12-24 03:39

    Had a look around for information on this one. Only thing I could come up with was this post on the Amazon Elastic MapReduce Dev Guide:

    In hadoop-site.xml, we set hadoop.tmp.dir to /mnt/var/lib/hadoop/tmp. /mnt is where we mount the “extra” EC2 volumes, which can contain a lot more data than the default volume. (The exact amount depends on instance type.) Hadoop's RunJar.java (the module that unpacks the input JARs) interprets hadoop.tmp.dir as a Hadoop file system path rather than a local path, so it writes to the path in HDFS instead of a local path. HDFS is mounted under /mnt (specifically /mnt/var/lib/hadoop/dfs/. So, you can write lots of data to it.

    0 讨论(0)
  • 2020-12-24 03:51

    Let me add a bit more to kkrugler's answer:

    There're three HDFS properties which contain hadoop.tmp.dir in their values

    1. dfs.name.dir: directory where namenode stores its metadata, with default value ${hadoop.tmp.dir}/dfs/name.
    2. dfs.data.dir: directory where HDFS data blocks are stored, with default value ${hadoop.tmp.dir}/dfs/data.
    3. fs.checkpoint.dir: directory where secondary namenode store its checkpoints, default value is ${hadoop.tmp.dir}/dfs/namesecondary.

    This is why you saw the /mnt/hadoop-tmp/hadoop-${user.name} in your HDFS after formatting namenode.

    0 讨论(0)
  • 2020-12-24 04:01

    It's confusing, but hadoop.tmp.dir is used as the base for temporary directories locally, and also in HDFS. The document isn't great, but mapred.system.dir is set by default to "${hadoop.tmp.dir}/mapred/system", and this defines the Path on the HDFS where where the Map/Reduce framework stores system files.

    If you want these to not be tied together, you can edit your mapred-site.xml such that the definition of mapred.system.dir is something that's not tied to ${hadoop.tmp.dir}

    0 讨论(0)
提交回复
热议问题