问题
On the cluster I'm working on every user is given 60GB of Hadoop quota. Historically the project I'm working on generates a lot of Hive queries. In order for things to work faster I'm trying to parallel these queries (which are unrelated) but as a result the directory /user/{myusername}/.staging/ is being filled with job_{someid} directories which in turn are filled with the hive jars and consume these 60GB very fast. While I can limit the parallelization factor I would also like to see if I can ask Hive to put these jars on a different directory. Say /tmp/{myusername} where I have a lot more space.
Any idea how do I tell Hive/Beeline to create the .staging directory under /tmp/{myusername}?
回答1:
Easiest way is on execution of your beeline session.
beeline --hive.exec.stagingdir=/tmp/{myusername}
Think you can do it via !set inside beeline but don't have the syntax to hand.
回答2:
The above doesn't work.
We found the following working
beeline --hiveconf hive.exec.stagingdir=/tmp/{myusername}
来源:https://stackoverflow.com/questions/37908837/hive-beeline-how-can-i-set-the-job-staging-directory