问题
I am getting an error "No space left on device" when I am running my Amazon EMR jobs using m1.large as the instance type for the hadoop instances to be created by the jobflow. The job generates approx. 10 GB of data at max and since the capacity of a m1.large instance is supposed to be 420GB*2 (according to: EC2 instance types ). I am confused how just 10GB of data could lead to a "disk space full" kind of a message. I am aware of the possibility that this kind of an error can also be generated if we have completely exhausted the total number of inodes allowed on the filesystem but that is like a big number amounting to millions and I am pretty sure that my job is not producing that many files. I have seen that when I try to create an EC2 instance independently of m1.large type it by default assigns a root volume of 8GB to it. Could this be the reason behind the provisioning of instances in EMR also? Then, when do the disks of size 420GB get alloted to an instance?
Also, here is the output of of "df -hi" and "mount"
$ df -hi Filesystem Inodes IUsed IFree IUse% Mounted on /dev/xvda1 640K 100K 541K 16% / tmpfs 932K 3 932K 1% /lib/init/rw udev 930K 454 929K 1% /dev tmpfs 932K 3 932K 1% /dev/shm ip-10-182-182-151.ec2.internal:/mapr 100G 50G 50G 50% /mapr $ mount /dev/xvda1 on / type ext3 (rw,noatime) tmpfs on /lib/init/rw type tmpfs (rw,nosuid,mode=0755) proc on /proc type proc (rw,noexec,nosuid,nodev) sysfs on /sys type sysfs (rw,noexec,nosuid,nodev) udev on /dev type tmpfs (rw,mode=0755) tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev) devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=620) /var/run on /run type none (rw,bind) /var/lock on /run/lock type none (rw,bind) /dev/shm on /run/shm type none (rw,bind) rpc_pipefs on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw) ip-10-182-182-151.ec2.internal:/mapr on /mapr type nfs (rw,addr=10.182.182.151)
$ lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT xvda1 202:1 0 10G 0 disk / xvdb 202:16 0 420G 0 disk xvdc 202:32 0 420G 0 disk
回答1:
With the help of @slayedbylucifer I was able to identify the problem was that the complete disk space is made available to the HDFS on the cluster by default. Hence, there is the default 10GB of space mounted on / available for local use by the machine. There is an option called --mfs-percentage
which can be used (while using MapR distribution of Hadoop) to specify the split of disk space between the local filesystem and HDFS. It mounts the local filesystem quota at /var/tmp
. Make sure that the option mapred.local.dir
is set to a directory inside /var/tmp
because that is where all the logs of the tasktracker attempts go in which can be huge in size for big jobs. The logging in my case was causing the disk space error. I set the value of --mfs-percentage
to 60 and was able to run the job successfully thereafter.
来源:https://stackoverflow.com/questions/19561578/getting-no-space-left-on-device-for-approx-10-gb-of-data-on-emr-m1-large-inst