Saturday, February 11, 2017

Disable disk calculation/allocation in MapR

Disable disk calculation/allocation in MapR


By default map uses 0.5 disk while reducer uses 1.33 disk .


[root@node9 ~]# hadoop conf | grep mapreduce.*.disk 
<property><name>mapreduce.map.disk</name><value>0.5</value></property>
<property><name>mapreduce.reduce.disk</name><value>1.33</value></property>
[root@node9 ~]#

How do I disable disk calculation/allocation in MapR ?



Job Level:
You can set the following value when you run a job:


-Dmapreduce.map.disk=0 or/and -Dmapreduce.reduce.disk=0


Cluster level :
You can set the following properties in /opt/mapr/hadoop/hadoop-*/etc/hadoop/mapred-site.xml on all the nodes ( No restarts needed ) :


<property>
<name>mapreduce.map.disk </name>
<value>0 </value>
<description>Number of “disks” allocated per map task </description>
</property>


<property>
<name>mapreduce.reduce.disk </name>
<value>0 </value>
<description>Number of “disks” allocated per reduce task </description>
</property>

After the change below would be the values as expected.

[root@node9 ~]# hadoop conf | grep mapreduce.*.disk 

<property><name>mapreduce.map.disk</name><value>0</value><source>mapred-site.xml</source></property>

<property><name>mapreduce.reduce.disk</name><value>0</value><source>mapred-site.xml</source></property>

[root@node9 ~]#

Note :- The cluster is fully functional without using any 'disks' resources but the applications will be hanging in SCHEDULED state because Application Master container can not be assigned to app_attempt due to AM context can not be built if any of the resource is Zero.

So the configuration in fair-scheduler should contain as least some value for every resource ( Non Zero and non factional ).

Lets say for this reason in fair-site.xml we keep value for disk as 1 even though our jobs will never ask for disks.

<minResources>143155 mb,58 vCores,1 disks</minResources>
<maxResources>386310 mb,118 vcores,1 disks</maxResources>

No comments:

Post a Comment