Thursday, April 14, 2016

Configuring the Fair Scheduler with ACL on MapR Cluster

                   Configuring the Fair Scheduler with ACL on MapR Cluster

This blog assumes you have MapR 4.0.2 un-secure cluster installed.

1) Add below lines  to yarn-site.xml on all Resource Managers nodes followed by restart the RM.

vi /opt/mapr/hadoop/hadoop-2.5.1/etc/hadoop/yarn-site.xml

<property><name>yarn.admin.acl</name><value>mapr</value></property>
<property><name>yarn.acl.enable</name><value>true</value></property>

With this setting MapR is administrator for yarn cluster and can kill any job and submit jobs no matter what. By default, yarn.admin.acl is set to *, which means anyone can be the Admin.

Note: An empty value for the yarn.admin.acl is not considered a valid value by YARN and it will fall back on the value configured in the yarn-default.xml which will allow access to allow everyone


2) Now add below settings to fair-scheduler.xml

vi /opt/mapr/hadoop/hadoop-2.5.1/etc/hadoop/fair-scheduler.xml

<allocations>
<queue name="root">
<aclSubmitApps>mapr</aclSubmitApps>                                       <aclAdministerApps>mapr</aclAdministerApps>

<queue name="mapr">
<minResources>20000 mb,40 vcores,5 disks</minResources>
<maxResources>30000 mb,50 vcores,50 disks</maxResources>
<maxRunningApps>10</maxRunningApps>
<weight>1.0</weight>
<schedulingPolicy>fair</schedulingPolicy>
<aclSubmitApps>mapr</aclSubmitApps>
</queue>

<queue name="abizer">
<minResources>20000 mb,40 vcores,5 disks</minResources>
<maxResources>30000 mb,50 vcores,50 disks</maxResources>
<maxRunningApps>10</maxRunningApps>
<weight>1.0</weight>
<schedulingPolicy>fair</schedulingPolicy>
<aclSubmitApps>abizer</aclSubmitApps>
</queue>
</queue>
</allocations>

3)  Below output shows Queue ACL for each user where :
i) root user has no permission to submit or Administer any queue

[root@master hadoop]# hadoop queue -showacls
Queue acls for user :  root
Queue  Operations
=====================
root 
root.abizer 
root.default 
root.mapr 

Eg 1 [root@master ~]# yarn jar /opt/mapr/hadoop/hadoop-2.5.1/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.1-mapr-1501.jar pi 1 2
Number of Maps  = 1
Samples per Map = 2
Wrote input for Map #0
Starting Job
16/04/14 21:42:28 INFO input.FileInputFormat: Total input paths to process : 1
16/04/14 21:42:28 INFO mapreduce.JobSubmitter: number of splits:1
16/04/14 21:42:28 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1460696891161_0009
16/04/14 21:42:28 INFO security.ExternalTokenManagerFactory: Initialized external token manager class - com.mapr.hadoop.yarn.security.MapRTicketManager
16/04/14 21:42:28 INFO impl.YarnClientImpl: Submitted application application_1460696891161_0009
16/04/14 21:42:28 INFO mapreduce.JobSubmitter: Cleaning up the staging area maprfs:/var/mapr/cluster/yarn/rm/staging/root/.staging/job_1460696891161_0009
java.io.IOException: Failed to run job : User root cannot submit applications to queue root.root
            at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:321)

Eg 2 [root@master ~]# yarn jar /opt/mapr/hadoop/hadoop-2.5.1/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.1-mapr-1501.jar pi -Dmapreduce.job.queuename=root.abizer 1 2
Number of Maps  = 1
Samples per Map = 2
Wrote input for Map #0
Starting Job
16/04/14 21:42:13 INFO input.FileInputFormat: Total input paths to process : 1
16/04/14 21:42:13 INFO mapreduce.JobSubmitter: number of splits:1
16/04/14 21:42:14 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1460696891161_0008
16/04/14 21:42:14 INFO security.ExternalTokenManagerFactory: Initialized external token manager class - com.mapr.hadoop.yarn.security.MapRTicketManager
16/04/14 21:42:14 INFO impl.YarnClientImpl: Submitted application application_1460696891161_0008
16/04/14 21:42:14 INFO mapreduce.JobSubmitter: Cleaning up the staging area maprfs:/var/mapr/cluster/yarn/rm/staging/root/.staging/job_1460696891161_0008
java.io.IOException: Failed to run job : User root cannot submit applications to queue root.abizer


ii) Mapr user had permission to Administer any queue and can submit app to any queue

[mapr@master hadoop]$ hadoop queue -showacls
Queue acls for user :  mapr
Queue  Operations
=====================
root  ADMINISTER_QUEUE,SUBMIT_APPLICATIONS
root.abizer  ADMINISTER_QUEUE,SUBMIT_APPLICATIONS
root.mapr  ADMINISTER_QUEUE,SUBMIT_APPLICATIONS
root.root  ADMINISTER_QUEUE,SUBMIT_APPLICATIONS

[mapr@master root]$ yarn jar /opt/mapr/hadoop/hadoop-2.5.1/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.1-mapr-1501.jar pi -Dmapreduce.job.queuename=root.abizer 1 2
Number of Maps  = 1
Samples per Map = 2
Wrote input for Map #0
Starting Job
16/04/14 21:40:34 INFO input.FileInputFormat: Total input paths to process : 1
16/04/14 21:40:34 INFO mapreduce.JobSubmitter: number of splits:1
16/04/14 21:40:34 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1460696891161_0007
16/04/14 21:40:35 INFO security.ExternalTokenManagerFactory: Initialized external token manager class - com.mapr.hadoop.yarn.security.MapRTicketManager
16/04/14 21:40:35 INFO impl.YarnClientImpl: Submitted application application_1460696891161_0007
16/04/14 21:40:35 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1460696891161_0007/
16/04/14 21:40:35 INFO mapreduce.Job: Running job: job_1460696891161_0007
16/04/14 21:40:43 INFO mapreduce.Job: Job job_1460696891161_0007 running in uber mode : false
16/04/14 21:40:43 INFO mapreduce.Job:  map 0% reduce 0%
16/04/14 21:40:49 INFO mapreduce.Job:  map 100% reduce 0%
16/04/14 21:40:55 INFO mapreduce.Job:  map 100% reduce 100%
16/04/14 21:40:55 INFO mapreduce.Job: Job job_1460696891161_0007 completed successfully
16/04/14 21:40:55 INFO mapreduce.Job: Counters: 46
            File System Counters
                        FILE: Number of bytes read=0
                        FILE: Number of bytes written=162489
                        FILE: Number of read operations=0
                        FILE: Number of large read operations=0
                        FILE: Number of write operations=0
                        MAPRFS: Number of bytes read=336
                        MAPRFS: Number of bytes written=303
                        MAPRFS: Number of read operations=43
                        MAPRFS: Number of large read operations=0
                        MAPRFS: Number of write operations=59
            Job Counters
                        Launched map tasks=1
                        Launched reduce tasks=1
                        Data-local map tasks=1
                        Total time spent by all maps in occupied slots (ms)=4040
                        Total time spent by all reduces in occupied slots (ms)=10986
                        Total time spent by all map tasks (ms)=4040
                        Total time spent by all reduce tasks (ms)=3662
                        Total vcore-seconds taken by all map tasks=4040
                        Total vcore-seconds taken by all reduce tasks=3662
                        Total megabyte-seconds taken by all map tasks=4136960
                        Total megabyte-seconds taken by all reduce tasks=11249664
                        DISK_MILLIS_MAPS=2020
                        DISK_MILLIS_REDUCES=4870
            Map-Reduce Framework
                        Map input records=1
                        Map output records=2
                        Map output bytes=18
                        Map output materialized bytes=0
                        Input split bytes=134
                        Combine input records=0
                        Combine output records=0
                        Reduce input groups=2
                        Reduce shuffle bytes=24
                        Reduce input records=2
                        Reduce output records=0
                        Spilled Records=4
                        Shuffled Maps =1
                        Failed Shuffles=0
                        Merged Map outputs=2
                        GC time elapsed (ms)=40
                        CPU time spent (ms)=1000
                        Physical memory (bytes) snapshot=780922880
                        Virtual memory (bytes) snapshot=5446422528
                        Total committed heap usage (bytes)=722468864
            Shuffle Errors
                        IO_ERROR=0
            File Input Format Counters
                        Bytes Read=118
            File Output Format Counters
                        Bytes Written=97
Job Finished in 21.633 seconds
Estimated value of Pi is 4.00000000000000000000

iii) Abizer user had permission to submit app to “root.abizer“ queue and cannot administer any queue except itself (default behavior)

[abizer@master hadoop]$ hadoop queue -showacls
Queue acls for user :  abizer

Queue  Operations
=====================
root 
root.abizer  SUBMIT_APPLICATIONS
root.default 
root.mapr 


[abizer@master root]$ yarn jar /opt/mapr/hadoop/hadoop-2.5.1/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.1-mapr-1501.jar pi -Dmapreduce.job.queuename=root.mapr 1 2
Number of Maps  = 1
Samples per Map = 2
Wrote input for Map #0
Starting Job
16/04/14 21:38:29 INFO input.FileInputFormat: Total input paths to process : 1
16/04/14 21:38:29 INFO mapreduce.JobSubmitter: number of splits:1
16/04/14 21:38:30 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1460696891161_0006
16/04/14 21:38:30 INFO security.ExternalTokenManagerFactory: Initialized external token manager class - com.mapr.hadoop.yarn.security.MapRTicketManager
16/04/14 21:38:30 INFO impl.YarnClientImpl: Submitted application application_1460696891161_0006
16/04/14 21:38:30 INFO mapreduce.JobSubmitter: Cleaning up the staging area maprfs:/var/mapr/cluster/yarn/rm/staging/abizer/.staging/job_1460696891161_0006
java.io.IOException: Failed to run job : User abizer cannot submit applications to queue root.mapr

4) Audit logging :

The Resource Manager will log attempts to kill or move applications. These messages are logged by the RMAuditLogger as part of standard operational logging. 

For Eg : Mapr started application application_1460696891161_0018 and when user Abizer tries to kill the application you will see log message as below "User doesn't have permissions to MODIFY_APP"

2016-04-14 22:16:55,433 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=mapr     IP=10.10.70.112 OPERATION=Submit Application Request    TARGET=ClientRMService  RESULT=SUCCESS  APPID=application_1460696891161_0018

2016-04-14 22:17:24,489 WARN org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=abizer1  IP=10.10.70.112 OPERATION=Kill Application Request      TARGET=ClientRMService  RESULT=FAILURE  DESCRIPTION=Unauthorized user   PERMISSIONS=User doesn't have permissions to MODIFY_APP APPID=application_1460696891161_0018



To conclude Queue level setting (aclSubmitApps, aclAdministerApps) and YANR admin ACL setting (yarn.acl.enable ,yarn.admin.acl) both the setting have to be in place, even if you disable access through one way, if the user has permission through the queue setting or admin then the user will be able to kill other users job or submit job in another users queue.

Note :- Application Master Link might not be accessible in RM UI due to above change to make the link working please change hadoop.http.staticuser.user to mapr user and restart HS . 

To do so add below property to core-site.xml 

<property><name>hadoop.http.staticuser.user</name><value>mapr</value></property>