Configuring the Fair Scheduler with ACL on MapR Cluster
This blog assumes you have MapR 4.0.2 un-secure cluster installed.
1) Add below lines to
yarn-site.xml on all Resource Managers nodes followed by restart the RM.
vi /opt/mapr/hadoop/hadoop-2.5.1/etc/hadoop/yarn-site.xml
<property><name>yarn.admin.acl</name><value>mapr</value></property>
<property><name>yarn.acl.enable</name><value>true</value></property>
With this setting MapR is
administrator for yarn cluster and can kill any job and submit jobs no matter what. By default, yarn.admin.acl is set to *, which means anyone can be the Admin.
Note: An empty value for the yarn.admin.acl is not considered a valid value by YARN and it will fall back on the value configured in the yarn-default.xml which will allow access to allow everyone
2) Now add below settings to fair-scheduler.xml
vi /opt/mapr/hadoop/hadoop-2.5.1/etc/hadoop/fair-scheduler.xml
<allocations>
<queue name="root">
<aclSubmitApps>mapr</aclSubmitApps> <aclAdministerApps>mapr</aclAdministerApps>
<queue name="mapr">
<minResources>20000 mb,40 vcores,5
disks</minResources>
<maxResources>30000 mb,50 vcores,50
disks</maxResources>
<maxRunningApps>10</maxRunningApps>
<weight>1.0</weight>
<schedulingPolicy>fair</schedulingPolicy>
<aclSubmitApps>mapr</aclSubmitApps>
</queue>
<queue name="abizer">
<minResources>20000 mb,40 vcores,5
disks</minResources>
<maxResources>30000 mb,50 vcores,50
disks</maxResources>
<maxRunningApps>10</maxRunningApps>
<weight>1.0</weight>
<schedulingPolicy>fair</schedulingPolicy>
<aclSubmitApps>abizer</aclSubmitApps>
</queue>
</queue>
</allocations>
3) Below output shows
Queue ACL for each user where :
i) root user has no permission to submit or Administer any
queue
[root@master hadoop]# hadoop queue -showacls
Queue acls for user :
root
Queue Operations
=====================
root
root.abizer
root.default
root.mapr
Eg 1 [root@master ~]# yarn jar
/opt/mapr/hadoop/hadoop-2.5.1/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.1-mapr-1501.jar
pi 1 2
Number of Maps = 1
Samples per Map = 2
Wrote input for Map #0
Starting Job
16/04/14 21:42:28 INFO input.FileInputFormat: Total input
paths to process : 1
16/04/14 21:42:28 INFO mapreduce.JobSubmitter: number of
splits:1
16/04/14 21:42:28 INFO mapreduce.JobSubmitter: Submitting
tokens for job: job_1460696891161_0009
16/04/14 21:42:28 INFO security.ExternalTokenManagerFactory:
Initialized external token manager class -
com.mapr.hadoop.yarn.security.MapRTicketManager
16/04/14 21:42:28 INFO impl.YarnClientImpl: Submitted
application application_1460696891161_0009
16/04/14 21:42:28 INFO mapreduce.JobSubmitter: Cleaning up
the staging area
maprfs:/var/mapr/cluster/yarn/rm/staging/root/.staging/job_1460696891161_0009
java.io.IOException: Failed to run job : User root cannot
submit applications to queue root.root
at
org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:321)
Eg 2 [root@master ~]# yarn jar
/opt/mapr/hadoop/hadoop-2.5.1/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.1-mapr-1501.jar
pi -Dmapreduce.job.queuename=root.abizer 1 2
Number of Maps = 1
Samples per Map = 2
Wrote input for Map #0
Starting Job
16/04/14 21:42:13 INFO input.FileInputFormat: Total input
paths to process : 1
16/04/14 21:42:13 INFO mapreduce.JobSubmitter: number of
splits:1
16/04/14 21:42:14 INFO mapreduce.JobSubmitter: Submitting
tokens for job: job_1460696891161_0008
16/04/14 21:42:14 INFO security.ExternalTokenManagerFactory:
Initialized external token manager class - com.mapr.hadoop.yarn.security.MapRTicketManager
16/04/14 21:42:14 INFO impl.YarnClientImpl: Submitted
application application_1460696891161_0008
16/04/14 21:42:14 INFO mapreduce.JobSubmitter: Cleaning up
the staging area maprfs:/var/mapr/cluster/yarn/rm/staging/root/.staging/job_1460696891161_0008
java.io.IOException: Failed to run job : User root cannot
submit applications to queue root.abizer
ii) Mapr user had permission to Administer any queue and can submit app to any queue
[mapr@master hadoop]$ hadoop queue -showacls
Queue acls for user :
mapr
Queue Operations
=====================
root ADMINISTER_QUEUE,SUBMIT_APPLICATIONS
root.abizer ADMINISTER_QUEUE,SUBMIT_APPLICATIONS
root.mapr ADMINISTER_QUEUE,SUBMIT_APPLICATIONS
root.root ADMINISTER_QUEUE,SUBMIT_APPLICATIONS
root.abizer ADMINISTER_QUEUE,SUBMIT_APPLICATIONS
root.mapr ADMINISTER_QUEUE,SUBMIT_APPLICATIONS
root.root ADMINISTER_QUEUE,SUBMIT_APPLICATIONS
[mapr@master root]$ yarn jar /opt/mapr/hadoop/hadoop-2.5.1/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.1-mapr-1501.jar
pi -Dmapreduce.job.queuename=root.abizer 1 2
Number of Maps = 1
Samples per Map = 2
Wrote input for Map #0
Starting Job
16/04/14 21:40:34 INFO input.FileInputFormat: Total input
paths to process : 1
16/04/14 21:40:34 INFO mapreduce.JobSubmitter: number of
splits:1
16/04/14 21:40:34 INFO mapreduce.JobSubmitter: Submitting
tokens for job: job_1460696891161_0007
16/04/14 21:40:35 INFO security.ExternalTokenManagerFactory:
Initialized external token manager class -
com.mapr.hadoop.yarn.security.MapRTicketManager
16/04/14 21:40:35 INFO impl.YarnClientImpl: Submitted
application application_1460696891161_0007
16/04/14 21:40:35 INFO mapreduce.Job: The url to track the
job: http://master:8088/proxy/application_1460696891161_0007/
16/04/14 21:40:35 INFO mapreduce.Job: Running job:
job_1460696891161_0007
16/04/14 21:40:43 INFO mapreduce.Job: Job
job_1460696891161_0007 running in uber mode : false
16/04/14 21:40:43 INFO mapreduce.Job: map 0% reduce 0%
16/04/14 21:40:49 INFO mapreduce.Job: map 100% reduce 0%
16/04/14 21:40:55 INFO mapreduce.Job: map 100% reduce 100%
16/04/14 21:40:55 INFO mapreduce.Job: Job
job_1460696891161_0007 completed successfully
16/04/14 21:40:55 INFO mapreduce.Job: Counters: 46
File System
Counters
FILE:
Number of bytes read=0
FILE:
Number of bytes written=162489
FILE:
Number of read operations=0
FILE:
Number of large read operations=0
FILE:
Number of write operations=0
MAPRFS:
Number of bytes read=336
MAPRFS:
Number of bytes written=303
MAPRFS:
Number of read operations=43
MAPRFS:
Number of large read operations=0
MAPRFS:
Number of write operations=59
Job
Counters
Launched
map tasks=1
Launched
reduce tasks=1
Data-local
map tasks=1
Total
time spent by all maps in occupied slots (ms)=4040
Total
time spent by all reduces in occupied slots (ms)=10986
Total
time spent by all map tasks (ms)=4040
Total
time spent by all reduce tasks (ms)=3662
Total
vcore-seconds taken by all map tasks=4040
Total
vcore-seconds taken by all reduce tasks=3662
Total
megabyte-seconds taken by all map tasks=4136960
Total
megabyte-seconds taken by all reduce tasks=11249664
DISK_MILLIS_MAPS=2020
DISK_MILLIS_REDUCES=4870
Map-Reduce
Framework
Map
input records=1
Map
output records=2
Map
output bytes=18
Map
output materialized bytes=0
Input
split bytes=134
Combine
input records=0
Combine
output records=0
Reduce
input groups=2
Reduce
shuffle bytes=24
Reduce
input records=2
Reduce
output records=0
Spilled
Records=4
Shuffled
Maps =1
Failed
Shuffles=0
Merged
Map outputs=2
GC
time elapsed (ms)=40
CPU
time spent (ms)=1000
Physical
memory (bytes) snapshot=780922880
Virtual
memory (bytes) snapshot=5446422528
Total
committed heap usage (bytes)=722468864
Shuffle
Errors
IO_ERROR=0
File Input
Format Counters
Bytes
Read=118
File Output
Format Counters
Bytes
Written=97
Job Finished in 21.633 seconds
Estimated value of Pi is 4.00000000000000000000
iii) Abizer user had permission to submit app to
“root.abizer“ queue and cannot administer any queue except itself (default
behavior)
[abizer@master hadoop]$ hadoop queue -showacls
Queue acls for user :
abizer
Queue Operations
=====================
root
root.abizer
SUBMIT_APPLICATIONS
root.default
root.mapr
[abizer@master root]$ yarn jar
/opt/mapr/hadoop/hadoop-2.5.1/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.1-mapr-1501.jar
pi -Dmapreduce.job.queuename=root.mapr 1 2
Number of Maps = 1
Samples per Map = 2
Wrote input for Map #0
Starting Job
16/04/14 21:38:29 INFO input.FileInputFormat: Total input
paths to process : 1
16/04/14 21:38:29 INFO mapreduce.JobSubmitter: number of
splits:1
16/04/14 21:38:30 INFO mapreduce.JobSubmitter: Submitting
tokens for job: job_1460696891161_0006
16/04/14 21:38:30 INFO security.ExternalTokenManagerFactory:
Initialized external token manager class -
com.mapr.hadoop.yarn.security.MapRTicketManager
16/04/14 21:38:30 INFO impl.YarnClientImpl: Submitted
application application_1460696891161_0006
16/04/14 21:38:30 INFO mapreduce.JobSubmitter: Cleaning up
the staging area
maprfs:/var/mapr/cluster/yarn/rm/staging/abizer/.staging/job_1460696891161_0006
java.io.IOException: Failed to run job : User abizer cannot
submit applications to queue root.mapr
4) Audit logging :
The Resource Manager will log attempts to kill or move applications. These messages are logged by the RMAuditLogger as part of standard operational logging.
For Eg : Mapr started application application_1460696891161_0018 and when user Abizer tries to kill the application you will see log message as below "User doesn't have permissions to MODIFY_APP"
2016-04-14 22:16:55,433 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=mapr IP=10.10.70.112 OPERATION=Submit Application Request TARGET=ClientRMService RESULT=SUCCESS APPID=application_1460696891161_0018
To conclude Queue level setting (aclSubmitApps,
aclAdministerApps) and YANR admin ACL setting (yarn.acl.enable ,yarn.admin.acl) both
the setting have to be in place, even if you disable access through one way, if
the user has permission through the queue setting or admin then the user will
be able to kill other users job or submit job in another users queue.
Note :- Application Master Link might not be accessible in RM UI due to above change to make the link working please change hadoop.http.staticuser.user to mapr user and restart HS .
To do so add below property to core-site.xml
<property><name>hadoop.http.staticuser.user</name><value>mapr</value></property>