Monitoring TEZ jobs on 5.x MapR Version
This Blog describes installation of the Hive-on-Tez along with having a way to monitor Tea jobs ( manual steps.)
STEP I : Install /Configure Tez 0.8 on hive 2.1
Note : This blog expects you have a 5.2.2 unsecured cluster already setup with Hive 2.1 and Java 8
1) Create the
2) Setup repo to download tez packages followed by installation of tez package .
3) Upload the Tez libraries to the
4) Verify the upload.
5) Set the Tez environment variables. To set, open the
Step II : Monitoring for Tez jobs ( Manual Install )
This topic describes how to configure the timeline server to use the Hive-on-Tez user interface.
1) Install timeline server. ( RPM will be provided by MapR support )
<description>Indicate to clients whether Timeline service is enabled or not.
If enabled, the TimelineClient library used by end-users will post entities
and events to the Timeline server.</description>
<name>yarn.timeline-service.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.timeline-service.hostname</name>
<value> <hostname> </value>
</property>
<property>
<description>The setting that controls whether yarn system metrics is
published on the timeline server or not by RM.</description>
<name>yarn.resourcemanager.system-metrics-publisher.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.timeline-service.http-cross-origin.enabled</name>
<value>true</value>
</property>
This Blog describes installation of the Hive-on-Tez along with having a way to monitor Tea jobs ( manual steps.)
STEP I : Install /Configure Tez 0.8 on hive 2.1
Note : This blog expects you have a 5.2.2 unsecured cluster already setup with Hive 2.1 and Java 8
1) Create the
/apps/tez
directory on MapR-FS.
To create, run the following commands:
hadoop fs -mkdir /apps
hadoop fs -mkdir /apps/tez
2) Setup repo to download tez packages followed by installation of tez package .
i) Repo location.
[root@node107rhel72 ~]# cat /etc/yum.repos.d/mapr_eco.repo
[MapR_Ecosystem]
name=MapR Ecosystem Components
baseurl=http://package.mapr.com/releases/MEP/MEP-3.0.2/redhat
gpgcheck=1
enabled=1
protected=1
ii) Tez package install.
[root@node107rhel72 ~]# yum install mapr-tez
3) Upload the Tez libraries to the
tez
directory on MapR-FS.
To upload, run the following commands:
hadoop fs -put /opt/mapr/tez/tez-0.8 /apps/tez
hadoop fs -chmod -R 755 /apps/tez
4) Verify the upload.
hadoop fs -ls /apps/tez/tez-0.8
5) Set the Tez environment variables. To set, open the
/opt/mapr/hive/hive-2.1/conf/hive-env.sh
file, add the following lines, and save the file:/opt/mapr/hive/hive-2.1/conf/hive-site.xml
file, add the following lines, and save the file.<property>
<name>hive.execution.engine</name>
<value>tez</value>
</property>
Note: Repeat this step on each node where you want Hive on Tez to be configured. ( Usually edge node but since i am testing on 1 node cluster blog will only do this step once on one node).
Step II : Monitoring for Tez jobs ( Manual Install )
This topic describes how to configure the timeline server to use the Hive-on-Tez user interface.
1) Install timeline server. ( RPM will be provided by MapR support )
Note: Install the timeline server on a single node. The Hive-on-Tez user interface does not support High Availability (HA).
2) Install below maps-patch or later ( Get this from MapR support )
rpm -ivh mapr-patch-5.2.2.44680.GA-20180118212034.x86_64.rpm
3) Add the following entry to the /opt/mapr/hadoop/hadoop-2.7.0/etc/hadoop/yarn-env.sh file (on each node):
export YARN_TIMELINESERVER_OPTS="${YARN_TIMELINESERVER_OPTS} ${MAPR_LOGIN_OPTS}"
4) Add the following entry to the /opt/mapr/hadoop/hadoop-2.7.0/bin/yarn file after line "elif [ "$COMMAND" = "timelineserver" ] ; then" (on each node):
CLASSPATH=${CLASSPATH}:$MAPR_HOME/lib/JPam-1.1.jar
5) Edit the /opt/mapr/hadoop/hadoop-2.7.0/etc/hadoop/yarn-site.xml file (on each node):
<property><description>Indicate to clients whether Timeline service is enabled or not.
If enabled, the TimelineClient library used by end-users will post entities
and events to the Timeline server.</description>
<name>yarn.timeline-service.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.timeline-service.hostname</name>
<value> <hostname> </value>
</property>
<property>
<description>The setting that controls whether yarn system metrics is
published on the timeline server or not by RM.</description>
<name>yarn.resourcemanager.system-metrics-publisher.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.timeline-service.http-cross-origin.enabled</name>
<value>true</value>
</property>
6) Configuring Tomcat server : This topic describes how to configure and manage the Tomcat server used by the Hive-on-Tez user interface.
i) Extract the Tomcat server
cd $TEZ_HOME/tomcat/
sudo tar -zxvf tomcat.tar.gz -C $TEZ_HOME/tomcat
ii) Change the permissions for the tomcat directory to the user who will be running the Tomcat server:
sudo chown -R <$USER>:<$USER_GROUP> $TEZ_HOME/tomcat
iii) Configuring the Timeline Server Base URL and Resource Manager WEB URL
Replace TIME_LINE_BASE_URL with the real URL i.e 'http://10.10.70.107:8188'
Replace RM_WEB_URL with the real URL i.e 'http://10.10.70.107:8088'
[root@node107rhel72 ~]# grep -i url /opt/mapr/tez/tez-0.8/tomcat/apache-tomcat-9.0.1/webapps/tez-ui/scripts/configs.js
timelineBaseUrl: 'http://10.10.70.107:8188',
RMWebUrl: 'http://10.10.70.107:8088',
[root@node107rhel72 ~]#
Note: The timelineBaseUrl maps to the YARN Timeline Server, and the RMWebUrl maps to the YARN Resource Manager.
iv) Now restart Tomcat server :
To stop the Tomcat server, run this script:
$TEZ_HOME/tomcat/apache-tomcat-<version>/bin/shutdown.sh
To start the Tomcat server, run this script:
$TEZ_HOME/tomcat/apache-tomcat-<version>/bin/startup.sh
7) Integrating the Hive-on-Tez User Interface with Tez
Perform these actions on each of the nodes where you have Hive-on-Tez configured.
i) Add the following entry to the /opt/mapr/tez/tez-<version>/conf/tez-site.xml file, replacing <hostname>:<port> with the real host name. Use 9383 for the port. 9383 is the default Tomcat port for the Hive-on-Tez user interface.
<property>
<description>Enable Tez to use the Timeline Server for History Logging</description>
<name>tez.history.logging.service.class</name>
<value>org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService</value>
</property>
<property>
<description>URL for where the Tez UI is hosted</description>
<name>tez.tez-ui.history-url.base</name>
<value>http://<hostname>:<port>/tez-ui/</value>
</property>
<description>Enable Tez to use the Timeline Server for History Logging</description>
<name>tez.history.logging.service.class</name>
<value>org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService</value>
</property>
<property>
<description>URL for where the Tez UI is hosted</description>
<name>tez.tez-ui.history-url.base</name>
<value>http://<hostname>:<port>/tez-ui/</value>
</property>
8) Ideally when doing the configuration warden should be down . Incase you have cluster up and just adding Yarn Timeline service on one node below services need to be restarted .
Restart the resource manager:
maprcli node services -name resourcemanager -action restart -nodes <hostname>
Restart the timeline server :
maprcli node services -name timelineserver -action start -nodes <nodename>
Validation : Test job .
hive> create table testtez (a1 int);
OK
Time taken: 1.001 seconds
hive> insert into testtez values (1);
Query ID = mapr_20180127013145_0aa28027-ceeb-4b85-9118-ed82e431bdb4
Total jobs = 1
Launching Job 1 out of 1
Status: Running (Executing on YARN cluster with App id application_1517043976772_0001)
----------------------------------------------------------------------------------------------
VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
----------------------------------------------------------------------------------------------
Map 1 .......... container SUCCEEDED 1 1 0 0 0 0
----------------------------------------------------------------------------------------------
VERTICES: 01/01 [==========================>>] 100% ELAPSED TIME: 8.98 s
----------------------------------------------------------------------------------------------
Loading data to table default.testtez
OK
Time taken: 13.09 seconds
hive>
Application Timeline server :