Run Pig Jobs with Oozie
This Blog assumes below Oozie and Pig package are already installed on running MapR cluster and Steps from blog 1 is already followed.
mapr-oozie-4.1.0.201606271017-1.noarch
mapr-oozie-internal-4.1.0.201606271017-1.noarch
mapr-pig-0.14.201608040131-1.noarch
http://abizeradenwala.blogspot.com/2015/07/installing-oozie-and-running-sample-job.html
Since Oozie is current bundled with Pig v0.12 we will need below steps for oozie pig action to work.
/opt/mapr/oozie/oozie-4.1.0/share1/lib/pig/pig-withouthadoop-0.12.1-mapr-1408-h2.jar
/opt/mapr/oozie/oozie-4.1.0/share1/lib/pig-2/pig-withouthadoop-0.12.1-mapr-1408-h2.jar
1) The Oozie
share/lib
directory has two sets of JAR
files for Pig. We will use the Pig JAR
files from the share/lib/pig-2
directory
with MapR distribution versions 4.0.0 and later.
To specify the
2) Stop Oozie:
JAR
files for a given Pig job, add the following section to the workflow.xml
file:<name>oozie.action.sharelib.
for
.pig</name>
<value>pig-
2
</value>
2) Stop Oozie:
maprcli node services -name oozie -action stop -nodes <nodes>
3) Remove all files located within the /opt/mapr/oozie/oozie<version>/share2/lib/pig*/ directory EXCEPT the oozie-sharelib-pig-<version>-mapr.jar file.
Now copy new Pig jars to share lib location,
cp <PIG_HOME>/pig-core-h2.jar <OOZIE_HOME>/share2/lib/pig-2/
cp <PIG_HOME>/lib/* <OOZIE_HOME>/share2/lib/pig-2/
4) Remove the zookeeper jars .
rm -rf <OOZIE_HOME>/share2/lib/pig-2/zookeeper*.jar
5) Now move all the old jars in latest share lib in MaprFS to temp location.
hadoop fs -mv /oozie/share/lib/lib_20160804181903/pig-2/* /abizer
And now copy latest jars into share lib in MaprFS
hadoop fs -put /opt/mapr/oozie/oozie-4.1.0//share2/lib/pig-2/* /oozie/share/lib/lib_20160804181903/pig-2
6) Copy work-flow.xml to maprfs which is specified in job.properties file
hadoop fs -put workflow.xml /user/mapr/examples/apps/pig/workflow.xml
Example of my workflow.xml
[mapr@node3 pig-2]$ cat /opt/mapr/oozie/oozie-4.1.0/examples/apps/pig/workflow.xml
<workflow-app xmlns="uri:oozie:workflow:0.2" name="pig-wf">
<start to="pig-node"/>
<action name="pig-node">
<pig>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<prepare>
<delete path="${nameNode}/user/${wf:user()}/output-data/pig"/>
</prepare>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
<property>
<name>mapred.compress.map.output</name>
<value>true</value>
</property>
<property>
<name>oozie.action.sharelib.for.pig</name>
<value>pig-2</value>
</property>
</configuration>
<script>id.pig</script>
<param>INPUT=/user/${wf:user()}/input-data/text</param>
<param>OUTPUT=/user/${wf:user()}/output-data/pig</param>
</pig>
<ok to="end"/>
<error to="fail"/>
</action>
<kill name="fail">
<message>Pig failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end"/>
</workflow-app>
7 ) Start Oozie:
maprcli node services -name oozie -action start -nodes <nodes>
8) As user MapR i am running sample workflow.
[mapr@node3 root]$ /opt/mapr/oozie/oozie-4.1.0/bin/oozie job -oozie="http://localhost:11000/oozie" -config /opt/mapr/oozie/oozie-4.1.0/examples/apps/pig/job.properties -run
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/mapr/oozie/oozie-4.1.0/lib/slf4j-simple-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/mapr/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.SimpleLoggerFactory]
job: 0000000-160805145748028-oozie-mapr-W
9) On checking the status pig wf was successfully executed by Oozie.
[mapr@node3 root]$ /opt/mapr/oozie/oozie-4.1.0/bin/oozie job -info 0000000-160805145748028-oozie-mapr-W -oozie="http://localhost:11000/oozie"
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/mapr/oozie/oozie-4.1.0/lib/slf4j-simple-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/mapr/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.SimpleLoggerFactory]
Job ID : 0000000-160805145748028-oozie-mapr-W
------------------------------------------------------------------------------------------------------------------------------------
Workflow Name : pig-wf
App Path : maprfs:/user/mapr/examples/apps/pig
Status : SUCCEEDED
Run : 0
User : mapr
Group : -
Created : 2016-08-05 18:58 GMT
Started : 2016-08-05 18:58 GMT
Last Modified : 2016-08-05 18:59 GMT
Ended : 2016-08-05 18:59 GMT
CoordAction ID: -
Actions
------------------------------------------------------------------------------------------------------------------------------------
ID Status Ext ID Ext Status Err Code
------------------------------------------------------------------------------------------------------------------------------------
0000000-160805145748028-oozie-mapr-W@:start: OK - OK -
------------------------------------------------------------------------------------------------------------------------------------
0000000-160805145748028-oozie-mapr-W@pig-node OK job_1470423487588_0001 SUCCEEDED -
------------------------------------------------------------------------------------------------------------------------------------
0000000-160805145748028-oozie-mapr-W@end OK - OK -
------------------------------------------------------------------------------------------------------------------------------------
[mapr@node3 root]$
No comments:
Post a Comment