Monday, March 21, 2016

Hbase table copy across secure MapR clusters

                                   Hbase table copy across secure MapR clusters
   
This Blog assumes you have 2 clusters up and running securely

Note :-
Source Cluster A - Node 10.10.70.111
Destination Cluster B - Node 10.10.70.109  

Also, in Secure cluster we need to specify hbase to use maprsecurity when submitting the map reduce job to do so edit  /opt/mapr/conf/env.sh file as below ( All nodes).


#export MAPR_HBASE_CLIENT_OPTS="${SIMPLE_LOGIN_OPTS} -Dzookeeper.sasl.client=false"
export MAPR_HBASE_CLIENT_OPTS="${MAPR_LOGIN_OPTS}"

Stop all services on destination clusters ( Cluster B ) and perform the following steps:-

1. Merge truststore on destination . 

scp /opt/mapr/conf/ssl_truststore 10.10.70.109:/tmp/     ( Copy truststore from source (A) to destination(B) cluster )

chmod 644 /opt/mapr/conf/ssl_truststore                         ( make truststore writable )

( Merge source SSL_Truststore with destination )
/opt/mapr/server/manageSSLKeys.sh merge /tmp/ssl_truststore /opt/mapr/conf/ssl_truststore

chmod 400 /opt/mapr/conf/ssl_truststore                        ( Make truststore readonly )

2. Remove existing cldb.key , maprserverticket and 'mapruserticket' from cluster B. 

rm -f /opt/mapr/conf/cldb.key ; rm -f /opt/mapr/conf/maprserverticket; 
rm -f /opt/mapr/conf/mapruserticket

3. copy cldb.key from cluster A to B (to cldb & ZK nodes)

scp /opt/mapr/conf/cldb.key  10.10.70.109:/opt/mapr/conf/ 

4. Make a copy of 'maprserverticket' on cluster A

cp /opt/mapr/conf/maprserverticket /opt/mapr/conf/maprserverticket_copy

5. Edit copied 'maprserverticket_copy' and modify the cluster name (first portion) to name of cluster B. (retain the remaining part as is). 

i.e
$ cat /opt/mapr/conf/maprserverticket 
Source tXDbAlh2d6+TJZPjGpXcWo/LRHWqPV6iMo58iQnSUSBpvx+daM7kM64ww/Fpow934VVnH7acdMBfV2fxipno49LsSryEcC1aEnMVHFw6Sbwifrr8PURddkhrd6kEO11+JSFgFdF4qYXfmuQGZAIHX+OgORkztRhdF0AKWEPif38+OiuFXJrqxzchms/FoqUYQ9o50bTeuZ82zieG4Z6/sR/CFbtrsko8EA9pH9xHu9Od5M6PauSBWOj4J+nwpJVTOKT4PlM3kuk/0Z7IVg==

$ cat /opt/mapr/conf/maprserverticket_copy 
Dest tXDbAlh2d6+TJZPjGpXcWo/LRHWqPV6iMo58iQnSUSBpvx+daM7kM64ww/Fpow934VVnH7acdMBfV2fxipno49LsSryEcC1aEnMVHFw6Sbwifrr8PURddkhrd6kEO11+JSFgFdF4qYXfmuQGZAIHX+OgORkztRhdF0AKWEPif38+OiuFXJrqxzchms/FoqUYQ9o50bTeuZ82zieG4Z6/sR/CFbtrsko8EA9pH9xHu9Od5M6PauSBWOj4J+nwpJVTOKT4PlM3kuk/0Z7IVg==
-bash-4.1$


6. Copy the modified maprserverticket to cluster B (all nodes)

scp /opt/mapr/conf/maprserverticket_copy 10.10.70.109:/opt/mapr/conf/maprserverticket

7 Make sure the permission of cldb.key and maprserverticket are correct (owned by mapr:mapr).

[mapr@410-Dest ~]$ ls -l /opt/mapr/conf/cldb.key 
-rw------- 1 mapr mapr 89 Mar 18 15:51 /opt/mapr/conf/cldb.key
[mapr@410-Dest ~]$ ls -l /opt/mapr/conf/maprserverticket 
-rw------- 1 mapr mapr 282 Mar 18 16:11 /opt/mapr/conf/maprserverticket

8 Start zookeeper & warden on respective nodes. 

9 Make sure the services are up & running on cluster B and attempt CopyTable from source.

Run copy table command on source cluster :

 hbase org.apache.hadoop.hbase.mapreduce.CopyTable --peer.adr=10.10.70.109:5181:/hbase weblog
2016-03-18 16:28:39,679 INFO  [main] Configuration.deprecation: io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2016-03-18 16:28:39,874 INFO  [main] zookeeper.ZooKeeper: Client environment:zookeeper.version=3.4.5-mapr-1503--1, built on 03/26/2015 18:33 GMT
2016-03-18 16:28:39,874 INFO  [main] zookeeper.ZooKeeper: Client environment:java.version=1.7.0_45

2016-03-18 16:28:39,887 INFO  [main] zookeeper.ZooKeeper: Client environment:user.name=mapr
2016-03-18 16:28:39,887 INFO  [main] zookeeper.ZooKeeper: Client environment:user.home=/home/mapr
2016-03-18 16:28:39,887 INFO  [main] zookeeper.ZooKeeper: Client environment:user.dir=/opt/mapr/zookeeper/zookeeper-3.4.5/logs
2016-03-18 16:28:39,888 INFO  [main] zookeeper.ZooKeeper: Initiating client connection, connectString=10.10.70.109:5181 sessionTimeout=30000 watcher=com.mapr.util.zookeeper.ZKDataRetrieval@2e14ec19
2016-03-18 16:28:39,914 INFO  [main-SendThread(410-Dest:5181)] zookeeper.Login: successfully logged in.
2016-03-18 16:28:39,915 INFO  [main-SendThread(410-Dest:5181)] client.ZooKeeperSaslClient: Client will use MAPR-SECURITY as SASL mechanism.
2016-03-18 16:28:39,918 INFO  [main-SendThread(410-Dest:5181)] zookeeper.ClientCnxn: Opening socket connection to server 410-Dest/10.10.70.111:5181. Will attempt to SASL-authenticate using Login Context section 'Client'
2016-03-18 16:28:39,944 INFO  [main-SendThread(410-Dest:5181)] zookeeper.ClientCnxn: Session establishment complete on server 410-Dest/10.10.70.109:5181, sessionid = 0x1538c23bd930050, negotiated timeout = 30000
2016-03-18 16:28:39,947 INFO  [main] zookeeper.ZKDataRetrieval: Connected to ZK: 10.10.70.111:5181
2016-03-18 16:28:39,947 INFO  [main] zookeeper.ZKDataRetrieval: Getting serviceData for master node of resourcemanager
2016-03-18 16:28:39,965 INFO  [main-EventThread] zookeeper.ZKDataRetrieval: Process path: null. Event state: SaslAuthenticated. Event type: None
2016-03-18 16:28:39,975 INFO  [main] client.MapRZKBasedRMFailoverProxyProvider: Updated RM address to 410-Dest/10.10.70.109:8032
2016-03-18 16:28:40,279 INFO  [main] Configuration.deprecation: io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2016-03-18 16:28:40,373 INFO  [main] zookeeper.RecoverableZooKeeper: Process identifier=hconnection-0x74727a94 connecting to ZooKeeper ensemble=10.10.70.111:5181
2016-03-18 16:28:40,374 INFO  [main] zookeeper.ZooKeeper: Initiating client connection, connectString=10.10.70.111:5181 sessionTimeout=90000 watcher=hconnection-0x74727a940x0, quorum=10.10.70.111:5181, baseZNode=/hbase
2016-03-18 16:28:40,375 INFO  [main-SendThread(410-Source:5181)] client.ZooKeeperSaslClient: Client will use MAPR-SECURITY as SASL mechanism.
2016-03-18 16:28:40,376 INFO  [main-SendThread(410-Source:5181)] zookeeper.ClientCnxn: Opening socket connection to server 410-Source/10.10.70.111:5181. Will attempt to SASL-authenticate using Login Context section 'Client'
2016-03-18 16:28:40,383 INFO  [main-SendThread(410-Source:5181)] zookeeper.ClientCnxn: Socket connection established to 410-Source/10.10.70.111:5181, initiating session
2016-03-18 16:28:40,409 INFO  [main-SendThread(410-Source:5181)] zookeeper.ClientCnxn: Session establishment complete on server 410-Source/10.10.70.111:5181, sessionid = 0x1538bc340af0076, negotiated timeout = 40000
2016-03-18 16:28:40,435 INFO  [main] mapreduce.TableOutputFormat: Created table instance for weblog
2016-03-18 16:28:41,032 INFO  [main] zookeeper.RecoverableZooKeeper: Process identifier=hconnection-0x20828fe4 connecting to ZooKeeper ensemble=10.10.70.109:5181
2016-03-18 16:28:41,032 INFO  [main] zookeeper.ZooKeeper: Initiating client connection, connectString=10.10.70.109:5181 sessionTimeout=90000 watcher=hconnection-0x20828fe40x0, quorum=10.10.70.109:5181, baseZNode=/hbase
2016-03-18 16:28:41,033 INFO  [main-SendThread(410-Source:5181)] client.ZooKeeperSaslClient: Client will use MAPR-SECURITY as SASL mechanism.
2016-03-18 16:28:41,033 INFO  [main-SendThread(410-Source:5181)] zookeeper.ClientCnxn: Opening socket connection to server 410-Dest/10.10.70.111:5181. Will attempt to SASL-authenticate using Login Context section 'Client'
2016-03-18 16:28:41,034 INFO  [main-SendThread(410-Source:5181)] zookeeper.ClientCnxn: Socket connection established to 410-Dest/10.10.70.111:5181, initiating session
2016-03-18 16:28:41,067 INFO  [main-SendThread(410-Source:5181)] zookeeper.ClientCnxn: Session establishment complete on server 410-Source/10.10.70.111:5181, sessionid = 0x1538c23bd930051, negotiated timeout = 40000
2016-03-18 16:28:41,115 INFO  [main] util.RegionSizeCalculator: Calculating region sizes for table "weblog".
2016-03-18 16:28:41,500 INFO  [main] mapreduce.JobSubmitter: number of splits:1
2016-03-18 16:28:41,511 INFO  [main] Configuration.deprecation: io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2016-03-18 16:28:41,765 INFO  [main] mapreduce.JobSubmitter: Submitting tokens for job: job_1458345179933_0003
2016-03-18 16:28:41,964 INFO  [main] security.ExternalTokenManagerFactory: Initialized external token manager class - com.mapr.hadoop.yarn.security.MapRTicketManager
2016-03-18 16:28:42,001 INFO  [main] impl.YarnClientImpl: Submitted application application_1458345179933_0003
2016-03-18 16:28:42,063 INFO  [main] mapreduce.Job: The url to track the job: https://410-Source:8090/proxy/application_1458345179933_0003/
2016-03-18 16:28:42,064 INFO  [main] mapreduce.Job: Running job: job_1458345179933_0003
2016-03-18 16:28:53,332 INFO  [main] mapreduce.Job: Job job_1458345179933_0003 running in uber mode : false
2016-03-18 16:28:53,334 INFO  [main] mapreduce.Job:  map 0% reduce 0%
2016-03-18 16:28:59,602 INFO  [main] mapreduce.Job:  map 100% reduce 0%
2016-03-18 16:28:59,622 INFO  [main] mapreduce.Job: Job job_1458345179933_0003 completed successfully
2016-03-18 16:28:59,786 INFO  [main] mapreduce.Job: Counters: 41
            File System Counters
                        FILE: Number of bytes read=0
                        FILE: Number of bytes written=111880
                        FILE: Number of read operations=0
                        FILE: Number of large read operations=0
                        FILE: Number of write operations=0
                        MAPRFS: Number of bytes read=66
                        MAPRFS: Number of bytes written=0
                        MAPRFS: Number of read operations=11
                        MAPRFS: Number of large read operations=0
                        MAPRFS: Number of write operations=0
            Job Counters 
                        Launched map tasks=1
                        Data-local map tasks=1
                        Total time spent by all maps in occupied slots (ms)=4698
                        Total time spent by all reduces in occupied slots (ms)=0
                        Total time spent by all map tasks (ms)=4698
                        Total vcore-seconds taken by all map tasks=4698
                        Total megabyte-seconds taken by all map tasks=4810752
                        DISK_MILLIS_MAPS=2349
            Map-Reduce Framework
                        Map input records=2
                        Map output records=2
                        Input split bytes=66
                        Spilled Records=0
                        Failed Shuffles=0
                        Merged Map outputs=0
                        GC time elapsed (ms)=51
                        CPU time spent (ms)=1400
                        Physical memory (bytes) snapshot=223748096
                        Virtual memory (bytes) snapshot=1834762240
                        Total committed heap usage (bytes)=159383552
            HBase Counters
                        BYTES_IN_REMOTE_RESULTS=0
                        BYTES_IN_RESULTS=102
                        MILLIS_BETWEEN_NEXTS=562
                        NOT_SERVING_REGION_EXCEPTION=0
                        NUM_SCANNER_RESTARTS=0
                        REGIONS_SCANNED=1
                        REMOTE_RPC_CALLS=0
                        REMOTE_RPC_RETRIES=0
                        RPC_CALLS=3
                        RPC_RETRIES=0
            File Input Format Counters 
                        Bytes Read=0
            File Output Format Counters 
                                   Bytes Written=0

No comments:

Post a Comment