Wednesday, March 30, 2016

MapR-DB Table Replication on Secure Cluster

                                         MapR-DB Table Replication on Secure cluster
This Blog assumes you have 2 clusters up and running securely

Note :-
Source Cluster  - Node
Destination Cluster  - Node  

Steps to setup mapr-db table replication in secure environment are as below. 

1) On all node in SOURCE CLUSTER verify that maprserverticket , cldb.key , ssl_truststore, ssl_keystore are same. Run md5sum on these file on each node to confirm. 
2) On all node in DESTINATION CLUSTER verify that maprserverticket , cldb.key , ssl_truststore, ssl_keystore are same. Run md5sum on these file on each node to confirm. 
3) Copy /opt/mapr/conf/ssl_truststore from DESTINATION CLUSTER to cldb node of SOURCE CLUSTER under /tmp/ and run the below command to merge ssl_truststore on 

Note: Ignore ssl_truststore merge step if in case you have already done it earlier. 
$ chmod 644 /opt/mapr/conf/ssl_truststore 
$ /opt/mapr/server/ merge /tmp/ssl_truststore /opt/mapr/conf/ssl_truststore 
$ chmod 444 /opt/mapr/conf/ssl_truststore 

4) Copy the merged truststore file '/opt/mapr/conf/ssl_truststore' on all the node in SOURCE CLUSTER under /opt/mapr/conf/ 
5) Generate cross-cluster ticket from DESTINATION CLUSTER , in this case i created ticket to last for 10 years

$ maprlogin generateticket -type crosscluster -out /tmp/destination-ticket -duration 3650:0:0 

Note: - It is critical to specify an appropriate value for the duration. After the ticket expires, communication between the clusters will stop. In this example, the duration of ten years is given for convenience of explanation. Use a value that is consistent with your security policies.

6) Copy file /tmp/destination-ticket from DESTINATION CLUSTER to SOURCE CLUSTER's cldb node under /tmp. 
7) At SOURCE CLUSTER append the content of file /tmp/destination-ticket in /opt/mapr/conf/maprserverticket .

$ cat /tmp/destination-ticket >> /opt/mapr/conf/maprserverticket 

8) Copy file /opt/mapr/conf/maprserverticket on all the nodes in SOURCE CLUSTER . 

9) Stop zookeeper and warden in SOURCE CLUSTER followed by starting ZK and then warden once ZK is up

10) On SOURCE CLUSTER create user ticket for user mapr for source and destination cluster .

maprlogin password
maprlogin password -cluster Dest

cat /tmp/maprticket_2000
Source KV34qQ0jtmQXObJglDiZqqHHm507pbYOsHd4qIEEavC+0PGDlB/YeTBGReOxf+EleSEO78pYvNqzoqK5uK+5Gibx0v+XPEyl2UuDgBR6GUBwx4yUUxnUY7Ct4STdcHmvcyE47AVM4gXc9ivQCvkokyIvZwYiGtwVQ8rnTNrLuzuUPAH8GMbR486UgMQ8axy8QIcA2zexIT0K0Ct7Fj612UPVonXZDfnAB2yG5gEhdmxLOMPmQLm9qt6f49Pzrn96IwHGLXQtUAmfrTwrbPPPOSUshA==
Dest 4D9Z469Y3j7h3sy2CVZwQrlXDEWHCtmCENQQGFvVzoGsytXp4K3OLOf+BZhLIoTBZuu2uzmV/1SbnqYUfO9NXsxAx3Bomez9iZ3ni7Kfk9m9CTEPydl9updp8IFQZ83jQ7IERM3WgN/rouEg3T/BnwPA2+U2cnGjeeCgXH3lmopJGiYFCegXWhhn9TmKawH0Vp4f3tDBBo2nWjr1sCnBvsBXhYP6DQzA3vLdmbGWQn6d2IJRNUA0irG8MSjxzZ4E9y4S2hu4gnLYE0IXgXNoWWhawQ==

11) Create table and put some data. 

$ echo "create '/user/mapr/
AbiSourcet1', {NAME => 'c1', VERSIONS => 1}" | hbase shell 
$ echo "put '/user/mapr/AbiSourcet1', 'r1', 'c1', '$(date)'" | hbase shell 

12)  Install mapr-gateway package at DESTINATION CLUSTER if not present and make SOURCE SIDE aware of G/W. 

On DESTINATION CLUSTER : yum install mapr-gayeway -y   ( and restart of cluster will be needed for warden to manage this service )

On Source Cluster : maprcli cluster gateway resolve -dstcluster Dest  ( Command to verify source cluster 
Source              GatewayHosts                                                         can resolve Destination G/W )


Command to make Source cluster aware of destination G/W

On Source Cluster: $ maprcli cluster gateway set -dstcluster <des_cluster_name> -gateways <dest_gateway_hostname> 

13)  Now set up and start replication between a source table and replica table.

$ maprcli table replica autosetup -path /mapr/Source/user/mapr/AbiSourcet1 -replica /mapr/Dest/user/mapr/AbiDestt1 

Above command does below steps in background :
  1. Creates a table on the replication cluster with the required column families
  2. Declares the new table to be a replica of the source table with a paused replication state.
  3. Declares the source table as an upstream source for the replica.
  4. Runs the CopyTables utility to load a copy of the source data into the replica.
  5. Clears the paused replication state to start the replication stream.

14) Add new row in table from source side and verify from destination side if upgrade was received on replica to make sure replication is working as expected 

echo "put '/user/mapr/AbiSourcet1', 'r4', 'c1', '$(date)'" | hbase shell

Monday, March 21, 2016

Hbase table copy across secure MapR clusters

                                   Hbase table copy across secure MapR clusters
This Blog assumes you have 2 clusters up and running securely

Note :-
Source Cluster A - Node
Destination Cluster B - Node  

Also, in Secure cluster we need to specify hbase to use maprsecurity when submitting the map reduce job to do so edit  /opt/mapr/conf/ file as below ( All nodes).

#export MAPR_HBASE_CLIENT_OPTS="${SIMPLE_LOGIN_OPTS} -Dzookeeper.sasl.client=false"

Stop all services on destination clusters ( Cluster B ) and perform the following steps:-

1. Merge truststore on destination . 

scp /opt/mapr/conf/ssl_truststore     ( Copy truststore from source (A) to destination(B) cluster )

chmod 644 /opt/mapr/conf/ssl_truststore                         ( make truststore writable )

( Merge source SSL_Truststore with destination )
/opt/mapr/server/ merge /tmp/ssl_truststore /opt/mapr/conf/ssl_truststore

chmod 400 /opt/mapr/conf/ssl_truststore                        ( Make truststore readonly )

2. Remove existing cldb.key , maprserverticket and 'mapruserticket' from cluster B. 

rm -f /opt/mapr/conf/cldb.key ; rm -f /opt/mapr/conf/maprserverticket; 
rm -f /opt/mapr/conf/mapruserticket

3. copy cldb.key from cluster A to B (to cldb & ZK nodes)

scp /opt/mapr/conf/cldb.key 

4. Make a copy of 'maprserverticket' on cluster A

cp /opt/mapr/conf/maprserverticket /opt/mapr/conf/maprserverticket_copy

5. Edit copied 'maprserverticket_copy' and modify the cluster name (first portion) to name of cluster B. (retain the remaining part as is). 

$ cat /opt/mapr/conf/maprserverticket 
Source tXDbAlh2d6+TJZPjGpXcWo/LRHWqPV6iMo58iQnSUSBpvx+daM7kM64ww/Fpow934VVnH7acdMBfV2fxipno49LsSryEcC1aEnMVHFw6Sbwifrr8PURddkhrd6kEO11+JSFgFdF4qYXfmuQGZAIHX+OgORkztRhdF0AKWEPif38+OiuFXJrqxzchms/FoqUYQ9o50bTeuZ82zieG4Z6/sR/CFbtrsko8EA9pH9xHu9Od5M6PauSBWOj4J+nwpJVTOKT4PlM3kuk/0Z7IVg==

$ cat /opt/mapr/conf/maprserverticket_copy 
Dest tXDbAlh2d6+TJZPjGpXcWo/LRHWqPV6iMo58iQnSUSBpvx+daM7kM64ww/Fpow934VVnH7acdMBfV2fxipno49LsSryEcC1aEnMVHFw6Sbwifrr8PURddkhrd6kEO11+JSFgFdF4qYXfmuQGZAIHX+OgORkztRhdF0AKWEPif38+OiuFXJrqxzchms/FoqUYQ9o50bTeuZ82zieG4Z6/sR/CFbtrsko8EA9pH9xHu9Od5M6PauSBWOj4J+nwpJVTOKT4PlM3kuk/0Z7IVg==

6. Copy the modified maprserverticket to cluster B (all nodes)

scp /opt/mapr/conf/maprserverticket_copy

7 Make sure the permission of cldb.key and maprserverticket are correct (owned by mapr:mapr).

[mapr@410-Dest ~]$ ls -l /opt/mapr/conf/cldb.key 
-rw------- 1 mapr mapr 89 Mar 18 15:51 /opt/mapr/conf/cldb.key
[mapr@410-Dest ~]$ ls -l /opt/mapr/conf/maprserverticket 
-rw------- 1 mapr mapr 282 Mar 18 16:11 /opt/mapr/conf/maprserverticket

8 Start zookeeper & warden on respective nodes. 

9 Make sure the services are up & running on cluster B and attempt CopyTable from source.

Run copy table command on source cluster :

 hbase org.apache.hadoop.hbase.mapreduce.CopyTable --peer.adr= weblog
2016-03-18 16:28:39,679 INFO  [main] Configuration.deprecation: io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2016-03-18 16:28:39,874 INFO  [main] zookeeper.ZooKeeper: Client environment:zookeeper.version=3.4.5-mapr-1503--1, built on 03/26/2015 18:33 GMT
2016-03-18 16:28:39,874 INFO  [main] zookeeper.ZooKeeper: Client environment:java.version=1.7.0_45

2016-03-18 16:28:39,887 INFO  [main] zookeeper.ZooKeeper: Client
2016-03-18 16:28:39,887 INFO  [main] zookeeper.ZooKeeper: Client environment:user.home=/home/mapr
2016-03-18 16:28:39,887 INFO  [main] zookeeper.ZooKeeper: Client environment:user.dir=/opt/mapr/zookeeper/zookeeper-3.4.5/logs
2016-03-18 16:28:39,888 INFO  [main] zookeeper.ZooKeeper: Initiating client connection, connectString= sessionTimeout=30000 watcher=com.mapr.util.zookeeper.ZKDataRetrieval@2e14ec19
2016-03-18 16:28:39,914 INFO  [main-SendThread(410-Dest:5181)] zookeeper.Login: successfully logged in.
2016-03-18 16:28:39,915 INFO  [main-SendThread(410-Dest:5181)] client.ZooKeeperSaslClient: Client will use MAPR-SECURITY as SASL mechanism.
2016-03-18 16:28:39,918 INFO  [main-SendThread(410-Dest:5181)] zookeeper.ClientCnxn: Opening socket connection to server 410-Dest/ Will attempt to SASL-authenticate using Login Context section 'Client'
2016-03-18 16:28:39,944 INFO  [main-SendThread(410-Dest:5181)] zookeeper.ClientCnxn: Session establishment complete on server 410-Dest/, sessionid = 0x1538c23bd930050, negotiated timeout = 30000
2016-03-18 16:28:39,947 INFO  [main] zookeeper.ZKDataRetrieval: Connected to ZK:
2016-03-18 16:28:39,947 INFO  [main] zookeeper.ZKDataRetrieval: Getting serviceData for master node of resourcemanager
2016-03-18 16:28:39,965 INFO  [main-EventThread] zookeeper.ZKDataRetrieval: Process path: null. Event state: SaslAuthenticated. Event type: None
2016-03-18 16:28:39,975 INFO  [main] client.MapRZKBasedRMFailoverProxyProvider: Updated RM address to 410-Dest/
2016-03-18 16:28:40,279 INFO  [main] Configuration.deprecation: io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2016-03-18 16:28:40,373 INFO  [main] zookeeper.RecoverableZooKeeper: Process identifier=hconnection-0x74727a94 connecting to ZooKeeper ensemble=
2016-03-18 16:28:40,374 INFO  [main] zookeeper.ZooKeeper: Initiating client connection, connectString= sessionTimeout=90000 watcher=hconnection-0x74727a940x0, quorum=, baseZNode=/hbase
2016-03-18 16:28:40,375 INFO  [main-SendThread(410-Source:5181)] client.ZooKeeperSaslClient: Client will use MAPR-SECURITY as SASL mechanism.
2016-03-18 16:28:40,376 INFO  [main-SendThread(410-Source:5181)] zookeeper.ClientCnxn: Opening socket connection to server 410-Source/ Will attempt to SASL-authenticate using Login Context section 'Client'
2016-03-18 16:28:40,383 INFO  [main-SendThread(410-Source:5181)] zookeeper.ClientCnxn: Socket connection established to 410-Source/, initiating session
2016-03-18 16:28:40,409 INFO  [main-SendThread(410-Source:5181)] zookeeper.ClientCnxn: Session establishment complete on server 410-Source/, sessionid = 0x1538bc340af0076, negotiated timeout = 40000
2016-03-18 16:28:40,435 INFO  [main] mapreduce.TableOutputFormat: Created table instance for weblog
2016-03-18 16:28:41,032 INFO  [main] zookeeper.RecoverableZooKeeper: Process identifier=hconnection-0x20828fe4 connecting to ZooKeeper ensemble=
2016-03-18 16:28:41,032 INFO  [main] zookeeper.ZooKeeper: Initiating client connection, connectString= sessionTimeout=90000 watcher=hconnection-0x20828fe40x0, quorum=, baseZNode=/hbase
2016-03-18 16:28:41,033 INFO  [main-SendThread(410-Source:5181)] client.ZooKeeperSaslClient: Client will use MAPR-SECURITY as SASL mechanism.
2016-03-18 16:28:41,033 INFO  [main-SendThread(410-Source:5181)] zookeeper.ClientCnxn: Opening socket connection to server 410-Dest/ Will attempt to SASL-authenticate using Login Context section 'Client'
2016-03-18 16:28:41,034 INFO  [main-SendThread(410-Source:5181)] zookeeper.ClientCnxn: Socket connection established to 410-Dest/, initiating session
2016-03-18 16:28:41,067 INFO  [main-SendThread(410-Source:5181)] zookeeper.ClientCnxn: Session establishment complete on server 410-Source/, sessionid = 0x1538c23bd930051, negotiated timeout = 40000
2016-03-18 16:28:41,115 INFO  [main] util.RegionSizeCalculator: Calculating region sizes for table "weblog".
2016-03-18 16:28:41,500 INFO  [main] mapreduce.JobSubmitter: number of splits:1
2016-03-18 16:28:41,511 INFO  [main] Configuration.deprecation: io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2016-03-18 16:28:41,765 INFO  [main] mapreduce.JobSubmitter: Submitting tokens for job: job_1458345179933_0003
2016-03-18 16:28:41,964 INFO  [main] security.ExternalTokenManagerFactory: Initialized external token manager class -
2016-03-18 16:28:42,001 INFO  [main] impl.YarnClientImpl: Submitted application application_1458345179933_0003
2016-03-18 16:28:42,063 INFO  [main] mapreduce.Job: The url to track the job: https://410-Source:8090/proxy/application_1458345179933_0003/
2016-03-18 16:28:42,064 INFO  [main] mapreduce.Job: Running job: job_1458345179933_0003
2016-03-18 16:28:53,332 INFO  [main] mapreduce.Job: Job job_1458345179933_0003 running in uber mode : false
2016-03-18 16:28:53,334 INFO  [main] mapreduce.Job:  map 0% reduce 0%
2016-03-18 16:28:59,602 INFO  [main] mapreduce.Job:  map 100% reduce 0%
2016-03-18 16:28:59,622 INFO  [main] mapreduce.Job: Job job_1458345179933_0003 completed successfully
2016-03-18 16:28:59,786 INFO  [main] mapreduce.Job: Counters: 41
            File System Counters
                        FILE: Number of bytes read=0
                        FILE: Number of bytes written=111880
                        FILE: Number of read operations=0
                        FILE: Number of large read operations=0
                        FILE: Number of write operations=0
                        MAPRFS: Number of bytes read=66
                        MAPRFS: Number of bytes written=0
                        MAPRFS: Number of read operations=11
                        MAPRFS: Number of large read operations=0
                        MAPRFS: Number of write operations=0
            Job Counters 
                        Launched map tasks=1
                        Data-local map tasks=1
                        Total time spent by all maps in occupied slots (ms)=4698
                        Total time spent by all reduces in occupied slots (ms)=0
                        Total time spent by all map tasks (ms)=4698
                        Total vcore-seconds taken by all map tasks=4698
                        Total megabyte-seconds taken by all map tasks=4810752
            Map-Reduce Framework
                        Map input records=2
                        Map output records=2
                        Input split bytes=66
                        Spilled Records=0
                        Failed Shuffles=0
                        Merged Map outputs=0
                        GC time elapsed (ms)=51
                        CPU time spent (ms)=1400
                        Physical memory (bytes) snapshot=223748096
                        Virtual memory (bytes) snapshot=1834762240
                        Total committed heap usage (bytes)=159383552
            HBase Counters
            File Input Format Counters 
                        Bytes Read=0
            File Output Format Counters 
                                   Bytes Written=0