Big Data / Systems / Cloud / AI : March 2017

Friday, March 31, 2017

Monitoring MapR Secured cluster.

Monitoring infra structure is single most important thing which is needed in cluster to watch out for any abnormalities and pro-actively fix before the problem aggrevates to cause major issues.

This blog assumes you have 5.2 Secured mapr cluster installed and is up and running. Now follow below steps to setup Monitoring infra for the cluster.

1) Verify root and mapr user has valid ticket if not please create.

[root@node10 ~]# maprlogin password

[Password for user 'root' at cluster 'Destination': ] 

MapR credentials of user 'root' for cluster 'Destination' are written to '/tmp/maprticket_0'

[mapr@node10 root]$ maprlogin password

[Password for user 'mapr' at cluster 'Destination': ] 

MapR credentials of user 'mapr' for cluster 'Destination' are written to '/tmp/maprticket_2000'

2) Setup MEP repo compatible with Mapr cluster version you installed .

[root@node10 ~]# cat /etc/yum.repos.d/mapr_core.repo 

[MapR_Core]

name = MapR Core Components

baseurl = http://package.mapr.com/releases/v5.2.0/redhat

gpgcheck = 1

enabled = 1

protected = 1

[maprecosystem]

name = MapR Technologies

baseurl = http://package.mapr.com/releases/MEP/MEP-2.0/redhat/

enabled=1

gpgcheck=0

protect=1

3) Clean up the repo's and install below packages 

Metric Monitoring

yum clean all

i)  yum install mapr-collectd mapr-grafana mapr-opentsdb -y

Verify packages get installed correctly along with the dependencies.

Installed:

  mapr-collectd.x86_64 0:5.5.1.201612081135-1                 mapr-grafana.x86_64 0:3.1.1.201612081149-1                 mapr-opentsdb.noarch 0:2.2.1.201612081520-1                

Dependency Installed:

  mapr-asynchbase.noarch 0:1.7.0.201611162116-1                                                                                                                                      

Complete!

Note : I am demo'ing with single node cluster but in real scenarios you will have more nodes . So collectd can run on all the nodes while Opentsdb/grafana can run on one node which has very few services installed.

Log Monitoring Architecture

Verify packages get installed correctly

ii) yum install mapr-fluentd mapr-elasticsearch mapr-kibana -y

Installed:

  mapr-elasticsearch.noarch 0:2.3.3.201612081350-1               mapr-fluentd.x86_64 0:0.14.00.201612081135-1               mapr-kibana.x86_64 0:4.5.1.201612081226-1              

Complete!

4) Run configure.sh along with below options on every node for cluster to know about Elastic search and OpenTSDB node.

/opt/mapr/server/configure.sh -R -ES 10.10.70.110 -OT 10.10.70.110

Verify configuration goes smooth and no errors are thrown.

Configuring Hadoop-2.7.0 at /opt/mapr/hadoop/hadoop-2.7.0

Done configuring Hadoop

Node setup configuration:  cldb collectd elasticsearch fileserver fluentd gateway grafana hbinternal kibana opentsdb webserver zookeeper

Log can be found at:  /opt/mapr/logs/configure.log

Running config for package collectd

Running config for package opentsdb

17/03/31 15:59:32 INFO Configuration.deprecation: io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum

17/03/31 15:59:32 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.

Running config for package grafana

MAC verified OK

{"id":1,"message":"Datasource added"}Running config for package elasticsearch

Running config for package fluentd

Running config for package kibana

MAC verified OK

Now from from the MCS, select the Grafana/OpenTSDB/Kibana view

http://<IPaddressOfOpenTSDBNode>:4242

GRAFANA :

LINK Username/passwd

http://<IPaddressOfGrafanaNode>:3000 admin/admin

Below is sample CLDB Dashboard.

Below is sample Node Dashboard.

Note : If grafana is not displaying details it might be related to known bug in OpenTSDB older version fixed in version 1.3.0 ( https://maprdrill.atlassian.net/browse/SPYG-806 )

Below is workaround to fix the issue.

On all openTSDB nodes

1) Edit /opt/mapr/opentsdb/*/etc/opentsdb/opentsdb.conf

Add: tsd.core.meta.enable_tsuid_tracking = true under the --------- CORE ---------- section.

2) Restart all OpenTSDB services

3) cd /opt/mapr/opentsdb/*/bin

4) Run # ./tsdb <OpenTSDB pid> metasync

5) Restart Grafana and navigate back to the Grafana dashboards

KIBANA :

http://<IPaddressOfKibanaNode>:5601

When the Kibana page loads, it displays (Setting tab) Configure an index pattern screen. Provide following values:

Note: The Index contains time-based events option is selected by default and should remain selected.

Field Value

Index name or pattern mapr_monitoring-*

Time-field @timestamp
Click Create.

Field	Value
Index name or pattern	mapr_monitoring-*
Time-field	@timestamp

Now new Index pattern is created.

Now click on Discover Tab, it will display the logs indexed also it will give an option to search for patterns.

Thursday, March 30, 2017

Jstat

A lot of time while troubleshooting performance issue we end up narrowing down to point where Java process went on break for doing Full GC . During this break clients wait on response and obviously there is delay in response, which is seen as performance issue if the java process goes in full GC very often indicating Heap memory given to Java process is not enough and just increasing the heap memory followed by restart of the process should alleviate the problem if there are no memory leak. To find what is accumulated in process heap you can follow below blog.

http://abizeradenwala.blogspot.com/2017/03/heap-dump.html

Example : if CLDB doesn't respond and on reviewing CLDB logs we see below message. It is very likely Java process was doing Full GC hence stopped .

2016-03-02 17:12:30,489 WARN CLDBServer [Health-Monitor]: JVM Monitor thread not scheduled for 2440 milliseconds. This could be due to one of the following: GC pause, all CPUs being busy, or a JVM freeze.

To confirm if GC was kicked in during the time we can use jstat to get details, Here FGC and GCT are most important which will give number of times Java process went for full GC and total number of seconds spent in GC . Obviously we need to take few iteration a minute or 2 apart during the time of issue and confirm, if the counters increase it would be confirmed GC is kicking in and cause of JVM pause.

Examples : jstat -gcutil <Java-PID> <Milli Sec> <Number of iteration>

[root@node9 ~]# jstat -gcutil 9424 1000 10

S0 S1 E O P YGC YGCT FGC FGCT GCT

0.14 0.00 80.11 2.14 99.90 112 0.864 0 0.000 0.864

0.14 0.00 80.13 2.14 99.90 112 0.864 0 0.000 0.864

0.14 0.00 80.18 2.14 99.90 112 0.864 0 0.000 0.864

0.14 0.00 80.19 2.14 99.90 112 0.864 0 0.000 0.864

[root@node9 ~]# jstat -gc 9424 1000 10

S0C S1C S0U S1U EC EU OC OU PC PU YGC YGCT FGC FGCT GCT

32448.0 32448.0 45.6 0.0 260032.0 224388.9 259776.0 5550.0 33536.0 33503.4 112 0.864 0 0.000 0.864

32448.0 32448.0 45.6 0.0 260032.0 224405.3 259776.0 5550.0 33536.0 33503.4 112 0.864 0 0.000 0.864

32448.0 32448.0 45.6 0.0 260032.0 228188.9 259776.0 5550.0 33536.0 33503.4 112 0.864 0 0.000 0.864

32448.0 32448.0 45.6 0.0 260032.0 228342.3 259776.0 5550.0 33536.0 33503.4 112 0.864 0 0.000 0.864

32448.0 32448.0 45.6 0.0 260032.0 228353.2 259776.0 5550.0 33536.0 33503.4 112 0.864 0 0.000 0.864

32448.0 32448.0 45.6 0.0 260032.0 228369.5 259776.0 5550.0 33536.0 33503.4 112 0.864 0 0.000 0.864

S0C – S0 Capacity; the capacity in KB of Survivor Space 0.
S1C – S1 Capacity; the capacity in KB of Survivor Space 1.
S0 -- Survivor space 0 utilization as a percentage of the space's current capacity.
S0U – S0 Utilization; how much space in KB is being used in Survivor space 0.
S1U – S1 Utilization; how much space in KB is being used in Survivor space 1.
S1 -- Survivor space 1 utilization as a percentage of the space's current capacity.
EC – Eden Capacity; the capacity, in KB, of Eden.
EU – Eden Utilization; how much space, in KB, is being used in Eden.
E -- Eden space utilization as a percentage of the space's current capacity.
OC – Old Capacity; the capacity, in KB, of the Old Generation.
OU – Old Utilization; how much space, in KB, is being used by the Old Generation.
O -- Old space utilization as a percentage of the space's current capacity.
PC – Perm Capacity; the capacity, in KB, of the Permanent Generation.
PU – Perm Utilization; how much space, in KB, is being used by the Permanent Generation.
P -- Permanent space utilization as a percentage of the space's current capacity.
YGC – Young Garbage Collections; the number of young garbage collections that this JVM has performed so far.
YGCT – Young Garbage Collection Time; the number of seconds that this JVM has spent performing young garbage collections so far.
FGC – Full Garbage Collections; the number of full garbage collections this JVM has performed so far.
FGCT – Full Garbage Collection Time; the number of seconds that this JVM has spent performing young garbage collections so far.
GCT – Garbage Collection Time; the total number of seconds that this JVM has spent performing garbage collections so far. (YGCT + FGCT).

Wednesday, March 29, 2017

Mapr Mirroring details (Steps from MFS logs )

Mapr Mirroring details (Steps from MFS logs )

This is continuation of earlier blog but will concentrate from MFS side . MFS main work during the mirroring is creating snapshots for source and destination followed by re-syncing for the data to be exactly same as source and role forwarding the containers in destination mirror snapshot to mirror volume to be part of final Mirror volume followed by deletion of any mirror snapshots. In short CLDB only give direction MFS is the one who does all the work during mirroring operation. (http://abizeradenwala.blogspot.com/2017/03/mapr-mirroring-details-steps-from-cldb.html)

1) First MFS creates mirror snapshot of the volume on source .

2017-03-27 15:50:33,5013 INFO Snapshot volumesnap.cc:1173 Added snap map for rwcid 2243 snapcid 256000049

2017-03-27 15:50:33,5262 INFO Snapshot volumesnap.cc:1173 Added snap map for rwcid 2245 snapcid 256000050

2017-03-27 15:50:33,5262 INFO Snapshot volumesnap.cc:1173 Added snap map for rwcid 2246 snapcid 256000051

2017-03-27 15:50:33,5262 INFO Snapshot volumesnap.cc:1173 Added snap map for rwcid 2247 snapcid 256000052

2017-03-27 15:50:33,5262 INFO Snapshot volumesnap.cc:1173 Added snap map for rwcid 2248 snapcid 256000053

2017-03-27 15:50:33,5262 INFO Snapshot volumesnap.cc:1173 Added snap map for rwcid 2249 snapcid 256000054

2017-03-27 15:50:33,5263 INFO Snapshot snapshot.cc:324 snap container 2243 : Creating snapshot 256000049 snapId 256000049

2017-03-27 15:50:33,8601 INFO Snapshot volumesnap.cc:895 Snapped volume : snap mirrorsrcsnap.19731699.27-Mar-2017-15-50-33 snapId 256000049 nameCid 2243 took 397 msec

2) Now MFS will create mirrosnapshot on destination volume. 

2017-03-27 15:50:35,3261 INFO Snapshot volumesnap.cc:1173 Added snap map for rwcid 2244 snapcid 256000055

2017-03-27 15:50:35,3364 INFO Snapshot snapshot.cc:324 snap container 2244 : Creating snapshot 256000055 snapId 256000050

2017-03-27 15:50:35,4183 INFO Snapshot volumesnap.cc:895 Snapped volume : snap mirrorsnap.27-Mar-2017-15-50-35 snapId 256000050 nameCid 2244 took 112 sec

3) During this step CLDB has collected the info on how many containers exist on source volume and instructs MFS to create same number of containers on mirror volume mirror volume for containers to be mapped 1-1 for resync (Source - Destination)

2017-03-27 15:50:36,1145 INFO Container create.cc:805 Created Container on sp SP2:/dev/sdd with cid 2250 rindex 0

2017-03-27 15:50:36,1145 INFO Container create.cc:805 Created Container on sp SP2:/dev/sdd with cid 2254 rindex 0

2017-03-27 15:50:36,1256 INFO Container create.cc:805 Created Container on sp SP1:/dev/sdc with cid 2251 rindex 0

2017-03-27 15:50:36,1256 INFO Container create.cc:805 Created Container on sp SP1:/dev/sdc with cid 2253 rindex 0

2017-03-27 15:50:36,1256 INFO Container create.cc:805 Created Container on sp SP1:/dev/sdc with cid 2252 index 0

4) Now MFS on source and destination have to communicate and finish the resync for data to be transferred for which CID on source and destination are paired.

2017-03-27 15:50:37,8192 INFO Replication containerrestore.cc:5066 Get Resync Status and Start Resync WA 0xd915c018 replicacid 2244 srccid 256000049  mirrord 2

2017-03-27 15:50:37,8192 INFO Replication resyncworkareas.cc:125 FindContainerStoppedWA inputs replicacid 2244 srccid 256000049 wa (nil)

2017-03-27 15:50:38,6467 INFO Replication volumemirrorserver.cc:274 Added volume map and inverse volume map for rwcid 2243 destcid 2244

2017-03-27 15:50:38,6467 INFO Replication volumemirrorserver.cc:274 Added volume map and inverse volume map for rwcid 2248 destcid 2250

2017-03-27 15:50:38,6467 INFO Replication volumemirrorserver.cc:274 Added volume map and inverse volume map for rwcid 2245 destcid 2251

2017-03-27 15:50:38,6467 INFO Replication volumemirrorserver.cc:274 Added volume map and inverse volume map for rwcid 2246 destcid 2252

2017-03-27 15:50:38,6467 INFO Replication volumemirrorserver.cc:274 Added volume map and inverse volume map for rwcid 2247 destcid 2253

2017-03-27 15:50:38,6467 INFO Replication volumemirrorserver.cc:274 Added volume map and inverse volume map for rwcid 2249 destcid 2254

5) To explain the details of this resync process I will only concentrate on one pair of container which will be resynced .

Creator container        Mirror vol container

2243(Snap 256000049)          -- >     2244 (Snap 4288880229)

i) First we create clone of destination rw container

2017-03-27 15:50:38,2581 INFO Replication containerrestore.cc:1289 Creating clone container of rwcid 2244

2017-03-27 15:50:38,2581 INFO Snapshot snapshot.cc:324 snap container 2244 : Creating snapshot 4288880229 snapId 0

ii) Now resync from srccid which is Snap cid 256000049 to replica clone cid ( 4288880229 ) starts.

2017-03-27 15:50:38,6469 INFO Replication containerrestore.cc:1690 Starting resync from TxnVN 0:0 SnapVN 35:35 WriteVN 0:0 srccid 256000049 replicacid 2244 needreconnect 0

2017-03-27 15:50:38,6471 INFO Replication containerresync.cc:176 CONTAINER_RESYNC_START -- destnode FSID 5979527104878929364, 10.10.70.109:5660, 10.10.70.109:5692,, srccid 256000049, replicacid 2244, wa 0xe34bc000, resyncwacount 8 replica txnvn 0:0 writevn 0:0 snapvn 35:35 chainSeqNumber 0 destRWMirrorId 0 destCloneMirrorId 0, destVolumeId 19731699, metaDataOnly 0, replica compression type 3

2017-03-27 15:50:38,6471 INFO Replication containerresync.cc:178 srccid 256000049 replicacid 2244 replicaIsRwmirrorCapable 1 srcPrevVolSnapId 0

2017-03-27 15:50:38,6472 INFO Replication containerresync.cc:1025 Converting the resync to full resync because VNs are all zero for srccid 256000049, replicacid 2244, destnode 10.10.70.109:5660

2017-03-27 15:50:38,6472 INFO Replication containerresync.cc:2929 Resync of cid 2244 snapcontainer 256000049 prevcid 0 prevresynccid 0 txnvn 0 writevn 0 snapvn 35

2017-03-27 15:50:38,6472 INFO Replication containerresyncfromsnapshot.cc:162 CONTAINER_RESYNC_FROM_SNAPSHOT_START -- from cid 256000049 replica 2244 txnVN 0 snapVN 35 writeVN 0 isundo 0, rollforwardcontainer 0 dumpSnapshotInode 0, isRwmirrorCapable 1, replicaIsRwmirrorCapable 1, using starting Inode 0 send starting Inode 0 

2017-03-27 15:50:38,6472 INFO Replication containerresyncfromsnapshot.cc:374  srccid 256000049 replicacid 2244 sessionid 4020444768 undoneeded 0 txnundoneeded 0 writeundoneeded 0 snapundoneeded 0 undotxnvn 0 undowritevn 0 undosnapvn 18446744073709551615 redotxnvn 0 redowritevn 0 redosnapvn 0

2017-03-27 15:50:38,6472 INFO Replication containerresyncfromsnapshot.cc:383 Converting the resync to full resync because VNs are all zero for srccid 256000049, replicacid 2244 

2017-03-27 15:50:38,6472 INFO Replication containerresyncfromsnapshot.cc:422 Redoing replica cid 2244 from txnvn:0 snapvn:0 writevn:0 

2017-03-27 15:50:38,7820 INFO Replication containerresyncfromsnapshot.cc:1564 ResyncFromSnapshot resyncing versioninfo srccid:256000049 replicacid:2244TxnVN 1048700:1048700 WriteVN 1048698:1048698 snapVN 34:34 maxUniq 262570, maxinum 255

2017-03-27 15:50:38,7822 INFO Replication inoderestore.cc:836 Container restore for cid 4288880229, srccid 256000049, replicacid 2244, maxinum on replica is 255 and on src is 255, no inode delete needed

2017-03-27 15:50:38,7822 INFO Replication containerrestore.cc:3249 ContainerRestore updating versioninfo for cid 2244 txnvn 1048700:1048700 writevn 1048698:1048698 snapvn 35:35 maxUniq 262570 updateSnapVnSpace 1

2017-03-27 15:50:38,8825 INFO Replication containerrestore.cc:4021 RestoreDataEnd Complete resync data WA 0xd90ba018 srccid 256000049 replicacid 2244

2017-03-27 15:50:38,8828 INFO Replication containerresyncfromsnapshot.cc:1977 CONTAINER_RESYNC_FROM_SNAPSHOT_END-- from cid 256000049, replica 2244, time 235 ms, numInodes 16, numInodesIterated 256, numBytes 68986, numMsgs 18, err 0

2017-03-27 15:50:38,8828 INFO Replication containerresync.cc:3070 Resync from snapshot completed srccid:256000049 replicacid:2244 resynccid:256000049 err 0 svderr 0

2017-03-27 15:50:38,8828 INFO Replication containerresync.cc:3885 wa->srccid=256000049 set_mirrorstate=3 replicaCid=2244

2017-03-27 15:50:38,8828 INFO Replication containerrestore.cc:4615 RestoreContainerCleanupGotRWContainer cid 2244 replicatecleanup 1 isMaster 1 isStale 0

2017-03-27 15:50:38,9025 INFO Replication containerrestore.cc:4734 Resync complete for srccid 256000049 replicacid 2244  srcnode 10.10.70.109:5660 rollforward 0

2017-03-27 15:50:38,9025 INFO Replication containerrestore.cc:4981 CONTAINER_RESTORE_END -- Resync of container successful srcnode FSID 5979527104878929364, 10.10.70.109:5660, 10.10.70.109:5692,, srccid 256000049, replicacid 2244, time 1082 ms, numMsgs 21, numBytes 69073

2017-03-27 15:50:38,9025 INFO Replication containerrestore.cc:4521 RestoreEnd Complete resync data WA 0xd8e32018 srccid 256000049 replicacid 2244

2017-03-27 15:50:38,9025 INFO Replication containerresync.cc:3944 ResyncContainer complete srccid 256000049 replicacid 2244 err 0x0

2017-03-27 15:50:38,9025 INFO Replication containerresync.cc:3282 CONTAINER_RESYNC_END -- destnode FSID 5979527104878929364, 10.10.70.109:5660, 10.10.70.109:5692, srccid 256000049, replicacid 2244, time 255 ms, numIterations 1, numInodesIterated 256numInodes 16, numMsgs 18, numBytes 68986, err 0

iii) Next steps after resync is role forward where the clone container 4288880229 role is swapped with 2244 and roll forward flag in the container is cleared ( In turn all the data us available in destination container now )

2017-03-27 15:53:35,2580 INFO Replication volumemirrorserver.cc:318 Updating the mirrorid of container 2244 to mirrorId 1 nextMirrorId 2 

2017-03-27 15:53:35,2580 INFO Replication volumemirrorserver.cc:397 Setting the rollforward flag for cid 2244

2017-03-27 15:53:35,2580 INFO Replication volumemirrorserver.cc:530 UpdateMirrorId cid 2244 CLDB mirrorId 1 CLDB nextMirrorId 2 FS mirrorId 0 FS cloneMirrorId 2

2017-03-27 15:53:35,2580 INFO Replication volumemirrorserver.cc:556 UpdateMirrorId cid 2244 version of clone container txnvn 1048700:1048700, writevn 1048698:1048698, snapvn 35:35

2017-03-27 15:53:35,2580 INFO Replication volumemirrorserver.cc:672 Rolling forward the clone container 4288880229 for cid 2244

2017-03-27 15:53:35,2580 INFO Container rollback.cc:89 RollForward 0 :Readwrite cid 2244.

2017-03-27 15:53:35,2732 INFO Container rollback.cc:990 RollForward 2244 :Clearing rollforward in progress flag for cid 2244

6) Once resync for all the containers is completed as explained in above step , all the resynced clone containers need to roll forward to the original CID .

2017-03-27 15:53:35,2580 INFO Replication volumemirrorserver.cc:672 Rolling forward the clone container 4288880229 for cid 2244

2017-03-27 15:53:35,2580 INFO Replication volumemirrorserver.cc:672 Rolling forward the clone container 4288880224 for cid 2251

2017-03-27 15:53:35,2580 INFO Replication volumemirrorserver.cc:672 Rolling forward the clone container 4288880228 for cid 2254

2017-03-27 15:53:35,2581 INFO Replication volumemirrorserver.cc:672 Rolling forward the clone container 4288880227 for cid 2250

2017-03-27 15:53:35,2581 INFO Replication volumemirrorserver.cc:672 Rolling forward the clone container 4288880226 for cid 2252

2017-03-27 15:53:35,2581 INFO Replication volumemirrorserver.cc:672 Rolling forward the clone container 4288880225 for cid 2253

7) As a final step Clone containers created for resync are deleted followed by deleting source mirror snapcid ( Snap volume completely )

2017-03-27 15:53:35,2735 INFO Container delete.cc:1528 Delete container 4288880229 : Ownership transfer for cntr-tree not required.

2017-03-27 15:53:35,2736 INFO Container delete.cc:1528 Delete container 4288880224 : Ownership transfer for cntr-tree not required.

2017-03-27 15:53:35,2736 INFO Container delete.cc:1528 Delete container 4288880228 : Ownership transfer for cntr-tree not required.

2017-03-27 15:53:35,2736 INFO Container delete.cc:1528 Delete container 4288880226 : Ownership transfer for cntr-tree not required.

2017-03-27 15:53:35,2736 INFO Container delete.cc:1528 Delete container 4288880227 : Ownership transfer for cntr-tree not required.

2017-03-27 15:53:35,2736 INFO Container delete.cc:1528 Delete container 4288880225 : Ownership transfer for cntr-tree not required.

2017-03-27 15:53:36,0959 INFO Container delete.cc:1528 Delete container 256000052 : Ownership transfer for cntr-tree not required.

2017-03-27 15:53:36,0959 INFO Container delete.cc:1528 Delete container 256000053 : Ownership transfer for cntr-tree not required.

2017-03-27 15:53:36,0960 INFO Container delete.cc:1528 Delete container 256000049 : Ownership transfer for cntr-tree not required.

2017-03-27 15:53:36,0960 INFO Container delete.cc:1528 Delete container 256000050 : Ownership transfer for cntr-tree not required.

2017-03-27 15:53:36,0961 INFO Container delete.cc:1528 Delete container 256000051 : Ownership transfer for cntr-tree not required.

2017-03-27 15:53:36,0962 INFO Container delete.cc:1528 Delete container 256000054 : Ownership transfer for cntr-tree not required.

Increasing Log retention

Increasing Log retention

Logs are the first place we check when we need to troubleshoot an issue looking for error's or any clue what went wrong. Some times default log retention are not enough for us to go back in time to figure out actually what happened and get RC of the issue. This blogs talk about various configs which can be tweaked for longer log retention for various services.

MFS : In configuration file /opt/mapr/conf/mfs.conf increase value for below config.

mfs.max.logfile.size.in.mb

maxSizePerLogFile = maxLogSize / MAX_NUM_OF_LOG_FILES

where

maxLogSize = total amount of space that MFS log files can consume
MAX_NUM_OF_LOG_FILES = total number of MFS log files ( 5 hard coded )

Lets say we put this value to 10Gb then you will have 5 log file (mfs.log-3 to mfs.log-3.4) of 2 GB each.

CLDB : In configuration file /opt/mapr/conf/log4j.cldb.properties

log4j.appender.R.MaxFileSize ( Size of each cldb.log before rolling over )

log4j.appender.R.MaxBackupIndex ( Number of cldb.log* before oldest one is deleted )

Lets say i set log4j.appender.R.MaxFileSize to 1024Mb and log4j.appender.R.MaxBackupIndex to 10 then I will have 10 GB to total CLDB logs before they are purged.

Resource Manager : In configuration file /opt/mapr/hadoop/hadoop-<Version>/etc/hadoop/log4j.properties

hadoop.log.maxfilesize ( Size of each yarn-mapr-resourcemanager-<hostname>.log before rolling over )

hadoop.log.maxbackupindex ( Number of yarn-mapr-resourcemanager-<hostname>.log* before oldest one is deleted )

Lets say i set hadoop.log.maxfilesize to 1024Mb and hadoop.log.maxbackupindexto 10 then I will have 10 GB to total RM logs before they are purged.

Zookeeper : By default zookeeper keeps 10mb of logs and max backup index upto 4

ZK log size is controlled by below log4j.properties /opt/mapr/zookeeper/zookeeper-3.4.5/conf/log4j.properties file

# Max log file size of 10MB
log4j.appender.ROLLINGFILE.MaxFileSize=10MB
# uncomment the next line to limit number of backup files
#log4j.appender.ROLLINGFILE.MaxBackupIndex=10
log4j.appender.ROLLINGFILE.MaxBackupIndex=4

Ideally to retain logs for longer period set log4j.appender.ROLLINGFILE.MaxFileSize to 512 MB and log4j.appender.ROLLINGFILE.MaxBackupIndex to 10 followed by ZK restart on all ZK nodes in rolling fashion for the new config to be picked.

Monday, March 27, 2017

Mapr Mirroring details (Steps from CLDB logs )

Mapr Mirroring details (Steps from CLDB logs )

A mirror volume is a read-only physical copy of another volume, i.e source volume. We can use mirror volumes in the same cluster (local mirroring) to provide local load balancing by using mirror volumes to serve read requests for the most frequently accessed data in the cluster or as backup solution incase someone drops/deletes source volume.

This Blog assumes you have 5.2 secure cluster with correct ticket for access and permission to initiate mirror. Below are command line steps to create mirror / Same can be achieved via MCS .
1) Create source volume.

maprcli volume create -name srcvol -path /srcvol

2) Create mirror volume "mirrorvol" which is mirror of source volume "srcvol" .

maprcli volume create -name mirrorvol -source srcvol@Source -path /mirrorvol -type mirror

3) Loading source volume with some data for us to mirror it gradually.

dd if=/dev/zero of=11gfile bs=65536 count=163840

hadoop fs -put 11gfile /srcvol/

4) Now start the mirror.

[mapr@node9 logs]$ maprcli volume mirror start -name mirrorvol

messages

Started mirror operation for volume(s) 'mirrorvol'

[mapr@node9 logs]$

In this blog we will review CLDB logs to understand detailed steps/Stages during mirroring.

i) Mirroring starts (STATE_MIRROR_STARTED ):

2017-03-27 15:50:31,869 INFO VolumeUpdateManager [RPC-2]: VolumeUpdate: VolName: mirrorvolStarting mirror op STATE_UPDATE

ii) Now collects details of source and mirror volume as a setup process :

2017-03-27 15:50:31,894 INFO VolumeMirror [RPC-2]: Starting mirror of volume mirrorvol, cluster Source volumeId 19731699, mirrortype MIRROR_TYPE_LIVE, rollforward post mirror true, snap expiry time 0, dest creds uid 2000, src creds uid 2000 binding 43470384 cldbPort 7222 restoreSessionId 0

2017-03-27 15:50:31,951 INFO VolumeMirrorInfo [pool-5-thread-1]: Read mirror status of volume mirrorvol is STATE_MIRROR_STARTED

2017-03-27 15:50:32,160 INFO VolumeMirrorInfo [pool-5-thread-3]: Got the list of containers in the destination volumemirrorvol count 1 number of master nodes for volume 1

iii) Create mirrorsnapshot on source .

2017-03-27 15:50:33,458 INFO VolumeMirrorInfo [pool-5-thread-5]: Creating source snapshot mirrorsrcsnap.19731699.27-Mar-2017-15-50-33 of volume srcvol@Source src vol id 2238200 src root cid 2243 for mirroring of mirrorvol@Sourceenabled featurelist - mfs.feature.devicefile.support, mfs.feature.tables, mfs.feature.db.spillv2.support, mfs.feature.db.ace.support, mfs.feature.db.regionmerge.support, mfs.feature.filecipherbit.support, mfs.feature.db.bulkload.support, mfs.feature.db.repl.support, mfs.feature.db.json.support, mfs.feature.rwmirror.support, mfs.feature.fileace.support, mfs.feature.hardlinks.support, mfs.feature.volume.upgrade, mfs.feature.audit.support

2017-03-27 15:50:33,863 INFO VolumeMirrorInfo [pool-5-thread-5]: Created Source volume snapshotmirrorsrcsnap.19731699.27-Mar-2017-15-50-33 snapshotId 256000049 for src volume srcvol@Source mirror volume mirrorvol@Source dataSrcSnapCreateTimeMillis 1490655033490

iv) Gets list if containers on source Snapshot for mirroring .

2017-03-27 15:50:34,018 INFO VolumeMirrorInfo [pool-5-thread-6]: Got the list of containers in the source snapshot for mirorring of mirrorvol@Source src snapshot name mirrorsrcsnap.19731699.27-Mar-2017-15-50-33 count 6

v) Create mirrosnapshot on destination.

2017-03-27 15:50:35,304 INFO VolumeMirrorInfo [pool-5-thread-8]: Creating destination snapshot mirrorsnap.27-Mar-2017-15-50-35 of volume mirrorvol@Source dest vol id 19731699 dest rw cid 2244 snap expiry time 1491259835303

2017-03-27 15:50:35,418 INFO VolumeMirrorInfo [pool-5-thread-8]: Created destination volume snapshotmirrorsnap.27-Mar-2017-15-50-35 snapshotId 256000050 for src volume srcvol@Source mirror volume mirrorvol@Source retry count 0

vi) Now CLDB role is only to get update on status for mirror progress from MFS ( MFS are directly involved in resync 1-1 ) . Status changes to STATE_MIRROR_RESYNC_INPROGRESS .

2017-03-27 15:50:35,959 INFO VolumeUpdateManager [RPC-6]: Saving the src vol snapshot id 256000049 src vol snapshot name mirrorsrcsnap.19731699.27-Mar-2017-15-50-33

2017-03-27 15:50:35,985 INFO VolumeMirrorInfo [pool-5-thread-10]: Update mirror status of volume mirrorvol@Source newstate STATE_MIRROR_RESYNC_INPROGRESS mirrorid 1 nextMirrorId 2 completed successfully. srcsnapshot name mirrorsrcsnap.19731699.27-Mar-2017-15-50-33 destsnap name mirrorsnap.27-Mar-2017-15-50-35 destsnap id 256000050 srcsnap id 256000049

During this resync process if containers are needed for Mapping containers from source to mirror volume they are created.

2017-03-27 15:50:36,135 INFO ContainerAllocator [RPC-2]: ContainerCreate Container ID:2254 vol:19731699 Servers: 10.10.70.109

2017-03-27 15:50:36,136 INFO ContainerAllocator [RPC-9]: ContainerCreate Container ID:2252 vol:19731699 Servers: 10.10.70.109

2017-03-27 15:50:36,136 INFO ContainerAllocator [RPC-4]: ContainerCreate Container ID:2251 vol:19731699 Servers: 10.10.70.109

2017-03-27 15:50:36,137 INFO ContainerAllocator [RPC-7]: ContainerCreate Container ID:2253 vol:19731699 Servers: 10.10.70.109

2017-03-27 15:50:36,137 INFO ContainerAllocator [RPC-8]: ContainerCreate Container ID:2250 vol:19731699 Servers: 10.10.70.109

v) Resync Completes .

2017-03-27 15:53:34,922 INFO VolumeMirrorInfo [VolumeMirrorThread0]: Completed resync of containers of volume mirrorvol@Source

vi) Now the Status of the mirror changes to STATE_MIRROR_ROLLFORWARD_INPROGRESS. This stage is where snapshot containers in destination mirror volume are promoted to latest Copy which is part of Read only mirrored volume.

2017-03-27 15:53:35,067 INFO VolumeMirrorInfo [pool-5-thread-5]: Update mirror status of volume mirrorvol@Source newstate STATE_MIRROR_ROLLFORWARD_INPROGRESS mirrorid 1 nextMirrorId 2 completed successfully. srcsnapshot name mirrorsrcsnap.19731699.27-Mar-2017-15-50-33 destsnap name mirrorsnap.27-Mar-2017-15-50-35 destsnap id 256000050 srcsnap id 256000049

2017-03-27 15:53:35,301 INFO VolumeMirrorInfo [VolumeMirrorThread0]: Completed rollforward of containers of volume mirrorvol@Source

vii) Now source mirror snapshot is removed.

2017-03-27 15:53:35,352 INFO VolumeMirrorInfo [pool-5-thread-3]: Removing source snapshot id 256000049 of volume srcvol@Source for mirroring of mirrorvol@Source op StartMirror OpFailed false

2017-03-27 15:53:35,360 INFO SnapshotProcessor [RPC-2]: SnapshotRemove: Removing snapshot mirrorsrcsnap.19731699.27-Mar-2017-15-50-33 with snapshotid 256000049 for volume srcvol

2017-03-27 15:53:35,417 INFO PurgeExecutor [pool-1-thread-2]: [purgeSnapshot] Deleted snapshot with Id: 256000049

viii) Finally State for mirror changes to complete (STATE_MIRROR_COMPLETE)

2017-03-27 15:53:35,476 INFO VolumeMirrorInfo [pool-5-thread-9]: Update mirror status of volume mirrorvol@Source newstate STATE_MIRROR_COMPLETE mirrorid 1 nextMirrorId 2 completed successfully. srcsnapshot name destsnap name mirrorsnap.27-Mar-2017-15-50-35 destsnap id 256000050 srcsnap id 0

2017-03-27 15:53:35,552 INFO VolumeMirrorInfo [VolumeMirrorThread0]: Mirroring successfully completed for volume mirrorvol@Source From srcvol@Source

Some commands of interest :

To know 1:1 mapping between containers of mirror and source volume we can run below command to get the details.

[mapr@node9 logs]$ maprcli dump volumeinfo -volumename mirrorvol -json |egrep "ContainerId|CreatorContainerId|NameContainer"

"ContainerId":2244,

"NameContainer":"true",

"CreatorContainerId":2243,

"ContainerId":2250,

"NameContainer":"false",

"CreatorContainerId":2248,

"ContainerId":2251,

"NameContainer":"false",

"CreatorContainerId":2245,

"ContainerId":2252,

"NameContainer":"false",

"CreatorContainerId":2246,

"ContainerId":2253,

"NameContainer":"false",

"CreatorContainerId":2247,

"ContainerId":2254,

"NameContainer":"false",

"CreatorContainerId":2249,

Below command gives details on % of mirror completed (mirror-percent-complete) and volume type ("volumetype":1 indicates mirror volume) .

[mapr@node9 logs]$ maprcli volume info -name mirrorvol -json

{

"timestamp":1490660616571,

"timeofday":"2017-03-27 05:23:36.571 GMT-0700",

"status":"OK",

"total":1,

"data":[

{

"acl":{

"Principal":"User mapr",

"Allowed actions":[

"dump",

"restore",

"m",

"a",

"d",

"fc"

]

"creator":"mapr",

"aename":"mapr",

"aetype":0,

"numreplicas":"3",

"minreplicas":"2",

"nsNumReplicas":"3",

"nsMinReplicas":"2",

"allowGrant":"false",

"reReplTimeOutSec":"0",

"replicationtype":"high_throughput",

"rackpath":"/data",

"mirrorthrottle":"1",

"accesstime":"March 27, 2017",

"readonly":"1",

"mountdir":"/mirrorvol",

"volumename":"mirrorvol",

"mounted":1,

"quota":"0",

"advisoryquota":"0",

"snapshotcount":"1",

"logicalUsed":"10255",

"used":"1300",

"snapshotused":"0",

"totalused":"1300",

"scheduleid":0,

"schedulename":"",

"mirrorscheduleid":0,

"volumetype":1,

"mirrortype":2,

"creatorcontainerid":2243,

"creatorvolumeuuid":"-5484106972208618544:-9062173038321204647",

"volumeid":19731699,

"actualreplication":[

100,

"nameContainerSizeMB":0,

"nameContainerId":2244,

"needsGfsck":false,

"maxinodesalarmthreshold":"0",

"dbrepllagsecalarmthresh":"0",

"limitspread":"true",

"partlyOutOfTopology":0,

"auditVolume":0,

"audited":0,

"coalesceInterval":60,

"enableddataauditoperations":"getattr,setattr,chown,chperm,chgrp,getxattr,listxattr,setxattr,removexattr,read,write,create,delete,mkdir,readdir,rmdir,createsym,lookup,rename,createdev,truncate,tablecfcreate,tablecfdelete,tablecfmodify,tablecfScan,tableget,tableput,tablescan,tablecreate,tableinfo,tablemodify,getperm,getpathforfid,hardlink",

"disableddataauditoperations":"",

"mirrorSrcVolume":"srcvol",

"mirrorSrcVolumeId":2238200,

"mirrorSrcCluster":"Source",

"mirrorDataSrcVolume":"srcvol",

"mirrorDataSrcVolumeId":2238200,

"mirrorDataSrcCluster":"Source",

"lastSuccessfulMirrorTime":1490655215453,

"mirror-percent-complete":100,

"mirrorId":2,

"nextMirrorId":2,

"mirrorstatus":0,

"volumeAces":{

"readAce":"p",

"writeAce":"p"

"fixCreatorId":"false",

"ReplTypeConversionInProgress":"0"

}

]

}