Wednesday, March 29, 2017

Mapr Mirroring details (Steps from MFS logs )

                          Mapr Mirroring details (Steps from MFS logs )

This is continuation of earlier blog but will concentrate from MFS side . MFS main work during the mirroring is creating snapshots for source and destination followed by re-syncing for the data to be exactly same as source and role forwarding the containers in destination mirror snapshot to mirror volume to be part of final Mirror volume followed by deletion of any mirror snapshots.  In short CLDB only give direction MFS is the one who does all the work during mirroring operation. (http://abizeradenwala.blogspot.com/2017/03/mapr-mirroring-details-steps-from-cldb.html)

1) First MFS creates mirror snapshot of the volume on source .

2017-03-27 15:50:33,5013 INFO Snapshot volumesnap.cc:1173 Added snap map for rwcid 2243 snapcid 256000049
2017-03-27 15:50:33,5262 INFO Snapshot volumesnap.cc:1173 Added snap map for rwcid 2245 snapcid 256000050
2017-03-27 15:50:33,5262 INFO Snapshot volumesnap.cc:1173 Added snap map for rwcid 2246 snapcid 256000051
2017-03-27 15:50:33,5262 INFO Snapshot volumesnap.cc:1173 Added snap map for rwcid 2247 snapcid 256000052
2017-03-27 15:50:33,5262 INFO Snapshot volumesnap.cc:1173 Added snap map for rwcid 2248 snapcid 256000053
2017-03-27 15:50:33,5262 INFO Snapshot volumesnap.cc:1173 Added snap map for rwcid 2249 snapcid 256000054

2017-03-27 15:50:33,5263 INFO Snapshot snapshot.cc:324 snap container 2243 : Creating snapshot 256000049 snapId 256000049
2017-03-27 15:50:33,8601 INFO Snapshot volumesnap.cc:895 Snapped volume : snap mirrorsrcsnap.19731699.27-Mar-2017-15-50-33 snapId 256000049 nameCid 2243 took 397 msec

2) Now MFS will create mirrosnapshot on destination volume. 

2017-03-27 15:50:35,3261 INFO Snapshot volumesnap.cc:1173 Added snap map for rwcid 2244 snapcid 256000055
2017-03-27 15:50:35,3364 INFO Snapshot snapshot.cc:324 snap container 2244 : Creating snapshot 256000055 snapId 256000050
2017-03-27 15:50:35,4183 INFO Snapshot volumesnap.cc:895 Snapped volume : snap mirrorsnap.27-Mar-2017-15-50-35 snapId 256000050 nameCid 2244 took 112 sec

3) During this step CLDB has collected the info on how many containers exist on source volume and instructs MFS to create same number of containers on mirror volume mirror volume for containers to be mapped 1-1 for resync (Source - Destination)

2017-03-27 15:50:36,1145 INFO Container create.cc:805 Created Container on sp SP2:/dev/sdd with cid 2250 rindex 0
2017-03-27 15:50:36,1145 INFO Container create.cc:805 Created Container on sp SP2:/dev/sdd with cid 2254 rindex 0
2017-03-27 15:50:36,1256 INFO Container create.cc:805 Created Container on sp SP1:/dev/sdc with cid 2251 rindex 0
2017-03-27 15:50:36,1256 INFO Container create.cc:805 Created Container on sp SP1:/dev/sdc with cid 2253 rindex 0
2017-03-27 15:50:36,1256 INFO Container create.cc:805 Created Container on sp SP1:/dev/sdc with cid 2252 index 0

4) Now MFS on source and destination have to communicate and finish the resync for data to be transferred for which CID on source and destination are paired. 

2017-03-27 15:50:37,8192 INFO Replication containerrestore.cc:5066 Get Resync Status and Start Resync WA 0xd915c018 replicacid 2244 srccid 256000049  mirrord 2
2017-03-27 15:50:37,8192 INFO Replication resyncworkareas.cc:125 FindContainerStoppedWA inputs replicacid 2244 srccid 256000049 wa (nil)
2017-03-27 15:50:38,6467 INFO Replication volumemirrorserver.cc:274 Added volume map and inverse volume map for rwcid 2243 destcid 2244
2017-03-27 15:50:38,6467 INFO Replication volumemirrorserver.cc:274 Added volume map and inverse volume map for rwcid 2248 destcid 2250
2017-03-27 15:50:38,6467 INFO Replication volumemirrorserver.cc:274 Added volume map and inverse volume map for rwcid 2245 destcid 2251
2017-03-27 15:50:38,6467 INFO Replication volumemirrorserver.cc:274 Added volume map and inverse volume map for rwcid 2246 destcid 2252
2017-03-27 15:50:38,6467 INFO Replication volumemirrorserver.cc:274 Added volume map and inverse volume map for rwcid 2247 destcid 2253
2017-03-27 15:50:38,6467 INFO Replication volumemirrorserver.cc:274 Added volume map and inverse volume map for rwcid 2249 destcid 2254

5) To explain the details of this resync process I will only concentrate on one pair of container which will be resynced .

Creator container        Mirror vol container
2243(Snap 256000049)          -- >     2244 (Snap 4288880229)

i) First we create clone of destination rw container

2017-03-27 15:50:38,2581 INFO Replication containerrestore.cc:1289 Creating clone container of rwcid 2244
2017-03-27 15:50:38,2581 INFO Snapshot snapshot.cc:324 snap container 2244 : Creating snapshot 4288880229 snapId 0

ii) Now resync from srccid which is Snap cid 256000049 to replica clone cid ( 4288880229 ) starts.

2017-03-27 15:50:38,6469 INFO Replication containerrestore.cc:1690 Starting resync from TxnVN 0:0 SnapVN 35:35 WriteVN 0:0 srccid 256000049 replicacid 2244 needreconnect 0
2017-03-27 15:50:38,6471 INFO Replication containerresync.cc:176 CONTAINER_RESYNC_START -- destnode FSID 5979527104878929364, 10.10.70.109:5660, 10.10.70.109:5692,, srccid 256000049, replicacid 2244, wa 0xe34bc000, resyncwacount 8 replica txnvn 0:0 writevn 0:0 snapvn 35:35 chainSeqNumber 0 destRWMirrorId 0 destCloneMirrorId 0, destVolumeId 19731699, metaDataOnly 0, replica compression type 3
2017-03-27 15:50:38,6471 INFO Replication containerresync.cc:178 srccid 256000049 replicacid 2244 replicaIsRwmirrorCapable 1 srcPrevVolSnapId 0
2017-03-27 15:50:38,6472 INFO Replication containerresync.cc:1025 Converting the resync to full resync because VNs are all zero for srccid 256000049, replicacid 2244, destnode 10.10.70.109:5660
2017-03-27 15:50:38,6472 INFO Replication containerresync.cc:2929 Resync of cid 2244 snapcontainer 256000049 prevcid 0 prevresynccid 0 txnvn 0 writevn 0 snapvn 35
2017-03-27 15:50:38,6472 INFO Replication containerresyncfromsnapshot.cc:162 CONTAINER_RESYNC_FROM_SNAPSHOT_START -- from cid 256000049 replica 2244 txnVN 0 snapVN 35 writeVN 0 isundo 0, rollforwardcontainer 0 dumpSnapshotInode 0, isRwmirrorCapable 1, replicaIsRwmirrorCapable 1, using starting Inode 0 send starting Inode 0 
2017-03-27 15:50:38,6472 INFO Replication containerresyncfromsnapshot.cc:374  srccid 256000049 replicacid 2244 sessionid 4020444768 undoneeded 0 txnundoneeded 0 writeundoneeded 0 snapundoneeded 0 undotxnvn 0 undowritevn 0 undosnapvn 18446744073709551615 redotxnvn 0 redowritevn 0 redosnapvn 0
2017-03-27 15:50:38,6472 INFO Replication containerresyncfromsnapshot.cc:383 Converting the resync to full resync because VNs are all zero for srccid 256000049, replicacid 2244 
2017-03-27 15:50:38,6472 INFO Replication containerresyncfromsnapshot.cc:422 Redoing replica cid 2244 from txnvn:0 snapvn:0 writevn:0 
2017-03-27 15:50:38,7820 INFO Replication containerresyncfromsnapshot.cc:1564 ResyncFromSnapshot resyncing versioninfo srccid:256000049 replicacid:2244TxnVN 1048700:1048700 WriteVN 1048698:1048698 snapVN 34:34 maxUniq 262570, maxinum 255
2017-03-27 15:50:38,7822 INFO Replication inoderestore.cc:836 Container restore for cid 4288880229, srccid 256000049, replicacid 2244, maxinum on replica is 255 and on src is 255, no inode delete needed
2017-03-27 15:50:38,7822 INFO Replication containerrestore.cc:3249 ContainerRestore updating versioninfo for cid 2244 txnvn 1048700:1048700 writevn 1048698:1048698 snapvn 35:35 maxUniq 262570 updateSnapVnSpace 1
2017-03-27 15:50:38,8825 INFO Replication containerrestore.cc:4021 RestoreDataEnd Complete resync data WA 0xd90ba018 srccid 256000049 replicacid 2244
2017-03-27 15:50:38,8828 INFO Replication containerresyncfromsnapshot.cc:1977 CONTAINER_RESYNC_FROM_SNAPSHOT_END-- from cid 256000049, replica 2244, time 235 ms, numInodes 16, numInodesIterated 256, numBytes 68986, numMsgs 18, err 0
2017-03-27 15:50:38,8828 INFO Replication containerresync.cc:3070 Resync from snapshot completed srccid:256000049 replicacid:2244 resynccid:256000049 err 0 svderr 0
2017-03-27 15:50:38,8828 INFO Replication containerresync.cc:3885 wa->srccid=256000049 set_mirrorstate=3 replicaCid=2244
2017-03-27 15:50:38,8828 INFO Replication containerrestore.cc:4615 RestoreContainerCleanupGotRWContainer cid 2244 replicatecleanup 1 isMaster 1 isStale 0
2017-03-27 15:50:38,9025 INFO Replication containerrestore.cc:4734 Resync complete for srccid 256000049 replicacid 2244  srcnode 10.10.70.109:5660 rollforward 0
2017-03-27 15:50:38,9025 INFO Replication containerrestore.cc:4981 CONTAINER_RESTORE_END -- Resync of container successful srcnode FSID 5979527104878929364, 10.10.70.109:5660, 10.10.70.109:5692,, srccid 256000049, replicacid 2244, time 1082 ms, numMsgs 21, numBytes 69073
2017-03-27 15:50:38,9025 INFO Replication containerrestore.cc:4521 RestoreEnd Complete resync data WA 0xd8e32018 srccid 256000049 replicacid 2244
2017-03-27 15:50:38,9025 INFO Replication containerresync.cc:3944 ResyncContainer complete srccid 256000049 replicacid 2244 err 0x0
2017-03-27 15:50:38,9025 INFO Replication containerresync.cc:3282 CONTAINER_RESYNC_END -- destnode FSID 5979527104878929364, 10.10.70.109:5660, 10.10.70.109:5692, srccid 256000049, replicacid 2244, time 255 ms, numIterations 1, numInodesIterated 256numInodes 16, numMsgs 18, numBytes 68986, err 0

iii) Next steps after resync is role forward where the clone container 4288880229 role is swapped with 2244 and roll forward flag in the container is cleared ( In turn all the data us available in destination container now )

2017-03-27 15:53:35,2580 INFO Replication volumemirrorserver.cc:318 Updating the mirrorid of container 2244 to mirrorId 1 nextMirrorId 2 
2017-03-27 15:53:35,2580 INFO Replication volumemirrorserver.cc:397 Setting the rollforward flag for cid 2244
2017-03-27 15:53:35,2580 INFO Replication volumemirrorserver.cc:530 UpdateMirrorId cid 2244 CLDB mirrorId 1 CLDB nextMirrorId 2 FS mirrorId 0 FS cloneMirrorId 2
2017-03-27 15:53:35,2580 INFO Replication volumemirrorserver.cc:556 UpdateMirrorId cid 2244 version of clone container txnvn 1048700:1048700, writevn 1048698:1048698, snapvn 35:35
2017-03-27 15:53:35,2580 INFO Replication volumemirrorserver.cc:672 Rolling forward the clone container 4288880229 for cid 2244
2017-03-27 15:53:35,2580 INFO Container rollback.cc:89 RollForward 0 :Readwrite cid 2244.
2017-03-27 15:53:35,2732 INFO Container rollback.cc:990 RollForward 2244 :Clearing rollforward in progress flag for cid 2244

6) Once resync for all the containers is completed as explained in above step , all the resynced clone containers need to roll forward to the original CID . 

2017-03-27 15:53:35,2580 INFO Replication volumemirrorserver.cc:672 Rolling forward the clone container 4288880229 for cid 2244
2017-03-27 15:53:35,2580 INFO Replication volumemirrorserver.cc:672 Rolling forward the clone container 4288880224 for cid 2251
2017-03-27 15:53:35,2580 INFO Replication volumemirrorserver.cc:672 Rolling forward the clone container 4288880228 for cid 2254
2017-03-27 15:53:35,2581 INFO Replication volumemirrorserver.cc:672 Rolling forward the clone container 4288880227 for cid 2250
2017-03-27 15:53:35,2581 INFO Replication volumemirrorserver.cc:672 Rolling forward the clone container 4288880226 for cid 2252
2017-03-27 15:53:35,2581 INFO Replication volumemirrorserver.cc:672 Rolling forward the clone container 4288880225 for cid 2253

7) As a final step Clone containers created for resync are deleted followed by deleting source mirror snapcid ( Snap volume completely )

2017-03-27 15:53:35,2735 INFO Container delete.cc:1528 Delete container 4288880229 : Ownership transfer for cntr-tree not required.
2017-03-27 15:53:35,2736 INFO Container delete.cc:1528 Delete container 4288880224 : Ownership transfer for cntr-tree not required.
2017-03-27 15:53:35,2736 INFO Container delete.cc:1528 Delete container 4288880228 : Ownership transfer for cntr-tree not required.
2017-03-27 15:53:35,2736 INFO Container delete.cc:1528 Delete container 4288880226 : Ownership transfer for cntr-tree not required.
2017-03-27 15:53:35,2736 INFO Container delete.cc:1528 Delete container 4288880227 : Ownership transfer for cntr-tree not required.
2017-03-27 15:53:35,2736 INFO Container delete.cc:1528 Delete container 4288880225 : Ownership transfer for cntr-tree not required.

2017-03-27 15:53:36,0959 INFO Container delete.cc:1528 Delete container 256000052 : Ownership transfer for cntr-tree not required.
2017-03-27 15:53:36,0959 INFO Container delete.cc:1528 Delete container 256000053 : Ownership transfer for cntr-tree not required.
2017-03-27 15:53:36,0960 INFO Container delete.cc:1528 Delete container 256000049 : Ownership transfer for cntr-tree not required.
2017-03-27 15:53:36,0960 INFO Container delete.cc:1528 Delete container 256000050 : Ownership transfer for cntr-tree not required.
2017-03-27 15:53:36,0961 INFO Container delete.cc:1528 Delete container 256000051 : Ownership transfer for cntr-tree not required.
2017-03-27 15:53:36,0962 INFO Container delete.cc:1528 Delete container 256000054 : Ownership transfer for cntr-tree not required.



No comments:

Post a Comment