AWS/Azure(Cloud)/Spark/Hadoop / Linux : October 2016

Monday, October 31, 2016

MFS/NFS Debug Logging

MFS/NFS Debug Logging

Default log level for MapR FS is Info but there will be instances when we need to get more detailed logging while troubleshooting an issue.

Allowed Trace Levels are:
FATAL
ERROR
WARN
INFO
DEBUG ( Very verbose )

To set loglevel to DEBUG for all modules :

maprcli trace setlevel -level debug -port 5660 -host <Hostname>

maprcli trace setmode -mode continuous -port 5660 -host <Hostname>

Verify all modes set to DEBUG level :

maprcli trace info

To set back to normal:

maprcli trace setlevel -level default -port 5660 -host <Hostname>

maprcli trace setmode -mode default -port 5660 -host <Hostname>

Debug for resync module

Debug logging can be enabled for specific modules , Like for resync/replication issue we can turn on debug logging for replication module via below commands on source and destination.

maprcli trace setlevel -level DEBUG -module Replication -host $src_host

maprcli trace setlevel -level DEBUG -module Replication -host $dest_host

MapR-DB

Similarly if debug logging needs to be enabled for troubleshooting MapR-DB issues below are modules where debug needs to be turned on.

maprcli trace setlevel -module DB -level debug -host <Hostname>

maprcli trace setlevel -module DBRpc -level debug -host <Hostname>

NFS
While troubleshooting NFS issue we might need to get detailed logs of when the op reached NFS and when op was served by NFS back to client to do so we can turn on DEBUG logging in NFS via below commands.

maprcli trace setlevel -level debug -port 9998 -host `hostname -f`
maprcli trace setmode -mode continuous -port 9998 -host `hostname -f`

Tuesday, October 25, 2016

Multi MFS

For MapR deployments on clusters with SSDs more fileserver instances can be configured on every node . On servers with SSDs and MapRDB license, this feature is automatically enabled (2 instances). This was designed since with High network and disk throughput MFS process was becoming bottleneck .

This blog assumes you have a 5.2 MapR cluster installed and has nodes with 6 SP's created ( 1 SSD per SP )

1) To configure 6 instances i.e 1 instance per SP run below command and restart warden .

maprcli config save -values {"multimfs.numinstances.pernode":"6"}

Once the fileserver come up they would have 6 instances running on each node.

You can confirm by running below commands.

i) /opt/mapr/server/mrconfig info instances

5660 5661 5662 5663 5664 5665

ii) Below command gives SP layout along with port on which instance is started.

opt/mapr/server/mrconfig sp list -v

ListSPs resp: status 0:6

No. of SPs (6), totalsize 1350248 MB, totalfree 1346902 MB

SP 0: name SP1, Online, size 225041 MB, free 224481 MB, path /dev/sdb, log 200 MB, port 5660, guid 8cd70192656eb99200567f391f013f71, clusterUuid -6798498073769914619--7342816723903035526, disks /dev/sdb

SP 1: name SP2, Online, size 225041 MB, free 224498 MB, path /dev/sdd, log 200 MB, port 5661, guid 4d1ce1445bb7895200567f392304a1b8, clusterUuid -6798498073769914619--7342816723903035526, disks /dev/sdd

SP 2: name SP3, Online, size 225041 MB, free 224501 MB, path /dev/sdc, log 200 MB, port 5662, guid 82c3bef9e8b7107000567f392707cea1, clusterUuid -6798498073769914619--7342816723903035526, disks /dev/sdc

SP 3: name SP4, Online, size 225041 MB, free 224446 MB, path /dev/sde, log 200 MB, port 5663, guid 625c0c900f7e9ce500567f392b0b1f75, clusterUuid -6798498073769914619--7342816723903035526, disks /dev/sde

SP 4: name SP5, Online, size 225041 MB, free 224496 MB, path /dev/sdf, log 200 MB, port 5664, guid 3984f36383daabe000567f392f0d5cac, clusterUuid -6798498073769914619--7342816723903035526, disks /dev/sdf

SP 5: name SP6, Online, size 225041 MB, free 224478 MB, path /dev/sdg, log 200 MB, port 5665, guid 8396bc5ee969dafd00567f3934025547, clusterUuid -6798498073769914619--7342816723903035526, disks /dev/sdg

maprcli config load -json | grep multimfs.numinstances.pernode

"multimfs.numinstances.pernode":"6",

2) Any mrconfig commands will give details of all Instances .

/opt/mapr/server/mrconfig info threads

----------------------

|From Instance 5660::|

----------------------

No Running threads

----------------------

|From Instance 5661::|

----------------------

No Running threads

----------------------

|From Instance 5662::|

----------------------

No Running threads

----------------------

|From Instance 5663::|

----------------------

No Running threads

----------------------

|From Instance 5664::|

----------------------

No Running threads

----------------------

|From Instance 5665::|

----------------------

No Running threads

3) To get details for any specific instance run below command.

/opt/mapr/server/mrconfig -i -h `hostname -f` -p 5661 info threads

----------------------

|From Instance 5661::|

----------------------

No Running threads

NOTE : - If for any reason you need a specific host to have different number of instances you can achieve via below ways followed by warden restart for it to be picked up .

maprcli node modify -nodes <HOSTNAME> -numInstances 3

mv /opt/mapr/conf/mfsinstances_6  /opt/mapr/conf/mfsinstances_3

Understanding Logs

Additional log files are created in $MAPR_HOME/logs when this feature enabled.

Log files: one per instance: mfs.log-<N> and mfs.log.<instanceID>-<N>, where <N> is greater than 0. For example:

- mfs.log-3 for the primary instance
- mfs.log.1-3 for the second instance

Example of logs for my 6 instances :

ls  mfs.log*-3

mfs.log.1-3  mfs.log.2-3  mfs.log-3  mfs.log.3-3  mfs.log.4-3  mfs.log.5-3

When Instances come up below is logged in the log messages in mfs.log*-3 , from below logs we can cross check which SP/disk is assigned to which instance . Also from the instance logs we 

Instance 1

2015-12-27 01:26:38,2768 INFO loadsp.cc:216 InitStoragePools from disktab, instance 0 numInstances 6

2015-12-27 01:26:38,2769 INFO iomgr.cc:2705 found 6 disks in disktab

2015-12-27 01:26:56,9202 INFO spinit.cc:1045 SP SP1:/dev/sdb Initialized

2015-12-27 01:26:56,9202 INFO loadsp.cc:182 Done loading disks from disktab for instance 0

2015-12-27 01:26:57,0399 INFO cldbha.cc:532 Got registration request from instance 4, numInstances 6

2015-12-27 01:26:57,6216 INFO cldbha.cc:532 Got registration request from instance 2, numInstances 6

2015-12-27 01:26:58,2328 INFO cldbha.cc:532 Got registration request from instance 0, numInstances 6

2015-12-27 01:26:58,2549 INFO cldbha.cc:532 Got registration request from instance 5, numInstances 6

2015-12-27 01:26:58,8337 INFO cldbha.cc:532 Got registration request from instance 3, numInstances 6

2015-12-27 01:27:02,6280 INFO fileserver.cc:11207 Registered with cldb 10.10.72.18:7222

Instance 2

2015-12-27 01:26:56,9048 INFO spinit.cc:1045 SP SP2:/dev/sdc Initialized

2015-12-27 01:26:56,9048 INFO loadsp.cc:182 Done loading disks from disktab for instance 1

Instance 3

2015-12-27 01:26:56,9274 INFO spinit.cc:1045 SP SP3:/dev/sdd Initialized

2015-12-27 01:26:56,9274 INFO loadsp.cc:182 Done loading disks from disktab for instance 2

2015-12-27 01:26:57,6215 INFO cldbha.cc:487 Switching to primary instance to register with CLDB

Instance 4

2015-12-27 01:26:56,9100 INFO spinit.cc:1045 SP SP4:/dev/sde Initialized

2015-12-27 01:26:56,9100 INFO loadsp.cc:182 Done loading disks from disktab for instance 3

2015-12-27 01:26:58,8336 INFO cldbha.cc:487 Switching to primary instance to register with CLDB

Instance 5

2015-12-27 01:26:56,8195 INFO spinit.cc:1045 SP SP5:/dev/sdf Initialized

2015-12-27 01:26:56,8195 INFO loadsp.cc:182 Done loading disks from disktab for instance 4

2015-12-27 01:26:57,0399 INFO cldbha.cc:487 Switching to primary instance to register with CLDB

Instance 6

2015-12-27 01:26:56,9032 INFO spinit.cc:1045 SP SP6:/dev/sdg Initialized

2015-12-27 01:26:56,9032 INFO loadsp.cc:182 Done loading disks from disktab for instance 5

2015-12-27 01:26:58,2548 INFO cldbha.cc:487 Switching to primary instance to register with CLDB

CLDB logs :

2015-12-27 01:27:02,545 INFO FileServerHandler [RPC-2]: FSRegister: Registered FileServer: 10.10.72.18- at topology /default-rack/tss2-18/5660

2015-12-27 01:27:02,546 INFO FileServerHandler [RPC-2]: FileServer Registration Request: Node Configuration

2015-12-27 01:27:02,546 INFO FileServerHandler [RPC-2]: NumCpus: 24 Avail Memory: 22503 Num Sps: 6 Num Instances: 6

From the logs we can see although all the instances start in succession but they do not individually come and register with CLDB but it actually switches to Primary instance which inturn registers with CLDB.

Tuesday, October 18, 2016

Fuse installation and DEBUG

Fuse Installation and DEBUG

This post assumes you have MapR cluster already setup and need to setup Fuse client for reading and writing to cluster.

Mapr posix package needs maps client package installed on the node as it has dependency on it.

[root@node10 logs]# repoquery --requires mapr-posix-client-basic

/bin/sh

mapr-client

[root@node10 logs]# 

1) Install Basic POSIX client package on your machine.

yum install mapr-posix-client-basic

2) Run configure.sh to set this node as the client node.

For example:

/opt/mapr/server/configure.sh -N <clustername> -C <CLDBhost> -Z <ZooKeeperhost> -c -R

3) Start maps-posix-client-basic .

service mapr-posix-client-basic start
or

/opt/mapr/initscripts/mapr-posix-client-basic start

Once Up you will see below logs in /opt/mapr/logs/posix-client-basic.log.


Sun Oct 16 00:35:26 PDT 2016: Running /opt/mapr/initscripts/mapr-posix-client-basic start

Sun Oct 16 00:35:26 PDT 2016: Mounting posix-client-basic /mapr --log_path /opt/mapr/logs -o allow_other -o big_writes -o auto_unmount -o async_dio -o max_background=64 at /mapr ...

Starting fuse with 1 libraries

Sun Oct 16 00:35:26 PDT 2016: Result:0

Started registering to cldb.Successfully registered to cldb.Invalid Option, record already found, cannot override secure=true for cluster: Abizer.cluster.com

Below FS will be mounted from which the client will have access to the cluster.

[root@node10 logs]# df -hP /mapr

Filesystem          Size  Used Avail Use% Mounted on

posix-client-basic  168G  437M  168G   1% /mapr

[root@node10 logs]# 

Troubleshooting :

1) You can force unmount via below command if needed.

umount -lf /mapr

2) If fuse is hanging

i) Run gstack <pid> (of the fuse process) in a loop with a sleep of 1 second in the state where fio makes no progress to check the state of fuse process for say 5 mins.

ii) It would be helpful if we enable debug logs. We can use fcdebug to enable debug logs at run-time ( Say for 5 mins ).

Find Shmid (1262092291) this is logged in /opt/mapr/logs/ffs.log-0 logs when fuse starts up.

2016-10-16 00:35:26,7650 ERROR FuseAPI fs/client/fileclient/cc/fuse_api.cc:1357 Thread: 11050 Shmid to be used by fcdebug 1262092291

Enable Debug for all modules will take 60 seconds for debug logs to print .

 /opt/mapr/server/tools/fcdebug -s 1262092291 -l DEBUG

After you enable Debug and any operation via fuse client will log messages as below.


2016-10-17 22:47:13,6229 DEBUG FuseOps fs/client/fuse/cc/fuse_ops_ll.c:597 Thread: 11123 lookup: parent 550024249360, name abi, libindex 0



2016-10-17 22:47:13,6229 DEBUG Client fs/client/fileclient/cc/client.cc:3119 Thread: 11123 Lookupfid : start = abi, end = (nil)



2016-10-17 22:47:13,6230 DEBUG Client fs/client/fileclient/cc/client.cc:3220 Thread: 11123 Pathwalk of abi started, pfid 2049.16.2



2016-10-17 22:47:13,6230 DEBUG Cidcache fs/client/fileclient/cc/cidcache.cc:729 Thread: 11123 GetBinding: for cid 2049 isUpd 0



2016-10-17 22:47:13,6230 DEBUG Cidcache fs/client/fileclient/cc/cidcache.cc:563 Thread: 11123 >allocate and fill for cid 2049, isUpd 0



2016-10-17 22:47:13,6230 DEBUG Cidcache fs/client/fileclient/cc/cidcache.cc:517 Thread: 11123 Found valid entry for cid 2049, numSer 1



2016-10-17 22:47:13,6230 DEBUG Client fs/client/fileclient/cc/client.cc:3223 Thread: 11123 Rpc: PathWalkPlusProc to Fileserver: 10.10.70.110:5692 for fid: 2049.16.2



2016-10-17 22:47:13,6237 DEBUG ClntProfileRpc fs/client/fileclient/cc/client.cc:3223 Thread: 11123 Profile: CltRpcDone: server 10.10.70.110:5692 took 0 msec error 2 FID 2049.16.2  PathWalkPlus path abi

iii) You may have to run "ulimit -c unlimited" and then "kill -6 <fuse-pid>" or "kill -11 <Fuse-pid> to get core for the process.

This details would be good enough along with application logs to know the timestamp when customer observed delays to start troubleshooting.