AWS/Azure(Cloud)/Spark/Hadoop / Linux : September 2015

Monday, September 21, 2015

Performance Issue / Hang issue

Performance Issue / Hang issue

-First we can use vmstat to check system resource utilization and see which resource is scarce when hang is observed as below. If its due to Memory constrains or system swapping its easy to address by adding more memory to the server or putting cap on processes to utilize only dedicated amount of memory. Below blog gives detail on identifying system resource constrains step by step.

http://abizeradenwala.blogspot.com/2015/07/resource-utilization-to-quickly.html

- If we identify its certain process causing it then we can start with htop to identify if processes hogging most amount of resources have multiple threads running in them internally . Htop is usually not installed on the system by default and might need below package installed.

htop-1.0.1-2.el6.x86_64

Using htop, press t to get a nested tree of threads and specifically look at PID which seems to be hogging most amount of resources.

Alternatively to get list of all the threads and processes currently running on the system we would need to run below command to get all threads . Running below ps as a cron job and poll the system every 5 mins and recreate the system hang can identify if there are huge number of threads which have been spunned up via java internally.

ps -eLf | grep 21917

UID PID PPID LWP C NLWP STIME TTY TIME CMD

mapr 21887 1 21917 0 53 Sep11 00:01:05 /usr/lib/jvm/jre-1.7.0-openjdk.x86_64/bin/java

-L Show threads, possibly with LWP( light wt process) and NLWP ( Not light wt process ) columns

- If we do see some process continuously hogging CPU it would point us to some part of Code is not using CPU effectively or some threads are stuck in CPU and not making progress . We should run

"kill -3 " every 1 sec for it to thread dump to standard out of the process and carefully review the thread dump with Developers to identify any issue in the code. This will usually show which thread is slow or not making any or significant progress during the time timeframe when issue was observed , this would help developers to put the fix in the code to avoid potential hang.

Friday, September 11, 2015

Iostat

Iostat

Alot of time we need to find out disk IO utilization, measure if all disks are performing well and monitor system input/output device loading by observing the time the physical disks are active in relation to their average transfer rates. I below example i ran dd on disk "sdb" and ran iostat in back ground, its seen disk is busy i.e %util is almost 100%.

dd bs=1M count=4096 if=/dev/zero of=/dev/sdb oflag=direct ( Directly writing to disk )

Iostat was run with below options where m = display numbers in MB -t = print time stamp
-x = Display extended statistics

iostat -m -t -x 1

09/11/2015 02:28:39 PM
avg-cpu: %user %nice %system %iowait %steal %idle
0.00 0.00 2.27 41.48 0.00 56.25
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdb 0.00 0.00 0.00 44.00 0.00 22.00 1024.00 1.57 34.82 22.68 99.80
sdc 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdd 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sde 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

Line 1 : On First line it prints time
Line 2 and 3 : Print CPU stats/utilization , here we see CPU is still idle

Line 5 --- n : Prints various stats for each disk ( Stat printed mean as below ) , I will shortly explain which once are what and what to infer out of this numbers.

rrqm/s : The number of read requests merged per second that were queued to the hard disk
wrqm/s : The number of write requests merged per second that were queued to the hard disk
r/s : The number of read requests per second
w/s : The number of write requests per second
rsec/s : The number of sectors read from the hard disk per second
wsec/s : The number of sectors written to the hard disk per second
avgrq-sz : The average size (in sectors) of the requests that were issued to the device.
avgqu-sz : The average queue length of the requests that were issued to the device If one complains about I/O performance issues when avgqu-sz is lower, then it is application specific stuff, that can be resolved with more aggressive read-ahead, less fsyncs, etc. One interesting part – avqu-sz, await, svctm and %util are iterdependent ( await = avgqu-sz * svctm / (%util/100)
await : The average time (in milliseconds) for I/O requests issued to the device to be served. This includes the time spent by the requests in queue and the time spent servicing them. If there is not alot of I/O being created but there are requests just pending it could be due to disk being slow due to H/W issue.
svctm : The average service time (in milliseconds) for I/O requests that were issued to the device
%util : Percentage of CPU time during which I/O requests were issued to the device (bandwidth utilization for the device). Device saturation occurs when this value is close to 100%. Also this value is excluding any kind of cache here – if request can be served from cache, the chance is quite negligible it will show up in %util, unlike in other values.

If below values from the iostat output is high it would mean the specific disk is under pressure: Clearly in above example sdb is being utilized a lot.

The average service time (svctm)
Percentage of CPU time during which I/O requests were issued (%util)
If a hard disk reports consistently high reads/writes (r/s and w/s)
await is continuously high ( Very important ) .
avgqu-sz is continuously high ( very important )

Note :- " -n " option Displays the network filesystem (NFS) report

commonly accepted averages

Rotational Speed (rpm) IOPS

5400 50-80

7200 75-100

10k 125-150

15k 175-210

Sunday, September 6, 2015

Top Command Explanation

Top Command Explanation

As a Linux system admin top command is a frequently used command to view resource utilization (Memory and CPU) by processes on server. This command helps us to find which process is utilizing what resources of system and nail down the process which is hogging all memory or churning CPU.

Although there are much better and user friendly tools then top like htop but i would consider this to be used as a first level command and later use more specific commands to further drill down to RC the issue . Below post i am writing on how to use and read results of top command.

Reading Linux Top Command Output:

When we execute top command on linux, it shows a lot of results, here i am trying to show you to how to read it row by row.

Result Row #1:

Row 1 results shows about server up time from last reboot, currently logged in users and cpu load on server. The same output you can find using linux uptime command.

top - 21:56:08 up 62 days, 6:38, 3 users, load average: 0.08, 0.04, 0.00

Result Row #2:

Row 2 shows the number of process running on server and there state.

Tasks: 187 total, 1 running, 186 sleeping, 0 stopped, 0 zombie

Zombie process is a process that has completed execution but still has an entry in the process table. This entry is still needed to allow the parent process to read its child’s exit status. Zombies are basically the leftover bits of dead processes that haven’t been cleaned up properly. A program that creates zombie processes isn’t programmed properly – programs aren’t supposed to let zombie processes stick around.

Result Row #3:

Row three shows the cpu utilization status on server, you can find here how much cpu is free and how much is utilizing by system.

Cpu(s): 0.3%us, 0.5%sy, 0.0%ni, 98.8%id, 0.3%wa, 0.0%hi, 0.0%si, 0.0%st

0.3% us : %CPU used by User processes

0.5%sy : %CPU used by System Processes

0.0%ni : %CPU used by setting Nice value
98.8%id : %CPU in Idle state
0.3%wa : %CPU Waiting on I/O
0.0%hi / 0.0%si : %CPU used by Hardware/Software Interrupts
0.0%st : Steal time is the time that a virtual CPU waits for a real CPU while the hypervisor is servicing another virtual processor.

Result Row #4:

Row 4 shows the memory utilization on server, you can find here how much memory is used, the same results you can find using free command.

Total Memory Used Memory Free Memory Buffered Memory

Mem: 8193720k total, 6317764k used, 1875956k free, 493816k buffers

Result Row #5:

Row 5 shows the swap memory utilization on server, you can find here how much swap is being used, the same results you can find using free command ( last line ). Usually if you system is swapping / using swap memory it indicated system is under memory pressure and will most likely cause system to run very slow

Total Swap Mem Used Swap Free Swap Cached Swap

Swap: 9215992k total, 3920k used, 9212072k free, 1742556k cached

Result Row #6 ( Running Processes ):

In this steps you will see all running process on servers and details about each process as below.

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 4706 mapr 10 -10 2706m 2.4g 17m S 2.0 30.7 641:03.31 mfs

PID - Process ID of process USER - User who's running the process

PR - Priority of process NI - Nice value of process

VIRT - Virtual Memory used by process RES - Physical Memory used by process ( Actual Memory )

SHR - Shared Memory used by process S - Current status of the process

%CPU / %MEM - % CPU/MEM used by the process

TIME+ - Total time process is been running for. COMMAND - Name of process

By default TOP sorts the output by %CPU usage (k), If you would want to sort output on basis of any fields, you can use SHIFT+F and select appropriate Alphabet to sort field via field as below and press Enter.

a: PID = Process Id

b: PPID = Parent Process Pid

c: RUSER = Real user name

d: UID = User Id

e: USER = User Name

f: GROUP = Group Name

g: TTY = Controlling Tty

h: PR = Priority

i: NI = Nice value

j: P = Last used cpu (SMP)

* K: %CPU = CPU usage

l: TIME = CPU Time

m: TIME+ = CPU Time, hundredths

n: %MEM = Memory usage (RES)

o: VIRT = Virtual Image (kb)

p: SWAP = Swapped size (kb)

q: RES = Resident size (kb)

r: CODE = Code size (kb)

s: DATA = Data+Stack size (kb)

t: SHR = Shared Mem size (kb)

u: nFLT = Page Fault count

v: nDRT = Dirty Pages count

w: S = Process Status

x: COMMAND = Command name/line

y: WCHAN = Sleeping in Function

z: Flags = Task Flags <sched.h>

Note :- When top is running and you press 1 it will show you per CPU (different for each CPU) resource utilization .

Rotational Speed (rpm)	IOPS
5400	50-80
7200	75-100
10k	125-150
15k	175-210