Big Data / Systems / Cloud / AI : System Resource utilization

System Resource utilization

To quickly overview of the system you would need a tool to give you some system stats which you can read and interpret which resource is causing the bottleneck, Vmstat is good tool which reports information about processes, memory, swap I/O, block IO, system, and cpu activity in just one line. There are other tool which report system utilization stats as well but reason i prefer vmstat is because the output of vmstat command is easy to read and can be used effectively for preliminary check to help identify any system bottlenecks but later to gain more insight into suspected issues, a different kind of tool is required i.e a tool capable of more in-depth data collection for analysis.

In this post we will talk about only Vmstat and what each column represents to effectively determine if issue is due to High CPU utilization/ Disk IO / system Swapping etc.

 Here is an output of vmstat command from my test node:

$ vmstat 1 3 ( Here I get stats every 1 second for 3 times only)

  [root@ip-10-128-160-140 ~]# vmstat 1 3

procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----

r b swpd free buff cache si so bi bo in cs us sy id wa st

0 0 0 23940396 178804 5127788 0 0 0 5 3 6 0 0 100 0 0

0 0 0 23940396 178804 5127788 0 0 0 0 161 278 0 0 100 0 0

0 0 0 23940396 178804 5127788 0 0 0 32 374 329 1 1 98 0 0

[root@ip-10-128-160-140 ~]#

  The fist line lists six different categories for which stats will be displayed. The consecutive lines gives all data you need to interpret if you are running into any system level bottlenecks. All the data collected is in "kb" by default.

Below info in from "man vmstat" and self explanatory which gives details of every column for which stats are collected.

  
FIELD DESCRIPTION FOR VM MODE

Procs

r: The number of processes waiting for run time.

b: The number of processes in uninterruptible sleep.

  Memory

swpd: the amount of virtual memory used.

free: the amount of idle memory.

buff: the amount of memory used as buffers.

cache: the amount of memory used as cache.

inact: the amount of inactive memory. (-a option)

active: the amount of active memory. (-a option)

  Swap

si: Amount of memory swapped in from disk (/s).

so: Amount of memory swapped to disk (/s).

bi: Blocks received from a block device (blocks/s).

bo: Blocks sent to a block device (blocks/s).

  System

in: The number of interrupts per second, including the clock.

cs: The number of context switches per second.

  CPU

These are percentages of total CPU time.

us: Time spent running non-kernel code. (user time, including nice time)

sy: Time spent running kernel code. (system time)

id: Time spent idle. Prior to Linux 2.5.41, this includes IO-wait time.

wa: Time spent waiting for IO. Prior to Linux 2.5.41, included in idle.

st: Time stolen from a virtual machine. Prior to Linux 2.6.11, unknown.

  With systems who have larger memory its worth running the command with "-S" option and specifying "M" which would capture stats in "MB" which is more human readable.

  [root@ip-10-128-160-140 ~]# vmstat -S M 1 3

procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----

r b swpd free buff cache si so bi bo in cs us sy id wa st

0 0 0 23386 174 5000 0 0 0 5 3 6 0 0 100 0 0

0 0 0 23386 174 5000 0 0 0 0 262 299 0 0 100 0 0

0 0 0 23386 174 5000 0 0 0 0 140 256 0 0 100 0 0

[root@ip-10-128-160-140 ~]#

 With above info a system administrators can identify system bottlenecks or atleast get a data point on where he needs to dig deeper to get to the RC of the performance problem they are troubleshooting. I plan to write another post with examples where i will manually stimulate system bottlenecks and run vmstat in parallel to get data points which wold point to the problem .

Big Data / Systems / Cloud / AI

Friday, July 3, 2015

System Resource utilization

System Resource utilization

No comments:

Post a Comment