Performance tuning tools :by Matt Frye
- P.S.—I want to see my processes
- Being on top
- Sar, yes, sar!
- Check your system, STAT!
- About the author
As a system administrator, part of your daily duties is to monitor systems for performance and to tune systems where necessary. While there are expensive software products and benchmarking tools that can hone a machine to optimum efficiency, there exist several basic tools within Linux® that permit the knowledgeable system administrator to gather information and use the valuable information to make decisions about where and when to tune a system.
P.S.—I want to see my processes
One of the most basic tools we can use is the utility ps
. ps
provides a snapshot of current processes. This snapshot can range from myself as a single user (such as what active processes I have running) to all the processes on the system. The simple example of course is to run the ps
command with no options, which produces output similar to:
PID TTY TIME CMD 2873 pts/1 00:00:00 bash 3002 pts/1 00:00:00 ps
We see in Example 1, “Basic output of ps” that we get some minimal information about the processes we are running, including ps
itself. ps
displays the process ID (PID), the terminal associated with the process (TTY), the cumulated CPU time in [dd-]hh:mm:ss format (TIME), and the executable name (CMD). Spectacular, right? Well, ps
does this and a whole lot more. I should mention at this point that the version of ps
that I am using for this article is something special compared to the ps
of yester-year and of your classic UNIX®. This ps
, procps version 3.2.5, accepts several kinds of options: UNIX options, which may be grouped and must be preceded by a dash, BSD options, which may be grouped and must not be used with a dash, and GNU long options, which are preceded by two dashes. For the uninitiated, those who are new to Linux, or refugees from some older BSD or System V variant, this is good news. A system administrator can track down a process via several sets of options.
root 2784 2774 0 22:45 pts/2 00:00:00 su - mfrye mfrye 2785 2784 0 22:45 pts/2 00:00:00 -bash root 2895 1870 0 23:04 ? 00:00:00 sshd: mfrye [priv] mfrye 2897 2895 0 23:04 ? 00:00:00 sshd: mfrye@pts/3 mfrye 2898 2897 0 23:04 pts/3 00:00:00 -bash mfrye 3274 2785 0 23:34 pts/2 00:00:00 ps -ef mfrye 3275 2785 0 23:34 pts/2 00:00:00 grep mfrye
root 2784 0.0 0.0 71368 1288 pts/2 S 22:45 0:00 su - mfrye mfrye 2785 0.0 0.0 55124 1536 pts/2 S 22:45 0:00 -bash root 2895 0.0 0.1 38228 2660 ? Ss 23:04 0:00 sshd: mfrye [priv] mfrye 2897 0.0 0.1 38228 2748 ? S 23:04 0:00 sshd: mfrye@pts/3 mfrye 2898 0.0 0.0 55124 1528 pts/3 Ss 23:04 0:00 -bash mfrye 3272 0.0 0.0 52948 872 pts/2 R+ 23:34 0:00 ps aux mfrye 3273 0.0 0.0 51192 636 pts/2 S+ 23:34 0:00 grep mfrye
In Example 2, “Output of ps -ef | grep mfrye” and Example 3, “Output of ps -aux | grep mfrye”, we see the output of ps
with different arguments. We can use this output to track a particular set of processes (owned by mfrye) via either of two sets of options (UNIX & BSD, respectively). So what’s the big deal, you’re thinking? OK, so bash is a pretty tame example. In cases where another process, perhaps one that consumes more memory, or some other resource, than you want, ps
can be a very quick, easy, and effective way to track that process down. So now we’ve tracked down a particular process, but we don’t know much more than some basic information about the process’s CPU usage in terms of accumulated CPU time, which as you may appreciate, is not ideal. Luckily, there’s more.
Being on top
To track a process in relation to the system usage, another basic performance monitoring tool is top
. To start top
, simply run top
from the command line. A typical glimpse of top
output without any formatting can be seen in Example 4, “Basic output of top”.
top - 23:50:16 up 3:25, 1 user, load average: 0.00, 0.00, 0.00 Tasks: 88 total, 1 running, 87 sleeping, 0 stopped, 0 zombie Cpu(s): 0.0% us, 0.0% sy, 0.0% ni, 100.0% id, 0.0% wa, 0.0% hi, 0.0% si Mem: 2055112k total, 227684k used, 1827428k free, 53556k buffers Swap: 2096472k total, 0k used, 2096472k free, 100884k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1 root 16 0 4876 596 500 S 0.0 0.0 0:00.78 init 2 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/0 3 root 34 19 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/0 4 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/1 5 root 34 19 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/1 6 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/2 7 root 34 19 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/2 8 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/3 9 root 34 19 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/3 10 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 events/0 11 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 events/1 12 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 events/2 13 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 events/3 14 root 19 -5 0 0 0 S 0.0 0.0 0:00.00 khelper 15 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 kthread 22 root 20 -5 0 0 0 S 0.0 0.0 0:00.00 kacpid 106 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 kblockd/0 107 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 kblockd/1 108 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 kblockd/2 109 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 kblockd/3 112 root 15 0 0 0 0 S 0.0 0.0 0:00.00 khubd 162 root 20 0 0 0 0 S 0.0 0.0 0:00.00 pdflush 163 root 15 0 0 0 0 S 0.0 0.0 0:00.01 pdflush 166 root 13 -5 0 0 0 S 0.0 0.0 0:00.00 aio/0 167 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 aio/1 168 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 aio/2 169 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 aio/3
Top
is an interactive tool that allows a system administrator to view the process table in order of CPU or memory usage, by user, and at varying refresh rates. For example, a system administrator who wants to monitor the process running under the user apache (option u
, apache
), sorted by memory usage (option M
), updated every half second (option S
, .5
) would get that output. See Example 5, “Example of top output sorted by user apache”.
top - 23:58:42 up 3:33, 1 user, load average: 0.00, 0.00, 0.00 Tasks: 88 total, 1 running, 87 sleeping, 0 stopped, 0 zombie Cpu(s): 0.0% us, 0.0% sy, 0.0% ni, 100.0% id, 0.0% wa, 0.0% hi, 0.0% si Mem: 2055112k total, 227436k used, 1827676k free, 53740k buffers Swap: 2096472k total, 0k used, 2096472k free, 101220k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1911 apache 16 0 113m 13m 7984 S 0.0 0.7 0:00.00 httpd 1912 apache 15 0 113m 13m 7980 S 0.0 0.7 0:00.00 httpd 1913 apache 16 0 113m 12m 7912 S 0.0 0.6 0:00.00 httpd 1914 apache 20 0 113m 12m 7912 S 0.0 0.6 0:00.00 httpd 1915 apache 20 0 113m 12m 7912 S 0.0 0.6 0:00.00 httpd 1916 apache 20 0 113m 12m 7912 S 0.0 0.6 0:00.00 httpd 1917 apache 20 0 113m 12m 7912 S 0.0 0.6 0:00.00 httpd 1918 apache 25 0 113m 12m 7912 S 0.0 0.6 0:00.00 httpd
Top
is useful for viewing real-time process behavior within the context of system resources. The use of a faster refresh rate will provide enhanced precision for measuring system loads. For example, if you have a system running an Oracle® Database, and your startup time for the database is unacceptably slow, you will be able to see what processes consume a greater part of memory while the system is pegged. While top
is a good interactive tool, you may not have the time or inclination to sit and watch processes for more than a few minutes. Luckily, there’s more.
Sar, yes, sar!
Sar
is one of those utilities that conjures up images of UNIX nerds that took Latin in high school (when Latin was still offered in high schools). Because of sar
‘s relative oddness, it is often lumped into the same category as sendmail for ease of configuration. To be fair, there is wonderful documentation for most such utilities. However, looking beyond sar
‘s reputation for obscurity in output as well as syntax reveals a powerful system monitoring tool.
You can install sar
by installing the sysstat
package with the command yum install sysstat
. You also need to initialize sar
the first time by running /usr/lib/sa/sa1 1 1
and /usr/lib/sa/sa2 -A
, or by letting cron run these commands. The sysstat
package will place these in /etc/cron.d/systat/
, and you won’t be able to run sar with no arguments and get meaningful output without having done this first.
Running sar
with no arguments will give you some pretty obvious output as to what’s going on in your system. In Example 6, “Basic output of sar”, we see the day’s cumulative averages so far for every ten minutes on all CPUs. You will notice that these are the same pieces of information that we saw in top
, except that in this case, sar
gives us a time breakdown of when loads occurred.
Linux 2.6.12-1.1398_FC4smp (knuth) 08/28/2005 12:00:01 AM CPU %user %nice %system %iowait %idle 12:10:01 AM all 0.01 0.00 0.01 0.00 99.98 12:20:01 AM all 0.01 0.00 0.01 0.00 99.98 12:30:01 AM all 0.01 0.00 0.01 0.01 99.98 12:40:01 AM all 0.00 0.00 0.00 0.00 100.00 12:50:01 AM all 0.00 0.00 0.00 0.01 99.99 01:00:01 AM all 0.00 0.00 0.00 0.00 100.00 01:10:01 AM all 0.00 0.00 0.00 0.00 100.00 01:20:01 AM all 0.00 0.00 0.00 0.00 100.00 01:30:01 AM all 0.00 0.00 0.00 0.00 100.00 01:40:01 AM all 0.00 0.00 0.00 0.00 100.00 01:50:01 AM all 0.00 0.00 0.00 0.00 100.00 Average: all 0.00 0.00 0.00 0.00 99.99
Incidentally, these values are stored by running sar
in cron. Fedora™ Core 4 has the following entries in /etc/cron.d/sysstat
, by default:
# run system activity accounting tool every 10 minutes */10 * * * * root /usr/lib/sa/sa1 1 1 # generate a daily summary of process accounting at 23:53 53 23 * * * root /usr/lib/sa/sa2 -A
The sa1
script collects and stores binary data in the system activity daily data file, and sa2
writes a daily report in the /var/log/sa/
directory. Sar
can also be invoked to provide real-time statistics on the fly. In Example 7, “Example output of sar 1 10”, I have invoked sar
with the options for a one second interval over 10 iterations. This is a very effective way to evaluate where a bottleneck might lie. If you’re having problems with I/O wait when certain reads take place, you’ll be able to see it here. Running sar
in this fashion offers you the dynamic output of top
with the specificity of sar
. See Example 7, “Example output of sar 1 10”.
Linux 2.6.12-1.1398_FC4smp (knuth) 08/28/2005 02:13:43 AM CPU %user %nice %system %iowait %idle 02:13:44 AM all 0.00 0.00 0.00 0.00 100.00 02:13:45 AM all 0.00 0.00 0.00 0.00 100.00 02:13:46 AM all 0.00 0.00 0.00 0.00 100.00 02:13:47 AM all 0.00 0.00 0.00 0.00 100.00 02:13:48 AM all 0.00 0.00 0.00 0.00 100.00 02:13:49 AM all 0.00 0.00 0.00 0.00 100.00 02:13:50 AM all 0.00 0.00 0.00 0.00 100.00 02:13:51 AM all 0.00 0.00 0.00 0.00 100.00 02:13:52 AM all 0.00 0.00 0.00 0.00 100.00 02:13:53 AM all 0.00 0.00 0.00 0.00 100.00 Average: all 0.00 0.00 0.00 0.00 100.00
Sar
also allows you to view the same output but restricts your reporting to a particular processor. Example 8, “sar -P 1 1 5 output” shows 5 one second iterations for CPU 1, and Example 9, “sar -P 2 1 5 output” shows 5 one second iterations for CPU 2.
Linux 2.6.12-1.1398_FC4smp (knuth) 08/28/2005 02:28:24 AM CPU %user %nice %system %iowait %idle 02:28:25 AM 1 0.00 0.00 0.00 0.00 100.00 02:28:25 AM CPU %user %nice %system %iowait %idle 02:28:26 AM 1 0.00 0.00 0.00 0.00 100.00 02:28:26 AM CPU %user %nice %system %iowait %idle 02:28:27 AM 1 0.00 0.00 0.00 0.00 100.00 02:28:27 AM CPU %user %nice %system %iowait %idle 02:28:28 AM 1 0.00 0.00 0.00 0.00 100.00 02:28:28 AM CPU %user %nice %system %iowait %idle 02:28:29 AM 1 0.00 0.00 0.00 0.00 100.00 Average: CPU %user %nice %system %iowait %idle Average: 1 0.00 0.00 0.00 0.00 100.00
Linux 2.6.12-1.1398_FC4smp (knuth) 08/28/2005 02:28:33 AM CPU %user %nice %system %iowait %idle 02:28:34 AM 2 0.00 0.00 0.00 0.00 100.00 02:28:34 AM CPU %user %nice %system %iowait %idle 02:28:35 AM 2 0.00 0.00 0.00 0.00 100.00 02:28:35 AM CPU %user %nice %system %iowait %idle 02:28:36 AM 2 0.00 0.00 0.00 0.00 100.00 02:28:36 AM CPU %user %nice %system %iowait %idle 02:28:37 AM 2 0.00 0.00 0.00 0.00 100.00 02:28:37 AM CPU %user %nice %system %iowait %idle 02:28:38 AM 2 0.00 0.00 0.00 0.00 100.00 Average: CPU %user %nice %system %iowait %idle Average: 2 0.00 0.00 0.00 0.00 100.00
Check your system, STAT!
There are a number of *stat
commands that appear in any given system, and I would like to mention two which I think are most useful. The first of these is iostat
. Iostat
reports CPU statistics and input/output statistics for devices and partitions. While it seems that CPU statistics are available in every utility mentioned here so far, it’s the I/O part of iostat
that makes it useful. Iostat
run without any parameters gives you a single history since boot report for all CPU and devices. This is useful for a quick look at device utilization and, in this case, looking at CPU usage makes a lot of sense. In Example 10, “Basic output of iostat”, iostat
shows blocks read and written per second and overall.
Linux 2.6.12-1.1398_FC4smp (knuth) 08/28/2005 avg-cpu: %user %nice %sys %iowait %idle 0.01 0.00 0.01 0.04 99.93 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 0.92 12.27 8.27 289810 195288
In Example 11, “Output of iostat -p sda 1 3”, iostat
displays three reports at one second intervals for device sda and all its partitions. It’s easy to see how iostat
can deliver real-time statistics on the partitions’ reads and writes.
Linux 2.6.12-1.1398_FC4smp (knuth) 08/28/2005 avg-cpu: %user %nice %sys %iowait %idle 0.01 0.00 0.01 0.04 99.93 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 0.92 12.08 8.36 289810 200592 sda3 0.01 0.02 0.00 386 0 sda2 1.68 12.01 8.36 288138 200544 sda1 0.02 0.04 0.00 1024 48 avg-cpu: %user %nice %sys %iowait %idle 0.00 0.00 0.00 0.00 100.00 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 0.00 0.00 0.00 0 0 sda3 0.00 0.00 0.00 0 0 sda2 0.00 0.00 0.00 0 0 sda1 0.00 0.00 0.00 0 0 avg-cpu: %user %nice %sys %iowait %idle 0.00 0.00 0.00 0.00 100.00 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 0.00 0.00 0.00 0 0 sda3 0.00 0.00 0.00 0 0 sda2 0.00 0.00 0.00 0 0 sda1 0.00 0.00 0.00 0 0
The last utility I would like to mention is vmstat
. Vmstat
reports statistics on virtual memory and can be useful when trying to identify system bottlenecks. Vmstat
does not count itself as a running process, and it can be used in a number of modes. Run with no parameters, vmstat
will display active and inactive memory. Like iostat
, vmstat
can be run in iterations, at a particular interval. In Example 12, “Output of vmstat 1 5”, vmstat
is run at one second intervals for five iterations.
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 0 0 1826368 57028 102352 0 0 1 1 251 6 0 0 100 0 0 0 0 1826368 57028 102352 0 0 0 0 1008 13 0 0 100 0 0 0 0 1826368 57028 102352 0 0 0 0 1004 13 0 0 100 0 0 0 0 1826368 57036 102344 0 0 0 60 1007 25 0 0 100 0 0 0 0 1826368 57036 102344 0 0 0 0 1004 13 0 0 100 0
Vmstat
can also provide a quick list of memory-related statistics from the vmstat -s
command:
2055112 total memory 229240 used memory 84480 active memory 91816 inactive memory 1825872 free memory 57224 buffer memory 102156 swap cache 2096472 total swap 0 used swap 2096472 free swap 1130 non-nice user cpu ticks 247 nice user cpu ticks 1110 system cpu ticks 9995941 idle cpu ticks 3860 IO-wait cpu ticks 35 IRQ cpu ticks 56 softirq cpu ticks 144945 pages paged in 108540 pages paged out 0 pages swapped in 0 pages swapped out 25092942 interrupts 575618 CPU context switches 1126139091 boot time 4447 forks
as well as partition information from the vmstat -p sda2
:
sda2 reads read sectors writes requested writes 15200 288218 27285 218280
Many of the functions of the utilities discussed in this article overlap. This is the result of having several authors who have attempted to provide you with as elegant and powerful a utility as possible. This has the potential, however, of causing some confusion or apathy in using these tools because they seem redundant or are perceived to be “bloated.” However, the system administrator, who recognizes each tool for its strengths and inherent ability to report cleanly the characteristics of a running system, will find that their system comes with a rather complete tool set for not only reacting to but predicting performance issues via proactive monitoring.