Linux Performance Monitoring: top, vmstat, iostat, sar Commands

Linux performance monitoring helps you identify bottlenecks before they become outages. This guide covers all key tools for RHCA-level performance analysis.

top — Interactive Process Monitor

# top                           # launch
# top -b -n 1 > /tmp/top.txt   # batch mode snapshot

# Inside top:
# 1         → per-CPU breakdown
# M         → sort by memory
# P         → sort by CPU (default)
# k         → kill process (prompts for PID)
# r         → renice process
# u         → filter by user
# f         → add/remove columns
# q         → quit

# Key metrics to watch:
# load average: 3 values (1min, 5min, 15min) — should be < nCPUs
# %us: user-space CPU    %sy: kernel CPU
# %wa: I/O wait — high = disk bottleneck
# Mem: used vs free (remember: Linux uses free RAM for cache — normal)
# Swap: non-zero usage = memory pressure

vmstat — Virtual Memory Statistics

# vmstat [interval] [count]
# vmstat 1 10                  # stats every 1 sec, 10 times

# Key columns:
# r   = processes waiting for CPU (runnable)
# b   = processes in uninterruptible sleep (I/O wait)
# si  = swap-in (KB/s) from disk to RAM
# so  = swap-out (KB/s) from RAM to disk — high = memory problem
# bi  = blocks read from disk per second
# bo  = blocks written to disk per second
# us  = % user CPU
# sy  = % system CPU
# wa  = % I/O wait — high = disk bottleneck
# id  = % idle

# Memory summary:
# vmstat -s

iostat — Disk I/O Statistics

# yum install sysstat -y          # provides iostat, sar, etc.

# iostat [interval] [count]
# iostat 1 5                      # disk stats every 1 sec

# Key columns:
# tps    = transfers per second
# kB_read/s, kB_wrtn/s = throughput
# await  = average I/O wait time (ms) — high = slow disk
# %util  = disk utilization — near 100% = bottleneck

# Per-device stats:
# iostat -x 1 5                   # extended stats

sar — System Activity Reporter

# CPU utilization:
# sar -u 1 5                     # every 1 sec, 5 times

# Memory usage:
# sar -r 1 5

# Disk I/O:
# sar -b 1 5

# Network:
# sar -n DEV 1 5                 # per interface stats

# Historical data (collected by sysstat every 10 min):
# sar -u                         # today's CPU history
# sar -u -f /var/log/sa/sa01     # specific day

# Enable sysstat collection:
# systemctl enable sysstat
# systemctl start sysstat

free — Memory Usage

# free -h                        # human readable
# free -m                        # in megabytes

# Output explained:
#              total    used    free   shared  buff/cache   available
# Mem:          16G     4G      2G     500M       10G         11G
# Swap:          4G      0       4G

# "available" is what you actually have for new processes
# "buff/cache" is used by OS for performance — can be reclaimed

Network Performance

# sar -n DEV 1 5                # network interface stats
# iftop                         # real-time bandwidth per connection
# nethogs                       # per-process bandwidth
# ss -s                         # socket statistics

# Check NIC speed and errors:
# ethtool eth0
# ip -s link show eth0

System Resource Limits

# Current limits:
# ulimit -a                     # all limits for current shell

# Set limits (temporary):
# ulimit -n 65535               # max open files
# ulimit -u 4096                # max user processes

# Permanent limits:
# vim /etc/security/limits.conf
* soft nofile 65535
* hard nofile 65535
apache soft nproc 4096

Performance Tuning Quick Wins

# Tune kernel parameters (sysctl):
# sysctl -a | grep vm.swappiness   # check current
# sysctl vm.swappiness=10          # reduce swap tendency (temporary)
# echo "vm.swappiness=10" >> /etc/sysctl.conf  # permanent

# Key sysctl parameters:
vm.swappiness=10              # lower = prefer RAM over swap
vm.dirty_ratio=15             # % RAM for dirty pages before sync
kernel.shmmax=68719476736     # shared memory max (for Oracle DB)
net.core.somaxconn=1024       # max socket connections