Application Monitoring

Application monitoring

Application monitoring is a very important aspect of a project but unfortunately not much attention is paid to develop the effective monitoring while the projects are still movingh to completions. Once project is complete & live lack of proper monitoring costs in terms of downtime when support persons are not aware if application is having some problems or application not working at all.

iostat, vmstat, netstat – Performance Monitoring & Tuning in Unix & Linux

This document is primarily written with reference to Solaris performance monitoring and tuning but these tools are available in other Unix variants & Linux also with slight syntax difference.

iostat , vmstat and netstat are three most commonly used tools for performance monitoring . These comes built in with the operating system and are easy to use .iostat stands for input output statistics and reports statistics for i/o devices such as disk drives . vmstat gives the statistics for virtual Memory and netstat gives the network statistics .

Following pages describes these tools and their usage for performance monitoring explains their syntax , examples and explanantion of results and solution for the common problems.

iostat – Input Output statistics

iostat reports terminal and disk I/O activity and CPU utilization. The first line of output is for the time period since boot & each subsequent line is for the prior interval . Kernel maintains a number of counters to keep track of the values.

iostat’s activity class options default to tdc (terminal, disk, and CPU). If any other option/s are specified, this default is completely overridden i.e. iostat -d will report only statistics about the disks.

iostat syntax

Basic synctax is iostat interval count
option – let you specify the device for which information is needed like disk , cpu or terminal. (-d , -c , -t or -tdc ) . x options gives the extended statistics .

interval – is time period in seconds between two samples . iostat 4 will give data at each 4 seconds interval.

count – is the number of times the data is needed . iostat 4 5 will give data at 4 seconds interval 5 times

iostat Example

$ iostat -xtc 5 2
                          extended disk statistics       tty         cpu
     disk r/s  w/s Kr/s Kw/s wait actv svc_t  %w  %b  tin tout us sy wt id
     sd0   2.6 3.0 20.7 22.7 0.1  0.2  59.2   6   19   0   84  3  85 11 0
     sd1   4.2 1.0 33.5  8.0 0.0  0.2  47.2   2   23
     sd2   0.0 0.0  0.0  0.0 0.0  0.0   0.0   0    0
     sd3  10.2 1.6 51.4 12.8 0.1  0.3  31.2   3   31

The fields have the following meanings:
      disk    name of the disk
      r/s     reads per second
      w/s     writes per second
      Kr/s    kilobytes read per second
      Kw/s    kilobytes written per second
      wait    average number of transactions waiting for service (Q length)
      actv    average number of transactions  actively being serviced 
(removed  from  the  queue but not yet completed)
      %w      percent of time there are transactions  waiting
              for service (queue non-empty)
      %b      percent of time the disk is busy  (transactions
                  in progress)

iostat Results and Solutions

The values to look from the iostat output are:
* Reads/writes per second (r/s , w/s)
* Percentage busy (%b)
* Service time (svc_t)

If a disk shows consistently high reads/writes along with , the percentage busy (%b) of the disks is greater than 5 percent, and the average service time (svc_t) is greater than 30 milliseconds, then one of the following action needs to be taken

  1. Tune the application to use disk i/o more efficiently by modifying the disk queries and using available cache facilities of application servers .
  2. Spread the file system of the disk on to two or more disk using disk striping feature of volume manager /disksuite etc.
  3. Increase the system parameter values for inode cache , ufs_ninode , which is Number of inodes to be held in memory. Inodes are cached globally (for UFS), not on a per-file system basis
  4. Move the file system to another faster disk /controller or replace existing disk/controller to a faster one.

fsck – Check & Repair Unix and Linux File Systems

Learn about fsck modes, fsck phases & fsck errors messages, their explanation along with some suggested course of action to successfully repair unix, Linux file system.

fsck, file system consistency check, is a system utilty in Unix, Linux and other Unix like systems for checking and repairing file system inconsistencies.

File system can become inconsistent due to several reasons and the most common is abnormal shutdown due to hardware failure, power failure or switching off the system without proper shutdown. Due to these reasons the superblock in a file system is not updated and has mismatched information relating to system data blocks, free blocks and inodes.

fsck in Linux

fsck in this document is refered with reference to ufs file system but it can be used in Linux systems as

fsck -t ext2 /dev/sda3
fsck.ext2 /dev/sda3
fsck.ext4 /dev/sda3
fsck.ext3 /dev/sda3

it returns with any of the followig code

0 – No errors
1 – File system errors corrected
2 – System should be rebooted
4 – File system errors left uncorrected
8 – Operational error
16 – Usage or syntax error
32 – Fsck canceled by user request
128 – Shared library error

fsck checks the file systems defined in /etc/fstab in Linux and /etc/vfstab in Unix systems