Friday, November 30, 2012

Learn AWK by Examples

Here are some AWK (rather gawk) tricks that I normally use to tackle data crunching problems. You may also want to find out how shell variables can be passed to awk

Print all users in /etc/passwd start with a, c, or d

$ awk -F: '
/^[acd]/ { 
    print $1
}' /etc/passwd

daemon
colord
avahi-autoipd
avahi
chihung

Process stdin based on patterns as range

$ echo 'BEGIN
abc
def
END
junk
junk
START
pqrst
uvwxyz
STOP' | awk '
/START/,/STOP/ { print "-->" $0 }
/BEGIN/,/END/  { print "==>" $0 }
'

==>BEGIN
==>abc
==>def
==>END
-->START
-->pqrst
-->uvwxyz
-->STOP

Print 10 random number 10<=N<20, use /dev/null as a dummy input file

$ awk -v n=10 -v start=10 -v end=20 '
BEGIN {
    srand()
    for (i=1; i<=n; ++i) {
        printf("%d\n", start+rand()*(end-start))
    }
}' /dev/null

17
13
15
19
10
11
19
13
11
19

Count by file types in current directory

$ ls -l | 
awk '
BEGIN {
    dirs=0
    files=0
    socks=0
    links=0
}
/total/ { next }
/^d/ { ++dirs }
/^-/ { ++files }
/^s/ { ++socks }
/^l/ { ++links }
END { print "dirs=" dirs, "files=" files, "socks=" socks, "links=" links }
'

dirs=3 files=10 socks=0 links=0

Count by file types in current directory (using array)

$ ls -l | 
awk '
$1 != "total" {
    c1=substr($1,1,1)
    ++s[c1]
}
END {
    printf("dirs=%d files=%d socks=%d links=%d\n",
        s["d"], s["-"], s["s"], s["l"])
}
'

dirs=3 files=10 socks=0 links=0

Calculate total size of all .gz files in /usr/share directory

$ find /usr/share -type f -name "*.gz" -ls | 
awk '
{
    s+=$7
}
END {
    printf("%.2lf MB\n", s/(1024*1024))
}
'

63.08 MB

Count files by users in /home directory

$ find /home -mount -type f -ls | 
awk '
{
    ++count[$5]
    size[$5]+=$7
} 
END {
    for ( i in count ) {
        printf("User=%s Count=%d Size=%s\n", i, count[i], size[i])
    }
}
'

User=chihung Count=16301 Size=6346658343
User=root Count=10 Size=80109

Print all users start with e, f, g with their corresponding group name. Group id to name mapping is stored in the gid2name array by processing the first file /etc/group. (Note: I present two similar ways to do the same task)

$ awk -F: '
NR==FNR {
    gid2name[$3]=$1
}
NR>FNR && /^[e-g]/ {
    print $1, gid2name[$4]
}' /etc/group /etc/passwd

games games
gnats gnats
gdm gdm
games games
gnats gnats
gdm gdm

$ awk -F: '
FILENAME=="/etc/group" {
    gid2name[$3]=$1
}
FILENAME=="/etc/passwd" && /^[e-g]/ {
    print $1, gid2name[$4]
}' /etc/group /etc/passwd

games games
gnats gnats
gdm gdm
games games
gnats gnats
gdm gdm


Count all file extensions in /usr/include directory

find /usr/include -mount -type f | 
$ awk -F/ '
{
    basename=$NF
    n=split(basename, arr, ".")
    if ( n>1 ) {
        ext=arr[n]
        ++summary[ext]
    }
}
END {
    for ( i in summary ) {
        print i, summary[i]
    }
}
'

h 4335
def 1
x 17
hpp 245
c 6
tcc 37

Multi-line record with blank line(s) as separator

$ cat ~/.mozilla/firefox/profiles.ini 
[General]
StartWithLastProfile=1

[Profile0]
Name=default
IsRelative=1
Path=5d0x3te1.default

$ awk '
BEGIN {
    FS="\n"
    RS=""
}
{
    for ( i=1; i<=NF; ++i ) {
        print "NR=" NR, "NF=" i, "Data=" $i
    }
}' ~/.mozilla/firefox/profiles.ini

NR=1 NF=1 Data=[General]
NR=1 NF=2 Data=StartWithLastProfile=1
NR=2 NF=1 Data=[Profile0]
NR=2 NF=2 Data=Name=default
NR=2 NF=3 Data=IsRelative=1
NR=2 NF=4 Data=Path=5d0x3te1.default

Print all the section headers in a .ini file with function definition to remove square brackets

$ cat ~/.mozilla/firefox/profiles.ini 
[General]
StartWithLastProfile=1

[Profile0]
Name=default
IsRelative=1
Path=5d0x3te1.default

$ awk '                                             
function rmsq(n) {
    gsub("\\[","",n)
    gsub("]","",n)
    return n
}
BEGIN {
    FS="\n"
    RS=""
}
{
    print rmsq($1)
}' ~/.mozilla/firefox/profiles.ini

General
Profile0

Labels:

Monday, October 22, 2012

Remove Hundreds of Thousands of Files, take 3

After blogged about using find to remove files, I realised that there are thousands of error in execve system calls. By simulating the same scenario with just 2 files, I understand that the search PATH is the culprit.
[pid 29481] execve("/usr/lib/lightdm/lightdm/rm", ["rm", "-f", "./somefiles-a"], [/* 50 vars */]) = -1 ENOENT (No such file or directory)
[pid 29481] execve("/usr/local/sbin/rm", ["rm", "-f", "./somefiles-a"], [/* 50 vars */]) = -1 ENOENT (No such file or directory)
[pid 29481] execve("/usr/local/bin/rm", ["rm", "-f", "./somefiles-a"], [/* 50 vars */]) = -1 ENOENT (No such file or directory)
[pid 29481] execve("/usr/sbin/rm", ["rm", "-f", "./somefiles-a"], [/* 50 vars */]) = -1 ENOENT (No such file or directory)
[pid 29481] execve("/usr/bin/rm", ["rm", "-f", "./somefiles-a"], [/* 50 vars */]) = -1 ENOENT (No such file or directory)
[pid 29481] execve("/sbin/rm", ["rm", "-f", "./somefiles-a"], [/* 50 vars */]) = -1 ENOENT (No such file or directory)
[pid 29481] execve("/bin/rm", ["rm", "-f", "./somefiles-a"], [/* 50 vars */]) = 0

The amount of system calls is reduced (especially errors) if I specify the correct full path of 'rm'.

$ touch somefiles-{a..z}{a..z}{a..z}

$strace -cf find . -name "somefiles-*" -exec /bin/rm -f {} \;
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 96.59   36.418146        2072     17576           waitpid
  2.50    0.940861          54     17576           clone
  0.72    0.271665          15     17576           unlinkat
  0.07    0.025220           0    105467           close
  0.03    0.012440         541        23           getdents64
  0.03    0.011857           1     17577           fstatat64
  0.02    0.009425           1     17576     17576 _llseek
  0.01    0.005058           0     52737           open
  0.01    0.004589           0    140626           mmap2
  0.01    0.003785           0     17577           ioctl
  0.00    0.001053           0     52757           brk
  0.00    0.001028           0     52735           fstat64
  0.00    0.000388           0     17577           munmap
  0.00    0.000245           0     70311           mprotect
  0.00    0.000000           0     17580           read
  0.00    0.000000           0     17577           execve
  0.00    0.000000           0     52734     52734 access
  0.00    0.000000           0         1           gettimeofday
  0.00    0.000000           0         2           uname
  0.00    0.000000           0     17581           fchdir
  0.00    0.000000           0         3           rt_sigaction
  0.00    0.000000           0         1           rt_sigprocmask
  0.00    0.000000           0         2           getrlimit
  0.00    0.000000           0         2         1 futex
  0.00    0.000000           0     17577           set_thread_area
  0.00    0.000000           0         1           set_tid_address
  0.00    0.000000           0         1           openat
  0.00    0.000000           0     17577           set_robust_list
------ ----------- ----------- --------- --------- ----------------
100.00   37.705760                738330     70311 total

Another way to further reduce the amount of system calls as well as run time is to take advantage of 'rm' ability to take more than one file as argument. Getting 'find' output to pipe to 'xargs -L 10 /bin/rm -f', we are able to ask 'rm' to remove 10 files at a time. You can see the mass reduction in system calls and run time.

$ cat rm.sh
#! /bin/bash

find . -name "somefiles-*" | xargs -L 10 /bin/rm -f


$ touch somefiles-{a..z}{a..z}{a..z}

$ strace -cf ./rm.sh
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 98.63    9.880623        5611      1761         1 waitpid
  0.98    0.097799           6     17576           unlinkat
  0.18    0.018443          10      1760           clone
  0.12    0.011728         510        23           getdents64
  0.08    0.008373           0     17577           fstatat64
  0.00    0.000331           0      7058         4 open
  0.00    0.000312           0      1842           read
  0.00    0.000289           0     14108           mmap2
  0.00    0.000041           0      1761      1760 ioctl
  0.00    0.000039           0     12342         2 close
  0.00    0.000000           0        69           write
  0.00    0.000000           0      1761           execve
  0.00    0.000000           0         1           time
  0.00    0.000000           0         1           getpid
  0.00    0.000000           0      5296      5288 access
  0.00    0.000000           0         1           pipe
  0.00    0.000000           0      5323           brk
  0.00    0.000000           0         3           dup2
  0.00    0.000000           0         1           getppid
  0.00    0.000000           0         1           getpgrp
  0.00    0.000000           0         2           gettimeofday
  0.00    0.000000           0      1765           munmap
  0.00    0.000000           0         1           sigreturn
  0.00    0.000000           0         3           uname
  0.00    0.000000           0      7049           mprotect
  0.00    0.000000           0         5           fchdir
  0.00    0.000000           0      1761           _llseek
  0.00    0.000000           0        25           rt_sigaction
  0.00    0.000000           0        21           rt_sigprocmask
  0.00    0.000000           0         5           getrlimit
  0.00    0.000000           0        24         8 stat64
  0.00    0.000000           0      5295           fstat64
  0.00    0.000000           0         9           getuid32
  0.00    0.000000           0         9           getgid32
  0.00    0.000000           0         9           geteuid32
  0.00    0.000000           0         9           getegid32
  0.00    0.000000           0         3         1 fcntl64
  0.00    0.000000           0         2         1 futex
  0.00    0.000000           0      1761           set_thread_area
  0.00    0.000000           0         1           set_tid_address
  0.00    0.000000           0         1           openat
  0.00    0.000000           0         1           set_robust_list
------ ----------- ----------- --------- --------- ----------------
100.00   10.017978                106026      7065 total

Remove Hundreds of Thousands of Files, take 2

An alternte way to efficiently remove hundreds of thousands of files with find in Linux.

Normally the boilerplate for removing files in 'find' is find some-dir -name "*pattern*" -exec rm -f {} \;. This is very inefficent because it has to fork as many process as the number of files. As we all know, forking takes time to create process. If fork takes 0.01s to create a process, it will take 1,000s (16+ min) just to create those 'rm' processes for 100,000 files to be removed.

Below is the summary of strace system calls for the 3 solutions (python way, traditional find way with -exec, and find -delete) to delete 17576 files (26*26*26). Definitely 'find -delete' is the winner. See for yourself.

  • Python way - 18647 system calls, 0.0896s run time
  • find -exec rm - 843786 system calls, 42.801s run time
  • find -delete - 17711 system calls, 0.0793s run time

Python way:

$ touch somefiles-{a..z}{a..z}{a..z}

$ strace -cf ./rm.py somefiles
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 81.86    0.073372           4     17576           unlink
 17.78    0.015938         590        27           getdents64
  0.09    0.000080           1        89           close
  0.08    0.000071           0       153           read
  0.08    0.000070           1       135        74 stat64
  0.07    0.000059           0       268       182 open
  0.04    0.000036           0       137           fstat64
  0.00    0.000000           0         1           execve
  0.00    0.000000           0         1           chdir
  0.00    0.000000           0         9         9 access
  0.00    0.000000           0        12           brk
  0.00    0.000000           0         5         1 ioctl
  0.00    0.000000           0         4         2 readlink
  0.00    0.000000           0        50           munmap
  0.00    0.000000           0         1           uname
  0.00    0.000000           0        10           mprotect
  0.00    0.000000           0         3           _llseek
  0.00    0.000000           0        68           rt_sigaction
  0.00    0.000000           0         1           rt_sigprocmask
  0.00    0.000000           0         2           getcwd
  0.00    0.000000           0         1           getrlimit
  0.00    0.000000           0        74           mmap2
  0.00    0.000000           0         9           lstat64
  0.00    0.000000           0         1           getuid32
  0.00    0.000000           0         1           getgid32
  0.00    0.000000           0         1           geteuid32
  0.00    0.000000           0         1           getegid32
  0.00    0.000000           0         1         1 futex
  0.00    0.000000           0         1           set_thread_area
  0.00    0.000000           0         1           set_tid_address
  0.00    0.000000           0         3           openat
  0.00    0.000000           0         1           set_robust_list
------ ----------- ----------- --------- --------- ----------------
100.00    0.089626                 18647       269 total

Traditional find -exec rm -f {} \;:

$ touch somefiles-{a..z}{a..z}{a..z}

$ strace -cf find . -name "somefiles-*" -exec rm -f {} \;
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 97.91   42.883595        2440     17576           waitpid
  1.30    0.571413          33     17576           clone
  0.55    0.241115          14     17576           unlinkat
  0.07    0.030407           0    105467           close
  0.04    0.017349           1     17577           fstatat64
  0.03    0.014306           1     17576     17576 _llseek
  0.03    0.012770           0     52737           open
  0.02    0.008407           0    140626           mmap2
  0.02    0.006971           0     17577           ioctl
  0.01    0.004180         182        23           getdents64
  0.01    0.004000           0    123033    105456 execve
  0.01    0.003189           0     52757           brk
  0.00    0.001418           0     17577           munmap
  0.00    0.001373           0     52735           fstat64
  0.00    0.000519           0     70311           mprotect
  0.00    0.000000           0     17580           read
  0.00    0.000000           0     52734     52734 access
  0.00    0.000000           0         1           gettimeofday
  0.00    0.000000           0         2           uname
  0.00    0.000000           0     17581           fchdir
  0.00    0.000000           0         3           rt_sigaction
  0.00    0.000000           0         1           rt_sigprocmask
  0.00    0.000000           0         2           getrlimit
  0.00    0.000000           0         2         1 futex
  0.00    0.000000           0     17577           set_thread_area
  0.00    0.000000           0         1           set_tid_address
  0.00    0.000000           0         1           openat
  0.00    0.000000           0     17577           set_robust_list
------ ----------- ----------- --------- --------- ----------------
100.00   43.801012                843786    175767 total

find -delete way:

$ touch somefiles-{a..z}{a..z}{a..z}

$ strace -cf find . -name "somefiles-*" -delete
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 87.20    0.069193           4     17576           unlinkat
 12.69    0.010070         438        23           getdents64
  0.10    0.000083           5        17           mmap2
  0.00    0.000000           0         4           read
  0.00    0.000000           0         9           open
  0.00    0.000000           0        11           close
  0.00    0.000000           0         1           execve
  0.00    0.000000           0         6         6 access
  0.00    0.000000           0        29           brk
  0.00    0.000000           0         1           ioctl
  0.00    0.000000           0         1           gettimeofday
  0.00    0.000000           0         1           munmap
  0.00    0.000000           0         2           uname
  0.00    0.000000           0         7           mprotect
  0.00    0.000000           0         5           fchdir
  0.00    0.000000           0         2           rt_sigaction
  0.00    0.000000           0         1           rt_sigprocmask
  0.00    0.000000           0         1           getrlimit
  0.00    0.000000           0         7           fstat64
  0.00    0.000000           0         2         1 futex
  0.00    0.000000           0         1           set_thread_area
  0.00    0.000000           0         1           set_tid_address
  0.00    0.000000           0         1           openat
  0.00    0.000000           0         1           fstatat64
  0.00    0.000000           0         1           set_robust_list
------ ----------- ----------- --------- --------- ----------------
100.00    0.079346                 17711         7 total

cron Has No Limit in Linux

cron in Solaris limits the number of cron jobs it can run by the configuration defined in /etc/cron.d/queuedefs.

In Ubuntu (or Linux), it does not have this limitation. I downloaded (apt-get source cron) the source and could not find any hard limit coded in the program. Also, I tried to launch some long running job (sleep 36000000) every minute and managed to pump 1500+ jobs until my little netnook (with 1GB ram) started swapping.

Although the hardware is the limit for cron, it would be nice to monitor how many cron jobs are running and how long each cron job takes to run. According to the 'man cron', you can provide a '-L' flag to capture more info of a job

-L loglevel
               Tell  cron what to log about jobs (errors are logged regardless
               of this value) as the sum of the following values:

                   1      will log the start of all cron jobs

                   2      will log the end of all cron jobs

                   4      will log all failed jobs (exit status != 0)

                   8      will log the process number of all cron jobs

For troubleshooting of cron, this link (Reasons Why Crontab Does Not Work) is realy useful.

Saturday, October 20, 2012

RTFM - Read The Fine Manual

After I resolved my user's issue regarding crontab file format, I asked him to refer to the man page of crontab (man -s 5 crontab).

My colleague thinks that I am being sarcastic :-). I may be cheeky sometimes, but not this round.

The truth is the default section of crontab is section 1, which is man page for user command (see "man Intro" or "man -s 1 Intro").

If you read the "See Also" section of "man crontab", they want you to refer to crontab(5) which is section 5 of crontab

CRONTAB(1)                                                          CRONTAB(1)

NAME
       crontab - maintain crontab files for individual users (Vixie Cron)

SYNOPSIS
       crontab [ -u user ] file
       crontab [ -u user ] [ -i ] { -e | -l | -r }

DESCRIPTION
       crontab  is  the  program used to install, deinstall or list the tables
       used to drive the cron(8) daemon in Vixie Cron.   Each  user  can  have
       their    own    crontab,    and    though    these    are    files   in
       /var/spool/cron/crontabs, they are not intended to be edited directly.

...
SEE ALSO
       crontab(5), cron(8)

FILES
       /etc/cron.allow
       /etc/cron.deny
       /var/spool/cron/crontabs
...

It is only section 5 of the man page that will give you the details of the configuration. So "man -s 5 crontab"

CRONTAB(5)                                                          CRONTAB(5)

NAME
       crontab - tables for driving cron

DESCRIPTION
       A  crontab file contains instructions to the cron(8) daemon of the gen‐
       eral form: ``run this command at this time on this date''.   Each  user
       has  their  own crontab, and commands in any given crontab will be exe‐
       cuted as the user who owns the crontab.  Uucp  and  News  will  usually
       have  their  own  crontabs, eliminating the need for explicitly running
       su(1) as part of a cron command.
...
       Commands are executed by cron(8) when the minute, hour,  and  month  of
       year  fields  match  the current time, and when at least one of the two
       day fields (day of month, or day of week) match the current  time  (see
       ``Note'' below).  cron(8) examines cron entries once every minute.  The
       time and date fields are:

              field          allowed values
              -----          --------------
              minute         0-59
              hour           0-23
              day of month   1-31
              month          1-12 (or names, see below)
              day of week    0-7 (0 or 7 is Sun, or use names)

       A field may be an asterisk (*), which always stands for ``first-last''.
...

For more details of each section, simply do a "man -s SECTION# Intro" to find out more. Here is an overview of all the sections on my ubuntu:

  • Section 1 - user commands
  • Section 2 - system calls
  • Section 3 - library functions
  • Section 4 - special files
  • Section 5 - file formats
  • Section 6 - games
  • Section 7 - overview, conventions, and miscellany section
  • Section 8 - administration and privileged commands

Sunday, September 30, 2012

Disk Usage Summary per User and Time, take 2

I would very much like to compare awk (gawk) with python and therefore I coded the same thing in gawk. Here is the code:
#! /bin/bash
#
# count user file size by block
#


if [ $# -ne 1 ]; then
    echo "Usage: $0 <directory>"
    exit 1
fi


if [ ! -d $1 ]; then
    echo "Error. $1 does not exist"
fi


now=$(date +%s)
ls -lRs --time-style=+%s $1 | awk -v now=$now '

function print_header() {
    printf("%-15s %8s %8s %8s %8s %8s %8s %8s %8s\n", 
        "User", "0m-1m", "1m-3m", "3m-6m", "6m-1y",
        "1y-2y", "2y-3y", "3y-  ", "Total")
}

function print_line() {
    d8="--------"
    d15="---------------"
    printf("%-15s %8s %8s %8s %8s %8s %8s %8s %8s\n", d15, d8, d8, d8, d8, d8, d8, d8, d8)
}

function print_footer() {
    printf("\nNote: Size in GB\n")
}

BEGIN {
    print_header()
    print_line()

    factor=1024.0*1024.0

    y0m0=0
    y0m1=1*30*86400
    y0m3=3*30*86400
    y0m6=6*30*86400
    y1m0=1*365*86400
    y2m0=2*365*86400
    y3m0=3*365*86400
    yxm0=100*365*86400
}

# match directory, link and file
$2 ~ /^[dl-]/ {
    block=$1
    user= $4
    epoch=$7

    users[user]=1

    dt=now-epoch

    if ( y0m0<=dt && dt<y0m1 ) { cnt=1 }
    if ( y0m1<=dt && dt<y0m3 ) { cnt=2 }
    if ( y0m3<=dt && dt<y0m6 ) { cnt=3 }
    if ( y0m6<=dt && dt<y1m0 ) { cnt=4 }
    if ( y1m0<=dt && dt<y2m0 ) { cnt=5 }
    if ( y2m0<=dt && dt<y3m0 ) { cnt=6 }
    if ( y3m0<=dt && dt<yxm0 ) { cnt=7 }

    summary[user,cnt]+=block
    total_time[cnt]+=block
    total_user[user]+=block
}
END {
    # sort user name using asorti (gawk)
    n=asorti(users, users_sorted)
    for(i=1;i<=n;++i) {
        user=users_sorted[i]
        printf("%-15s", user)
        for(cnt=1;cnt<=7;++cnt) {
            if ( summary[user,cnt] == "" ) {
                summary[user,cnt]=0.0
            }
            printf(" %8.2f", summary[user,cnt]/factor)
        }

        # print per user total
        printf(" %8.2f\n", total_user[user]/factor)
    }

    print_line()

    # print total per time
    total=0.0
    printf("%15s", "Total:")
    for(cnt=1;cnt<=7;++cnt) {
        if ( total_time[cnt] == "" ) {
            total_time[cnt]=0.0
        }
        total+=total_time[cnt]
        printf(" % 8.2f", total_time[cnt]/factor)
    }
    printf(" %8.2f\n", total/factor)

    print_footer()
}
'


With 814MB and 10,208 files in /var, python solution took 1.17s and gawk took 0.95s. I am yet to find out how the two compare for millions of files.

Saturday, September 29, 2012

Disk Usage Summary per User and Time

If you are a system administrator, you often face disk utilisation dilemma. On one hand, you need to clean up those old and unwanted files. On the other hand, you cannot do so because they are owned by other users and you need their permission.

Below script summarise users' disk utilisation over certain duration. Hopefully this can help user to determine when they need to housekeep.

# cat b.py
#! /usr/bin/python

import fileinput, time, re


fmt="%-15s %8s %8s %8s %8s %8s %8s %8s %8s"
def print_line():
    print fmt % ('-'*15,'-'*8,'-'*8,'-'*8,'-'*8,'-'*8,'-'*8,'-'*8, '-'*8)
def print_header():
    print fmt % ('User','0-1m','1m-3m','3m-6m','6m-1y','1y-2y','2y-3y','3y-  ', 'Total')
def print_footer():
    print '\nNote: Size in GB'


# match directory, link, file
p=re.compile("^[ ]*[1-9][0-9]* [dl-]")


now=int(time.time())


y0m0=0
y0m1=1*30*86400
y0m3=3*30*86400
y0m6=6*30*86400
y1m0=1*365*86400
y2m0=2*365*86400
y3m0=3*365*86400
yxm0=100*365*86400
tranges=(
    [y0m0, y0m1],
    [y0m1, y0m3],
    [y0m3, y0m6],
    [y0m6, y1m0],
    [y1m0, y2m0],
    [y2m0, y3m0],
    [y3m0, yxm0]
)


users=set()
summary=dict()
total_time=dict()
total_user=dict()


for line in fileinput.input():

    if p.match(line):

        (block, perm, link, user, group, size, epoch, others)=line.split(None,7)
        iblock=int(block)
        dt=now-int(epoch)
        users.add(user)

        # summary per user+duration
        cnt=0
        for (t1,t2) in tranges:
            if t1<=dt and dt<t2:
                key=(user,cnt)
                if key in summary:
                    summary[key]+=iblock
                else:
                    summary[key]=iblock

                # total per duration
                if cnt in total_time:
                    total_time[cnt]+=iblock
                else:
                    total_time[cnt]=iblock

            cnt+=1

        # total per user
        if user in total_user:
            total_user[user]+=iblock
        else:
            total_user[user]=iblock


allusers=list(users)
allusers.sort()
factor=1024.0*1024.0


print_header()
print_line()


for user in allusers:
    print "%-15s" % user,
    for cnt in range(len(tranges)):
        key=(user,cnt)
        if key in summary:
            gb=summary[key]/factor
        else:
            gb=0.0
        print "%8.2lf" % gb,

    # user total
    gb=total_user[user]/factor
    print "%8.2lf" % gb


print_line()


print "%15s" % "Total:",
total=0.0
for cnt in range(len(tranges)):
    if cnt in total_time:
        gb=total_time[cnt]/factor
    else:
        gb=0.0
    print "%8.2lf" % gb,
    total+=gb
print "%8.2lf" % total


print_footer()


Here is a sample output from my 16GB SSD netbook

# ls -lRs --time-style=+%s /var | ./b.py
User                0-1m    1m-3m    3m-6m    6m-1y    1y-2y    2y-3y    3y-      Total
--------------- -------- -------- -------- -------- -------- -------- -------- --------
avahi-autoipd       0.00     0.00     0.00     0.00     0.00     0.00     0.00     0.00
chihung             0.00     0.00     0.00     0.41     0.00     0.00     0.00     0.41
colord              0.00     0.00     0.00     0.00     0.00     0.00     0.00     0.00
daemon              0.00     0.00     0.00     0.00     0.00     0.00     0.00     0.00
libuuid             0.00     0.00     0.00     0.00     0.00     0.00     0.00     0.00
lightdm             0.00     0.00     0.00     0.00     0.00     0.00     0.00     0.00
lp                  0.00     0.00     0.00     0.00     0.00     0.00     0.00     0.00
man                 0.00     0.00     0.00     0.00     0.00     0.00     0.00     0.00
mysql               0.03     0.00     0.00     0.00     0.00     0.00     0.00     0.03
root                0.21     0.07     0.04     0.02     0.00     0.00     0.00     0.35
speech-dispatcher     0.00     0.00     0.00     0.00     0.00     0.00     0.00     0.00
syslog              0.00     0.00     0.00     0.00     0.00     0.00     0.00     0.00
www-data            0.00     0.00     0.00     0.00     0.00     0.00     0.00     0.00
--------------- -------- -------- -------- -------- -------- -------- -------- --------
         Total:     0.24     0.07     0.04     0.43     0.00     0.00     0.00     0.79

Note: Size in GB

ssh runs only once inside a loop

If you need to run ssh inside a loop, you need to pass in "-n" flag in ssh to tell ssh to take stdin from /dev/null. If not, the loop will stop after the first run

for h in host1 host2 host3 host4
do
    ssh -n user@$h "/run/something"
done

man page on my Ubuntu say:
     -n      Redirects stdin from /dev/null (actually, prevents reading from
             stdin).  This must be used when ssh is run in the background.  A
             common trick is to use this to run X11 programs on a remote
             machine.  For example, ssh -n shadows.cs.hut.fi emacs & will
             start an emacs on shadows.cs.hut.fi, and the X11 connection will
             be automatically forwarded over an encrypted channel.  The ssh
             program will be put in the background.  (This does not work if
             ssh needs to ask for a password or passphrase; see also the -f
             option.)

Tuesday, August 21, 2012

Remove Hundreds of Thousands of Files

If you need to remove tonnes of files in a directory, you will likely hit the "argument list too long" error when you try to "rm -f *.log.*". This is due to your shell trying to expand the wild card to actual filenames and it exceeded the ARG_MAX. In Linux, run "getconf ARG_MAX" to find out the limit. My Ubuntu showed 2097152 as my ARG_MAX.

If you are in this situation, you are better off using a high level scripting language such as Python. With Python, you do not have to 'exec' 'rm' for every file.

Here is my script to do this task efficiently:

#! /usr/bin/python


import os,sys,glob


nargv=len(sys.argv)
if nargv==2:
    pattern='*%s*' % sys.argv[1]
    basedir=os.getcwd()
elif nargv==3:
    pattern='*%s*' % sys.argv[1]
    basedir=sys.argv[2]
else:
    print "Usage: %s pattern [directory]" % sys.argv[0]
    print "       eg. %s .log - to remove *.log* in current directory"
    print "       eg. %s 201207 /var/log/app - to remove *201207* in /var/log/app directory"
    print ""
    exit(1)

os.chdir(basedir)
for f in glob.glob(pattern):
    os.remove(f)

Friday, May 04, 2012

Let's GO Further

If you are interest in building concurrent software, watch this

Here are some of the links:

Thursday, April 19, 2012

Oracle v. Google

Moabcon 2012

Interested in HPC and Cloud? Here are the videos from Moabcon 2012. Thanks to insideHPC.com

Labels: , ,

Monday, March 12, 2012

OCBC Cycling Analysis

After participated in the recent OCBC cycling, I am eager to analyse the results. Here is my first attempt.

These are my findings

  • Raced flagged off at 6:30am, which is the offical start time for the 39km race
  • The first 6 batches of cyclists were staggered to start then followed by the rest of the participants.
  • Start time cut off at around 7:10am. This is likely to prepare for the next race which is suppose to start at 7:30am
  • A very small group of cyclists started after 7:30am. I supposed these are the late comer which missed the cut off time and managed to sneak in to the next race.
  • Majority of the professional cyclists registered early (assuming the smaller the race number, the earlier the registration)
  • Beginners or novice cyclists (like myself) tend to registrate late.

gnuplot command file:

set title "OCBC Cycling Singapore 2012\n39km The Challenge"
set terminal png size 900,500
set output 'data39.png'
set ylabel 'Race No.'
set key box
set xdata time
set format x '%H:%M'
set timefmt '%H:%M:%S'
set xtics 900 rotate by -90 scale 0 font ",8"
set xrange ['06:15:00':]
set grid
plot \
'data39.txt' using 2:1 with points title 'Start time', \
'data39.txt' using 3:1 with points title 'End time', \
'mytime.txt' using 2:1 with lines title 'My time'

$ cat mytime.txt 
37807 05:30:00
37807 10:00:00

30000 06:47:27
55000 06:47:27

30000 08:24:14
55000 08:24:14

Labels: ,

Sunday, March 04, 2012

Participated in OCBC Cycling

My son and I took part in this year OCBC Cycling Singapore 2012. We had lots of fun cycling that 39km challenge race. I tracked my race using My Tracks for Android and here is the map.

These are our results

Labels:

Saturday, February 18, 2012

Visualising COE (Certificate of Entitlement)

I managed to web scrap all the COE (Certificate of Entitlement) from LTA site at http://www.lta.gov.sg/content/lta/en/corporate/corp_info/index_corp_press.html.
With gnuplot's multiplot feature, I am able to combine the COE quota premium ($) and quota bidding in a single plot. Also, I am able to put Categories A, B and E together in an animated GIF file. Here is the animation:









It is clear that there are far less COE quota given for the past 2 years and therefore the COE price keeps going up. Anyway my objective is on visualisation rather than trying to analyse COE trend. Here is my gnuplot code:
set terminal png size 800,480 font "Arial,10"
set xdata time
set timefmt "%Y-%m-%d"
set format y '%6.0f'
set xrange [:"2012-08-01"]
set xtic offset 4


# -- 1st plot
set output 'coe-a.png'
set multiplot
set title "Singapore C.O.E. (Category A <1600cc), last update 2012-07-19"
set ylabel "Quota Premium ($)"
set size 1,0.65
set origin 0,0.35
set bmargin 0
unset key
set format x ''
set grid
set yrange [0:100000]
set ytics 10000
plot 'coe.txt' using 1:3 with lines

unset title
set key right box
set bmargin
set size 1,0.3
set origin 0,0
set tmargin 0
set ylabel 'Quota'
set format x '%Y'
set yrange [0:6000]
set ytics 1000
plot \
'coe.txt' using 1:4 with filledcurves x1 linetype 1 title 'Total Bids', \
'coe.txt' using 1:2 with filledcurves x1 linetype 2 title 'Quota', \
'coe.txt' using 1:4 with lines linetype -1 notitle, \
'coe.txt' using 1:2 with lines linetype -1 notitle


reset 

unset multiplot
set output 'coe-b.png'
set xdata time
set timefmt "%Y-%m-%d"
set format y '%6.0f'
set xrange [:"2012-08-01"]
set xtic offset 4

# -- 2nd plot
set multiplot
set title "Singapore C.O.E. (Category B >1600cc), last update 2012-07-19"
set ylabel "Quota Premium ($)"
set size 1,0.65
set origin 0,0.35
set bmargin 0
unset key
set format x ''
set grid
set yrange [0:100000]
set ytics 10000
plot 'coe.txt' using 1:8 with lines


unset title
set key right box
set bmargin
set size 1,0.3
set origin 0,0
set tmargin 0
set ylabel 'Quota'
set format x '%Y'
set yrange [0:6000]
set ytics 1000
plot \
'coe.txt' using 1:9 with filledcurves x1 linetype 1 title 'Total Bids', \
'coe.txt' using 1:7 with filledcurves x1 linetype 2 title 'Quota', \
'coe.txt' using 1:9 with lines linetype -1 notitle, \
'coe.txt' using 1:7 with lines linetype -1 notitle


reset

unset multiplot
set xdata time
set output 'coe-e.png'
set timefmt "%Y-%m-%d"
set format y '%6.0f'
set xrange [:"2012-08-01"]
set xtic offset 4

# -- 3rd plot
set multiplot
set title "Singapore C.O.E. (Category E -  Open), last update 2012-07-19"
set ylabel "Quota Premium ($)"
set size 1,0.65
set origin 0,0.35
set bmargin 0
unset key
set format x ''
set grid
set yrange [0:100000]
set ytics 10000
plot 'coe.txt' using 1:23 with lines


unset title
set key right box
set bmargin
set size 1,0.3
set origin 0,0
set tmargin 0
set ylabel 'Quota'
set format x '%Y'
set yrange [0:6000]
set ytics 1000
plot \
'coe.txt' using 1:24 with filledcurves x1 linetype 1 title 'Total Bids', \
'coe.txt' using 1:22 with filledcurves x1 linetype 2 title 'Quota', \
'coe.txt' using 1:24 with lines linetype -1 notitle, \
'coe.txt' using 1:22 with lines linetype -1 notitle
Interested in the data ? It is in the source. :-)
FYI, I will update the graph regularly to reflect the COE trend

Labels:

Wednesday, February 15, 2012

Visualising Age Distribution of Motor Vehicles

Today's newspaper has an ad on our Government web sites, and one of them caught my attention. That's data.gov.sg. The data from LTA on Age Distribution of Motor Vehicles is the one that I want to visualise using gnuplot

The animated image clearly showed that more new cars were introduced in 2006. After that the new car business gradually dropped to a record low in 2011. This definitely tally with the COE (Certificate of Entitlement) price and quota.

I am still trying to get hold of the COE data so that I can combine all these information in a single plot. It would be really nice if the data from data.gov.sg is given in its raw format.

Here is my gnuplot code to generate the animated gif file

set terminal gif size 800,480 animate delay 100
set output 'car.gif'
set auto x
set yrange [0:120000]
set style data histogram
set style histogram cluster gap 3
set style fill solid border -1
set boxwidth 0.9
set xtic rotate by -90 scale 0 font ",8"
set key box
set grid
set title 'Age Distribution of Motor Vehicle - 2000'
plot '2000' using 2:xtic(1) title 'Cars', '' u 4 title 'Motorcycles', '' u 6 title 'Buses', '' u 8 title 'Goods & Other Vehicles'
#
set title 'Age Distribution of Motor Vehicle - 2001'
plot '2001' using 2:xtic(1) title 'Cars', '' u 4 title 'Motorcycles', '' u 6 title 'Buses', '' u 8 title 'Goods & Other Vehicles'
#
set title 'Age Distribution of Motor Vehicle - 2002'
plot '2002' using 2:xtic(1) title 'Cars', '' u 4 title 'Motorcycles', '' u 6 title 'Buses', '' u 8 title 'Goods & Other Vehicles'
#
set title 'Age Distribution of Motor Vehicle - 2003'
plot '2003' using 2:xtic(1) title 'Cars', '' u 4 title 'Motorcycles', '' u 6 title 'Buses', '' u 8 title 'Goods & Other Vehicles'
#
set title 'Age Distribution of Motor Vehicle - 2004'
plot '2004' using 2:xtic(1) title 'Cars', '' u 4 title 'Motorcycles', '' u 6 title 'Buses', '' u 8 title 'Goods & Other Vehicles'
#
set title 'Age Distribution of Motor Vehicle - 2005'
plot '2005' using 2:xtic(1) title 'Cars', '' u 4 title 'Motorcycles', '' u 6 title 'Buses', '' u 8 title 'Goods & Other Vehicles'
#
set title 'Age Distribution of Motor Vehicle - 2006'
plot '2006' using 2:xtic(1) title 'Cars', '' u 4 title 'Motorcycles', '' u 6 title 'Buses', '' u 8 title 'Goods & Other Vehicles'
#
set title 'Age Distribution of Motor Vehicle - 2007'
plot '2007' using 2:xtic(1) title 'Cars', '' u 4 title 'Motorcycles', '' u 6 title 'Buses', '' u 8 title 'Goods & Other Vehicles'
#
set title 'Age Distribution of Motor Vehicle - 2008'
plot '2008' using 2:xtic(1) title 'Cars', '' u 4 title 'Motorcycles', '' u 6 title 'Buses', '' u 8 title 'Goods & Other Vehicles'
#
set title 'Age Distribution of Motor Vehicle - 2009'
plot '2009' using 2:xtic(1) title 'Cars', '' u 4 title 'Motorcycles', '' u 6 title 'Buses', '' u 8 title 'Goods & Other Vehicles'
#
set title 'Age Distribution of Motor Vehicle - 2010'
plot '2010' using 2:xtic(1) title 'Cars', '' u 4 title 'Motorcycles', '' u 6 title 'Buses', '' u 8 title 'Goods & Other Vehicles'
#
set title 'Age Distribution of Motor Vehicle - 2011'
plot '2011' using 2:xtic(1) title 'Cars', '' u 4 title 'Motorcycles', '' u 6 title 'Buses', '' u 8 title 'Goods & Other Vehicles'

Here is all the data

$ for i in 20*
do
echo ==$i==
cat $i
echo
done
==2000==
0-<1 58097 (14.8%) 11620 (8.9%) 1347 (11.0%) 22706 (18.2%)
1-<2 38441 (9.8%) 9887 (7.5%) 887 (7.2%) 11211 (9.0%)
2-<3 27856 (7.1%) 8916 (6.8%) 807 (6.6%) 9293 (7.4%)
3-<4 24160 (6.1%) 7210 (5.5%) 1019 (8.3%) 8459 (6.8%)
4-<5 28211 (7.2%) 6581 (5.0%) 731 (5.9%) 7725 (6.2%)
5-<6 29790 (7.6%) 6044 (4.6%) 1164 (9.5%) 7539 (6.0%)
6-<7 29543 (7.5%) 9011 (6.9%) 978 (8.0%) 6149 (4.9%)
7-<8 38186 (9.7%) 6777 (5.2%) 949 (7.7%) 7342 (5.9%)
8-<9 28030 (7.1%) 5083 (3.9%) 875 (7.1%) 7460 (6.0%)
9-<10 25588 (6.5%) 3601 (2.7%) 831 (6.8%) 6549 (5.2%)
10-<11 8378 (2.1%) 2985 (2.3%) 768 (6.2%) 5927 (4.7%)
11-<12 8675 (2.2%) 4320 (3.3%) 603 (4.9%) 6617 (5.3%)
12-<13 2634 (0.7%) 3843 (2.9%) 395 (3.2%) 3477 (2.8%)
13-<14 2287 (0.6%) 3097 (2.4%) 329 (2.7%) 1601 (1.3%)
14-<15 2151 (0.5%) 2917 (2.2%) 180 (1.5%) 1116 (0.9%)
15-<16 1895 (0.5%) 4901 (3.7%) 156 (1.3%) 1652 (1.3%)
16-<17 4161 (1.1%) 8388 (6.4%) 84 (0.7%) 3035 (2.4%)
17-<18 6681 (1.7%) 8678 (6.6%) 78 (0.6%) 3247 (2.6%)
18-<19 9235 (2.4%) 7654 (5.8%) 82 (0.7%) 2520 (2.0%)
19-<20 3781 (1.0%) 4539 (3.5%) 37 (0.3%) 1020 (0.8%)
20- 15181 (3.9%) 4912 (3.8%) 0 (0.0%) 209 (0.2%)

==2001==
0-<1 67134 (16.6%) 13980 (10.7%) 861 (6.8%) 13895 (10.9%)
1-<2 58000 (14.3%) 11496 (8.8%) 1348 (10.7%) 22696 (17.8%)
2-<3 38210 (9.4%) 9752 (7.4%) 887 (7.0%) 11163 (8.8%)
3-<4 27614 (6.8%) 8757 (6.7%) 807 (6.4%) 9259 (7.3%)
4-<5 19420 (4.8%) 6908 (5.3%) 1015 (8.0%) 8044 (6.3%)
5-<6 25157 (6.2%) 6078 (4.6%) 728 (5.8%) 7519 (5.9%)
6-<7 25574 (6.3%) 5357 (4.1%) 1157 (9.2%) 7291 (5.7%)
7-<8 23843 (5.9%) 8348 (6.4%) 967 (7.7%) 5893 (4.6%)
8-<9 34102 (8.4%) 6121 (4.7%) 938 (7.4%) 6975 (5.5%)
9-<10 24297 (6.0%) 4245 (3.2%) 848 (6.7%) 7061 (5.5%)
10-<11 14480 (3.6%) 2747 (2.1%) 751 (5.9%) 5203 (4.1%)
11-<12 8238 (2.0%) 2819 (2.2%) 727 (5.8%) 5109 (4.0%)
12-<13 8521 (2.1%) 4035 (3.1%) 571 (4.5%) 5598 (4.4%)
13-<14 2556 (0.6%) 3371 (2.6%) 379 (3.0%) 2769 (2.2%)
14-<15 2135 (0.5%) 2482 (1.9%) 296 (2.3%) 1097 (0.9%)
15-<16 1956 (0.5%) 2230 (1.7%) 54 (0.4%) 512 (0.4%)
16-<17 1791 (0.4%) 4271 (3.3%) 129 (1.0%) 1199 (0.9%)
17-<18 3916 (1.0%) 7339 (5.6%) 66 (0.5%) 2149 (1.7%)
18-<19 5694 (1.4%) 7355 (5.6%) 51 (0.4%) 2222 (1.7%)
19-<20 5450 (1.3%) 6085 (4.6%) 44 (0.3%) 1386 (1.1%)
20- 7266 (1.8%) 7134 (5.4%) 0 (0.0%) 233 (0.2%)

==2002==
0-<1 62935 (15.6%) 17078 (13.0%) 648 (5.1%) 10317 (8.2%)
1-<2 67066 (16.6%) 13870 (10.6%) 859 (6.8%) 13892 (11.0%)
2-<3 57110 (14.1%) 11322 (8.6%) 1346 (10.6%) 22611 (18.0%)
3-<4 36747 (9.1%) 9562 (7.3%) 885 (7.0%) 11023 (8.8%)
4-<5 26719 (6.6%) 8509 (6.5%) 804 (6.3%) 9197 (7.3%)
5-<6 13305 (3.3%) 6441 (4.9%) 996 (7.8%) 7363 (5.8%)
6-<7 20309 (5.0%) 5442 (4.1%) 718 (5.7%) 7249 (5.8%)
7-<8 18122 (4.5%) 4571 (3.5%) 1148 (9.0%) 6858 (5.4%)
8-<9 15608 (3.9%) 7356 (5.6%) 946 (7.4%) 5446 (4.3%)
9-<10 21537 (5.3%) 5129 (3.9%) 907 (7.1%) 6338 (5.0%)
10-<11 13222 (3.3%) 2984 (2.3%) 747 (5.9%) 4674 (3.7%)
11-<12 14418 (3.6%) 2573 (2.0%) 734 (5.8%) 4916 (3.9%)
12-<13 7913 (2.0%) 2595 (2.0%) 679 (5.3%) 4377 (3.5%)
13-<14 8230 (2.0%) 3639 (2.8%) 523 (4.1%) 4713 (3.7%)
14-<15 2411 (0.6%) 2735 (2.1%) 320 (2.5%) 2176 (1.7%)
15-<16 1821 (0.5%) 1639 (1.2%) 226 (1.8%) 449 (0.4%)
16-<17 1870 (0.5%) 1900 (1.4%) 51 (0.4%) 396 (0.3%)
17-<18 1654 (0.4%) 3586 (2.7%) 102 (0.8%) 906 (0.7%)
18-<19 3357 (0.8%) 5857 (4.5%) 47 (0.4%) 1533 (1.2%)
19-<20 3096 (0.8%) 5415 (4.1%) 21 (0.2%) 1260 (1.0%)
20- 6824 (1.7%) 9234 (7.0%) 0 (0.0%) 237 (0.2%)

==2003==
0-<1 81244 (20.0%) 14926 (11.1%) 699 (5.5%) 13742 (11.0%)
1-<2 62827 (15.5%) 17009 (12.6%) 648 (5.1%) 10273 (8.2%)
2-<3 66234 (16.3%) 13682 (10.2%) 857 (6.8%) 13770 (11.0%)
3-<4 47358 (11.7%) 11052 (8.2%) 1336 (10.6%) 22139 (17.7%)
4-<5 27250 (6.7%) 9291 (6.9%) 868 (6.9%) 10236 (8.2%)
5-<6 22390 (5.5%) 8189 (6.1%) 797 (6.3%) 8978 (7.2%)
6-<7 8327 (2.1%) 5861 (4.3%) 975 (7.7%) 6129 (4.9%)
7-<8 12810 (3.2%) 4811 (3.6%) 703 (5.6%) 6407 (5.1%)
8-<9 10545 (2.6%) 3871 (2.9%) 1072 (8.5%) 5971 (4.8%)
9-<10 6747 (1.7%) 6312 (4.7%) 923 (7.3%) 4746 (3.8%)
10-<11 4207 (1.0%) 4280 (3.2%) 842 (6.7%) 5261 (4.2%)
11-<12 13164 (3.2%) 2643 (2.0%) 728 (5.8%) 4001 (3.2%)
12-<13 14236 (3.5%) 2275 (1.7%) 707 (5.6%) 4266 (3.4%)
13-<14 6678 (1.6%) 2255 (1.7%) 612 (4.8%) 3321 (2.7%)
14-<15 6834 (1.7%) 3043 (2.3%) 442 (3.5%) 2937 (2.3%)
15-<16 1849 (0.5%) 1808 (1.3%) 256 (2.0%) 611 (0.5%)
16-<17 1638 (0.4%) 1336 (1.0%) 64 (0.5%) 330 (0.3%)
17-<18 1629 (0.4%) 1604 (1.2%) 31 (0.2%) 284 (0.2%)
18-<19 1157 (0.3%) 3039 (2.3%) 71 (0.6%) 620 (0.5%)
19-<20 1079 (0.3%) 4855 (3.6%) 22 (0.2%) 800 (0.6%)
20- 7125 (1.8%) 12625 (9.4%) 0 (0.0%) 201 (0.2%)

==2004==
0-<1 96670 (23.2%) 12046 (8.8%) 684 (5.3%) 14901 (11.8%)
1-<2 81164 (19.5%) 14855 (10.9%) 699 (5.4%) 13736 (10.8%)
2-<3 60289 (14.5%) 16825 (12.4%) 644 (5.0%) 9992 (7.9%)
3-<4 56374 (13.5%) 13448 (9.9%) 853 (6.6%) 13360 (10.5%)
4-<5 27860 (6.7%) 10732 (7.9%) 1316 (10.2%) 20725 (16.4%)
5-<6 13038 (3.1%) 8927 (6.6%) 846 (6.6%) 8897 (7.0%)
6-<7 15284 (3.7%) 7762 (5.7%) 784 (6.1%) 8686 (6.9%)
7-<8 4311 (1.0%) 5272 (3.9%) 947 (7.3%) 5003 (3.9%)
8-<9 6970 (1.7%) 4241 (3.1%) 691 (5.4%) 5662 (4.5%)
9-<10 4538 (1.1%) 3178 (2.3%) 1046 (8.1%) 5459 (4.3%)
10-<11 811 (0.2%) 4843 (3.6%) 890 (6.9%) 4190 (3.3%)
11-<12 4165 (1.0%) 3788 (2.8%) 829 (6.4%) 4969 (3.9%)
12-<13 12979 (3.1%) 2263 (1.7%) 704 (5.5%) 3305 (2.6%)
13-<14 13490 (3.2%) 1969 (1.4%) 688 (5.3%) 3505 (2.8%)
14-<15 4395 (1.1%) 1877 (1.4%) 567 (4.4%) 2256 (1.8%)
15-<16 3856 (0.9%) 2481 (1.8%) 373 (2.9%) 655 (0.5%)
16-<17 1565 (0.4%) 1509 (1.1%) 242 (1.9%) 433 (0.3%)
17-<18 1298 (0.3%) 1096 (0.8%) 26 (0.2%) 224 (0.2%)
18-<19 1248 (0.3%) 1340 (1.0%) 26 (0.2%) 200 (0.2%)
19-<20 753 (0.2%) 2589 (1.9%) 37 (0.3%) 378 (0.3%)
20- 6045 (1.4%) 15081 (11.1%) 0 (0.0%) 173 (0.1%)

==2005==
0-<1 109165 (24.9%) 12122 (8.7%) 776 (5.9%) 14138 (11.0%)
1-<2 96518 (22.0%) 11976 (8.6%) 684 (5.2%) 14898 (11.6%)
2-<3 78754 (18.0%) 14712 (10.6%) 699 (5.3%) 13689 (10.7%)
3-<4 46496 (10.6%) 16599 (12.0%) 639 (4.8%) 9394 (7.3%)
4-<5 34396 (7.8%) 13127 (9.5%) 840 (6.4%) 12379 (9.7%)
5-<6 10562 (2.4%) 10358 (7.5%) 1290 (9.8%) 18348 (14.3%)
6-<7 6644 (1.5%) 8500 (6.1%) 819 (6.2%) 7718 (6.0%)
7-<8 8462 (1.9%) 7326 (5.3%) 767 (5.8%) 8133 (6.3%)
8-<9 2284 (0.5%) 4794 (3.5%) 922 (7.0%) 4245 (3.3%)
9-<10 3250 (0.7%) 3661 (2.6%) 674 (5.1%) 5160 (4.0%)
10-<11 644 (0.1%) 2218 (1.6%) 1006 (7.6%) 4944 (3.9%)
11-<12 787 (0.2%) 4241 (3.1%) 879 (6.6%) 4035 (3.1%)
12-<13 4003 (0.9%) 3271 (2.4%) 812 (6.1%) 4514 (3.5%)
13-<14 12431 (2.8%) 1903 (1.4%) 666 (5.0%) 2595 (2.0%)
14-<15 11822 (2.7%) 1661 (1.2%) 664 (5.0%) 2437 (1.9%)
15-<16 1823 (0.4%) 1483 (1.1%) 504 (3.8%) 690 (0.5%)
16-<17 2859 (0.7%) 2214 (1.6%) 342 (2.6%) 454 (0.4%)
17-<18 1113 (0.3%) 1299 (0.9%) 211 (1.6%) 194 (0.2%)
18-<19 879 (0.2%) 945 (0.7%) 16 (0.1%) 52 (0.0%)
19-<20 844 (0.2%) 1154 (0.8%) 10 (0.1%) 42 (0.0%)
20- 4458 (1.0%) 15024 (10.8%) 0 (0.0%) 134 (0.1%)

==2006==
0-<1 116741 (24.7%) 11456 (8.1%) 985 (7.1%) 13358 (10.1%)
1-<2 109075 (23.1%) 12047 (8.5%) 778 (5.6%) 14133 (10.6%)
2-<3 93240 (19.7%) 11848 (8.4%) 686 (5.0%) 14908 (11.2%)
3-<4 63124 (13.4%) 14511 (10.2%) 701 (5.1%) 13655 (10.3%)
4-<5 26056 (5.5%) 16349 (11.5%) 629 (4.5%) 8451 (6.4%)
5-<6 15655 (3.3%) 12805 (9.0%) 816 (5.9%) 11285 (8.5%)
6-<7 5823 (1.2%) 9934 (7.0%) 1249 (9.0%) 17076 (12.9%)
7-<8 3398 (0.7%) 8052 (5.7%) 789 (5.7%) 6918 (5.2%)
8-<9 4456 (0.9%) 6862 (4.8%) 748 (5.4%) 7732 (5.8%)
9-<10 1174 (0.2%) 4366 (3.1%) 901 (6.5%) 3830 (2.9%)
10-<11 1131 (0.2%) 2953 (2.1%) 640 (4.6%) 4625 (3.5%)
11-<12 634 (0.1%) 1975 (1.4%) 1004 (7.3%) 4806 (3.6%)
12-<13 746 (0.2%) 3696 (2.6%) 865 (6.3%) 3800 (2.9%)
13-<14 3617 (0.8%) 2849 (2.0%) 784 (5.7%) 3969 (3.0%)
14-<15 10967 (2.3%) 1587 (1.1%) 639 (4.6%) 1885 (1.4%)
15-<16 8972 (1.9%) 1318 (0.9%) 625 (4.5%) 1219 (0.9%)
16-<17 1295 (0.3%) 1344 (0.9%) 483 (3.5%) 556 (0.4%)
17-<18 1914 (0.4%) 1985 (1.4%) 312 (2.3%) 343 (0.3%)
18-<19 697 (0.1%) 1135 (0.8%) 193 (1.4%) 134 (0.1%)
19-<20 554 (0.1%) 825 (0.6%) 4 (0.0%) 30 (0.0%)
20- 3039 (0.6%) 13984 (9.9%) 0 (0.0%) 128 (0.1%)

==2007==
0<1 106502 (20.7%) 10343 (7.2%) 775 (5.5%) 10652 (7.7%)
1<2 116656 (22.7%) 11338 (7.9%) 981 (6.9%) 13378 (9.7%)
2<3 108606 (21.1%) 11897 (8.3%) 777 (5.5%) 14204 (10.2%)
3<4 81376 (15.8%) 11704 (8.2%) 687 (4.8%) 14926 (10.8%)
4<5 42069 (8.2%) 14297 (10.0%) 695 (4.9%) 13583 (9.8%)
5<6 12678 (2.5%) 16098 (11.2%) 611 (4.3%) 7833 (5.7%)
6<7 10607 (2.1%) 12396 (8.6%) 798 (5.6%) 10740 (7.7%)
7<8 3638 (0.7%) 9397 (6.5%) 1225 (8.6%) 16386 (11.8%)
8<9 2024 (0.4%) 7418 (5.2%) 768 (5.4%) 6677 (4.8%)
9<10 2288 (0.4%) 6068 (4.2%) 729 (5.1%) 7647 (5.5%)
10-<11 502 (0.1%) 3405 (2.4%) 885 (6.2%) 3724 (2.7%)
11-<12 1125 (0.2%) 2725 (1.9%) 630 (4.4%) 4516 (3.3%)
12-<13 621 (0.1%) 1732 (1.2%) 999 (7.0%) 4632 (3.3%)
13-<14 698 (0.1%) 3142 (2.2%) 856 (6.0%) 3499 (2.5%)
14-<15 3223 (0.6%) 2475 (1.7%) 761 (5.4%) 3341 (2.4%)
15-<16 9311 (1.8%) 1249 (0.9%) 604 (4.3%) 907 (0.7%)
16-<17 6982 (1.4%) 1209 (0.8%) 615 (4.3%) 1021 (0.7%)
17-<18 980 (0.2%) 1224 (0.9%) 458 (3.2%) 463 (0.3%)
18-<19 1380 (0.3%) 1760 (1.2%) 281 (2.0%) 264 (0.2%)
19-<20 445 (0.1%) 992 (0.7%) 57 (0.4%) 94 (0.1%)
20- 2974 (0.6%) 12613 (8.8%) 0 (0.0%) 117 (0.1%)

==2008==
0<1 96945 (17.6%) 10336 (7.1%) 1506 (10.1%) 8630 (6.0%)
1<2 106440 (19.3%) 10212 (7.0%) 778 (5.2%) 10640 (7.4%)
2<3 116471 (21.2%) 11162 (7.7%) 980 (6.5%) 13364 (9.3%)
3<4 102520 (18.6%) 11740 (8.1%) 775 (5.2%) 14192 (9.9%)
4<5 60442 (11.0%) 11508 (7.9%) 686 (4.6%) 14910 (10.4%)
5<6 23981 (4.4%) 14094 (9.7%) 695 (4.6%) 13510 (9.4%)
6<7 8570 (1.6%) 15795 (10.9%) 598 (4.0%) 7393 (5.2%)
7<8 7668 (1.4%) 11928 (8.2%) 783 (5.2%) 10315 (7.2%)
8<9 2474 (0.4%) 8761 (6.0%) 1198 (8.0%) 15908 (11.1%)
9<10 1131 (0.2%) 6491 (4.5%) 747 (5.0%) 6446 (4.5%)
10-<11 594 (0.1%) 4381 (3.0%) 691 (4.6%) 7226 (5.1%)
11-<12 498 (0.1%) 3203 (2.2%) 874 (5.8%) 3711 (2.6%)
12-<13 1113 (0.2%) 2513 (1.7%) 627 (4.2%) 4428 (3.1%)
13-<14 604 (0.1%) 1561 (1.1%) 993 (6.6%) 4494 (3.1%)
14-<15 649 (0.1%) 2737 (1.9%) 837 (5.6%) 3155 (2.2%)
15-<16 2698 (0.5%) 2156 (1.5%) 718 (4.8%) 2303 (1.6%)
16-<17 7810 (1.4%) 1121 (0.8%) 589 (3.9%) 813 (0.6%)
17-<18 5353 (1.0%) 1094 (0.8%) 443 (3.0%) 872 (0.6%)
18-<19 746 (0.1%) 1039 (0.7%) 378 (2.5%) 390 (0.3%)
19-<20 851 (0.2%) 1257 (0.9%) 80 (0.5%) 153 (0.1%)
20- 2897 (0.5%) 12199 (8.4%) 0 (0.0%) 113 (0.1%)

==2009==
0-<1 68464 (11.9%) 8827 (6.0%) 1376 (8.8%) 5552 (3.8%)
1-<2 96927 (16.8%) 10248 (7.0%) 1505 (9.6%) 8624 (6.0%)
2-<3 106281 (18.4%) 10076 (6.9%) 778 (5.0%) 10631 (7.3%)
3-<4 116043 (20.1%) 10982 (7.5%) 978 (6.2%) 13352 (9.2%)
4-<5 93610 (16.2%) 11540 (7.9%) 773 (4.9%) 14172 (9.8%)
5-<6 44002 (7.6%) 11323 (7.7%) 681 (4.3%) 14865 (10.3%)
6-<7 17511 (3.0%) 13800 (9.4%) 687 (4.4%) 13337 (9.2%)
7-<8 5936 (1.0%) 15437 (10.5%) 575 (3.7%) 7057 (4.9%)
8-<9 5465 (0.9%) 11277 (7.7%) 759 (4.8%) 9947 (6.9%)
9-<10 1574 (0.3%) 7901 (5.4%) 1165 (7.4%) 15447 (10.7%)
10-<11 505 (0.1%) 4889 (3.3%) 704 (4.5%) 6136 (4.2%)
11-<12 586 (0.1%) 4118 (2.8%) 684 (4.4%) 7140 (4.9%)
12-<13 488 (0.1%) 2967 (2.0%) 870 (5.6%) 3678 (2.5%)
13-<14 1096 (0.2%) 2329 (1.6%) 618 (3.9%) 4263 (2.9%)
14-<15 577 (0.1%) 1365 (0.9%) 979 (6.3%) 4187 (2.9%)
15-<16 550 (0.1%) 2313 (1.6%) 812 (5.2%) 2548 (1.8%)
16-<17 2377 (0.4%) 1919 (1.3%) 683 (4.4%) 2121 (1.5%)
17-<18 6618 (1.1%) 1021 (0.7%) 515 (3.3%) 686 (0.5%)
18-<19 4297 (0.7%) 985 (0.7%) 372 (2.4%) 658 (0.5%)
19-<20 661 (0.1%) 842 (0.6%) 145 (0.9%) 290 (0.2%)
20- 3420 (0.6%) 12178 (8.3%) 0 (0.0%) 111 (0.1%)

==2010==
0-<1 41407 (7.0%) 8228 (5.6%) 1088 (6.8%) 3905 (2.7%)
1-<2 68503 (11.5%) 8744 (5.9%) 1376 (8.6%) 5547 (3.9%)
2-<3 96887 (16.3%) 10139 (6.9%) 1509 (9.5%) 8623 (6.0%)
3-<4 105917 (17.8%) 9973 (6.8%) 781 (4.9%) 10633 (7.4%)
4-<5 115583 (19.4%) 10846 (7.4%) 976 (6.1%) 13331 (9.3%)
5-<6 88437 (14.9%) 11317 (7.7%) 773 (4.9%) 14154 (9.9%)
6-<7 37564 (6.3%) 11092 (7.5%) 678 (4.3%) 14838 (10.3%)
7-<8 14014 (2.4%) 13529 (9.2%) 673 (4.2%) 13238 (9.2%)
8-<9 4713 (0.8%) 14983 (10.2%) 559 (3.5%) 6879 (4.8%)
9-<10 3790 (0.6%) 10391 (7.1%) 730 (4.6%) 9380 (6.5%)
10-<11 558 (0.1%) 5958 (4.0%) 1063 (6.7%) 12946 (9.0%)
11-<12 501 (0.1%) 4648 (3.2%) 688 (4.3%) 6116 (4.3%)
12-<13 581 (0.1%) 3875 (2.6%) 680 (4.3%) 7105 (4.9%)
13-<14 475 (0.1%) 2779 (1.9%) 862 (5.4%) 3662 (2.5%)
14-<15 1082 (0.2%) 2140 (1.5%) 604 (3.8%) 4068 (2.8%)
15-<16 526 (0.1%) 1174 (0.8%) 959 (6.0%) 3678 (2.6%)
16-<17 533 (0.1%) 2068 (1.4%) 790 (5.0%) 2430 (1.7%)
17-<18 2018 (0.3%) 1706 (1.2%) 618 (3.9%) 1973 (1.4%)
18-<19 5121 (0.9%) 920 (0.6%) 436 (2.7%) 563 (0.4%)
19-<20 3217 (0.5%) 851 (0.6%) 93 (0.6%) 437 (0.3%)
20- 3758 (0.6%) 11921 (8.1%) 0 (0.0%) 107 (0.1%)

==2011==
0-<1 27748 (4.6%) 7991 (5.5%) 1502 (9.0%) 5175 (3.6%)
1-<2 41426 (6.9%) 8175 (5.6%) 1089 (6.5%) 3903 (2.7%)
2-<3 68512 (11.3%) 8677 (6.0%) 1376 (8.3%) 5545 (3.8%)
3-<4 96877 (16.0%) 10037 (6.9%) 1509 (9.1%) 8616 (5.9%)
4-<5 105783 (17.5%) 9880 (6.8%) 781 (4.7%) 10630 (7.3%)
5-<6 115335 (19.1%) 10714 (7.4%) 975 (5.9%) 13317 (9.2%)
6-<7 87554 (14.5%) 11118 (7.6%) 773 (4.6%) 14142 (9.7%)
7-<8 34178 (5.7%) 10837 (7.4%) 672 (4.0%) 14809 (10.2%)
8-<9 11710 (1.9%) 13134 (9.0%) 658 (4.0%) 13135 (9.0%)
9-<10 3376 (0.6%) 13623 (9.4%) 530 (3.2%) 6647 (4.6%)
10-<11 588 (0.1%) 6836 (4.7%) 670 (4.0%) 7895 (5.4%)
11-<12 558 (0.1%) 5701 (3.9%) 1053 (6.3%) 12857 (8.9%)
12-<13 501 (0.1%) 4404 (3.0%) 683 (4.1%) 6098 (4.2%)
13-<14 575 (0.1%) 3638 (2.5%) 671 (4.0%) 7072 (4.9%)
14-<15 464 (0.1%) 2577 (1.8%) 841 (5.1%) 3610 (2.5%)
15-<16 1016 (0.2%) 1913 (1.3%) 580 (3.5%) 3565 (2.5%)
16-<17 519 (0.1%) 1084 (0.7%) 941 (5.7%) 3604 (2.5%)
17-<18 512 (0.1%) 1831 (1.3%) 701 (4.2%) 2313 (1.6%)
18-<19 1094 (0.2%) 1499 (1.0%) 543 (3.3%) 1787 (1.2%)
19-<20 1149 (0.2%) 768 (0.5%) 104 (0.6%) 319 (0.2%)
20- 4248 (0.7%) 11243 (7.7%) 0 (0.0%) 119 (0.1%)

Labels:

Sunday, July 03, 2011

How A Simple Test Script Evolve

I will try to walk you through how a simple test script evolve from a one-time throw-away script to a more generic program that can be re-use.

The requirement is pretty simple. "Can I have a script to submit program-a to run 5 times for 10 minutes, 3 times for 5 minutes and 2 times for half an hour?". There you go with a one-time throw-away script. BTW, this is supposed to run on a Linux system

#! /bin/sh

./program-a --duration 600
./program-a --duration 600
./program-a --duration 600
./program-a --duration 600
./program-a --duration 600

./program-a --duration 300
./program-a --duration 300
./program-a --duration 300

./program-a --duration 1800
./program-a --duration 1800

Our job is done. What if the requirement for duration needs to be changed and we will have to modify quite a few places in the above script. So the script evolved to do some looping.

#! /bin/sh

for i in 1 2 3 4 5
do
    ./program-a --duration 600
done

for i in 1 2 3
do
    ./program-a --duration 300
done

for i in 1 2
do
   ./program-a --duration 1800
done

Not too bad! But there are still a few places where we hard code things like how many times we loop and the duration. OK, may be we can do some form of input instead of fix values

#! /bin/sh

while read duration howmany
do
    seq $howmany | while read count
    do
        ./program-a --duration $duration
    done
done <<EOF
600 5
300 3
1800 2
EOF

Now all I have to do is to change the input data (within the EOF) in the script to control the duration in seconds and how many times to loop through. Hey, but I still need to modify the program every time there is a change in requirement. Can't we have that information to be controlled by input arguments. Also, I want to tell the script in terms of day/hour/minute/second instead of just seconds because I am lazy to calculate. Not a problem at all. Let say the input argument takes the form of "1d23h45m6sx7" to represent looping through the prgroam 7 times with a duration of 1 day 23 hours 45 minutes 6 seconds (that is 171906 seconds, if you work it out). This calculation can be easily done by using sed to substitute day (d) with '*86400+', hour (h) with '*3600+', minute (m) with '*60+' and second (s) with ''. You also need to remove any ending '+' to avoid illegal arthematic epxpression.

#! /bin/sh

if [ $# -eq 0 ]; then
    echo "Usage: $0  [?d?h?m?sx?]"
    echo "       Eg. $0 1d23h45m6sx17 4h30sx8 1hx7"
    exit 1
fi


# check arg syntax
for arg in $@
do
    echo $arg | egrep '^([1-9][0-9]*d)?([1-9][0-9]*h)?([1-9][0-9]*m)?([1-9][0-9]*s)?x[1-9][0-9]*$' > /dev/null 
    if [ $? -ne 0 ]; then
        echo "Error. $arg syntax not correct"
        exit 2
    fi
done


for arg in $@
do

    duration=`echo ${arg%x*} | sed -e 's/d/*86400+/;s/h/*3600+/;s/m/*60+/;s/s//;s/+$//' | bc`
    for count in `seq ${arg#*x}`
    do
        ./program-a --duration $duration
    done
done

If you need to run program-a 1.5 day once, 3 hour and 5 seconds twice and 30 minutes thrice, all you need to do is
./loop.sh 1d12hx1 3h5sx2 30mx3
I hope this blog shows you how a simple script can evolve into something so generic that your future test scripts can be based on.

Labels:

Sunday, May 29, 2011

Monitor Solaris Zones with prstat

Solaris provides an interactive command line tool, prstat, to help you to monitor the zone utilisation if you provide "-Z" flag. The screen displays both processes info and zones info to fit your terminal window.
   PID USERNAME  SIZE   RSS STATE  PRI NICE      TIME  CPU PROCESS/NLWP       
  8632 chihung  7204K 6792K cpu10   49    0   0:00:00 0.0% prstat/1
   584 root     1696K  684K sleep   59    0   0:00:00 0.0% smcboot/1
   260 root     2008K 1244K sleep   59    0   0:00:00 0.0% ttymon/1
   228 root     2264K  924K sleep   59    0   0:01:16 0.0% cron/1
   112 root     3524K 2600K sleep   59    0   0:00:00 0.0% picld/5
   110 root     2140K 1288K sleep   59    0   0:00:00 0.0% syseventd/14
   255 root     1696K  904K sleep   59    0   0:00:05 0.0% sac/1
   231 daemon   2280K  940K sleep   59    0   0:00:00 0.0% rpcbind/1
   130 root     7952K 6216K sleep   59    0   0:00:03 0.0% devfsadm/72
 22936 root     3348K 1356K sleep   59    0   0:00:00 0.0% sshd/1
  9186 root     4424K 2508K sleep   59    0   0:00:00 0.0% htt_server/2
   237 daemon   2028K 1276K sleep   60  -20   0:00:00 0.0% nfs4cbd/2
   138 daemon   4076K 2188K sleep   59    0   0:09:56 0.0% kcfd/4
   238 daemon    495M  494M sleep   59    0   0:06:46 0.0% nfsmapid/4
   137 root     1312K  888K sleep   59    0   0:00:00 0.0% powerd/2
     9 root     9672K 8720K sleep   59    0   0:03:00 0.0% svc.configd/17
     7 root       12M   10M sleep   59    0   0:02:47 0.0% svc.startd/13
   239 daemon   2336K 1512K sleep   59    0   0:00:00 0.0% statd/1
ZONEID    NPROC  SIZE   RSS MEMORY      TIME  CPU ZONE                        
     0      123 1489M 1008M   3.0%  10:01:44 0.0% global                      
    26       29   90M   56M   0.1%   0:18:41 0.0% john                      
    27       33  169M  123M   0.3%   0:23:04 0.0% mark                 
     5       31  155M  118M   0.3%   0:18:47 0.0% node102                     
     2       32  161M  122M   0.3%   1:21:49 0.0% sgeexec2                    
    62       33  168M  122M   0.3%   0:22:38 0.0% henry                     
    19       29   89M   54M   0.1%   0:18:56 0.0% peter                     
Total: 2221 processes, 8427 lwps, load averages: 1.65, 4.26, 2.60

With "-c" flag to avoid overwritting the previous display and "-n 1,99999" to tell prstat to display up to 99999 zones (that should be enough for all your zones) information, you can pipe that to AWK to extract the MEMORY and CPU. If you schedule this task to start at mid-night daily at a sampling interval of 300 with 288 samples, you cover the whole day monitoring of all your zones.

Here is a script to convert the prstat zone data to CSV

#! /bin/ksh


export PATH=/usr/bin:/bin:/usr/sbin:/sbin


# zone IDs store as comma separated
zids=`zoneadm list -v | awk 'NR>1{printf("%d,",$1)}'`
zids=${zids%,}

# zone NAMEs store as comma separated
znames=`zoneadm list -v | awk 'NR>1{printf("%s,",$2)}'`
znames=${znames%,}


prstat -Z -n 1,99999 $1 $2 | nawk -v zids=$zids -v znames=$znames '
# print header (mem/* and cpu/*)
BEGIN {
 n=split(zids,a,",")
 m=split(znames,b,",")
 # %mem ($5)
 for(i=1;i<=n;++i){
  printf("mem/%s,",b[i])
 }
 # %cpu ($7)
 for(i=1;i<=n;++i){
  printf("cpu/%s,",b[i])
 }
 printf("\n")
}
# store each interval in array
/ZONEID/,/Total/ {
 if ( $0 !~ /ZONEID/ && $0 !~ /Total/ ) {
  gsub("%","")
  mem[$1]=$5
  cpu[$1]=$7
 }
}
/Total/ {
 for(i=1;i<=n;++i) {
  printf("%s,",mem[a[i]])
 }
 for(i=1;i<=n;++i) {
  printf("%s,",cpu[a[i]])
 }
 printf("\n")
}'

Let's run through it with an interval of 10 seconds for 4 samples. If you have 3 zones in this machine, the output will look like this

./zstat.sh 10 4
mem/global,mem/zone1,mem/zone2,mem/zone3,cpu/global,cpu/zone2,cpu/zone3,
3.2,1.2,1.0,1.0,23.0,5.0,7.0,9.0,2.0,
4.7,1.5,1.7,1.5,31.0,15.0,6.0,9.0,1.0,
11.0,5.7,2.3,3.0,40.0,5.0,22.0,13.0,5.0,
13.2,7.2,4.0,2.0,34.0,5.0,17.0,10.0,7.0,
I ran this script in my machines with 67 zones and imported the CSV into openoffice, here is the sample output:

Labels: , ,

Saturday, March 05, 2011

200 Countries, 200 Years, 4 Minutes

Another great Data Visualization example.

Wonder why Congo still stuck in the bottom-left of the chart. You may want to find out more about Rape of a Nation

No More Hanging Jobs in Cron

Have you encountered 'prtdiag' or other commands hang for whatever reason? If your script happens to run these commands and launch from cron, your job will simply pile up until cron hits the limit. By default, Solaris configured cron to run 100 concurrent jobs and the next 101th job will just fail.

I developed a launcher program (watchdog) to limit the runtime of script (worker) that may have the above mentioned behaviour. It works well with my worker script and it should work for other programs too. So, no more hanging job in cron !!

#! /bin/ksh
#
# A watchdog program to limit the elapsed time of the worker shell script
# to avoid hanging processes that can pile up if worker runs under cron
#


export PATH=/usr/bin:/usr/sbin:/bin


#
# default time limit is 60 seconds
#
timelimit=${1:-60}

B
worker="${0%/*}/check-worker.ksh"
worker_name=${worker##*/}
worker_name=${worker_name%.*}
if [ ! -f $worker ]; then
    echo "Error. \"$worker\" cannot be found"
    exit 1
fi
if [ ! -x $worker ]; then
    echo "Error. \"$worker\" is not executable"
    exit 2
fi


watchdog()
{
    sleep 1; # wait for the worker to start
    while [ $timelimit -gt 0 ]
    do
        # pgrep is available since 5.8, else use ps -ef | grep -v grep | grep $worker_name
        jobid=`pgrep $worker_name`
        if [ $? -eq 1 ]; then
            break
        else
            sleep 1
        fi
        ((timelimit-=1))
    done
    if [ $timelimit -eq 0 ]; then
        # kill worker + child processes
        ptree $jobid | awk '$1=='$jobid'{start=1}start==1{print $1}' | while read pid
            do
                kill -TERM "$pid" > /dev/null 2>&1
            done
    fi
}


#
# start the watchdog before the worker
#
watchdog &


tmpfile="/tmp/.$work_name.$$"
$worker > $tmpfile 2>&1 &
worker_id=$!
wait $worker_id > /dev/null 2>&1
rc=$?


if [ $rc -ne 0 ]; then
    # replace this line to do whatever you want, send email, sms, logger....
    #
    # echo .... | mailx someone@somewhere.com

    details=`cat $tmpfile 2>/dev/null`
    echo "Exit status=$rc. There is a problem with the server '`hostname`' - $details"
fi


rm -f $tmpfile

Labels: ,