Chi Hung Chan: January 2009

Wednesday, January 28, 2009

Sun Tech Days 2009, Lab Instructions

Not free to attend the latest Sun Tech Days 2009 in Singapore. I am sure you will be happy to understand that some of the Lab Instructions are available online:

Labels: DTrace, Solaris, Sun, ZFS

Tuesday, January 27, 2009

Learn 10% SQL That Accounts for 90% Query

I just finished reading Data Crunching: Solve Everyday Problems Using Java, Python, and More , that I borrowed from the National Library. In Chapter 6, the author showed his readers "... meet the 10% of SQL that accounts for 90% of common use ...".

Here I am writing that 10% down in my blog so that next time I can refer to it and I am using SQLite database to illustrate all the steps. BTW, SQLite (SQLite is a software library that implements a self-contained, serverless, zero-configuration, transactional SQL database engine) is an extremely powerful database and definitely worth your time in learning it.

Here I am going to create the database table and populate it with data in my Cygwin environment:

$ uname -a
CYGWIN_NT-6.0 user-PC 1.5.25(0.156/4/2) 2008-06-12 19:34 i686 Cygwin

$ ls -l test.db
ls: cannot access test.db: No such file or directory

$ sqlite3.exe test.db
SQLite version 3.6.2
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> CREATE TABLE Person (
   ...> EmpId  INTEGER NOT NULL PRIMARY KEY,
   ...> FirstName TEXT NOT NULL,
   ...> LastName TEXT NOT NULL,
   ...> Rate  DECIMAL
   ...> );
sqlite> CREATE TABLE Assigned (
   ...> EmpId  INTEGER NOT NULL,
   ...> ProjId INTEGER,
   ...> StartDate DATE,
   ...> EndDate DATE
   ...> );
sqlite> CREATE TABLE Customer (
   ...> CustId INTEGER NOT NULL PRIMARY KEY,
   ...> ContactInfo TEXT
   ...> );
sqlite> CREATE TABLE Project (
   ...> ProjId INTEGER NOT NULL PRIMARY KEY,
   ...> ProjName TEXT,
   ...> CustId INTEGER,
   ...> StartDate DATE,
   ...> EndDate DATE
   ...> );
sqlite>
sqlite> INSERT INTO Person VALUES (3001,'Dave','Thomas',400);
sqlite> INSERT INTO Person VALUES (3002,'Andy','Hunt',400);
sqlite> INSERT INTO Person VALUES (4001,'Greg','Wilson',320);
sqlite> INSERT INTO Person VALUES (4002,'Grace','Hopper',500);
sqlite> INSERT INTO Person VALUES (4003,'Alan','Turing',500);
sqlite> INSERT INTO Person VALUES (4004,'Chunk','Babbage',125);
sqlite>
sqlite> INSERT INTO Project VALUES (904,'RubyMath',70043,'2004-05-01','2004-10-30');
sqlite> INSERT INTO Project VALUES (905,'DBBridge',70047,'2004-05-01','2004-10-30');
sqlite>
sqlite> INSERT INTO Customer VALUES (70043,'MegaCorp Inc.');
sqlite> INSERT INTO Customer VALUES (70047,"Deadlines 'R' Us");
sqlite> INSERT INTO Customer VALUES (70101,'UNiversity of Euphoria');
sqlite>
sqlite> INSERT INTO Assigned VALUES (3001,904,'2005-02-01','2005-02-28');
sqlite> INSERT INTO Assigned VALUES (3002,904,'2005-02-01','2005-03-15');
sqlite> INSERT INTO Assigned VALUES (4001,904,'2005-02-01','2005-03-21');
sqlite> INSERT INTO Assigned VALUES (4001,905,'2005-01-10','2005-02-22');
sqlite> INSERT INTO Assigned VALUES (4002,905,'2005-01-20','2005-04-01');
sqlite> INSERT INTO Assigned VALUES (4004,905,'2005-02-10','2005-03-31');
sqlite>
sqlite> .q

$ ls -l test.db
-rwxrwxrwx 1 user None 5120 Jan 27 20:18 test.db

Here are some of the SQL statements in doing join, nesting and negation. Basically they are trying to answer the following questions:

Who is paying for the RubyMath project ?
Get forenames and surnames of employees on RubyMath project
Select people who are NOT assigned to the RubyMath project
Select people who are assigned to exactly one project
Find people in 904 or 905, but not both
Find the most expensive contractors

$ sqlite3.exe test.db
SQLite version 3.6.2
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> -- Who is paying for the RubyMath project ?
sqlite> SELECT Customer.ContactInfo
   ...> FROM   Customer, Project
   ...> WHERE (Customer.CustId = Project.CustId)
   ...>   AND (Project.ProjName = 'RubyMath');
MegaCorp Inc.
sqlite>
sqlite>
sqlite> SELECT Customer.ContactInfo
   ...> FROM   Customer INNER JOIN Project
   ...> ON     Customer.CustId = Project.CustId
   ...> WHERE  Project.ProjName = 'RubyMath';
MegaCorp Inc.
sqlite>
sqlite>
sqlite> -- Get forenames and surnames of employees on RubyMath project
sqlite> SELECT Person.FirstName, Person.LastName
   ...> FROM   Person, Project, Assigned
   ...> WHERE (Person.EmpId = Assigned.EmpId)
   ...>   AND (Project.ProjName = 'RubyMath')
   ...>   AND (Assigned.ProjId = Project.ProjId);
Dave|Thomas
Andy|Hunt
Greg|Wilson
sqlite>
sqlite>
sqlite> -- Select people who are NOT assigned to the RubyMath project
sqlite> SELECT Person.FirstName, Person.LastName
   ...> FROM   Person
   ...> WHERE  Person.EmpId NOT IN
   ...>        (SELECT Assigned.EmpId
   ...>         FROM   Assigned, Project
   ...>         WHERE (Assigned.ProjId = Project.ProjId)
   ...>         AND   (Project.ProjName = 'RubyMath'));
Grace|Hopper
Alan|Turing
Chunk|Babbage
sqlite>
sqlite>
sqlite> -- Select people who are assigned to exactly one project
sqlite> SELECT Person.FirstName, Person.LastName
   ...> FROM   Person, Assigned
   ...> WHERE  (Person.EmpId = Assigned.EmpId)
   ...> AND    (Assigned.EmpId NOT IN
   ...>         (SELECT A.EmpId
   ...>          FROM   Assigned A, Assigned B
   ...>          WHERE (A.EmpId = B.EmpId)
   ...>          AND   (A.ProjId < B.ProjId)));
Dave|Thomas
Andy|Hunt
Grace|Hopper
Chunk|Babbage
sqlite>
sqlite>
sqlite> -- Find people in 904 or 905, but not both
sqlite> SELECT Person.FirstName, Person.LastName
   ...> FROM   Person, Assigned
   ...> WHERE  (Person.EmpId = Assigned.EmpId)
   ...> AND    ((Assigned.ProjId = 904) OR (Assigned.ProjId = 905))
   ...> AND    (Assigned.ProjId NOT IN
   ...>         (SELECT A.ProjId
   ...>          FROM Assigned A, Assigned B
   ...>          WHERE (A.ProjId = 904) AND (B.ProjId = 905)));
Greg|Wilson
Grace|Hopper
Chunk|Babbage
sqlite>
sqlite>
sqlite> -- Find the most expensive contractors
sqlite> SELECT Person.FirstName, Person.LastName
   ...> FROM Person
   ...> WHERE (Person.Rate NOT IN
   ...>        (SELECT A.Rate
   ...>         FROM Person A, Person B
   ...>         WHERE A.Rate < B.Rate));
Grace|Hopper
Alan|Turing
sqlite> .q

$

Here are the SQL statements in aggregation and views answering the following questions:

Get the total rate for all consultants
Find consultant who rate is above average

$ sqlite3.exe test.db
SQLite version 3.6.2
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> -- Get the total rate for all consultants
sqlite> SELECT Project.ProjName, SUM(Person.Rate)
   ...> FROM   Person, Assigned, Project
   ...> WHERE  (Person.EmpId = Assigned.EmpId)
   ...> AND    (Project.ProjId = Assigned.ProjId)
   ...> GROUP BY Assigned.ProjId;
RubyMath|1120
DBBridge|945
sqlite>
sqlite>
sqlite> -- view
sqlite> CREATE VIEW ProjAveRate AS
   ...>   SELECT Project.ProjId AS ProjId,
   ...>          AVG(Person.Rate) AS AveRate
   ...>   FROM   Person, Assigned, Project
   ...>   WHERE  (Person.EmpId = Assigned.EmpId)
   ...>   AND    (Project.ProjId = Assigned.ProjId)
   ...>   GROUP BY Assigned.ProjId;
sqlite>
sqlite>
sqlite> SELECT ProjAveRate.ProjId, Person.FirstName, Person.LastName, Person.Rate, ProjAveRate.AveRate
   ...> FROM   Person, Assigned, ProjAveRate
   ...> WHERE  (Person.EmpId = Assigned.EmpId)
   ...> AND    (Assigned.ProjId = ProjAveRate.ProjId)
   ...> AND    (Person.Rate > ProjAveRate.AveRate);
904|Dave|Thomas|400|373.333333333333
904|Andy|Hunt|400|373.333333333333
905|Greg|Wilson|320|315.0
905|Grace|Hopper|500|315.0
sqlite> .q

$

If you cannot get hold of the book, you may want to read the author's article - Top Ten Data Crunching Tips and Tricks.

Labels: SQLite

Monday, January 26, 2009

Shell Script Documentation, with commands output

Two months ago, I blogged about how we can have documentation built-in in the shell script. Although the shell script has sufficient documentation, sometime you may not have the privilege to run those commands to fully understand what the script is really doing. Imagine you are given the below code snippet:

for i in `someAdmCmd list | awk 'NR> 2{print $1}'`
do
    someAdmCmd -v list instance=$i | awk 'NR>2 && $5>0 {printf("%s Failed\n", $1)}'
done

You may want to know why you need to skip 2 lines and what is that 5th field suppose to be.

It would be helpful if the original script can provide listing output of someAdmCmd. You can do so by ending your script with exit 0 and follow it by whatever commands output. You don't even have to comment them out because your script will be terminated by exit 0 anyway. Just like below:

#! /bin/sh
...
...

exit 0

# someAdmCmd list
Name  Description Time
======================
someA Something A 12:00
someB Something B 13:00
...

# someAdmCmd -v list instance=someA
Name Path Status Enabled Failed
================================
someA-1 2 OK 2 0
someA-2 2 Failed 1 1
someA-3 2 OK 2 0
someA-4 2 OK 2 0
....

Don't you find this script self-explanatory. I certainly think so.

Labels: shell script

The Practice of Software Engineering from Queue Editorial Board Members

Just came across this two part series from ACM Queue editorial board members Steve Bourne, Eric Allman and Bryan Cantrill discuss the practice of software engineering.

Labels: software

Saturday, January 24, 2009

Shell Script Performance, Part 3

Real cases/scenarios are some time hard to come by and definitely you want to be prepared when you are in such a situation. It is good that you can create such scenario yourself. In my Part 2 regarding shell script performance, I asked my audiences to come up with the fastest way to create 1000 files. of course you do not want to wait for minutes or hours for just 1000 files. What if you want to test on 10,000 or even 100,000 files ?

With that many files in your directory, you can practice your shell scripting by changing the prefix, suffix, extension, padding with zeros, ..., the amount of things you can script is endless.

In Part 1, we understand that process creation is a very expensive task and we should avoid running too many commands, especially within a loop. So I am not going to use the traditional approach like this one:
count=1; while [ $count -le 100 ]; do touch file-$count.txt; count=`expr $count + 1`; done

Below shows you various ways to 'skin a cat':

create1.sh - use 'seq' to generate 1 to 1000 number to be used in a for loop and each loop create a new file
create2.sh - use Bash shell built-in capability to do brace expansion
create3.sh - use 'seq -f' to generate the file name and supply that to touch to create all the files. [Thanks to comment by pjz]
create.tcl - a Tcl implementation
create.py - a Python implementation

And the run time:

create1.sh - 2m36.452s
create2.sh - 0m2.371s
create3.sh - 0m2.075s
create.tcl - 0m7.831s
create.py - 0m2.403s

As you can see, the less commands you used the shortest the run time. In this case, pjz won!. In create2.sh, I introduced Bash shell brace expansion which is a very useful feature in doing command expansion. See this blog in how I create 48 'devices' (c0d0t0s0, ...) in a single command line.

create1.sh

$ cat create1.sh
#! /bin/sh

if [ $# -ne 3 ]; then
        echo "Usage: $0 <prefix> <start#> <end#>"
        exit 1
fi

prefix=$1
start=$2
end=$3

for i in `seq -w $2 $3`
do
        touch $prefix-$i.txt
done

$ time ./create1.sh file 1 1000

real    2m36.452s
user    0m4.260s
sys     0m24.288s

$ rm -f file-*

create2.sh

$ cat create2.sh
#! /bin/sh

if [ $# -ne 1 ]; then
        echo "Usage: $0 <prefix>"
        exit 1
fi

prefix=$1
n="0,1,2,3,4,5,6,7,8,9"
eval touch $prefix-0{$n}{$n}{$n}.txt
mv $prefix-0000.txt $prefix-1000.txt

$ time ./create2.sh file

real    0m2.371s
user    0m0.031s
sys     0m0.856s

$ rm -f file-*

create3.sh

$ cat ./create3.sh
#! /bin/sh


if [ $# -ne 3 ]; then
        echo "Usage: $0 <prefix> <start#> <end#>"
        exit 1
fi
prefix=$1
start=$2
end=$3

touch `seq -f "$prefix-%04g.txt" $start $end`

$ time ./create3.sh file 1 1000

real    0m2.075s
user    0m0.108s
sys     0m0.746s

$ rm -f file-*

create.tcl

$ cat create.tcl
#! /cygdrive/c/Tcl8.4.19/bin/tclsh

if { $argc != 3 } {
        puts stderr "Usage: $argv0 <prefix> <start#> <end#>"
        exit 1
}

set prefix [lindex $argv 0]
set start  [lindex $argv 1]
set end    [lindex $argv 2]
set pad    [string length $end]
set format "%s-%0${pad}d.txt"

for { set i $start } { $i <= $end } { incr i } {
        set fname [format $format $prefix $i]
        set fp [open $fname w]
        close $fp
}

$ time ./create.tcl file 1 1000

real    0m7.831s
user    0m0.000s
sys     0m0.031s

$ rm -f file-*

create.py

$ cat create.py
#! /usr/bin/python

import sys

if len(sys.argv) != 4:
        sys.stderr.write('Usage: %s <prefix> <start#> <end#>' % (sys.argv[0]))
        exit(1)

prefix = sys.argv[1]
start = int(sys.argv[2])
end = int(sys.argv[3])
pad = len(sys.argv[3])
format = '%s-%0' + str(pad) + 'd.txt'
for i in xrange(start, end+1):
        fname = format % (prefix, i)
        fp = open(fname, 'w')
        fp.close()

$ time ./create.py file 1 1000

real    0m2.403s
user    0m0.031s
sys     0m0.951s

$ rm -f file-*

Labels: performance, shell script, Tcl

Friday, January 23, 2009

Solaris 10 How To Guides

Today, I came across this How-To Guide from Sun:
Working with ZFS Snapshots

However, the base URL: http://www.sun.com/software/solaris/howtoguides/ does not provide a full listing of what is available. It just returned a "Page Not Found". With the help of Google, you can uncover all the PDF files.

Simply key in the following in Google search: site:www.sun.com inurl:howtoguides filetype:pdf and you will see a full list like this:

USING SOLARIS 10 SECURITY
POSTGRESQL ON SOLARIS 10
Consolidating Servers and Applications with Solaris Containers
Solaris Operating System - Solaris Live Upgrade How-To Guide
A SOLARIS™ CONTAINER
SMF MANIFEST
ZFS IN SOLARIS CONTAINERS
Working With ZFS Snapshots-Solaris 10 How-To Guides
How to Install the Solaris 10 OS on x86 Systems
SOLARIS 10 SYSTEM
SERVICE MANAGEMENT FACILITY
A TWO-NODE CLUSTER

Labels: google, Solaris

Thursday, January 22, 2009

Debug - Running A Crontab Script Interactively

If your script needs to run in cron, you wouldn't want any output (stdout/stderr) to send to the cron because it will be ended up in your mail box. You may want to consider the below approach to auto set your script to 'debug' mode if you are running in an interactive shell.

If you login to an interactive shell, tty command will give you a pseudo terminal. Whereas if you run tty in a at/cron environment, you will get a "not a tty".

$ tty
/dev/pts/3

$ at now
at> tty > somefile
at> <EOT>
job 20 at 2009-01-22 21:37

$ cat somefile
not a tty

Now we can take advantage of tty so that when you run your cron job in an interactive shell, you are actually telling it to run in a 'debug' mode.

_debug_=1
if [ "`tty`" = "not a tty" ]; then
   _debug_=0
fi

debug()
{
    if [ $_debug_ = 1 ]; then
       if [ "$1" = "-x" ]; then
           shift
           $*
       else
           echo $*
       fi
    fi
}


debug This is a comment
debug -x uname -a

Useful ?

Labels: shell script

Wednesday, January 21, 2009

Shell Script Performance, Take 2

In this 2nd part of shell script performance, I am going to show you with another example and walk you through my performance journey.

Suppose you need to change 1000 files' extension to another extension (eg, from .txt to .log). What you need to do is to write a script to do that. Let's start with a straightforward approach like this:

$ ls -1 *.txt | wc -l
1000

$ cat ext1.sh
#! /bin/bash

if [ $# -ne 2 ]; then
        echo "Usage: $0 <original-extension> <new-extension>"
        exit 1
fi
ext_o=$1
ext_n=$2


for i in *.$ext_o
do
        mv $i `basename $i $ext_o`$ext_n

done

$ time ./ext1.sh txt log

real    4m45.135s
user    0m10.040s
sys     0m58.621s

$ ls -1 *.log | wc -l
1000

As you can see, you need to run "basename" command 1000 times to take away the extension before you can rename all these files and that's the overhead of forking out 1000 processes. As I mentioned in my previous blog, it needs to make 160+ system calls for an empty shell with 0.13 seconds run. All these will add up to slow down your script when you are working with large data set. In order to increase your script's performance, you need to basically follow these principles:

Use the most appropriate tools for the job (more efficient)
Try to reduce the amount of commands to run (less forking)
Explore all the options and capability of every command (may reduce no. of commands)
Explore the functionality of built-in shell construct (so no forking)

In my second attempt, I will explore a built-in pattern matching in Bash shell.
${variable%pattern}, - If the pattern matches the end of the variable's value, delete the shortest part that matches and return the rest.
Example:

$ f=abc.txt

$ echo ${f%.txt}.log
abc.log

Here is the run time:

$ ls -1 *log | wc -l
1000

$ cat ext2.sh
#! /bin/bash

if [ $# -ne 2 ]; then
        echo "Usage: $0 <original-extension> <new-extension>"
        exit 1
fi
ext_o=$1
ext_n=$2


for i in *.$ext_o
do
        mv $i ${i%.$ext_o}.$ext_n

done

$ time ./ext2.sh log txt

real    2m6.684s
user    0m4.765s
sys     0m28.804s

$ ls -1 *.txt | wc -l
1000

That's 2.2 times faster than the first version. Is this good enough ? Let's explore other scripting languages. The advantage of modern scripting languages (like Perl, Python, Tcl, ... just to name a few) has a lot of capabilities built-in without having the script to fork processes. Below shows my Tcl and Python implementation.

In Tcl:

$ ls -1 *.txt | wc -l
1000

$ cat ext.tcl
#! /cygdrive/c/Tcl8.4.19/bin/tclsh


if { $argc != 2 } {
        puts stderr "Usage: $argv0 <original-ext> <new-ext>"
        exit 1
}
set ext_o [lindex $argv 0]
set ext_n [lindex $argv 1]


foreach ofile [glob -nocomplain *.$ext_o] {
        file rename $ofile "[file rootname $ofile].$ext_n"
}

$ time ./ext.tcl txt log

real    0m12.149s
user    0m0.000s
sys     0m0.015s

$ ls -1 *log | wc -l
1000

In Python:

$ ls -1 *log | wc -l
1000

$ cat ext.py
#! /usr/bin/python

import os, sys

if len(sys.argv) != 3:
        print "Usage:", sys.argv[0], "<original-ext> <new-ext>"
        sys.exit(1)

ext_o = os.path.extsep + sys.argv[1]
ext_n = os.path.extsep + sys.argv[2]

for f in os.listdir('.'):
        [fname,ext] = os.path.splitext(f)
        if ext == ext_o:
                f_new = fname + ext_n
                os.rename(f,f_new)

$ time ./ext.py log txt

real    0m4.307s
user    0m0.062s
sys     0m1.653s

$ ls -1 *.txt | wc -l
1000

Wow, we are able to reduce the runtime from less than 5min to just over 4sec. That's over 66 times faster than the first version.

4m45.135s (1st version of shell script)
2m6.684s (2nd version, using built-in Bash function)
0m12.149s (Tcl version)
0m4.307s (Python version)

Next time when you write scripts, make sure your script is tested with a large data set for performance reason. BTW, can you come up with the fastest script to create 1000 files (eg, file-0001.txt ... file-10000.txt) in some running sequence ? Stay tune for "Shell Script Performance, Take 3" 'cos I will show you how to take advantage of built-in functions in shell so that your script can run as fast as Python.

Labels: performance, python, shell script, Tcl

Friday, January 16, 2009

Shell Script Performance

In UNIX, creating a separate process is considered very expensive. In order to proof this point, I created an empty shell script and I am going to run truss on it to see how many system calls are needed to a barebone shell.

$ cat empty.sh
#! /bin/sh

$ truss -c ./empty.sh

syscall               seconds   calls  errors
_exit                    .000       1
read                     .000       2
open                     .000       8       1
close                    .000       9
time                     .000       1
brk                      .001      17
getpid                   .000       4
mount                    .000       1       1
getuid                   .000       2
getgid                   .000       2
sysi86                   .000       1
ioctl                    .000       5       1
execve                   .000       1
umask                    .000       2
fcntl                    .000       2
fcntl                    .000       2
readlink                 .000       1       1
sigaction                .000       1
getcontext               .000       1
setustack                .000       1
mmap                     .003      34
munmap                   .000      10
xstat                    .001      11       3
getrlimit                .000       1
memcntl                  .000       7
sysconfig                .000      10
lwp_sigmask              .000       1
lwp_private              .000       1
llseek                   .000       3
schedctl                 .000       1
resolvepath              .000       9
stat64                   .000       3
fstat64                  .000      11
open64                   .000       1
                     --------  ------   ----
sys totals:              .016     167      7
usr time:                .006
elapsed:                 .130

Although it took only 0.13 seconds to run, a total of 173 system calls were invoked. Imagine you need to 'daisy-chain' a few commands together and run that 100 times in a loop. Your total run time will be quite substantial.

Below example clearly illustrates this point. Suppose you do a few invert-match (-v) grep on the output from ps -ef and we try to run that 100 times. I am providing you 3 different solutions:

Daisy chain a number of grep commands
Daisy chain a number of fgrep (fast grep - Interpret PATTERN as a list of fixed strings) commands
Use egrep (Interpret PATTERN as an extended regular expression) command

$ cat ex-grep.sh; time ./ex-grep.sh
#! /bin/sh

for i in `perl -e '$,=" ";print 1..100'`
do
 ps -ef | grep -v root | grep -v daemon | grep -v nothing | grep -v oralce | grep -v weblogic > /dev/null 2>&1
done

real 0m14.696s
user 0m4.081s
sys 0m8.142s

$ cat ex-fgrep.sh; time ./ex-fgrep.sh
#! /bin/sh

for i in `perl -e '$,=" ";print 1..100'`
do
 ps -ef | fgrep -v root | fgrep -v daemon | fgrep -v nothing | fgrep -v oralce | fgrep -v weblogic > /dev/null 2>&1
done

real 0m14.392s
user 0m4.085s
sys 0m7.936s

$ cat ex-egrep.sh; time ./ex-egrep.sh
#! /bin/sh

for i in `perl -e '$,=" ";print 1..100'`
do
 ps -ef | egrep -v 'root|daemon|nothing|oralce|weblogic' > /dev/null 2>&1
done

real 0m8.527s
user 0m2.705s
sys 0m3.884s

As you can see, fgrep gives us only a slight improvement even it is based on fixed string matching. This is because "ex-fgrep.sh script" is still forking out as many processes as the "ex-grep.sh", in other words they are having similar overhead. However, the "ex-egrep.sh" has the least overhead because the regular expression matching can group all the OR cases together in one command and therefore avoided all the unnecessary process forking.

$ truss -c ./ex-grep.sh 2>&1 | tail -5
open64                   .013     107       5
                     --------  ------   ----
sys totals:             1.125   11216   1273
usr time:                .366
elapsed:               19.250

$ truss -c ./ex-fgrep.sh 2>&1 | tail -5
open64                   .014     107       5
                     --------  ------   ----
sys totals:             1.128   11241   1279
usr time:                .369
elapsed:               19.340

$ truss -c ./ex-egrep.sh 2>&1 | tail -5
open64                   .013     107       5
                     --------  ------   ----
sys totals:              .593    6819    869
usr time:                .193
elapsed:               11.490

In order to achieve better performance in UNIX shell scripting, we should use the most appropriate tool for the job and explore fully on all the available options. BTW, do you know what is the UNIX philospphy ?

Write programs that do one thing and do it well.

If you are running the above in a multiple CPUs box, you may not see much different because all the forked processes are running on different CPUs. I tested this out in my office's SunFire X4600 with 16 CPUs and there isn't any significance difference.

FYI, the above benchmark result is obtained in OpenSolaris 2008.11 under VirtualBox environment with 1 CPU assigned on my Intel Centrino Duo notebook

Labels: performance, shell script, unix

Thursday, January 08, 2009

SNMP on OpenSolaris 2008.11

Apparently there isn't any SNMP service daemon available in the OpenSolaris

# uname -a
SunOS opensolaris 5.11 snv_101b i86pc i386 i86pc Solaris

# svcs -a | grep -i snmp
disabled       22:28:58 svc:/network/device-discovery/printers:snmp

If you want to explore SNMP, you may want to download the Net-SNMP. In my case, I downloaded the source code for net-snmp-5.4.2.1 and configured & compiled with the default settings. BTW, I installed the SunStudio Express via the OpenSolaris Package Manager. The default location will be in /usr/local

# which cc
/usr/bin/cc

# ls -l /usr/bin/cc
lrwxrwxrwx 1 root root 33 2008-12-04 09:09 /usr/bin/cc -> ../../opt/SunStudioExpress/bin/cc

# cc -V
cc: Sun Ceres C 5.10 SunOS_i386 2008/10/22
usage: cc [ options] files.  Use 'cc -flags' for details

# CC=cc

# export CC/stdin>

# ./configure
....

# make
...

# make install

You need to configure it with /usr/local/bin/snmpconf -g basic_setup for basic configuration. Simply answer a dozen of questions and the software will create snmpd.conf file. Now all you have to do is to launch the snmpd daemon with this configuration file.
/usr/local/sbin/snmpd -c /usr/local/etc/snmpd.conf
Below shows you how you can explore the SNMP MIB information. I am using based on version 2c of the SNMP. You can get individual MIB data (eg. sysDesrc.0 - system description), recursively get the SNMP tree on the system, interface (if) or even the entire mib-2 section of the MIB-2 tree. BTW, my opensolaris is running in my VirtualBox

# /usr/local/bin/snmpget -c public -v 2c localhost sysDescr.0
SNMPv2-MIB::sysDescr.0 = STRING: SunOS opensolaris 5.11 snv_101b i86pc

# /usr/local/bin/snmpwalk -c public -v 2c localhost system
SNMPv2-MIB::sysDescr.0 = STRING: SunOS opensolaris 5.11 snv_101b i86pc
SNMPv2-MIB::sysObjectID.0 = OID: NET-SNMP-MIB::netSnmpAgentOIDs.3
DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (115731) 0:19:17.31
SNMPv2-MIB::sysContact.0 = STRING: chihung@singnet.com.sg
SNMPv2-MIB::sysName.0 = STRING: opensolaris
SNMPv2-MIB::sysLocation.0 = STRING: home
SNMPv2-MIB::sysServices.0 = INTEGER: 79
SNMPv2-MIB::sysORLastChange.0 = Timeticks: (1) 0:00:00.01
SNMPv2-MIB::sysORID.1 = OID: SNMP-FRAMEWORK-MIB::snmpFrameworkMIBCompliance
SNMPv2-MIB::sysORID.2 = OID: SNMP-MPD-MIB::snmpMPDCompliance
SNMPv2-MIB::sysORID.3 = OID: SNMP-USER-BASED-SM-MIB::usmMIBCompliance
SNMPv2-MIB::sysORID.4 = OID: SNMPv2-MIB::snmpMIB
SNMPv2-MIB::sysORID.5 = OID: TCP-MIB::tcpMIB
SNMPv2-MIB::sysORID.6 = OID: IP-MIB::ip
SNMPv2-MIB::sysORID.7 = OID: UDP-MIB::udpMIB
SNMPv2-MIB::sysORID.8 = OID: SNMP-VIEW-BASED-ACM-MIB::vacmBasicGroup
SNMPv2-MIB::sysORID.9 = OID: IF-MIB::ifMIB
SNMPv2-MIB::sysORDescr.1 = STRING: The SNMP Management Architecture MIB.
SNMPv2-MIB::sysORDescr.2 = STRING: The MIB for Message Processing and Dispatching.
SNMPv2-MIB::sysORDescr.3 = STRING: The management information definitions for the SNMP User-based Security Model.
SNMPv2-MIB::sysORDescr.4 = STRING: The MIB module for SNMPv2 entities
SNMPv2-MIB::sysORDescr.5 = STRING: The MIB module for managing TCP implementations
SNMPv2-MIB::sysORDescr.6 = STRING: The MIB module for managing IP and ICMP implementations
SNMPv2-MIB::sysORDescr.7 = STRING: The MIB module for managing UDP implementations
SNMPv2-MIB::sysORDescr.8 = STRING: View-based Access Control Model for SNMP.
SNMPv2-MIB::sysORDescr.9 = STRING: The MIB module to describe generic objects for network interface sub-layers
SNMPv2-MIB::sysORUpTime.1 = Timeticks: (0) 0:00:00.00
SNMPv2-MIB::sysORUpTime.2 = Timeticks: (0) 0:00:00.00
SNMPv2-MIB::sysORUpTime.3 = Timeticks: (0) 0:00:00.00
SNMPv2-MIB::sysORUpTime.4 = Timeticks: (0) 0:00:00.00
SNMPv2-MIB::sysORUpTime.5 = Timeticks: (0) 0:00:00.00
SNMPv2-MIB::sysORUpTime.6 = Timeticks: (0) 0:00:00.00
SNMPv2-MIB::sysORUpTime.7 = Timeticks: (0) 0:00:00.00
SNMPv2-MIB::sysORUpTime.8 = Timeticks: (0) 0:00:00.00
SNMPv2-MIB::sysORUpTime.9 = Timeticks: (1) 0:00:00.01

# /usr/local/bin/snmpwalk -c public -v 2c localhost if
IF-MIB::ifIndex.1 = INTEGER: 1
IF-MIB::ifIndex.2 = INTEGER: 2
IF-MIB::ifDescr.1 = STRING: lo0
IF-MIB::ifDescr.2 = STRING: e1000g0
IF-MIB::ifType.1 = INTEGER: softwareLoopback(24)
IF-MIB::ifType.2 = INTEGER: ethernetCsmacd(6)
IF-MIB::ifMtu.1 = INTEGER: 8232
IF-MIB::ifMtu.2 = INTEGER: 1500
IF-MIB::ifSpeed.1 = Gauge32: 127000000
IF-MIB::ifSpeed.2 = Gauge32: 1000000000
IF-MIB::ifPhysAddress.1 = STRING: 
IF-MIB::ifPhysAddress.2 = STRING: 8:0:27:41:ea:8
IF-MIB::ifAdminStatus.1 = INTEGER: up(1)
IF-MIB::ifAdminStatus.2 = INTEGER: up(1)
IF-MIB::ifOperStatus.1 = INTEGER: up(1)
IF-MIB::ifOperStatus.2 = INTEGER: up(1)
IF-MIB::ifLastChange.1 = Timeticks: (0) 0:00:00.00
IF-MIB::ifLastChange.2 = Timeticks: (0) 0:00:00.00
IF-MIB::ifInOctets.1 = Counter32: 3686760
IF-MIB::ifInOctets.2 = Counter32: 6909176
IF-MIB::ifInUcastPkts.1 = Counter32: 11974
IF-MIB::ifInUcastPkts.2 = Counter32: 6268
IF-MIB::ifInNUcastPkts.1 = Counter32: 0
IF-MIB::ifInNUcastPkts.2 = Counter32: 0
IF-MIB::ifInDiscards.1 = Counter32: 0
IF-MIB::ifInDiscards.2 = Counter32: 0
IF-MIB::ifInErrors.1 = Counter32: 0
IF-MIB::ifInErrors.2 = Counter32: 0
IF-MIB::ifInUnknownProtos.1 = Counter32: 0
IF-MIB::ifInUnknownProtos.2 = Counter32: 0
IF-MIB::ifOutOctets.1 = Counter32: 3694152
IF-MIB::ifOutOctets.2 = Counter32: 498473
IF-MIB::ifOutUcastPkts.1 = Counter32: 11998
IF-MIB::ifOutUcastPkts.2 = Counter32: 3699
IF-MIB::ifOutNUcastPkts.1 = Counter32: 0
IF-MIB::ifOutNUcastPkts.2 = Counter32: 127
IF-MIB::ifOutDiscards.1 = Counter32: 0
IF-MIB::ifOutDiscards.2 = Counter32: 0
IF-MIB::ifOutErrors.1 = Counter32: 0
IF-MIB::ifOutErrors.2 = Counter32: 0
IF-MIB::ifOutQLen.1 = Gauge32: 0
IF-MIB::ifOutQLen.2 = Gauge32: 0
IF-MIB::ifSpecific.1 = OID: SNMPv2-SMI::zeroDotZero
IF-MIB::ifSpecific.2 = OID: SNMPv2-SMI::zeroDotZero

# /usr/local/bin/snmpwalk -c public -v 2c localhost mib-2
.....

See this image for the SNMP MIB tree layout

Labels: opensolaris, snmp

Monday, January 05, 2009

Systems Administration Advent Calendar

Recently I came across the sysdvent: Systems Administration Advent Calendar, it featured 25 sys admin articles between 1-25 Dec 2008 contributed by a number of great system administrators. Here is the list:

Labels: sysadmin

Friday, January 02, 2009

Don't Shout At Your Disks

Came across this blog from Brendan Gregg. See this video to know why you shouldn't shout at your disks

Labels: storage

Chi Hung Chan