Wednesday, October 29, 2008

Difficult NAWK to Understand

I have not been involved in the UNIX.com shell programming forum for almost a month. Today I received an email reminder from the forum administrator. One of the questions that caught my attention is this nawk one-liner:
I found a command who prints x lines before and after a line who contain a searched string in a text file. The command is :
nawk 'c-->0;$0~s{if(b)for(c=b+1;c>1;c--)print r[(NR-c+1)%b];print;c=a}b{r[NR%b]=$0}' b=2 a=4 s="string" file1
It works very well but I can't understand the syntax, too difficult with "man nawk". Is that some one who will be able to comment this syntax ?

The one-liner is using a lot of shortcut and defaults in the awk code and make it so cryptic. My 'deciphered' version:

nawk -v before=4 -v after=4 -v search="string" '
--current > 0 {
        print
}
$0 ~ search{
        if ( before ) {
                for ( current=before+1 ; current>1 ; current-- ) {
                        print rec[(NR-current+1)%before]
                }
        }
        print
        current=after
}
before{
        rec[NR%before]=$0
}' file1

I think the code is now pretty self-explanatory, hopefully :-)

Labels:

Sed Trick

My friend is asking me ...
I need to run a sed command to change "CONSOLE=/dev/console" to "#CONSOLE=/dev/console". But the back slashes are giving me problems.

In Solaris, you can allow root direct access by commenting out 'CONSOLE=/dev/console' in the /etc/default/login file.

If you were to use sed 's/.../.../' like below, you will need to escape the forward slash sequence with '\/' to input the forward slash

sed 's/CONSOLE=\/dev\/console/#CONSOLE=\/dev\/console/' /etc/default/login

The above is definitely very clumsy. Do you know that there is a trick in sed that allows you to use another character (any character you provide to sed) for the 'pattern separator'. In my case I used "!" because "!" does not appear in the sed expression and therefore I do not have to use any backslashes.

sed 's!CONSOLE=/dev/console!#CONSOLE=/dev/console!' /etc/default/login

You can even shorten the sed command by asking it to remember your pattern so that you can reuse it again. Anything with brackets (but you need to escape them) will be remembered and you can refer to it as \1 for the first pattern, \2 for the second, etc.

sed 's!\(CONSOLE=/dev/console\)!#\1!' /etc/default/login

Labels:

Monday, October 27, 2008

Power of 2, Part 2

A month ago I was trying to find consecutive zeroes in Power of 2. The 9 consecutive zeroes appears in 2100823 (this number has 30,351 digits !) after 4.13 hours runtime on an AMD Opteron 2.6GHz CPU.
$ python
Python 2.5.2 (r252:60911, Oct  2 2008, 09:47:11) [C] on sunos5
Type "help", "copyright", "credits" or "license" for more information.
>>> zero9=2**100823
>>> len(str(zero9))
30351
>>> '0'*9 in str(zero9)
True
>>>

BTW, my Python program has been running for 609h 41m 10s (25.4 days) and it still cannot find the 10 consecutive zeroes in 2N. The question is, does the 2N number ever contains 10 consecutive zeroes ? I will leave my program to 'find' out the truth.

Labels:

Give Your Shell Some Colours

Came across this Colorful Shells -- Using ANSI Color Codes. I think you can start putting colours in your shell scripting.

Instead of using captial letter to draw someone's attention, you can use red (or even blink) to show error messages and green to show succes:

red="\033[31m"
green="\033[32m"
off="\033[0m"
echo -e "${red}Error. Unable to run command${off}"
echo -e "${green}OK. Run successful.${off}"

If you are going to run the script in above mentioned article (show below), you will find out the effects of the combination of all the sequences.

for attr in 0 1 4 5 7 ; do
  echo "----------------------------------------------------------------"
  printf "ESC[%s;Foreground;Background - \n" $attr
  for fore in 30 31 32 33 34 35 36 37; do
    for back in 40 41 42 43 44 45 46 47; do
      printf ’\033[%s;%s;%sm %02s;%02s ’ $attr $fore $back $fore $back
    done
    printf ’\n’
  done
  printf ’\033[0m’
done

Here is the summary of the sequence:

  • Text properties - 0(default), 1(bold), 22(not bold), 4(underlined), 24(not underlined), 5(blinking), 25(not blinking), 7(invers), 27(not invers)
  • Foreground colour - 30(black), 31(red), 32(green), 33(yellow), 34(blue), 35(magenta), 36(cyan), 37(white)
  • Background colur - 40(black), 41(red), 42(green), 43(yellow), 44(blue), 45(magenta), 46(cyan), 47(white)

Labels:

Wednesday, October 22, 2008

Interview Question, Part 2

Few days ago I blogged about the interview question and I hope that you have given it a try. As I mentioned in the blog, the ability to create the scenario is also a skill by itself.

OK, let's go ahead to create a 1000 files (img-*.png) and remove some of them for this exercise. In Bash shell, it is very easy to create a 'for' loop from 1 to 1000 to create (touch) 1000 files:

prefix="img-"
ext="png"
for ((num=1;num<=1000;++num))
do
    touch $prefix$num.$ext
done

There you go, you have 1000 files for free. Suppose we delete img-111.png, img-222.png and img-333.png to represent those missing files that we are suppose to uncover. By using the same construct as given above, we can loop from 1 to 1000 and test the occurrence of the file in each loop. If it does not exist, just simply print out the file name:

prefix="img-"
ext="png"
for ((num=1;num<=1000;++num))
do
    fname="$prefix$num.$ext"
    if [ ! -f "$fname" ]
    then
        echo "$fname does not exist"
    fi
done

Mission accomplished, but not for me. Why not challenge yourself by asking these questions, can we do better than that, or can we get away with the loop althogether ? In Bash shell, there is a curly braces syntax that you do not need a 'for' loop. Below shows you how to create 1000 files in a one-liner and how to use the curly braces in a 'for num in' loop. Also, I introduced a 'short-circut' if test in the loop.

$ touch img-{1..1000}.png

$ for fname in img-{1..1000}.png
do
    [ -f "$fname" ] || echo "$fname does not exist"
done

Bash shell really give you a lot of flexibility. If you were to use Bourne shell, you will have to explicitly do the counting, like this (code snippet):

num=1
while [ $num -le 1000 ]
do
   ...
   num=`expr $num + 1`
done

Here is the bonus point. You DO NOT need loop in this exercise. As the command line length limit (see /usr/include/limits.h in Solaris) is 1048320 for 32-bit program and 2096640 for 64-bit program, you can take advantage of this to get Bash shell to expand the file name using curly braces.As for the 3 missing files, what you can do is to do a 'ls -l' on 1000 files (use curly braces again) and let the system to report error on the missing files. Since the listing of the 997 files are output to the standard output channel and the 3 missing files are output to the standard error channel, we can just simply re-direct the standard output to /dev/null device. The missing files will be reported automatically.

$ touch img-{1..1000}.png

$ rm -f img-111.png img-222.png img-333.png

$ ls -1 img-{1.1000}.png > /dev/null
img-111.png: No such file or directory
img-222.png: No such file or directory
img-333.png: No such file or directory

Hope you find this exercise challenging. The whole idea is to always challenge yourself, and do not satisfy with just one solution. Here is the challenge: If you do a 'ls -1' listing of the img-*.png, the order will start with img-1.png, img-10.png, img-100.png, img-1000.png, img-11.png, ... like below. How can we do a listing in numerical order, i.e., img-1.png, img-2.png, img-3.png, ...

$ ls -1 img-*.png
img-1.png
img-10.png
img-100.png
img-1000.png
img-101.png
img-102.png
img-103.png
img-104.png
img-105.png
img-106.png
img-107.png
...

Labels:

Monday, October 20, 2008

Interview Question

Nowadays I need to conduct a lot of interviews. When candidate mentions "shell scripting" in their resume, I will normally try to find out how he/she approach a typical scenario like this one:

[Question]: Suppose you are expecting 1000 files in the directory and the filename starts with a prefix followed by a running number say from 1 to 1000 and file extension. Eg, img-1.png, img-2.png, ... etc. However, when you do a ls -l img* | wc -l, you realise that you only have 997 files there. The question will be, how can you tell which 3 files are missing ?

BTW, this is a typical situation I normally encountered during the animation rendering in a grid computing environment. Sometimes the image just cannot be rendered for no obvious reason and I just have to re-submit those missing frames.

I will let my readers to think about this problem before I post the solution. IMO, the ability to simulate such a situation is also considered a skill set, i.e. able to generate 1000 files in a directory. 1.5 years ago I blogged about the limit (32765) of directories in Solaris and I was able to simulate such a situation.

Labels:

Friday, October 17, 2008

How Virtual Hosting Works

Probably most of the people know about virtual host support by web server, but not many know how it works from the web server view point. Why not I use my two blog addresses(http://chihungchan.blogspot.com/ and http://model4maths.blogspot.com/) as an example. If you are going to do a nslookup, you will realise that they both point to the same host, blogspot.l.google.com (209.85.175.191) because blogspot supports virtual hosting.
$ nslookup chihungchan.blogspot.com
Server:         203.166.128.168
Address:        203.166.128.168#53

Non-authoritative answer:
chihungchan.blogspot.com        canonical name = blogspot.l.google.com.
Name:   blogspot.l.google.com
Address: 209.85.175.191


$ nslookup model4maths.blogspot.com
Server:         203.166.128.168
Address:        203.166.128.168#53

model4maths.blogspot.com        canonical name = blogspot.l.google.com.
Name:   blogspot.l.google.com
Address: 209.85.175.191

If you were to telnet to the host at port 80 and try to get the home page with barebone HTTP protocol, you will not get what you want.

$ telnet chihungchan.blogspot.com 80
Trying 209.85.175.191...
Connected to chihungchan.blogspot.com (209.85.175.191).
Escape character is '^]'.
GET / HTTP/1.1

HTTP/1.1 302 Found
Location: http://www.google.com.sg/
Cache-Control: private
Content-Type: text/html; charset=UTF-8
Set-Cookie: PREF=ID=eb19920a5651e1f6:TM=1224256716:LM=1224256716:S=TUKFXgZR8MBbM
KAf; expires=Sun, 17-Oct-2010 15:18:36 GMT; path=/; domain=.google.com
Date: Fri, 17 Oct 2008 15:18:36 GMT
Server: gws
Content-Length: 222

<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>302 Moved</TITLE></HEAD><BODY>
<H1>302 Moved</H1>
The document has moved
<A HREF="http://www.google.com.sg/">here</A>.
</BODY></HTML>
According to the specification of HTTP/1.1, you will need to specify the http header for Host: to identify which host (or virtual host) you wish to get information from. Below shows how I telnet to chihungchan.blogspot.com to get information from model4maths.blogspot.com
$ telnet chihungchan.blogspot.com 80 | grep "<title>"
GET / HTTP/1.1
Host: model4maths.blogspot.com

<title>Model Approach for Primary School Maths</title>
Connection closed by foreign host.

All the modern browsers and command line utilities will automatically add in the "Host:" http header when you request information. You may want to download LiveHTTP Headers addon to your firefox browser to see what is under the hood. Below are request and the response to this blog

GET / HTTP/1.1
Host: chihungchan.blogspot.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.0.1) Gecko/2008070208 Firefox/3.0.1
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Cookie: .......

HTTP/1.x 200 OK
Content-Type: text/html; charset=UTF-8
Last-Modified: Fri, 17 Oct 2008 15:01:20 GMT
Cache-Control: max-age=0 private
Etag: "f5b850fe-6429-414d-a845-2ff9cb6c82f7"
Content-Encoding: gzip
Transfer-Encoding: chunked
Date: Fri, 17 Oct 2008 15:40:04 GMT
Server: GFE/1.3

Labels:

Accountability, At The Shell Level

Recently one of my projects is having some 'accountability' problem. A service daemon was down at 3+am and we had no clue in why it was not responsing to requests. From the 'last' command, we realised that someone login to the server but we were not able to trace what had been done by the user. BTW, this user owns that service daemon and he/she has all the rights to start/stop the service.

To avoid such incident to happen again, may be we can 'track' that user activity. In Solaris, I can turn on the C2 log as what I mentioned in my previous blog. However, that service daemon is running on Linux.

One quick fix is to tap on to the history capability of Bash shell. According to the man page of bash, these are the variables that we can set to control the command history

       HISTTIMEFORMAT
              If  this  variable  is  set and not null, its value is used as a
              format string for strftime(3) to print the time stamp associated
              with  each  history  entry displayed by the history builtin.  If
              this variable is set, time stamps are  written  to  the  history
              file so they may be preserved across shell sessions.
       HISTFILE
              The name of the file in which command history is saved (see HIS-
              TORY  below).   The default value is ~/.bash_history.  If unset,
              the command history is  not  saved  when  an  interactive  shell
              exits.
       HISTFILESIZE
              The maximum number of lines contained in the history file.  When
              this variable is assigned a value, the  history  file  is  trun-
              cated,  if necessary, by removing the oldest entries, to contain
              no more than that number of lines.  The default  value  is  500.
              The history file is also truncated to this size after writing it
              when an interactive shell exits.
       HISTSIZE
              The number of commands to remember in the command  history  (see
              HISTORY below).  The default value is 500.

It is possible to set up the history file to be located in another location instead of the default $HOME/.bash_history. We can set that in /etc/profile like this where the file is unique for every login session based on time and process id.

_HISTDIR="/history/`whoami`"
[ -d $_HISTDIR ] || mkdir $_HISTDIR
HISTFILE="$_HISTDIR/`date '+%Y%m%d%H%M%S'`-$$"
HISTFILESIZE=3000
HISTSIZE=3000
HISTTIMEFORMAT="%Y-%m-%dT%H:%M:%S "

The /history has to made with sticky bit on so that any user can create directory and own the content. Because of the HISTTIMEFORMAT, timestamp is also registered in the history file.

$ cat /history/`whoami`/20081017220353-16324
#1224252235
history
#1224252247
cd /history/chihung/
#1224252248
ls
#1224252249
ls -l
#1224252252
more 20081017220337

Now we should get some accountability.

Labels:

Thursday, October 16, 2008

Open Source Ponytail

Came across the below video from http://cuddletech.com/blog/pivot/entry.php?id=977 If you are Sun/Solaris fan, I am sure you will like this video.

Wednesday, October 15, 2008

The Ultimate ZFS Tutorial in Video

Bill Moore and Jeff Bonwick gave a three-hour tutorial on ZFS at this year's 2008 Storage Developer Conference:
  1. Part 1
  2. Part 2
  3. Part 3

Wanna to learn more about ZFS, here are some useful links:

Labels:

Tuesday, October 14, 2008

A Pretty Good Shell Scripting Primer

I stumbled upon a pretty good document in Apple's Developer Connection - Shell Scripting Primer. I particularly like the "Designing Scripts for Cross-Platform Deployment" and "Performance Tuning" chapters.

Here is the table of contents:

  1. Shell Script Basics
  2. Result Codes, Subroutines, Scoping, Sourcing
  3. Paint by Numbers
  4. Regular Expressiopns Unfettered
  5. How awk-ward
  6. Designing Scripts for Cross-Platform Deployment
  7. Advanced Techniques
  8. Performance Tuning
  9. Appendix A: Other Tools and Information
  10. Appendix B: An Extreme Example: The Monte Carlo (Bourne) Method for Pi

Labels:

Friday, October 10, 2008

New Blog

I always wanted to create a new blog to talk about primary school mathematics using the model approach. Now I think I can afford the time to do so because he my son has just finished his PSLE this week.

Feel free to visit http://model4maths.blogspot.com/
Model Approach for Primary School Maths
A Model A Day, Keeps The Algebra Away

Labels:

Wednesday, October 08, 2008

Shell Script Consolidation, Part 2

After the success in consolidating the Sun ONE Web Server scripts, I tried to tackle those scripts for the middleware. The middleware includes content management server and application server. To complicate the matter, two instances of the same middleware are running in the same box and listening at different ports.

The logic of the original script is basically to ensure the script has to be executed by a specific user. Also, it will try to find out the status of the process before taking any action (start, stop). Suppose we have:

  1. INSTANCE 1
    Install Directory: /opt/middleware/u01
    User: "user1"
    Executable: /opt/middleware/u01/bin/cms.exe
    Start/Stop script: /opt/scripts/u01/cms.sh
  2. INSTANCE 2
    Install Directory: /opt/middleware/u02
    User: "user2"
    Executable: /opt/middleware/u02/bin/cms.exe
    Start/Stop script: /opt/scripts/u02/cms.sh

The original startup/stop script for the middleware somehow uses an "inverted match" (-v in grep) to check for the existence of process not own by other user. For example in INSTANCE 1, it will try to do this:
RUN=`ps auxwww | grep cms.exe | grep -v grep | grep -v user2 | wc -l`

Although the above command sequence is not foolproof way to locate the correct process, the inverted match managed to work in this case with two instances of the middleware. What will happen if we were to have 3 or more instances running in the same hardware ? I will leave this to my reader to figure that out :-)

In Linux or even in Solaris, you can have a foolproof way to locate the process. The commands that you want to explore are pgrep and pkill
The return code of 0 (zero) from pgrep -u user1 cms.exe > /dev/null 2>&1 will indicate the process "cms.exe" own by "user1" exist.

By right we should only have one set of scripts to handle all the middleware, but "by left" ...

Labels:

Monday, October 06, 2008

Shell Script Consolidation

My ex-colleague had been creating a lot of web instances under the Sun ONE Web Server (S1WS) installation. However, for every instance of the web server, he also created a separated script to start/stop/query the web instance. Basically all these scripts are more or less the same except the instance directory and listening port. BTW, the way he used to find out the process ID of the web instance is not really foolproof because it is based on the listing of the processes and grep the instance name (ps -ef | grep instance).

If you were to look under the 'hood', you will realise that S1WS start/stop scripts actually make use of a utililty called 'parsexml' to get information related to it's instance. 'parsexml' is located at $INSTALL_DIR/lib/parsexml, but there isn't any documentation on how to use it. Thanks to the UNIX utility - strings, basically it prints the strings of printable characters in file. By 'strings' the parsexml file, we discovered the usage and it's corresponding variables name. So, to find out the PID of the web instance, you need to display the content in the PID_FILE: $INSTALL_DIR/lib/parsexml -g PID_FILE

# $INSTALL_DIR/parsexml -h
failure: CONF1115: Error opening -h/server.xml (File not found)

# $INSTALL_DIR//parsexml --help
failure: CONF1115: Error opening --help/server.xml (File not found)

# $INSTALL_DIR/parsexml
Usage: parsexml path -g option
       parsexml path

# strings $INSTALL_DIR/lib/parsexml
...

Usage: parsexml path -g option
       parsexml path
server.xml
SERVER_PID_FILE
SERVER_TEMP_DIR
SERVER_USER
SERVER_PLATFORM_SUBDIR
SERVER_JVM_LIBPATH
%s: invalid option
Options: PID_FILE
         TEMP_DIR
         USER
         JVM_LIBPATH
         PLATFORM_SUBDIR
...

# $INSTALL_DIR/lib/parsexml -g PID_FILE
/tmp/https-chihungchan.blogspot.com-9d93b4d6-1/pid

# cat /tmp/https-chihungchan.blogspot.com-9d93b4d6-1/pid
5266

Since all these scripts are similar, we should be able to 'considerate' them together to a single generic script (I called it s1ws.sh). By doing so, we need to supply the script with the instance name. However, the instance name can be pretty long and it would not be user-friendly if you were to ask the user to supply that every time he/she needs to run it. I cannot remember where I came across the auto-complete trick in bash shell, but basically it allows you to auto-complete the name of the file/directory in your current working directory. In our case, we cannot limit the user from running our script (s1ws.sh) from the $INSTALL_DIR directory. The trick is to run some unix commands (ls -1d ... | sed ...) to extract the instance names and supply that to the 'complete' command which is a bash shell builtin command.

_COMP=($(ls -1d /opt/SUNWwbsvr/admin-server /opt/SUNWwbsvr/https-* | sed 's#.*/##'))
complete -o default -W "${_COMP[*]}" s1ws.sh

In the s1ws.sh script, it allows user to query the status of the web instance. Status includes process id, process name and arguments, listening port and it's corresponding netstat LISTENing output. It is possible to get hold of listening port number from the '$INSTALL_DIR/$INSTANCE/config/server.xml'. With xml_grep command in Linux, you can extract the port number by using XPath notation. In our case, the XPath will be /server/http-listener/port. Here is a snippet of the server.xml

<server>
  ...
  <http-listener>
    <name>http-listener-1</name>
    <ip>192.168.1.2</ip>
    <port>80</port>
    <server-name>chihungchan.blogspot.com</server-name>
    <default-virtual-server-name>chihungchan.blogspot.com</default-virtual-server-name>
  </http-listener>
  ...
</server>

Here is the generic "s1ws.sh":

#! /bin/sh


INSTALL_DIR="/opt/SUNWwbsvr"
if [ ! -d "$INSTALL_DIR" ]; then
        echo "Error: \"$INSTALL_DIR\" directory does not exist"
        exit 1
fi
PARSEXML="$INSTALL_DIR/lib/parsexml"


usage()
{
        echo "Usage: $0 <instance> <start|stop|status>"
        echo ""
}


if [ $# -ne 2 ]; then
        usage
        exit 1
fi


INSTANCE="$1"
ACTION="$2"


isRunning()
{
        IS_RUNNING=0

        CONFIG_DIR="$INSTALL_DIR/$INSTANCE/config"
        if [ -d "$CONFIG_DIR" ]; then
                PID_FILE=`$PARSEXML "$CONFIG_DIR" -g PID_FILE`
                if [ -f "$PID_FILE" ]; then
                        PID=`cat "$PID_FILE"`
                else
                        echo "***WARNING*** Web instance \"$INSTANCE\" is not running"
                        return
                fi
        else
                echo "***ERROR*** Web instance \"$INSTANCE\" does not contain any configuration file"
                return
        fi

        #
        # check if process exist, via /proc
        #
        if [ -d /proc/$PID ]; then
                IS_RUNNING=1
                if [ $VERBOSE == 1 ]; then
                        echo "Process ID: $PID"
                        echo -e "`cat /proc/$PID/cmdline | tr '\000' ' '`\n"

                        # listen to port
                        port=`xml_grep --cond /server/http-listener/port --text_only $INSTALL_DIR/$INSTANCE/config/server.xml`
                        echo "Listening at port: $port"
                        netstat -an | awk '$6 == "LISTEN" && $4 ~/:'$port'$/{print}'

                fi
        fi
}


VERBOSE=0
case "$ACTION" in
status)
        VERBOSE=1
        isRunning
        if [ "$IS_RUNNING" -eq 0 ]; then
                exit 1
        else
                exit 0
        fi
        ;;
start)
        isRunning
        if [ "$IS_RUNNING" -eq 1 ]; then
                echo "***WARNING*** Web instance \"$INSTANCE\" is still running. No need to start"
        else
                $INSTALL_DIR/$INSTANCE/bin/startserv
                isRunning
                if [ "$IS_RUNNING" -eq 1 ]; then
                        echo "***INFO**** Successfully started."
                fi
        fi
        ;;
stop)
        isRunning
        if [ "$IS_RUNNING" -eq 0 ]; then
                echo "***WARNING*** Web instance \"$INSTANCE\" is stopped. No need to stop"
        else
                $INSTALL_DIR/$INSTANCE/bin/stopserv
                isRunning
                if [ "$IS_RUNNING" -eq 0 ]; then
                        echo "***INFO**** Successfully stopped."
                fi
        fi
        ;;
*)
        usage
        exit 1
        ;;
esac

and "s1ws.sh" in action:

# s1ws.sh
Usage: /usr/local/bin/s1ws.sh <instance> <start|stop|status>

# s1ws.sh <tab>
admin-server https-chihungchan.blogspot.com https-chihungchan.blogspot.com-dev1 https-chihungchan.blogspot.com-dev2 https-chihungchan.blogspot.com-test1 https-chihungchan.blogspot.com-test2 https-chihungchan.blogspot.com-uat

# s1ws.sh admin-server status
Process ID: 7215
webservd-wdog -d /opt/SUNWwbsvr/admin-server/config -r /opt/SUNWwbsvr -t /tmp/admin-server-9d93b4d6-1 -u root

Listening at port: 8989
tcp        0      0 0.0.0.0:8989                0.0.0.0:*                   LISTEN

# s1ws.sh admin-server start
***WARNING*** Web instance "admin-server" is still running. No need to start

# s1ws.sh admin-server stop
server has been shutdown
***WARNING*** Web instance "admin-server" is not running
***INFO**** Successfully stopped.

# s1ws.sh admin-server status
***WARNING*** Web instance "admin-server" is not running

# s1ws.sh admin-server start
***WARNING*** Web instance "admin-server" is not running
Sun Java System Web Server 7.0U1 B06/12/2007 21:21
info: CORE3016: daemon is running as super-user
info: CORE5076: Using [Java HotSpot(TM) Server VM, Version 1.5.0_09] from [Sun Microsystems Inc.]
info: WEB0100: Loading web module in virtual server [admin-server] at [/admingui]
info: WEB0100: Loading web module in virtual server [admin-server] at [/jmxconnector]
info: HTTP3072: admin-ssl-port: https://chihungchan.blogspot.com:8989 ready to accept requests
info: CORE3274: successful server startup
***INFO**** Successfully started.

# s1ws.sh admin-server status
Process ID: 7331
webservd-wdog -d /opt/SUNWwbsvr/admin-server/config -r /opt/SUNWwbsvr -t /tmp/admin-server-9d93b4d6-1 -u root

Listening at port: 8989
tcp        0      0 0.0.0.0:8989                0.0.0.0:*                   LISTEN

# s1ws.sh https-chihungchan.blogspot.com status
Process ID: 5266
webservd-wdog -d /opt/SUNWwbsvr/https-chihungchan.blogspot.com/config -r /opt/SUNWwbsvr -t /tmp/https-chihungchan.blogspot.com-9d93b4d6-1 -u webservd

Listening at port: 80
tcp        0      0 192.168.1.2:80             0.0.0.0:*                   LISTEN

After this 'shell script consolidation' exercise, we only need to remember and maintain one script instead of a half-a-dozen scripts or more.

Labels:

Saturday, October 04, 2008

Regular Expression

My colleague was asking me how to list out the settings in the default /etc/squid/squid.conf. This file has over 4325 lines with 4030 comments, 260 blank lines and only 35 settings (in RedHat). You can see the configuration file is heavily commented and it is extremely hard to locate the settings.

Regular expression is to the rescue. Here I am going to walk you through how we can come to the final solution

  1. grep -v # /etc/squid/squid.conf
    This will give you those lines without (-v) the occurrence of '#', but this will miss lines such as "acl Safe_ports port 80 # http"
  2. egrep -v '^#' /etc/squid/squid.conf
    This "Extended Grep" is able to understand regular expression in the pattern. ^ is an anchor and it represents the start of the line. '^#' means matching lines start with #. What if my setting starts with a blank space and follows by the comment
  3. egrep -v '^[ \t]*#' /etc/squid/squid.conf
    Anything inside the square bracket matches a single character that is contained within the brackets. In our case, the character set is a space and a tab. Since we cannot represent a tab as a literal character, we have to represent it as a escape sequence "\t". [ \t]* matches the preceding element (blank space) zero or more times. Although we can get rid of the comment, we still have a lot blank lines to deal with.
  4. egrep -v '^[ \t]*#' /etc/squid/squid.conf | egrep -v '^$'
    How about taking advantage of a pipe to run through the previous step's output and apply another 'egrep' to get rid of the blank line. ^$ are anchors, start of line and end of line, i.e. no character in the line. Ok, that's what we want, but can we do with just a single egrep. Of course we can.
  5. egrep -v '(^[ \t]*#|^$)' /etc/squid/squid.conf
    With the ability of grouping "()" and choice "|", we are telling egrep that match either comment or blank line. What if the blank lines are not really blank, but contains spaces or tabs
  6. egrep -v '(^[ \t]*#|^[ \t]*$)' /etc/squid/squid.conf
    This will do the job!

If your command understands POSIX compliant regular expression, you can write it in a more compact syntax:
egrep -v '(^\s*#|^\s*$)' /etc/squid/squid.conf
\s is equivalent to [ \t\r\n\v\f], this character set is called whitespace characters (space, tab, carriage return, newline, vertical tab, form feed)

Regular expression is definitely your life saver if you need to manlipulate data. Do you know that lots of other commands have regular expression support built-in. Run this to find out what commands(1) has this support:

cd /usr/share/man/man1
for i in *gz
do
    zgrep -li regexp $i
done

BTW, sed (stream editor) can do the same job but without applying an inverted match (-v):
sed -e '/^\s*#/d;/^\s*$/d' /etc/squid/squid.conf

Labels:

Wednesday, October 01, 2008

Power of 2

I came across this link yesterday and found the book that the author mentioned pretty interesting. The book is Impossible?: Surprising Solutions to Counterintuitive Conundrums. Guess what, our excellent National Library has a number of copies available. Although most of the maths covered in the book are very hard for me to swallow, I found this "Power of 2" section very interesting from the viewpoint of programming (BTW, I am thinking in Python).

The original text in the book:

... the first power of 2 with 8 consecutive zeroes, July 1963, ... the authors provided what the note's title suggests:
Consecutive zerosPower of 2
253
3242
4377
51491
61492
76801
814007
the first power of 2 which contains precisely eight consecutive zeros. To be explicit, taking the first case, 253 = 9 007 199 254 740 992 ...

The Karst's IBM 1620 computer took 1 hour 18 mintues to find those eight consecutive zeros on 1 January 1964, ...

With Python, it is not difficult at all to do multi-precision integer arithmetic. Here is my version and it only took 46.67 seconds on my office notebook to compute up to eight consecutive zeros. My notebook (Dell Latitude D630 with Intel Core Duo CPU T7500 @ 2.20GHz) is 100.28 times faster than the IBM 1620! What would happen if I could bring this notebook back to 1960 ? Also, I am curious to know how many punch cards they required to do that calculation. My python script is only 18 lines (includes blank lines) and I am sure the factor for code base is more than 100 times. Now you get a glimpse of the advancement of both hardware and software over half a century.

$ cat power2.py
#! /usr/bin/python

import sys,time


num=1L
count=1L
start=time.time()
while True:
        try:
                if '0'*count in str(2**num):
                        print num, 'elapsed time: %f secs' % (time.time()-start)

                        count+=1
                num+=1
        except KeyboardInterrupt:
                print 'Calculated till:', num
                sys.exit(1)

$ ./power2.py
10 elapsed time: 0.000000 secs
53 elapsed time: 0.000000 secs
242 elapsed time: 0.015000 secs
377 elapsed time: 0.015000 secs
1491 elapsed time: 0.078000 secs
1492 elapsed time: 0.078000 secs
6801 elapsed time: 5.531000 secs
14007 elapsed time: 46.671000 secs

Labels: