Friday, August 27, 2010

Z to gz On The Fly

If you have a lot of compress files (*.Z) and you want to convert them to gzip format to space disk space, here is the function that you can use. The function, Z2gz, make use of standard input and standard output to dynamically uncompress and gzip on the fly to avoiding any intermediate file to be created. Also, it will set the permission, ownership and timestamp to be the same as the original file.

Z2gz()
{
for Z in *.Z
do
    gz="${Z%.Z}.gz"
    uncompress < $Z | gzip > $gz
    touch -r $Z $gz
    getfacl $Z | setfacl -f - $gz
    rm -f $Z
done
}

Z2gz in action. In this exercise, we saved 11,970,771 bytes

$ ls -l *.Z
-rw-------   1 chihung   chihung     5278987 Feb  2  2009 apache.tar.Z
-rw-r-----   1 chihung   chihung     1479993 Aug 27 09:29 freetype-2.3.1-sol10-sparc-local.Z
-rw-r-----   1 chihung   chihung     25318755 Feb  2  2009 ganglia.tar.Z
-rw-r-----   1 chihung   chihung      797471 Aug 27 09:29 libgcc-3.4.6-sol10-sparc-local.Z
-rw-r--r--   1 chihung   chihung     4039745 Feb 19  2009 mediawiki-1.6.12.tar.Z

$ Z2gz

$ ls -l *.gz
-rw-------   1 chihung   chihung     3641373 Feb  2  2009 apache.tar.gz
-rw-r-----   1 chihung   chihung     1019829 Aug 27 09:29 freetype-2.3.1-sol10-sparc-local.gz
-rw-r-----   1 chihung   chihung     16940517 Feb  2  2009 ganglia.tar.gz
-rw-r-----   1 chihung   chihung      527891 Aug 27 09:29 libgcc-3.4.6-sol10-sparc-local.gz
-rw-r--r--   1 chihung   chihung     2814570 Feb 19  2009 mediawiki-1.6.12.tar.gz

Labels: ,

Saturday, August 14, 2010

Bandwidth Throttling

One of the users wanted to transfer a big file (300GB) from a Unix server to an external USB HDD. To make matter worse, the UNIX server is located across the WAN and the customer's management mentioned that no one is allow to transfer file bigger than 1GB. What the management implies is that do not jam up the WAN link with big file(s) transfer.

Guess what, people literally interpreted the statement AS IS. The 300GB file is split into small chucks, just slightly smaller than the 1GB limit, and have them merge back in the destination end. Although it does not violate the management rule, there is no different in terms of bandwidth utilisation. If the user is smart (or dumb), he/she can initiate multiple downloads to accelerate the transfer and this will definitely choke up the WAN link.

Do you know that there are many ways to do bandwdith throttling:

  • Client end
  • Server end
    • lighttpd with connection.kbytes-per-second setting in the configurtion file. See traffic shaping for more details
    • mod_bw module in Apache
    • and many more

With bandwidth throttling, network utilisation can be controlled. Also, it can avoid splitting/merging of small files and having temporarily storage to house these small files.

Things to take note. Some of the older file systems and utilities can only handle file smaller than 4GB. See the comparison of all the other file systems

Labels:

Sunday, August 01, 2010

Failsafe gzip

If you try to gzip file that is currently opened by other process, your program will lose the handler and therefore will not be able to read/write to it. You can verify that the i-node (ls -li) is different after gzip.

In Solaris, you can use fuser to check what processes are currently holding on to the file.

Below script is a fail-safe version to gzip file that has no process holding on to the file.

#! /bin/ksh
#
# gzip those files that no process is holding on to it
#


PATH=/usr/bin:/bin:/usr/sbin


usage()
{
        echo "Usage: $0 -n <name> [-d <dir>] [-b] [-e] [-[1-9]]"
        echo "       -n <name>: part of the file name"
        echo "       -d <dir> : directory to search, default to current"
        echo "       -b       : begins with the <name>"
        echo "       -e       : ends with the <name>"
        echo "       -1 .. -9 : gzip compression flag, default to -6"
}


set -- `getopt n:d:be123456789 $*`
if [ $? != 0 ]; then
        usage
        exit 1
fi


finddir="."
gzip="-6"
for i in $*
do
        case $i in
        -n)
                name=$2
                findname="*${name}*"
                shift 2
                ;;
        -d)
                finddir=$2
                shift 2
                ;;
        -b)
                findname="${name}*"
                shift
                ;;
        -e)
                findname="*${name}"
                shift
                ;;
        -[1-9])
                gzip="$i"
                shift
                ;;
        esac
done


#
# checking
#
if [ -z $findname ]; then
        usage
        exit 2
fi
if [ ! -d $finddir ]; then
        echo "Error. $finddir does not exist"
        exit 3
fi


find $finddir -name "$findname" -type f 2>/dev/null | while read f
do
        pids=`fuser -u $f 2>/dev/null`
        if [ -z $pids ]; then
                echo "Running $gzip $f ... \c"
                gzip $gzip "$f"
                echo OK
        fi
done

Labels: ,