Friday, December 05, 2008

Avoid Using Temporary Files, Part 2

Last Saturday, I blogged about how we can avoid using temporary files in shell scripting. At the end of that blog, I posted a question - how we can achieve this if the number of lines in all the command outputs are not the same.

My first implementation started off with two commands with unequal output and I managed to do that without much difficulty. I thought I was done with this. Wait! What if there more than 2 commands output, that means I have to rewrite this again. Why not we do it once and for all, craft a more generic function that is able to handle multiple command outputs.

My approach is to take advantage of sub shell. Also, I will introduce a "separator" in-between the commands so that the "paste" will be able to separate the output. Of course your "separator" has to be unique and it will not appear in any of the command output. I define my own "_paste_" command using AWK and store the commands output in memory using AWK associate array with the key based on "#file and #line". Here is my code:

$ cat t3.sh
#! /bin/sh

PATH=/usr/bin:/bin

seq()
{
    nawk -v start=$1 -v end=$2 '
        END {for(i=start;i<=end;++i){print i}}' /dev/null
}
calc1()
{
    for i in `seq $1 $2`
    do
        echo `expr $i \* $i`
    done
}
calc2()
{
    for i in `seq $1 $2`
    do
        echo `expr $i \* $i \* $i`
    done
}
_paste_()
{
    nawk -v sep=$sep '
        BEGIN {
            nfile=1
            nline=1
            max=0
        }
        $0==sep {
            ++nfile
            nline=1
            next
        }
        {
            if ( nline>max ) {
                max=nline
            }
            line[nfile,nline]=$0
            ++nline
        }
        END {
            for (l=1;l<=max;++l) {
                printf("%s", line[1,l])
                for (f=2;f<=nfile;++f) {
                    printf("\t%s", line[f,l])
                }
                printf("\n")
            }
        }'
}


sep="@@@@@"
(
   calc1 1 10; echo $sep
   calc2 1 13; echo $sep
   calc2 1 15
) | _paste_ 

$ ./t3.sh
1       1       1
4       8       8
9       27      27
16      64      64
25      125     125
36      216     216
49      343     343
64      512     512
81      729     729
100     1000    1000
        1331    1331
        1728    1728
        2197    2197
                2744
                3375

This implementation may not be very efficient especially if we have to deal with massive output from commands because all the data will be stored in memory. What I have in mind is to do this in Python, wanna give it a try?

Labels: ,

0 Comments:

Post a Comment

<< Home