Friday, April 10, 2009

paste: too many files- limit 12, in Solaris

When you collect performance data from system, very often you would want to prepend with timestamp (HH:MM:SS). Suppose you collect similar data across a number of servers, you would want to put them together and have them imported in your favourite spreadsheet software for further analysis.

In UNIX, you can paste them together. Below I will create 20 files (host*.txt) with some random data prepended with timestamp

$ for i in `perl -e '$,=" ";print 1..20'`
do
        for j in 1 2 3 4 5 6 7 8 9
        do
                ((v=RANDOM%100))
                echo "0$j:00:00 $v"
        done > host$i.txt
done

$ paste host1.txt host2.txt host3.txt host4.txt
01:00:00 56     01:00:00 61     01:00:00 83     01:00:00 50
02:00:00 59     02:00:00 1      02:00:00 96     02:00:00 72
03:00:00 31     03:00:00 33     03:00:00 71     03:00:00 60
04:00:00 54     04:00:00 29     04:00:00 61     04:00:00 36
05:00:00 62     05:00:00 69     05:00:00 25     05:00:00 36
06:00:00 2      06:00:00 72     06:00:00 76     06:00:00 8
07:00:00 69     07:00:00 59     07:00:00 91     07:00:00 89
08:00:00 51     08:00:00 75     08:00:00 80     08:00:00 61
09:00:00 17     09:00:00 12     09:00:00 59     09:00:00 83

Looks promising. Now, I need to get rid of the redundant timestamp. With AWK, we just have to take the even field values.

$ paste host1.txt host2.txt host3.txt host4.txt | awk '
{
        printf("%s\t%d", $1, $2)
        for ( i=4 ; i<=NF ; i+=2 ) {
                printf("\t%d",$i)
        }
        printf("\n")
}'
01:00:00        56      61      83      50
02:00:00        59      1       96      72
03:00:00        31      33      71      60
04:00:00        54      29      61      36
05:00:00        62      69      25      36
06:00:00        2       72      76      8
07:00:00        69      59      91      89
08:00:00        51      75      80      61
09:00:00        17      12      59      83

So far so good. Now I want to do that for all the hosts (host*.txt). You can use Bash brace expansion to supply the arguments to paste

$ paste host{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20}.txt
paste: too many files- limit 12

Ouch! paste cannot take more than 12 files. Stuck ? What we can do is to do one file at a time within a loop. Since I want the output to be imported to spreadsheet, I will output that as comma separated values (CSV) file (all_hosts.csv)

$ tmpfile=".tmp-$$"

$ cp /dev/null all_hosts.csv

$ for i in `perl -e '$,=" ";print 1..20'`
do
        f="host$i.txt"
        paste all_hosts.csv $f > $tmpfile
        mv $tmpfile all_hosts.csv
done

Now we have all the hosts data in one file. The final step is to remove the redundant timestamp.

$ awk '
{
        printf("%s,%d", $1, $2)
        for ( i=4 ; i<=NF ; i+=2 ) {
                printf(",%d",$i)
        }
        printf("\n")
}' all_hosts.csv
01:00:00,56,61,83,50,22,6,83,10,6,88,75,61,46,24,33,8,82,90,29,90
02:00:00,59,1,96,72,80,63,5,61,42,90,7,24,78,58,5,85,35,79,0,46
03:00:00,31,33,71,60,99,41,61,92,34,84,61,46,8,1,52,10,21,82,84,69
04:00:00,54,29,61,36,7,85,69,2,26,42,56,82,17,14,93,95,45,76,3,37
05:00:00,62,69,25,36,54,42,81,8,2,94,44,10,44,28,64,68,96,22,9,45
06:00:00,2,72,76,8,96,21,85,35,89,92,93,98,31,99,67,25,77,43,73,9
07:00:00,69,59,91,89,39,72,11,45,90,9,28,15,22,3,66,64,83,46,60,40
08:00:00,51,75,80,61,22,60,61,12,37,66,24,34,92,21,63,99,27,45,40,35
09:00:00,17,12,59,83,32,44,78,91,16,89,97,52,81,52,51,59,78,14,85,49

Labels: , , ,

0 Comments:

Post a Comment

<< Home