Friday, May 30, 2008

Swapping Row to Column

If you need to swap rows to columns, you may want to take a look at this script. Basically I stored the data in a matrix form, a[1,1], a[1,2], ... inside AWK. During the processing of each row, I also determine the maximium number of fields in each row. The rows and columns are swapped and output at the AWK's END block. In linear algebra, it is called Transpose, ie Aij to Aji

The below script works on my Cygwin under Windows XP.

#! /bin/sh


usage()
{
 echo "Usage: $0 [-h] [-i sep] [-o sep] [input-file]"
 echo "       -h : to print this help message"
 echo "       -i : input field separator  [default: whitespace]"
 echo "       -o : output field separator [default: space]"
 echo "Note: special field separator - NULL, SPACE, PIPE"
}


set -- `getopt i:o:h $* 2>/dev/null`
if [ $? -ne 0 ]; then
 usage
 exit 1
fi


isep="[ \t]+"
osep=" "
for i in $*; do
 case $i in 
  -i)
   isep=$2
   if [ "$isep" = "NULL" ]; then
    isep=""
   elif [ "$isep" = "SPACE" ]; then
    isep=" "
   elif [ "$isep" = "PIPE" ]; then
    isep="|"
   fi
   shift 2
   ;;
  -o)
   osep=$2
   if [ "$osep" = "NULL" ]; then
    osep=""
   elif [ "$osep" = "SPACE" ]; then
    osep=" "
   elif [ "$osep" = "PIPE" ]; then
    osep="|"
   fi
   shift 2
   ;;
  -h)
   usage
   exit 0
   ;;
  --)
   shift
   ;;
 esac
done



gawk -v isep="$isep" -v osep="$osep" '
BEGIN {
 FS=isep
 max=0
}
{
 for ( i=1 ; i<=NF ; ++i ) {
  a[i,NR]=$i
 }
 if ( NF > max ) { max=NF }
}
END {
 for ( i=1 ; i<=max ; ++i ) {
  for ( j=1 ; j<NR ; ++j ) {
   printf("%s%s", a[i,j], osep)
  }
  print a[i,j]

 }
}' $1

The script in action:

$ uname -a
CYGWIN_NT-5.1 chihung 1.5.25(0.156/4/2) 2007-12-14 19:21 i686 Cygwin

$ ./rowcol.sh -h
Usage: ./rowcol.sh [-h] [-i sep] [-o sep] [input-file]
       -h : to print this help message
       -i : input field separator  [default: whitespace]
       -o : output field separator [default: space]
Note: special field separator - NULL, SPACE, PIPE

$ echo a b c | ./rowcol.sh
a
b
c

$ cat rowcol-1.txt
a b c d e f A B C D E F
g h i j k l G H I J K L
m n o p q r M N O P Q R s h o r t
s t u v w x S T U V W X l o n g e r
y z 1 2 3 4 Y Z 1 2 3 4
5 6 7 8 9 0 5 6 7 8 9 0

$ ./rowcol.sh -o PIPE rowcol-1.txt
a|g|m|s|y|5
b|h|n|t|z|6
c|i|o|u|1|7
d|j|p|v|2|8
e|k|q|w|3|9
f|l|r|x|4|0
A|G|M|S|Y|5
B|H|N|T|Z|6
C|I|O|U|1|7
D|J|P|V|2|8
E|K|Q|W|3|9
F|L|R|X|4|0
||s|l||
||h|o||
||o|n||
||r|g||
||t|e||
|||r||

$ ./rowcol.sh -o , rowcol-1.txt
a,g,m,s,y,5
b,h,n,t,z,6
c,i,o,u,1,7
d,j,p,v,2,8
e,k,q,w,3,9
f,l,r,x,4,0
A,G,M,S,Y,5
B,H,N,T,Z,6
C,I,O,U,1,7
D,J,P,V,2,8
E,K,Q,W,3,9
F,L,R,X,4,0
,,s,l,,
,,h,o,,
,,o,n,,
,,r,g,,
,,t,e,,
,,,r,,

$ cat rowcol-2.txt
chanchihung
chihungchan
hungchichan
chanhungchi

$ ./rowcol.sh -i NULL -o : rowcol-2.txt
c:c:h:c
h:h:u:h
a:i:n:a
n:h:g:n
c:u:c:h
h:n:h:u
i:g:i:n
h:c:c:g
u:h:h:c
n:a:a:h
g:n:n:i

Labels: , ,

0 Comments:

Post a Comment

<< Home