Swapping Row to Column
If you need to swap rows to columns, you may want to take a look at this script. Basically I stored the data in a
matrix
form, a[1,1], a[1,2], ... inside
AWK. During the processing of each row, I also determine the maximium number of fields in each row. The rows and columns are swapped and output at the
AWK's END block. In
linear algebra,
it is called
Transpose, ie
Aij to Aji
The below script works on my Cygwin under Windows XP.
#! /bin/sh
usage()
{
echo "Usage: $0 [-h] [-i sep] [-o sep] [input-file]"
echo " -h : to print this help message"
echo " -i : input field separator [default: whitespace]"
echo " -o : output field separator [default: space]"
echo "Note: special field separator - NULL, SPACE, PIPE"
}
set -- `getopt i:o:h $* 2>/dev/null`
if [ $? -ne 0 ]; then
usage
exit 1
fi
isep="[ \t]+"
osep=" "
for i in $*; do
case $i in
-i)
isep=$2
if [ "$isep" = "NULL" ]; then
isep=""
elif [ "$isep" = "SPACE" ]; then
isep=" "
elif [ "$isep" = "PIPE" ]; then
isep="|"
fi
shift 2
;;
-o)
osep=$2
if [ "$osep" = "NULL" ]; then
osep=""
elif [ "$osep" = "SPACE" ]; then
osep=" "
elif [ "$osep" = "PIPE" ]; then
osep="|"
fi
shift 2
;;
-h)
usage
exit 0
;;
--)
shift
;;
esac
done
gawk -v isep="$isep" -v osep="$osep" '
BEGIN {
FS=isep
max=0
}
{
for ( i=1 ; i<=NF ; ++i ) {
a[i,NR]=$i
}
if ( NF > max ) { max=NF }
}
END {
for ( i=1 ; i<=max ; ++i ) {
for ( j=1 ; j<NR ; ++j ) {
printf("%s%s", a[i,j], osep)
}
print a[i,j]
}
}' $1
The script in action:
$ uname -a
CYGWIN_NT-5.1 chihung 1.5.25(0.156/4/2) 2007-12-14 19:21 i686 Cygwin
$ ./rowcol.sh -h
Usage: ./rowcol.sh [-h] [-i sep] [-o sep] [input-file]
-h : to print this help message
-i : input field separator [default: whitespace]
-o : output field separator [default: space]
Note: special field separator - NULL, SPACE, PIPE
$ echo a b c | ./rowcol.sh
a
b
c
$ cat rowcol-1.txt
a b c d e f A B C D E F
g h i j k l G H I J K L
m n o p q r M N O P Q R s h o r t
s t u v w x S T U V W X l o n g e r
y z 1 2 3 4 Y Z 1 2 3 4
5 6 7 8 9 0 5 6 7 8 9 0
$ ./rowcol.sh -o PIPE rowcol-1.txt
a|g|m|s|y|5
b|h|n|t|z|6
c|i|o|u|1|7
d|j|p|v|2|8
e|k|q|w|3|9
f|l|r|x|4|0
A|G|M|S|Y|5
B|H|N|T|Z|6
C|I|O|U|1|7
D|J|P|V|2|8
E|K|Q|W|3|9
F|L|R|X|4|0
||s|l||
||h|o||
||o|n||
||r|g||
||t|e||
|||r||
$ ./rowcol.sh -o , rowcol-1.txt
a,g,m,s,y,5
b,h,n,t,z,6
c,i,o,u,1,7
d,j,p,v,2,8
e,k,q,w,3,9
f,l,r,x,4,0
A,G,M,S,Y,5
B,H,N,T,Z,6
C,I,O,U,1,7
D,J,P,V,2,8
E,K,Q,W,3,9
F,L,R,X,4,0
,,s,l,,
,,h,o,,
,,o,n,,
,,r,g,,
,,t,e,,
,,,r,,
$ cat rowcol-2.txt
chanchihung
chihungchan
hungchichan
chanhungchi
$ ./rowcol.sh -i NULL -o : rowcol-2.txt
c:c:h:c
h:h:u:h
a:i:n:a
n:h:g:n
c:u:c:h
h:n:h:u
i:g:i:n
h:c:c:g
u:h:h:c
n:a:a:h
g:n:n:i
Labels: awk, Cygwin, shell script


0 Comments:
Post a Comment
<< Home