New Trick to Parse Field Separated File
After so many years of using UNIX, I am still learning new tricks in programming shell script. Today, I am going to show you a trick which will save you a lot of repetitive coding in extracting fields from a field-separated file such as /etc/passwd.
I used to do it this way:
$ while read line
do
user=`echo $line | cut -d: -f1`
uid=`echo $line | cut -d: -f3`
gid=`echo $line | cut -d: -f4`
echo "User=$user ($uid:$gid)"
done < /etc/passwd
User=root (0:0)
User=daemon (1:1)
User=bin (2:2)
User=sys (3:3)
User=adm (4:4)
User=lp (71:8)
User=uucp (5:5)
User=nuucp (9:9)
User=dladm (15:3)
User=smmsp (25:25)
User=listen (37:4)
User=gdm (50:50)
User=webservd (80:80)
User=postgres (90:90)
User=nobody (60001:60001)
User=noaccess (60002:60002)
User=nobody4 (65534:65534)
User=chihung (100:1)
New way to do thing:
$ while IFS=: read user x uid gid dummy
do
echo "User=$user ($uid:$gid)"
done < /etc/passwd
User=root (0:0)
User=daemon (1:1)
User=bin (2:2)
User=sys (3:3)
User=adm (4:4)
User=lp (71:8)
User=uucp (5:5)
User=nuucp (9:9)
User=dladm (15:3)
User=smmsp (25:25)
User=listen (37:4)
User=gdm (50:50)
User=webservd (80:80)
User=postgres (90:90)
User=nobody (60001:60001)
User=noaccess (60002:60002)
User=nobody4 (65534:65534)
User=chihung (100:1)
The above trick is to set the IFS environment variable as colon (IFS=:) for "read" command. Any name=value pair set before a command will be used as environment variable within the command and there will not have any side effect to your existing shell. Below shows how you can set the environment variables to command "env". Also the man page of sh (man sh in Solaris) describing the usage of IFS
$ /bin/sh
$ MY_a=a MY_b=b MY_c=c env | grep MY
MY_a=a
MY_b=b
MY_c=c
$ env | grep MY
$ man sh
....
read name ...
One line is read from the standard input and, using the
internal field separator, IFS (normally space or tab),
to delimit word boundaries, the first word is assigned
to the first name, the second word to the second name,
and so forth, with leftover words assigned to the last
name. Lines can be continued using \newline. Characters
other than newline can be quoted by preceding them with
a backslash. These backslashes are removed before words
are assigned to names, and no interpretation is done on
the character that follows the backslash. The return
code is 0, unless an EOF is encountered.
Now I can write cleaner and efficient script to parse any field-separated type of file.
Labels: shell script

0 Comments:
Post a Comment
<< Home