Monday, February 04, 2008

Print a Sequence of Dates

Every now and then my boss will ask me to generate log summary between certain dates. What I normally do is to select those log files and manually put that in a 'for' loop to process. Most of the time I can shorten the input to 'for' loop using either wild card or regular expression to get the shell to expand the files selection.

For example, I need to process those gzipped log files between 2008-01-23 to 2008-02-03. This is what I did in the past:

for i in access_log-2008012[3-9].gz access_log-2008013*.gz access_log-2008020[1-3].gz
do
gunzip < $i 
done | awk ' ... 

That can be quite tedious and error-prone. Do you know that in Linux, you can print sequence of numbers using seq (see other implementations). It would be nice to have similar command for dates as for numbers.

Below is a shell function (dateseq) that can help you to do all that (using Tcl)

dateseq() {
echo "set s [clock scan $1];set e [clock scan $2];for {set i \$s} {\$i<=\$e} {incr i 86400} {\
puts [clock format \$i -format ${3:-%Y%m%d}]}" | tclsh
}

I will show you how to use it within a 'for' loop and how to specify your own format

$ for i in `dateseq 20080123 20080203`
do
echo $i
done
20080123
20080124
20080125
20080126
20080127
20080128
20080129
20080130
20080131
20080201
20080202
20080203

$ for i in `dateseq 20080123 20080203 %Y-%b-%d`
do
echo $i
done
2008-Jan-23
2008-Jan-24
2008-Jan-25
2008-Jan-26
2008-Jan-27
2008-Jan-28
2008-Jan-29
2008-Jan-30
2008-Jan-31
2008-Feb-01
2008-Feb-02
2008-Feb-03

$ for i in `dateseq 20080123 20080203`
do
f="access_log-$i.gz"
[ -f $f ] && gunzip < $f
done | wc -l
12892723

Labels: ,

0 Comments:

Post a Comment

<< Home