Friday, July 11, 2008

Tcl Code Refactoring, Part 2

Yesterday I talked about refactoring my Tcl code in manipulating CSV data. The problem was circling aroung my head while I was driving home last night. It is pretty inefficient to loop through 285 columns for each row just to test and change a few columns, in my case 6 columns. So after finishing all my "routine duty" at home, I managed to find time to modify my code so that I only have to replace the 6 columns in the list. Although the lreplace has to re-create another list after the replacement, it is still better than going through the whole list.

With this modification, I managed to squeeze another 0.7 second out from the run time. Ok, I think I am kind of hitting the performance limit, so what will be the next step. Since I am learning Python, may be the snake has something to offer.

Bingo! The default Python installation comes with CSV module. Hey that's a good opportunity for me to practise my Python skill. Below is a simple skeleton to read and write in CSV format.

#! /usr/bin/python


import csv, sys

if len(sys.argv) != 3:
        print "Usage:", sys.argv[0], "csv(in) csv(out)"
        exit(1)

reader=csv.reader(open(sys.argv[1],"r"))
writer=csv.writer(open(sys.argv[2],"w"))

for row in reader:
        writer.writerow(row)

Comparing the above functionality with Tcl, Python is 6 times faster! With the characteristics of list object in Python being mutable, it is very efficient to replace values in Python list. However, in Tcl you will have to recreate another list object.

For this exercise, I will definitely go the Python way. So stay tune for more performance news. To be continued ...

Labels: , ,

0 Comments:

Post a Comment

<< Home