Tcl Code Refactoring
It is not difficult to parse the mapping file to store that in Tcl associate array so that it can be used for dynamic user name substitution.
CSV
module from
Tcllib proved to be extremely useful to parse CSV output. To ensure the mapping work properly, I need to dynamically generate the
switch
body to find out whether I need to substitute the user ID to real user name or set the default email address / telephone if it is blank. Why I need to do that dynamically because the switch
pattern cannot work with variable substitution. However, it is very inefficient to build the Tcl code dynamically every time within the while
loop.
It took 10 seconds to manipulate a 554 rows x 285 columns CSV file. Definitely I am not satify with the run time and I am sure Tcl can do better than that.
It is code refactoring time. By taking the switch body out of the loop and have it dynamically generated using
subst
, we can avoid a lot of computation in building that part of code over and over again. Also, we can collapse all the matching cases into a single command body using the "-" trick in switch
to avoid repeating code. Below is a code snippet:
set switchBody [subst -nocommands { $index(email) { if { [string length \$cell] == 0 } { set cell $defaultEmail } } $index(telephone) { if { [string length \$cell] == 0 } { set cell $defaultTelephone } } $index(assigned) - $index(closed) - $index(fixed) - $index(response) { if { [info exists map(\$cell)] == 1 } { set cell \$map(\$cell) } } }] ... while { [gets $fp line] >= 0 } { set lcsv [::csv::split -alternate $line] set lcsvN [llength $lcsv] set new {} for { set i 0 } { $i < $lcsvN } { incr i } { set cell [lindex $lcsv $i] switch $i $switchBody lappend new $cell } puts [::csv::join $new] }
Now the run time is down to 3.8 seconds and that is 2.5 times more efficient. I may have to tune this code further when my colleague provide me the real data source with few hundred thousand records.
Labels: Tcl
0 Comments:
Post a Comment
<< Home