Wednesday, August 29, 2007

Sun Grid Engine Accounting Users Summary

In Sun Grid Engine (SGE), all job accounting information is written to $SGE_ROOT/$SGE_CELL/common/accounting file. Each record of the accounting information consists of 43 fields separated by colon (':') signs.

For details of all the fields, read up the man page of accounting(5).

If you want to find out distribution of jobs and their corresponding elpased time per user, you will be interested in 4th and 14th field in the accounting record.

$ man accounting


N1 Grid Engine File Formats                         ACCOUNTING(5)

NAME
     accounting - N1 Grid Engine accounting file format

DESCRIPTION
     An accounting record  is  written  to  the  N1  Grid  Engine
     accounting file for each job having finished. The accounting
     file is processed by qacct(1) to derive  accounting  statis-
     tics.

FORMAT
     Each job is represented by a line in  the  accounting  file.
     Empty  lines  and  lines which contain one character or less
     are ignored.  Accounting record  entries  are  separated  by
     colon  (':')  signs.  The  entries  denote in their order of
     appearance:

     qname
          Name of the cluster queue in which the job has run.

     hostname
          Name of the execution host.

     group
          The effective group id of the job owner when  executing
          the job.

     owner
          Owner of the N1 Grid Engine job.

     job_name
          Job name.

     job_number
          Job identifier - job number.

     account
          An account  string  as  specified  by  the  qsub(1)  or
          qalter(1) -A option.

     priority
          Priority value assigned to the job corresponding to the
          priority  parameter  in  the  queue  configuration (see
          queue_conf(5)).

     submission_time
          Submission time in seconds (since epoch format).

     start_time
          Start time in seconds (since epoch format).

     end_time
          End time in seconds (since epoch format).

N1GE 6          Last change: 2004/04/19 10:52:07                1

N1 Grid Engine File Formats                         ACCOUNTING(5)

     failed
          Indicates the problem which  occurred  in  case  a  job
          could  not  be  started  on  the  execution  host (e.g.
          because the owner of the  job  did  not  have  a  valid
          account  on  that  machine). If N1 Grid Engine tries to
          start a job multiple times, this may lead  to  multiple
          entries  in  the  accounting  file corresponding to the
          same job ID.

     exit_status
          Exit status of  the  job  script  (or  N1  Grid  Engine
          specific status in case of certain error conditions).

     ru_wallclock
          Difference between end_time and start_time (see above).

Below is an AWK script to summarise the accounting information and it's corresponding output. FYI, the usernames are fictitious.

$ cat sge-summary.sh
#! /bin/sh

awk -F":" '
NF==43 {
        # $4 - owner
        # $14 - wallclock
        ++jsum[$4]
        ++jcnt
        tsum[$4]+=$14
        tcnt+=$14
}
END {
        printf("User\tJob\tRun Time\n")
        for(i in jsum) {
                printf("%-10s\t%.2f%\t%.2f%\n", i, jsum[i]*100/jcnt, tsum[i]*100/tcnt)
        }
}' $SGE_ROOT/$SGE_CELL/common/accounting


$ ./sge-summary.sh
User            Job     Run Time
alan            0.02%    0.00%
bob             2.43%    0.00%
carl            0.02%    0.00%
daryl           0.84%    0.00%
edwin           0.20%    0.00%
francis         0.01%    0.00%
george          0.06%    0.00%
harry           0.02%    0.00%
irene           0.02%    0.00%
jeffrey         0.71%    99.36%
karen           0.05%    0.00%
leo             0.04%    0.00%
mark            0.04%    0.00%
nelson          95.32%   0.64%
oliver          0.22%    0.00%

Labels: ,

0 Comments:

Post a Comment

<< Home