Tuesday, April 29, 2008

Sun Grid Engine for Rendering

In the past, I used to submit animation scene files to SGE (Sun Grid Engine) to render as individual job per frame. It is very hard to manage because we have to deal with lots of SGE job IDs for a scene.

Do you know that in SGE you can treat that as an array job and let SGE to handle the scheduling of individual sub-task. Eg. if you have a scene file to render with frame 1 to 256, you can submit as qsub -t 1-256:1 render.sh scenefile.mb. SGE will only assign one SGE job ID. This give you the flexibility to modify the entire job with a single SGE job ID, eg. alter the job dependency, change the priority, hold the job in queue, remove from queue. With this, you do not have to keep track in terms of individual frame. In SGE man page, -t option is described as:

Submits a so called Array Job, i.e. an array of identical tasks
being differentiated only by an index number and being treated
by Sun Grid Engine almost like a series of jobs. The option
argument to -t specifies the number of array job tasks and the
index number which will be associated with the tasks. The index
numbers will be exported to the job tasks via the environment
variable SGE_TASK_ID.

In essence,

  • Scene file == SGE Job ID / Name (set in JOB_ID / JOB_NAME environment variable)
  • Frame in scene file == SGE Task ID (set in SGE_TASK_ID environment variable)

When a job is scheduled to run on a execution node, SGE_TASK_ID environment variable will be set based on the task id in the array job. In this case, the SGE_TASK_ID will be your frame number.

You may also want to couple the submission command (qsub) with "-P" (for project), "-A" (for accounting) and "-N" (for job name) to help you streamline other activities such as accounting, monitoring, ...

The images generated from the rendering can easily consists of hundred of files with a total file size of MB or GB. It would be too troublesome to download individual images and 'brandwidth unfriendly' if they are not compressed. I explored a number of ways to effectively accomplish this:

  1. Write a "epilog" script in the SGE queue configuration (qconf -mq all.q) so that it will run at the end of each job. However, once the array job is run in the queue, we have no control which task will complete first. We cannot assume the last frame will be rendered last. The script has to be smart enough to know that your current task is indeed the last task in the queue. In my newly revamped cluster, I ensure the SGE job name is unique and therefore I can easily identify the scene file from qstat -xml
    njob=`SGE_ROOT=/gridware/sge /gridware/sge/bin/lx24-amd64/qstat -xml | grep -c "<JB_name>$SGE_JOB_NAME</JB_name>"`
    if [ $njob == 1 ]; then
     cd /san/renderImages
     tar cf $SGE_JOB_NAME.tar ./$SGE_JOB_NAME
     gzip $SGE_JOB_NAME.tar
    fi
    
  2. Another method is to take advantage of SGE job dependency. First job to do the rendering and the second job to do the compression. The dependency is that second job will only start if first job finishes.
  3. Run a cron job in the server where the storage is directly attached. The script need to ensure it will not compress if the job is still running. Since the script will list out all the directories in reverse chronological order, we can terminate the loop once the directory has been processed before.
    #! /bin/sh
    
    
    dir="/san/renderImages"
    
    cd $dir
    for i in `ls -1td scenefile*`
    do
            # still running
            SGE_ROOT=/gridware/sge /gridware/sge/bin/lx24-amd64/qstat -xml | grep "<JB_name>$i</JB_name>" > /dev/null
            if [ $? -eq 0 ]; then
                    continue
            fi
    
    
            # previously compressed; we can break from here
            tarfile="$i.tar"
            if [ -f "$tarfile.gz" ]; then
                    break
            fi
    
            # compress now
            tar cf $tarfile ./$i
            gzip $tarfile
    
    done
    

I opted for the last method because I can do the 'tar' and 'gzip' in the server with direct attached storage, no network traffic incurred. Also, it has no additional SGE job unless the second method. The first and second methods will have to run the compression in the execution node and it has to read and write over the network.

BTW, the output of qstat -xml looks like this:

<?xml version='1.0'?>
<job_info  xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <queue_info>
    <job_list state="running">
      <JB_job_number>572</JB_job_number>
      <JAT_prio>0.55500</JAT_prio>
      <JB_name>scenefile_001</JB_name>
      <JB_owner>renderer</JB_owner>
      <state>r</state>
      <JAT_start_time>2008-04-29T10:48:11</JAT_start_time>
      <queue_name>all.q@l1</queue_name>
      <slots>1</slots>
      <tasks>9</tasks>
    </job_list>
    <job_list state="running">
      <JB_job_number>570</JB_job_number>
      <JAT_prio>0.55500</JAT_prio>
      <JB_name>scenefile_002</JB_name>
      <JB_owner>renderer</JB_owner>
      <state>r</state>
      <JAT_start_time>2008-04-29T10:47:16</JAT_start_time>
      <queue_name>all.q@l10</queue_name>
      <slots>1</slots>
      <tasks>68</tasks>
    </job_list>
 ...

Labels: ,

0 Comments:

Post a Comment

<< Home