Monday, July 07, 2008

Three Hundred Thousand Files In A Single Directory

A colleague of mine is managing a set of Redhat Enterprise Linux servers. One of web servers the / partition is running low in disk space. The obvious way to find out which directory is the culprit is to run
find / -mount -type d -exec du -sk {} \; | sort -n -k 1
This will show you which directory occupies the most disk space

My colleague managed to locate the /var/spool/mqueue directory and this is the directory to store all the temporary file for sendmail. Apparently it contains 314,000+ files in that single directory. These are the mails that cannot be delivered for some reason and got stuck in the temp directory. If you were to do a "ls -l", you will have to wait for ages to get the directory listing. (See this blog to understand how to minimise the amount of system calls in directory listing).

By reading one of the 314,000+ files, we understand that the content of the file is actually generated from the rsync command in the crontab. The rsync is supposed to syncronise the web content between two servers and it is carried out every minute. The crontab entry looks like this
* * * * * rsync ....
Since the rsync command does not redirect standard output and error, cron will automatically help user to send that via email. Since the sendmail is not running, the mail has to be queued in the system (/var/spool/mqueue/*). For rsync to run every minute, you are talking about having 1440 mails (24*60) in the queue every day. If the server is up for 218 days, you will have 314,000+ files in /var/spool/mqueue.

By appending >/dev/null 2>1& in the rsync entry in crontab, it will simply get ride of all the output from the rsync command. Therefore, no mail will be sent out from cron.

If you intend to clean up all the files in the /var/spool/mqueue, you may want to follow what my colleague did by removing the directory /var/spool/mqueue and re-create the directory. If you were to remove the files using wild card (*), you will get "-bash: /bin/ls: Argument list too long" error.

BTW, do you think Windows can function proper with 314K files in a single directory ?

Labels:

2 Comments:

Blogger samwyse said...

I just found your blog; it is very interesting. I've been using Unix for about 25 years, and while I already know a lot of the stuff you're just discovering, there are a few things that you've been about to teach me.

Two thoughts on this post. First, if you want to get rid of a big directory without losing any new files that may arrive while you are deleting it, it's better to first rename it, then delete it. I realize that it wasn't a big concern in this case, but it might be in the future.

Second, if you want to issue shell commands against a really long list of files, look into xargs. For example, "ls /var/spool/mqueue | xargs rm". The xargs program knows how big a shell command line can get, and will issue a new rm command for every few hundred files that are listed. In this case, there might be other emails among the 314,000 that you want to examine before you delete them, so let's just get rid of the ones that mention rsync: "ls /var/spool/mqueue | xargs grep -l rsync | xargs rm". "xargs grep -l rsync" will work as a filter, just passing those files that have the word rsync in them. Once you've done that, you can check the remaining files for, perhaps, other cron commands that don't run as often as your misbehaving rsync.

8:44 PM  
Blogger chihungchan said...

Thanks for the comment. I didn't know that I can daisy chain xargs. In most cases, I will just limit the output using the find command -name "*pattern*"

9:28 PM  

Post a Comment

<< Home