mkdir, the limit
So the question is: Is there anything we can do. I don't think so because this is a built in limit in Solaris. What one can do is to ask ourself why we need that many sub-directories and can we change that flat directory structure to a hierarchical. It is pretty obvious that too many files/directories in folder will have a performance issue when an application interacts with it.
Let do an experiment just to convince ourself that indeed there is a limit:
$ uname -a SunOS myhost 5.9 Generic_118558-11 sun4u sparc SUNW,Sun-Fire-V240 $ psrinfo -v Status of processor 0 as of: 04/14/2007 12:16:12 Processor has been on-line since 10/16/2006 09:10:39. The sparcv9 processor operates at 1002 MHz, and has a sparcv9 floating point processor. Status of processor 1 as of: 04/14/2007 12:16:12 Processor has been on-line since 10/16/2006 09:10:38. The sparcv9 processor operates at 1002 MHz, and has a sparcv9 floating point processor. $ mkdir test1 $ cd test1 $ time for i in `perl -e '$,=" ";print 1..32768'` do mkdir $i done mkdir: Failed to make directory "32766"; Too many links mkdir: Failed to make directory "32767"; Too many links mkdir: Failed to make directory "32768"; Too many links real 2m52.716s user 0m40.830s sys 2m7.210s $ cd .. $ perl $dir="test1"; ($x,$x,$x,$nlink)=stat($dir); print $nlink,"\n";32767
As we can see, the maximum number of sub-directories one can create will be 32765. The total links is 32767 because for every directory we create, it creates two links, one for itself (.) and the other one is the parent directory (..)
Let sidetrack a little bit and take a look from the performance angle. It seems to take almost 3 minutes to create 32765 sub-directories. Can it be faster? Let see what interpreted languages like Tcl and Perl can offer.
The perl way:
$ perl -v This is perl, v5.6.1 built for sun4-solaris-64int (with 48 registered patches, see perl -V for more detail) Copyright 1987-2001, Larry Wall Perl may be copied only under the terms of either the Artistic License or the GNU General Public License, which may be found in the Perl 5 source kit. Complete documentation for Perl, including FAQ lists, should be found on this system using `man perl' or `perldoc perl'. If you have access to the Internet, point your browser at http://www.perl.com/, the Perl Home Page. $ mkdir test2 $ cd test2 $ time $ perl $|=1; $start=1; $end=32768; for ($i=$start; $i<=$end; $i++) { mkdir($i); }Wow, we are talking about 8 times speed up. How about my favourite Tcl. The Tcl way:real 0m21.098s user 0m0.180s sys 0m7.740s
$ cat a.tcl #! /usr/sfw/bin/tclsh8.3 for { set i 1 } { $i <= 32768 } { incr i } { file mkdir $i } $ mkdir test3 $ cd test3 $ time ../a.tcl can't create directory "32766": too many links while executing "file mkdir $i" ("for" body line 2) invoked from within "for { set i 1 } { $i <= 32768 } { incr i } { file mkdir $i } " (file "../a.tcl" line 3) real 0m20.796s user 0m0.900s sys 0m8.040sWow, I am impressed that Tcl 8.3 (latest is 8.4) is as good as Perl 5.6.1 (latest is 5.8.8)
BTW, to remove that many sub-directories (under test1, test2, test3) 3x32767, it took
$ time /bin/rm -rf test1 test2 test3 real 0m35.425s user 0m1.210s sys 0m19.740sWhy perl and Tcl can perform better than the shell script ? Obviously, there is no fork or exec of processes in Perl and Tcl because they have the "mkdir" function call built-in into their interpreter. So how much overhead we are talking about, let's do another experiement.
$ truss mkdir newdir 2>&1 | wc -l 48 $ truss mkdir u execve("/usr/bin/mkdir", 0xFFBFFCF4, 0xFFBFFD00) argc = 2 resolvepath("/usr/lib/ld.so.1", "/usr/lib/ld.so.1", 1023) = 16 resolvepath("/usr/bin/mkdir", "/usr/bin/mkdir", 1023) = 14 stat("/usr/bin/mkdir", 0xFFBFFAC8) = 0 open("/var/ld/ld.config", O_RDONLY) Err#2 ENOENT stat("/usr/lib/libgen.so.1", 0xFFBFF5D0) = 0 resolvepath("/usr/lib/libgen.so.1", "/usr/lib/libgen.so.1", 1023) = 20 open("/usr/lib/libgen.so.1", O_RDONLY) = 3 mmap(0x00010000, 8192, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_ALIGN, 3, 0) = 0xFF3A0000 mmap(0x00010000, 98304, PROT_NONE, MAP_PRIVATE|MAP_NORESERVE|MAP_ANON|MAP_ALIGN, -1, 0) = 0xFF380000 mmap(0xFF380000, 22677, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED, 3, 0) = 0xFF380000 mmap(0xFF396000, 2343, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_FIXED, 3, 24576) = 0xFF396000 munmap(0xFF386000, 65536) = 0 memcntl(0xFF380000, 6304, MC_ADVISE, MADV_WILLNEED, 0, 0) = 0 close(3) = 0 stat("/usr/lib/libc.so.1", 0xFFBFF5D0) = 0 resolvepath("/usr/lib/libc.so.1", "/usr/lib/libc.so.1", 1023) = 18 open("/usr/lib/libc.so.1", O_RDONLY) = 3 mmap(0xFF3A0000, 8192, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED, 3, 0) = 0xFF3A0000 mmap(0x00010000, 802816, PROT_NONE, MAP_PRIVATE|MAP_NORESERVE|MAP_ANON|MAP_ALIGN, -1, 0) = 0xFF280000 mmap(0xFF280000, 701788, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED, 3, 0) = 0xFF280000 mmap(0xFF33C000, 24664, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_FIXED, 3, 704512) = 0xFF33C000 munmap(0xFF32C000, 65536) = 0 memcntl(0xFF280000, 117372, MC_ADVISE, MADV_WILLNEED, 0, 0) = 0 close(3) = 0 stat("/usr/lib/libdl.so.1", 0xFFBFF5D0) = 0 resolvepath("/usr/lib/libdl.so.1", "/usr/lib/libdl.so.1", 1023) = 19 open("/usr/lib/libdl.so.1", O_RDONLY) = 3 mmap(0xFF3A0000, 8192, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED, 3, 0) = 0xFF3A0000 mmap(0x00002000, 8192, PROT_NONE, MAP_PRIVATE|MAP_NORESERVE|MAP_ANON|MAP_ALIGN, -1, 0) = 0xFF3FA000 mmap(0xFF3FA000, 1894, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_FIXED, 3, 0) = 0xFF3FA000 close(3) = 0 mmap(0x00000000, 8192, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANON, -1, 0) = 0xFF370000 stat("/usr/platform/SUNW,Sun-Fire-V240/lib/libc_psr.so.1", 0xFFBFF2E0) = 0 resolvepath("/usr/platform/SUNW,Sun-Fire-V240/lib/libc_psr.so.1", "/usr/platform/sun4u-us3/lib/libc_psr.so.1", 1023) = 41 open("/usr/platform/SUNW,Sun-Fire-V240/lib/libc_psr.so.1", O_RDONLY) = 3 mmap(0xFF3A0000, 8192, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED, 3, 0) = 0xFF3A0000 close(3) = 0 getustack(0xFFBFF914) getrlimit(RLIMIT_STACK, 0xFFBFF90C) = 0 getcontext(0xFFBFF748) setustack(0xFF343A5C) brk(0x000245E8) = 0 brk(0x000265E8) = 0 umask(0) = 077 umask(077) = 0 mkdir("u", 0777) = 0 _exit(0)
Ok, we now know that for every 'mkdir', it calls 48 system functions. So there will be 32,765 x 48 = 1,572,720 system calls in order to create that many directories. In Tcl/Perl, it only takes 6 system calls to create a single directory and that is 8 times less system calls. This tally with our initial speed up calculation.
9364/1: read(0, " f i l e m k d i r t".., 4096) = 13 9364/1: stat("t", 0xFFBFE910) Err#2 ENOENT 9364/1: umask(0) = 077 9364/1: umask(077) = 0 9364/1: mkdir("t", 0700) = 0 9364/1: write(1, " % ", 2) = 2
Labels: performance, Perl, Solaris, Tcl, unix
0 Comments:
Post a Comment
<< Home