mkdir, the limit
So the question is: Is there anything we can do. I don't think so because this is a built in limit in Solaris. What one can do is to ask ourself why we need that many sub-directories and can we change that flat directory structure to a hierarchical. It is pretty obvious that too many files/directories in folder will have a performance issue when an application interacts with it.
Let do an experiment just to convince ourself that indeed there is a limit:
$ uname -a
SunOS myhost 5.9 Generic_118558-11 sun4u sparc SUNW,Sun-Fire-V240
$ psrinfo -v
Status of processor 0 as of: 04/14/2007 12:16:12
Processor has been on-line since 10/16/2006 09:10:39.
The sparcv9 processor operates at 1002 MHz,
and has a sparcv9 floating point processor.
Status of processor 1 as of: 04/14/2007 12:16:12
Processor has been on-line since 10/16/2006 09:10:38.
The sparcv9 processor operates at 1002 MHz,
and has a sparcv9 floating point processor.
$ mkdir test1
$ cd test1
$ time for i in `perl -e '$,=" ";print 1..32768'`
do
mkdir $i
done
mkdir: Failed to make directory "32766"; Too many links
mkdir: Failed to make directory "32767"; Too many links
mkdir: Failed to make directory "32768"; Too many links
real 2m52.716s
user 0m40.830s
sys 2m7.210s
$ cd ..
$ perl
$dir="test1";
($x,$x,$x,$nlink)=stat($dir);
print $nlink,"\n";
32767
As we can see, the maximum number of sub-directories one can create will be 32765. The total links is 32767 because for every directory we create, it creates two links, one for itself (.) and the other one is the parent directory (..)
Let sidetrack a little bit and take a look from the performance angle. It seems to take almost 3 minutes to create 32765 sub-directories. Can it be faster? Let see what interpreted languages like Tcl and Perl can offer.
The perl way:
$ perl -v
This is perl, v5.6.1 built for sun4-solaris-64int
(with 48 registered patches, see perl -V for more detail)
Copyright 1987-2001, Larry Wall
Perl may be copied only under the terms of either the Artistic License or the
GNU General Public License, which may be found in the Perl 5 source kit.
Complete documentation for Perl, including FAQ lists, should be found on
this system using `man perl' or `perldoc perl'. If you have access to the
Internet, point your browser at http://www.perl.com/, the Perl Home Page.
$ mkdir test2
$ cd test2
$ time $ perl
$|=1;
$start=1;
$end=32768;
for ($i=$start; $i<=$end; $i++) {
mkdir($i);
}
real 0m21.098s
user 0m0.180s
sys 0m7.740s
Wow, we are talking about 8 times speed up. How about my favourite Tcl. The Tcl way:
$ cat a.tcl
#! /usr/sfw/bin/tclsh8.3
for { set i 1 } { $i <= 32768 } { incr i } {
file mkdir $i
}
$ mkdir test3
$ cd test3
$ time ../a.tcl
can't create directory "32766": too many links
while executing
"file mkdir $i"
("for" body line 2)
invoked from within
"for { set i 1 } { $i <= 32768 } { incr i } {
file mkdir $i
}
"
(file "../a.tcl" line 3)
real 0m20.796s
user 0m0.900s
sys 0m8.040s
Wow, I am impressed that Tcl 8.3 (latest is 8.4) is as good as Perl 5.6.1 (latest is 5.8.8)
BTW, to remove that many sub-directories (under test1, test2, test3) 3x32767, it took
$ time /bin/rm -rf test1 test2 test3 real 0m35.425s user 0m1.210s sys 0m19.740sWhy perl and Tcl can perform better than the shell script ? Obviously, there is no fork or exec of processes in Perl and Tcl because they have the "mkdir" function call built-in into their interpreter. So how much overhead we are talking about, let's do another experiement.
$ truss mkdir newdir 2>&1 | wc -l
48
$ truss mkdir u
execve("/usr/bin/mkdir", 0xFFBFFCF4, 0xFFBFFD00) argc = 2
resolvepath("/usr/lib/ld.so.1", "/usr/lib/ld.so.1", 1023) = 16
resolvepath("/usr/bin/mkdir", "/usr/bin/mkdir", 1023) = 14
stat("/usr/bin/mkdir", 0xFFBFFAC8) = 0
open("/var/ld/ld.config", O_RDONLY) Err#2 ENOENT
stat("/usr/lib/libgen.so.1", 0xFFBFF5D0) = 0
resolvepath("/usr/lib/libgen.so.1", "/usr/lib/libgen.so.1", 1023) = 20
open("/usr/lib/libgen.so.1", O_RDONLY) = 3
mmap(0x00010000, 8192, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_ALIGN, 3, 0) = 0xFF3A0000
mmap(0x00010000, 98304, PROT_NONE, MAP_PRIVATE|MAP_NORESERVE|MAP_ANON|MAP_ALIGN, -1, 0) = 0xFF380000
mmap(0xFF380000, 22677, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED, 3, 0) = 0xFF380000
mmap(0xFF396000, 2343, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_FIXED, 3, 24576) = 0xFF396000
munmap(0xFF386000, 65536) = 0
memcntl(0xFF380000, 6304, MC_ADVISE, MADV_WILLNEED, 0, 0) = 0
close(3) = 0
stat("/usr/lib/libc.so.1", 0xFFBFF5D0) = 0
resolvepath("/usr/lib/libc.so.1", "/usr/lib/libc.so.1", 1023) = 18
open("/usr/lib/libc.so.1", O_RDONLY) = 3
mmap(0xFF3A0000, 8192, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED, 3, 0) = 0xFF3A0000
mmap(0x00010000, 802816, PROT_NONE, MAP_PRIVATE|MAP_NORESERVE|MAP_ANON|MAP_ALIGN, -1, 0) = 0xFF280000
mmap(0xFF280000, 701788, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED, 3, 0) = 0xFF280000
mmap(0xFF33C000, 24664, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_FIXED, 3, 704512) = 0xFF33C000
munmap(0xFF32C000, 65536) = 0
memcntl(0xFF280000, 117372, MC_ADVISE, MADV_WILLNEED, 0, 0) = 0
close(3) = 0
stat("/usr/lib/libdl.so.1", 0xFFBFF5D0) = 0
resolvepath("/usr/lib/libdl.so.1", "/usr/lib/libdl.so.1", 1023) = 19
open("/usr/lib/libdl.so.1", O_RDONLY) = 3
mmap(0xFF3A0000, 8192, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED, 3, 0) = 0xFF3A0000
mmap(0x00002000, 8192, PROT_NONE, MAP_PRIVATE|MAP_NORESERVE|MAP_ANON|MAP_ALIGN, -1, 0) = 0xFF3FA000
mmap(0xFF3FA000, 1894, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_FIXED, 3, 0) = 0xFF3FA000
close(3) = 0
mmap(0x00000000, 8192, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANON, -1, 0) = 0xFF370000
stat("/usr/platform/SUNW,Sun-Fire-V240/lib/libc_psr.so.1", 0xFFBFF2E0) = 0
resolvepath("/usr/platform/SUNW,Sun-Fire-V240/lib/libc_psr.so.1", "/usr/platform/sun4u-us3/lib/libc_psr.so.1", 1023) = 41
open("/usr/platform/SUNW,Sun-Fire-V240/lib/libc_psr.so.1", O_RDONLY) = 3
mmap(0xFF3A0000, 8192, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED, 3, 0) = 0xFF3A0000
close(3) = 0
getustack(0xFFBFF914)
getrlimit(RLIMIT_STACK, 0xFFBFF90C) = 0
getcontext(0xFFBFF748)
setustack(0xFF343A5C)
brk(0x000245E8) = 0
brk(0x000265E8) = 0
umask(0) = 077
umask(077) = 0
mkdir("u", 0777) = 0
_exit(0)
Ok, we now know that for every 'mkdir', it calls 48 system functions. So there will be 32,765 x 48 = 1,572,720 system calls in order to create that many directories. In Tcl/Perl, it only takes 6 system calls to create a single directory and that is 8 times less system calls. This tally with our initial speed up calculation.
9364/1: read(0, " f i l e m k d i r t".., 4096) = 13
9364/1: stat("t", 0xFFBFE910) Err#2 ENOENT
9364/1: umask(0) = 077
9364/1: umask(077) = 0
9364/1: mkdir("t", 0700) = 0
9364/1: write(1, " % ", 2) = 2
Labels: performance, Perl, Solaris, Tcl, unix

0 Comments:
Post a Comment
<< Home