For the past few months, I have the opportunity to work with a number of
SunFire X4500 (a.k.a Thumper)
running the lastest
Solaris 10
11/06 with
raidz2 and spares implemented in
ZFS. After the implementation for the customer, I do not have opportunity to 'play' with it again. Even I have the opporunity to work on it, it will be unwise to try out all the cool stuff from Solaris 10 using customer's production server.
So, how to simulate an environment with 48 disks using an old
Sun Netra T1 105. My T1 configuration is:
- Memory: 256 MB
- Disk: 2x 18GB (all partitions are mirrored)
- CPU: 1x 440MHz UltraSPARC-IIi
- Patch: Recommended and Security Patch, Mar 12 2007, especially for these patches
- 124204 - zfs memory leak for large file
- 120068 - vulnerability in telnetd
Make 48 disks (files) with
mkfile (1M)
# mkdir /zdisk
# cd /zdisk
# for i in c{0,1,2,3,4,5}t{0,1,2,3,4,5,6,7}d0
do
mkfile 100m $i
done
# ls /zdisk
c0t0d0 c0t5d0 c1t2d0 c1t7d0 c2t4d0 c3t1d0 c3t6d0 c4t3d0 c5t0d0 c5t5d0
c0t1d0 c0t6d0 c1t3d0 c2t0d0 c2t5d0 c3t2d0 c3t7d0 c4t4d0 c5t1d0 c5t6d0
c0t2d0 c0t7d0 c1t4d0 c2t1d0 c2t6d0 c3t3d0 c4t0d0 c4t5d0 c5t2d0 c5t7d0
c0t3d0 c1t0d0 c1t5d0 c2t2d0 c2t7d0 c3t4d0 c4t1d0 c4t6d0 c5t3d0
c0t4d0 c1t1d0 c1t6d0 c2t3d0 c3t0d0 c3t5d0 c4t2d0 c4t7d0 c5t4d0
Create a RAIDZ2 (double parity) with 7 sets of 6 HDDs RAIDZ2. You can see every RAIDZ2 group cuts across all the controllers, thanks to
Joyeur blog
# zpool create zpool \
raidz2 /zdisk/c{0,1,2,3,4,5}t0d0 \
raidz2 /zdisk/c{0,1,2,3,4,5}t1d0 \
raidz2 /zdisk/c{0,1,2,3,4,5}t2d0 \
raidz2 /zdisk/c{0,1,2,3,4,5}t3d0 \
raidz2 /zdisk/c{0,1,2,3,4,5}t4d0 \
raidz2 /zdisk/c{0,1,2,3,4,5}t5d0 \
raidz2 /zdisk/c{0,1,2,3,4,5}t6d0 \
spare /zdisk/c{0,1,2,3,4,5}t7d0
# zpool list
NAME SIZE USED AVAIL CAP HEALTH ALTROOT
zpool 3.91G 288K 3.91G 0% ONLINE -
# zpool status
pool: zpool
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
zpool ONLINE 0 0 0
raidz2 ONLINE 0 0 0
/zdisk/c0t0d0 ONLINE 0 0 0
/zdisk/c1t0d0 ONLINE 0 0 0
/zdisk/c2t0d0 ONLINE 0 0 0
/zdisk/c3t0d0 ONLINE 0 0 0
/zdisk/c4t0d0 ONLINE 0 0 0
/zdisk/c5t0d0 ONLINE 0 0 0
raidz2 ONLINE 0 0 0
/zdisk/c0t1d0 ONLINE 0 0 0
/zdisk/c1t1d0 ONLINE 0 0 0
/zdisk/c2t1d0 ONLINE 0 0 0
/zdisk/c3t1d0 ONLINE 0 0 0
/zdisk/c4t1d0 ONLINE 0 0 0
/zdisk/c5t1d0 ONLINE 0 0 0
raidz2 ONLINE 0 0 0
/zdisk/c0t2d0 ONLINE 0 0 0
/zdisk/c1t2d0 ONLINE 0 0 0
/zdisk/c2t2d0 ONLINE 0 0 0
/zdisk/c3t2d0 ONLINE 0 0 0
/zdisk/c4t2d0 ONLINE 0 0 0
/zdisk/c5t2d0 ONLINE 0 0 0
raidz2 ONLINE 0 0 0
/zdisk/c0t3d0 ONLINE 0 0 0
/zdisk/c1t3d0 ONLINE 0 0 0
/zdisk/c2t3d0 ONLINE 0 0 0
/zdisk/c3t3d0 ONLINE 0 0 0
/zdisk/c4t3d0 ONLINE 0 0 0
/zdisk/c5t3d0 ONLINE 0 0 0
raidz2 ONLINE 0 0 0
/zdisk/c0t4d0 ONLINE 0 0 0
/zdisk/c1t4d0 ONLINE 0 0 0
/zdisk/c2t4d0 ONLINE 0 0 0
/zdisk/c3t4d0 ONLINE 0 0 0
/zdisk/c4t4d0 ONLINE 0 0 0
/zdisk/c5t4d0 ONLINE 0 0 0
raidz2 ONLINE 0 0 0
/zdisk/c0t5d0 ONLINE 0 0 0
/zdisk/c1t5d0 ONLINE 0 0 0
/zdisk/c2t5d0 ONLINE 0 0 0
/zdisk/c3t5d0 ONLINE 0 0 0
/zdisk/c4t5d0 ONLINE 0 0 0
/zdisk/c5t5d0 ONLINE 0 0 0
raidz2 ONLINE 0 0 0
/zdisk/c0t6d0 ONLINE 0 0 0
/zdisk/c1t6d0 ONLINE 0 0 0
/zdisk/c2t6d0 ONLINE 0 0 0
/zdisk/c3t6d0 ONLINE 0 0 0
/zdisk/c4t6d0 ONLINE 0 0 0
/zdisk/c5t6d0 ONLINE 0 0 0
spares
/zdisk/c0t7d0 AVAIL
/zdisk/c1t7d0 AVAIL
/zdisk/c2t7d0 AVAIL
/zdisk/c3t7d0 AVAIL
/zdisk/c4t7d0 AVAIL
/zdisk/c5t7d0 AVAIL
errors: No known data errors
Let's go for a test drive with ZFS. First, I will create a zfs file system (zfs1) without compression (by default) and try to simulate a corrupted disk. We then 'scrub' it and 'replace' the corrupted disk with a new disk. You can see the MD5 hash of the file created before the corruption is the same throughout the whole process (before corruption, after corruption, replace faulty disk)
=
# zfs create zpool/zfs1
# dd if=/dev/urandom of=/zpool/zfs1/somefile.bin bs=1024 count=1000
1000+0 records in
1000+0 records out
# digest -a md5 /zpool/zfs1/somefile.bin
c61163bc590222cfbc0576b933b9ba53
# dd if=/dev/zero of=/zdisk/c5t6d0 bs=1024 count=10
10+0 records in
10+0 records out
# zpool scrub zpool
# zpool status
pool: zpool
state: DEGRADED
status: One or more devices could not be used because the label is missing or
invalid. Sufficient replicas exist for the pool to continue
functioning in a degraded state.
action: Replace the device using 'zpool replace'.
see: http://www.sun.com/msg/ZFS-8000-4J
scrub: resilver stopped with 0 errors on Fri Mar 23 08:54:20 2007
config:
NAME STATE READ WRITE CKSUM
zpool DEGRADED 0 0 0
raidz2 ONLINE 0 0 0
/zdisk/c0t0d0 ONLINE 0 0 0
/zdisk/c1t0d0 ONLINE 0 0 0
/zdisk/c2t0d0 ONLINE 0 0 0
/zdisk/c3t0d0 ONLINE 0 0 0
/zdisk/c4t0d0 ONLINE 0 0 0
/zdisk/c5t0d0 ONLINE 0 0 0
raidz2 ONLINE 0 0 0
/zdisk/c0t1d0 ONLINE 0 0 0
/zdisk/c1t1d0 ONLINE 0 0 0
/zdisk/c2t1d0 ONLINE 0 0 0
/zdisk/c3t1d0 ONLINE 0 0 0
/zdisk/c4t1d0 ONLINE 0 0 0
/zdisk/c5t1d0 ONLINE 0 0 0
raidz2 ONLINE 0 0 0
/zdisk/c0t2d0 ONLINE 0 0 0
/zdisk/c1t2d0 ONLINE 0 0 0
/zdisk/c2t2d0 ONLINE 0 0 0
/zdisk/c3t2d0 ONLINE 0 0 0
/zdisk/c4t2d0 ONLINE 0 0 0
/zdisk/c5t2d0 ONLINE 0 0 0
raidz2 ONLINE 0 0 0
/zdisk/c0t3d0 ONLINE 0 0 0
/zdisk/c1t3d0 ONLINE 0 0 0
/zdisk/c2t3d0 ONLINE 0 0 0
/zdisk/c3t3d0 ONLINE 0 0 0
/zdisk/c4t3d0 ONLINE 0 0 0
/zdisk/c5t3d0 ONLINE 0 0 0
raidz2 ONLINE 0 0 0
/zdisk/c0t4d0 ONLINE 0 0 0
/zdisk/c1t4d0 ONLINE 0 0 0
/zdisk/c2t4d0 ONLINE 0 0 0
/zdisk/c3t4d0 ONLINE 0 0 0
/zdisk/c4t4d0 ONLINE 0 0 0
/zdisk/c5t4d0 ONLINE 0 0 0
raidz2 ONLINE 0 0 0
/zdisk/c0t5d0 ONLINE 0 0 0
/zdisk/c1t5d0 ONLINE 0 0 0
/zdisk/c2t5d0 ONLINE 0 0 0
/zdisk/c3t5d0 ONLINE 0 0 0
/zdisk/c4t5d0 ONLINE 0 0 0
/zdisk/c5t5d0 ONLINE 0 0 0
raidz2 DEGRADED 0 0 0
/zdisk/c0t6d0 ONLINE 0 0 0
/zdisk/c1t6d0 ONLINE 0 0 0
/zdisk/c2t6d0 ONLINE 0 0 0
/zdisk/c3t6d0 ONLINE 0 0 0
/zdisk/c4t6d0 ONLINE 0 0 0
spare DEGRADED 0 0 0
/zdisk/c5t6d0 UNAVAIL 0 0 0 corrupted data
/zdisk/c0t7d0 ONLINE 0 0 0
spares
/zdisk/c0t7d0 INUSE currently in use
/zdisk/c1t7d0 AVAIL
/zdisk/c2t7d0 AVAIL
/zdisk/c3t7d0 AVAIL
/zdisk/c4t7d0 AVAIL
/zdisk/c5t7d0 AVAIL
errors: No known data errors
# digest -a md5 /zpool/zfs1/somefile.bin
c61163bc590222cfbc0576b933b9ba53
# mkfile 100m /zdisk/newdisk
# zpool replace zpool /zdisk/c5t6d0 /zdisk/newdisk
# zpool status
pool: zpool
state: DEGRADED
scrub: resilver completed with 0 errors on Fri Mar 23 08:57:26 2007
config:
NAME STATE READ WRITE CKSUM
zpool DEGRADED 0 0 0
raidz2 ONLINE 0 0 0
/zdisk/c0t0d0 ONLINE 0 0 0
/zdisk/c1t0d0 ONLINE 0 0 0
/zdisk/c2t0d0 ONLINE 0 0 0
/zdisk/c3t0d0 ONLINE 0 0 0
/zdisk/c4t0d0 ONLINE 0 0 0
/zdisk/c5t0d0 ONLINE 0 0 0
raidz2 ONLINE 0 0 0
/zdisk/c0t1d0 ONLINE 0 0 0
/zdisk/c1t1d0 ONLINE 0 0 0
/zdisk/c2t1d0 ONLINE 0 0 0
/zdisk/c3t1d0 ONLINE 0 0 0
/zdisk/c4t1d0 ONLINE 0 0 0
/zdisk/c5t1d0 ONLINE 0 0 0
raidz2 ONLINE 0 0 0
/zdisk/c0t2d0 ONLINE 0 0 0
/zdisk/c1t2d0 ONLINE 0 0 0
/zdisk/c2t2d0 ONLINE 0 0 0
/zdisk/c3t2d0 ONLINE 0 0 0
/zdisk/c4t2d0 ONLINE 0 0 0
/zdisk/c5t2d0 ONLINE 0 0 0
raidz2 ONLINE 0 0 0
/zdisk/c0t3d0 ONLINE 0 0 0
/zdisk/c1t3d0 ONLINE 0 0 0
/zdisk/c2t3d0 ONLINE 0 0 0
/zdisk/c3t3d0 ONLINE 0 0 0
/zdisk/c4t3d0 ONLINE 0 0 0
/zdisk/c5t3d0 ONLINE 0 0 0
raidz2 ONLINE 0 0 0
/zdisk/c0t4d0 ONLINE 0 0 0
/zdisk/c1t4d0 ONLINE 0 0 0
/zdisk/c2t4d0 ONLINE 0 0 0
/zdisk/c3t4d0 ONLINE 0 0 0
/zdisk/c4t4d0 ONLINE 0 0 0
/zdisk/c5t4d0 ONLINE 0 0 0
raidz2 ONLINE 0 0 0
/zdisk/c0t5d0 ONLINE 0 0 0
/zdisk/c1t5d0 ONLINE 0 0 0
/zdisk/c2t5d0 ONLINE 0 0 0
/zdisk/c3t5d0 ONLINE 0 0 0
/zdisk/c4t5d0 ONLINE 0 0 0
/zdisk/c5t5d0 ONLINE 0 0 0
raidz2 DEGRADED 0 0 0
/zdisk/c0t6d0 ONLINE 0 0 0
/zdisk/c1t6d0 ONLINE 0 0 0
/zdisk/c2t6d0 ONLINE 0 0 0
/zdisk/c3t6d0 ONLINE 0 0 0
/zdisk/c4t6d0 ONLINE 0 0 0
spare DEGRADED 0 0 0
replacing DEGRADED 0 0 0
/zdisk/c5t6d0 UNAVAIL 0 0 0 corrupted data
/zdisk/newdisk ONLINE 0 0 0
/zdisk/c0t7d0 ONLINE 0 0 0
spares
/zdisk/c0t7d0 INUSE currently in use
/zdisk/c1t7d0 AVAIL
/zdisk/c2t7d0 AVAIL
/zdisk/c3t7d0 AVAIL
/zdisk/c4t7d0 AVAIL
/zdisk/c5t7d0 AVAIL
errors: No known data errors
# zpool status
pool: zpool
state: ONLINE
scrub: resilver completed with 0 errors on Fri Mar 23 08:57:26 2007
config:
NAME STATE READ WRITE CKSUM
zpool ONLINE 0 0 0
raidz2 ONLINE 0 0 0
/zdisk/c0t0d0 ONLINE 0 0 0
/zdisk/c1t0d0 ONLINE 0 0 0
/zdisk/c2t0d0 ONLINE 0 0 0
/zdisk/c3t0d0 ONLINE 0 0 0
/zdisk/c4t0d0 ONLINE 0 0 0
/zdisk/c5t0d0 ONLINE 0 0 0
raidz2 ONLINE 0 0 0
/zdisk/c0t1d0 ONLINE 0 0 0
/zdisk/c1t1d0 ONLINE 0 0 0
/zdisk/c2t1d0 ONLINE 0 0 0
/zdisk/c3t1d0 ONLINE 0 0 0
/zdisk/c4t1d0 ONLINE 0 0 0
/zdisk/c5t1d0 ONLINE 0 0 0
raidz2 ONLINE 0 0 0
/zdisk/c0t2d0 ONLINE 0 0 0
/zdisk/c1t2d0 ONLINE 0 0 0
/zdisk/c2t2d0 ONLINE 0 0 0
/zdisk/c3t2d0 ONLINE 0 0 0
/zdisk/c4t2d0 ONLINE 0 0 0
/zdisk/c5t2d0 ONLINE 0 0 0
raidz2 ONLINE 0 0 0
/zdisk/c0t3d0 ONLINE 0 0 0
/zdisk/c1t3d0 ONLINE 0 0 0
/zdisk/c2t3d0 ONLINE 0 0 0
/zdisk/c3t3d0 ONLINE 0 0 0
/zdisk/c4t3d0 ONLINE 0 0 0
/zdisk/c5t3d0 ONLINE 0 0 0
raidz2 ONLINE 0 0 0
/zdisk/c0t4d0 ONLINE 0 0 0
/zdisk/c1t4d0 ONLINE 0 0 0
/zdisk/c2t4d0 ONLINE 0 0 0
/zdisk/c3t4d0 ONLINE 0 0 0
/zdisk/c4t4d0 ONLINE 0 0 0
/zdisk/c5t4d0 ONLINE 0 0 0
raidz2 ONLINE 0 0 0
/zdisk/c0t5d0 ONLINE 0 0 0
/zdisk/c1t5d0 ONLINE 0 0 0
/zdisk/c2t5d0 ONLINE 0 0 0
/zdisk/c3t5d0 ONLINE 0 0 0
/zdisk/c4t5d0 ONLINE 0 0 0
/zdisk/c5t5d0 ONLINE 0 0 0
raidz2 ONLINE 0 0 0
/zdisk/c0t6d0 ONLINE 0 0 0
/zdisk/c1t6d0 ONLINE 0 0 0
/zdisk/c2t6d0 ONLINE 0 0 0
/zdisk/c3t6d0 ONLINE 0 0 0
/zdisk/c4t6d0 ONLINE 0 0 0
/zdisk/newdisk ONLINE 0 0 0
spares
/zdisk/c0t7d0 AVAIL
/zdisk/c1t7d0 AVAIL
/zdisk/c2t7d0 AVAIL
/zdisk/c3t7d0 AVAIL
/zdisk/c4t7d0 AVAIL
/zdisk/c5t7d0 AVAIL
errors: No known data errors
# digest -a md5 /zpool/zfs1/somefile.bin
c61163bc590222cfbc0576b933b9ba53
Now I am going to create another ZFS file system with compression on. You can see the time taken to create such a big file (1GB) in a compressed file system is only 51.112 seconds vs 49.419 seconds without compression. Also the MD5 hash of the same file under the 2 file systems are the same.
# time dd if=/dev/urandom of=/zpool/zfs1/bifile.bin bs=1024 count=100000
100000+0 records in
100000+0 records out
real 0m49.419s
user 0m0.701s
sys 0m41.169s
# zfs get compression zpool/zfs1
NAME PROPERTY VALUE SOURCE
zpool/zfs1 compression off local
# zfs create zpool/zfs2
# zfs set compression=on zpool/zfs2
# time dd if=/dev/urandom of=/zpool/zfs2/bifile.bin bs=1024 count=100000
100000+0 records in
100000+0 records out
real 0m52.112s
user 0m0.697s
sys 0m40.897s
# cp /zpool/zfs1/bifile.bin /zpool/zfs2/bifile.bin-copy-zfs1
# digest -a md5 /zpool/zfs1/bifile.bin
b15a3f71dd6ffb937c9cbf508cb442ff
# digest -a md5 /zpool/zfs2/bifile.bin-copy-zfs1
b15a3f71dd6ffb937c9cbf508cb442ff
Solaris 10 rocks, ZFS on Solaris 10 rocks++.
PS. I also explored the
IP filter
on the T1 so that I can implement host-based firewall for my customer. The article,
Using Solaris IP Filters, is a very good starting point. It is pretty easy to implement and I tried that out with Samba.
Labels: Solaris, ZFS