ZFS on 48 Disks without X4500
So, how to simulate an environment with 48 disks using an old Sun Netra T1 105. My T1 configuration is:
- Memory: 256 MB
- Disk: 2x 18GB (all partitions are mirrored)
- CPU: 1x 440MHz UltraSPARC-IIi
- Patch: Recommended and Security Patch, Mar 12 2007, especially for these patches
- 124204 - zfs memory leak for large file
- 120068 - vulnerability in telnetd
Make 48 disks (files) with mkfile (1M)
# mkdir /zdisk # cd /zdisk # for i in c{0,1,2,3,4,5}t{0,1,2,3,4,5,6,7}d0 do mkfile 100m $i done # ls /zdisk c0t0d0 c0t5d0 c1t2d0 c1t7d0 c2t4d0 c3t1d0 c3t6d0 c4t3d0 c5t0d0 c5t5d0 c0t1d0 c0t6d0 c1t3d0 c2t0d0 c2t5d0 c3t2d0 c3t7d0 c4t4d0 c5t1d0 c5t6d0 c0t2d0 c0t7d0 c1t4d0 c2t1d0 c2t6d0 c3t3d0 c4t0d0 c4t5d0 c5t2d0 c5t7d0 c0t3d0 c1t0d0 c1t5d0 c2t2d0 c2t7d0 c3t4d0 c4t1d0 c4t6d0 c5t3d0 c0t4d0 c1t1d0 c1t6d0 c2t3d0 c3t0d0 c3t5d0 c4t2d0 c4t7d0 c5t4d0
Create a RAIDZ2 (double parity) with 7 sets of 6 HDDs RAIDZ2. You can see every RAIDZ2 group cuts across all the controllers, thanks to Joyeur blog
# zpool create zpool \ raidz2 /zdisk/c{0,1,2,3,4,5}t0d0 \ raidz2 /zdisk/c{0,1,2,3,4,5}t1d0 \ raidz2 /zdisk/c{0,1,2,3,4,5}t2d0 \ raidz2 /zdisk/c{0,1,2,3,4,5}t3d0 \ raidz2 /zdisk/c{0,1,2,3,4,5}t4d0 \ raidz2 /zdisk/c{0,1,2,3,4,5}t5d0 \ raidz2 /zdisk/c{0,1,2,3,4,5}t6d0 \ spare /zdisk/c{0,1,2,3,4,5}t7d0 # zpool list NAME SIZE USED AVAIL CAP HEALTH ALTROOT zpool 3.91G 288K 3.91G 0% ONLINE - # zpool status pool: zpool state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM zpool ONLINE 0 0 0 raidz2 ONLINE 0 0 0 /zdisk/c0t0d0 ONLINE 0 0 0 /zdisk/c1t0d0 ONLINE 0 0 0 /zdisk/c2t0d0 ONLINE 0 0 0 /zdisk/c3t0d0 ONLINE 0 0 0 /zdisk/c4t0d0 ONLINE 0 0 0 /zdisk/c5t0d0 ONLINE 0 0 0 raidz2 ONLINE 0 0 0 /zdisk/c0t1d0 ONLINE 0 0 0 /zdisk/c1t1d0 ONLINE 0 0 0 /zdisk/c2t1d0 ONLINE 0 0 0 /zdisk/c3t1d0 ONLINE 0 0 0 /zdisk/c4t1d0 ONLINE 0 0 0 /zdisk/c5t1d0 ONLINE 0 0 0 raidz2 ONLINE 0 0 0 /zdisk/c0t2d0 ONLINE 0 0 0 /zdisk/c1t2d0 ONLINE 0 0 0 /zdisk/c2t2d0 ONLINE 0 0 0 /zdisk/c3t2d0 ONLINE 0 0 0 /zdisk/c4t2d0 ONLINE 0 0 0 /zdisk/c5t2d0 ONLINE 0 0 0 raidz2 ONLINE 0 0 0 /zdisk/c0t3d0 ONLINE 0 0 0 /zdisk/c1t3d0 ONLINE 0 0 0 /zdisk/c2t3d0 ONLINE 0 0 0 /zdisk/c3t3d0 ONLINE 0 0 0 /zdisk/c4t3d0 ONLINE 0 0 0 /zdisk/c5t3d0 ONLINE 0 0 0 raidz2 ONLINE 0 0 0 /zdisk/c0t4d0 ONLINE 0 0 0 /zdisk/c1t4d0 ONLINE 0 0 0 /zdisk/c2t4d0 ONLINE 0 0 0 /zdisk/c3t4d0 ONLINE 0 0 0 /zdisk/c4t4d0 ONLINE 0 0 0 /zdisk/c5t4d0 ONLINE 0 0 0 raidz2 ONLINE 0 0 0 /zdisk/c0t5d0 ONLINE 0 0 0 /zdisk/c1t5d0 ONLINE 0 0 0 /zdisk/c2t5d0 ONLINE 0 0 0 /zdisk/c3t5d0 ONLINE 0 0 0 /zdisk/c4t5d0 ONLINE 0 0 0 /zdisk/c5t5d0 ONLINE 0 0 0 raidz2 ONLINE 0 0 0 /zdisk/c0t6d0 ONLINE 0 0 0 /zdisk/c1t6d0 ONLINE 0 0 0 /zdisk/c2t6d0 ONLINE 0 0 0 /zdisk/c3t6d0 ONLINE 0 0 0 /zdisk/c4t6d0 ONLINE 0 0 0 /zdisk/c5t6d0 ONLINE 0 0 0 spares /zdisk/c0t7d0 AVAIL /zdisk/c1t7d0 AVAIL /zdisk/c2t7d0 AVAIL /zdisk/c3t7d0 AVAIL /zdisk/c4t7d0 AVAIL /zdisk/c5t7d0 AVAIL errors: No known data errors
Let's go for a test drive with ZFS. First, I will create a zfs file system (zfs1) without compression (by default) and try to simulate a corrupted disk. We then 'scrub' it and 'replace' the corrupted disk with a new disk. You can see the MD5 hash of the file created before the corruption is the same throughout the whole process (before corruption, after corruption, replace faulty disk)
= # zfs create zpool/zfs1 # dd if=/dev/urandom of=/zpool/zfs1/somefile.bin bs=1024 count=1000 1000+0 records in 1000+0 records out # digest -a md5 /zpool/zfs1/somefile.bin c61163bc590222cfbc0576b933b9ba53 # dd if=/dev/zero of=/zdisk/c5t6d0 bs=1024 count=10 10+0 records in 10+0 records out # zpool scrub zpool # zpool status pool: zpool state: DEGRADED status: One or more devices could not be used because the label is missing or invalid. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Replace the device using 'zpool replace'. see: http://www.sun.com/msg/ZFS-8000-4J scrub: resilver stopped with 0 errors on Fri Mar 23 08:54:20 2007 config: NAME STATE READ WRITE CKSUM zpool DEGRADED 0 0 0 raidz2 ONLINE 0 0 0 /zdisk/c0t0d0 ONLINE 0 0 0 /zdisk/c1t0d0 ONLINE 0 0 0 /zdisk/c2t0d0 ONLINE 0 0 0 /zdisk/c3t0d0 ONLINE 0 0 0 /zdisk/c4t0d0 ONLINE 0 0 0 /zdisk/c5t0d0 ONLINE 0 0 0 raidz2 ONLINE 0 0 0 /zdisk/c0t1d0 ONLINE 0 0 0 /zdisk/c1t1d0 ONLINE 0 0 0 /zdisk/c2t1d0 ONLINE 0 0 0 /zdisk/c3t1d0 ONLINE 0 0 0 /zdisk/c4t1d0 ONLINE 0 0 0 /zdisk/c5t1d0 ONLINE 0 0 0 raidz2 ONLINE 0 0 0 /zdisk/c0t2d0 ONLINE 0 0 0 /zdisk/c1t2d0 ONLINE 0 0 0 /zdisk/c2t2d0 ONLINE 0 0 0 /zdisk/c3t2d0 ONLINE 0 0 0 /zdisk/c4t2d0 ONLINE 0 0 0 /zdisk/c5t2d0 ONLINE 0 0 0 raidz2 ONLINE 0 0 0 /zdisk/c0t3d0 ONLINE 0 0 0 /zdisk/c1t3d0 ONLINE 0 0 0 /zdisk/c2t3d0 ONLINE 0 0 0 /zdisk/c3t3d0 ONLINE 0 0 0 /zdisk/c4t3d0 ONLINE 0 0 0 /zdisk/c5t3d0 ONLINE 0 0 0 raidz2 ONLINE 0 0 0 /zdisk/c0t4d0 ONLINE 0 0 0 /zdisk/c1t4d0 ONLINE 0 0 0 /zdisk/c2t4d0 ONLINE 0 0 0 /zdisk/c3t4d0 ONLINE 0 0 0 /zdisk/c4t4d0 ONLINE 0 0 0 /zdisk/c5t4d0 ONLINE 0 0 0 raidz2 ONLINE 0 0 0 /zdisk/c0t5d0 ONLINE 0 0 0 /zdisk/c1t5d0 ONLINE 0 0 0 /zdisk/c2t5d0 ONLINE 0 0 0 /zdisk/c3t5d0 ONLINE 0 0 0 /zdisk/c4t5d0 ONLINE 0 0 0 /zdisk/c5t5d0 ONLINE 0 0 0 raidz2 DEGRADED 0 0 0 /zdisk/c0t6d0 ONLINE 0 0 0 /zdisk/c1t6d0 ONLINE 0 0 0 /zdisk/c2t6d0 ONLINE 0 0 0 /zdisk/c3t6d0 ONLINE 0 0 0 /zdisk/c4t6d0 ONLINE 0 0 0 spare DEGRADED 0 0 0 /zdisk/c5t6d0 UNAVAIL 0 0 0 corrupted data /zdisk/c0t7d0 ONLINE 0 0 0 spares /zdisk/c0t7d0 INUSE currently in use /zdisk/c1t7d0 AVAIL /zdisk/c2t7d0 AVAIL /zdisk/c3t7d0 AVAIL /zdisk/c4t7d0 AVAIL /zdisk/c5t7d0 AVAIL errors: No known data errors # digest -a md5 /zpool/zfs1/somefile.bin c61163bc590222cfbc0576b933b9ba53 # mkfile 100m /zdisk/newdisk # zpool replace zpool /zdisk/c5t6d0 /zdisk/newdisk # zpool status pool: zpool state: DEGRADED scrub: resilver completed with 0 errors on Fri Mar 23 08:57:26 2007 config: NAME STATE READ WRITE CKSUM zpool DEGRADED 0 0 0 raidz2 ONLINE 0 0 0 /zdisk/c0t0d0 ONLINE 0 0 0 /zdisk/c1t0d0 ONLINE 0 0 0 /zdisk/c2t0d0 ONLINE 0 0 0 /zdisk/c3t0d0 ONLINE 0 0 0 /zdisk/c4t0d0 ONLINE 0 0 0 /zdisk/c5t0d0 ONLINE 0 0 0 raidz2 ONLINE 0 0 0 /zdisk/c0t1d0 ONLINE 0 0 0 /zdisk/c1t1d0 ONLINE 0 0 0 /zdisk/c2t1d0 ONLINE 0 0 0 /zdisk/c3t1d0 ONLINE 0 0 0 /zdisk/c4t1d0 ONLINE 0 0 0 /zdisk/c5t1d0 ONLINE 0 0 0 raidz2 ONLINE 0 0 0 /zdisk/c0t2d0 ONLINE 0 0 0 /zdisk/c1t2d0 ONLINE 0 0 0 /zdisk/c2t2d0 ONLINE 0 0 0 /zdisk/c3t2d0 ONLINE 0 0 0 /zdisk/c4t2d0 ONLINE 0 0 0 /zdisk/c5t2d0 ONLINE 0 0 0 raidz2 ONLINE 0 0 0 /zdisk/c0t3d0 ONLINE 0 0 0 /zdisk/c1t3d0 ONLINE 0 0 0 /zdisk/c2t3d0 ONLINE 0 0 0 /zdisk/c3t3d0 ONLINE 0 0 0 /zdisk/c4t3d0 ONLINE 0 0 0 /zdisk/c5t3d0 ONLINE 0 0 0 raidz2 ONLINE 0 0 0 /zdisk/c0t4d0 ONLINE 0 0 0 /zdisk/c1t4d0 ONLINE 0 0 0 /zdisk/c2t4d0 ONLINE 0 0 0 /zdisk/c3t4d0 ONLINE 0 0 0 /zdisk/c4t4d0 ONLINE 0 0 0 /zdisk/c5t4d0 ONLINE 0 0 0 raidz2 ONLINE 0 0 0 /zdisk/c0t5d0 ONLINE 0 0 0 /zdisk/c1t5d0 ONLINE 0 0 0 /zdisk/c2t5d0 ONLINE 0 0 0 /zdisk/c3t5d0 ONLINE 0 0 0 /zdisk/c4t5d0 ONLINE 0 0 0 /zdisk/c5t5d0 ONLINE 0 0 0 raidz2 DEGRADED 0 0 0 /zdisk/c0t6d0 ONLINE 0 0 0 /zdisk/c1t6d0 ONLINE 0 0 0 /zdisk/c2t6d0 ONLINE 0 0 0 /zdisk/c3t6d0 ONLINE 0 0 0 /zdisk/c4t6d0 ONLINE 0 0 0 spare DEGRADED 0 0 0 replacing DEGRADED 0 0 0 /zdisk/c5t6d0 UNAVAIL 0 0 0 corrupted data /zdisk/newdisk ONLINE 0 0 0 /zdisk/c0t7d0 ONLINE 0 0 0 spares /zdisk/c0t7d0 INUSE currently in use /zdisk/c1t7d0 AVAIL /zdisk/c2t7d0 AVAIL /zdisk/c3t7d0 AVAIL /zdisk/c4t7d0 AVAIL /zdisk/c5t7d0 AVAIL errors: No known data errors # zpool status pool: zpool state: ONLINE scrub: resilver completed with 0 errors on Fri Mar 23 08:57:26 2007 config: NAME STATE READ WRITE CKSUM zpool ONLINE 0 0 0 raidz2 ONLINE 0 0 0 /zdisk/c0t0d0 ONLINE 0 0 0 /zdisk/c1t0d0 ONLINE 0 0 0 /zdisk/c2t0d0 ONLINE 0 0 0 /zdisk/c3t0d0 ONLINE 0 0 0 /zdisk/c4t0d0 ONLINE 0 0 0 /zdisk/c5t0d0 ONLINE 0 0 0 raidz2 ONLINE 0 0 0 /zdisk/c0t1d0 ONLINE 0 0 0 /zdisk/c1t1d0 ONLINE 0 0 0 /zdisk/c2t1d0 ONLINE 0 0 0 /zdisk/c3t1d0 ONLINE 0 0 0 /zdisk/c4t1d0 ONLINE 0 0 0 /zdisk/c5t1d0 ONLINE 0 0 0 raidz2 ONLINE 0 0 0 /zdisk/c0t2d0 ONLINE 0 0 0 /zdisk/c1t2d0 ONLINE 0 0 0 /zdisk/c2t2d0 ONLINE 0 0 0 /zdisk/c3t2d0 ONLINE 0 0 0 /zdisk/c4t2d0 ONLINE 0 0 0 /zdisk/c5t2d0 ONLINE 0 0 0 raidz2 ONLINE 0 0 0 /zdisk/c0t3d0 ONLINE 0 0 0 /zdisk/c1t3d0 ONLINE 0 0 0 /zdisk/c2t3d0 ONLINE 0 0 0 /zdisk/c3t3d0 ONLINE 0 0 0 /zdisk/c4t3d0 ONLINE 0 0 0 /zdisk/c5t3d0 ONLINE 0 0 0 raidz2 ONLINE 0 0 0 /zdisk/c0t4d0 ONLINE 0 0 0 /zdisk/c1t4d0 ONLINE 0 0 0 /zdisk/c2t4d0 ONLINE 0 0 0 /zdisk/c3t4d0 ONLINE 0 0 0 /zdisk/c4t4d0 ONLINE 0 0 0 /zdisk/c5t4d0 ONLINE 0 0 0 raidz2 ONLINE 0 0 0 /zdisk/c0t5d0 ONLINE 0 0 0 /zdisk/c1t5d0 ONLINE 0 0 0 /zdisk/c2t5d0 ONLINE 0 0 0 /zdisk/c3t5d0 ONLINE 0 0 0 /zdisk/c4t5d0 ONLINE 0 0 0 /zdisk/c5t5d0 ONLINE 0 0 0 raidz2 ONLINE 0 0 0 /zdisk/c0t6d0 ONLINE 0 0 0 /zdisk/c1t6d0 ONLINE 0 0 0 /zdisk/c2t6d0 ONLINE 0 0 0 /zdisk/c3t6d0 ONLINE 0 0 0 /zdisk/c4t6d0 ONLINE 0 0 0 /zdisk/newdisk ONLINE 0 0 0 spares /zdisk/c0t7d0 AVAIL /zdisk/c1t7d0 AVAIL /zdisk/c2t7d0 AVAIL /zdisk/c3t7d0 AVAIL /zdisk/c4t7d0 AVAIL /zdisk/c5t7d0 AVAIL errors: No known data errors # digest -a md5 /zpool/zfs1/somefile.bin c61163bc590222cfbc0576b933b9ba53
Now I am going to create another ZFS file system with compression on. You can see the time taken to create such a big file (1GB) in a compressed file system is only 51.112 seconds vs 49.419 seconds without compression. Also the MD5 hash of the same file under the 2 file systems are the same.
# time dd if=/dev/urandom of=/zpool/zfs1/bifile.bin bs=1024 count=100000 100000+0 records in 100000+0 records out real 0m49.419s user 0m0.701s sys 0m41.169s # zfs get compression zpool/zfs1 NAME PROPERTY VALUE SOURCE zpool/zfs1 compression off local # zfs create zpool/zfs2 # zfs set compression=on zpool/zfs2 # time dd if=/dev/urandom of=/zpool/zfs2/bifile.bin bs=1024 count=100000 100000+0 records in 100000+0 records out real 0m52.112s user 0m0.697s sys 0m40.897s # cp /zpool/zfs1/bifile.bin /zpool/zfs2/bifile.bin-copy-zfs1 # digest -a md5 /zpool/zfs1/bifile.bin b15a3f71dd6ffb937c9cbf508cb442ff # digest -a md5 /zpool/zfs2/bifile.bin-copy-zfs1 b15a3f71dd6ffb937c9cbf508cb442ff
Solaris 10 rocks, ZFS on Solaris 10 rocks++.
PS. I also explored the IP filter on the T1 so that I can implement host-based firewall for my customer. The article, Using Solaris IP Filters, is a very good starting point. It is pretty easy to implement and I tried that out with Samba.
2 Comments:
You mentioned in the post that the file had the same digest value before, after, and during pool corruption. But doesn't this just mean that the file is probably not contained on a disk that got corrupted? If the file is on the corrupted disk, you would see a different digest value, right?
bstone@aspirinsoftware.com
I tried to corrupt just 1 disk and because the zpool is based on raidz2 (RAID 6), ZFS is about to figure out which strip is good and which one is bad. That's why the MD5 digest remains the same, before, during and after the corruption.
Also, replacing the 'disk' is just a breeze.
Post a Comment
<< Home