As a previous post stated, I have ran out of space on my 5.4TB NAS. New parts were ordered through Newegg.com and were in transit as this post was first being drafted. A 6TB rented external hard drive was used to backup all data from current NAS so that the old parts may be cannibalized for the revised NAS . The new NAS has approximately 12.4TB of usable space after the RAIDZ3 protection and will be easily expandable to 24.8TB (using another set of 10x 2TB drives in another RAIDZ3 configuration) of space without need for temporary storage or even a system shutdown due to the hot-swappable bays on this new case. This setup will be able to withstand three drive failures of the ten without any data loss. All this is due to the ZFS file system. So lets take a closer look at this versatile NAS setup.
This go around I decided I need to be able to expand the system in the future by having available bays and ports. Throughput of drives doesn’t double every 18 months like other sectors of technology, so buying SATA III hardware is likely a safe bet as it won’t be phased out too quickly. Also as prices continually fall for 2TB drives it makes sense to only create an array that meets current demands while not overbuilding too much as the price for the same drives or equivalents is likely to be significantly cheaper in just a few months time.
I decided early on to reuse my motherboard, processor, power supply, and current 2TB drives for cost reasons.
As for the new hardware I started off by evaluating computer chassises. First I looked into consumer towers, the most 5 ¼” bays I could find in a single tower had 9 bays, it was fairly cheap. Alright, then I started looking at hard drive cages and soon realized, due to the price, that buying a purpose built chassis would be more cost effective than piecing together something that could hold quiet a few HDs. Most of the quality drive cages were in the excess of $90 for a 3x 5¼” bays to a 5x 3½” drive system; if you do decide to go that route I suggest going with Icy Dock as their products are well built and pretty. The chassis I ended up purchasing has 20 hot swappable bays on 5 backplanes using SATA ports; in hindsight I should of gone with backplanes with SAS 8087 connectors as it would of been more efficient for cooling and cable routing.
The next step was getting 20 SATA drives on a motherboard with only 6 SATA ports. My first plan of attack was to use a simple SATA port multiplier. Well after have many issues with them, including IO errors while running some benchmark test on the preliminary testing setup I decided to RMA the cards. The next idea was to use a RocketRAID 2760 card, 24 ports on a PCI Express 2.0 16x interface; it would of been wicked fast if it would of passed the volumes to the OS. So what I ended up settling on was a 16 internal port HBA card by LSI, using this card will require me to use the 4 remaining SATA ports on the motherboard to be able to fully utilize all the bays in my case; not a problem ZFS can handle that as you can even mismatch IDE, PATA, and SATA drives on the same pool without issue.
For the additional hard drives I simply went to NewEgg.com, did a filtered search for SATA hard drives that were 2TB in capacity. Currently 2TB hard drives have the lowest Cost to Capacity ratio, 3TB HDs seem to go for double the cost and the 1TB drives are about $20 cheaper than 2TB drives. The drive I settled on was $59.99 after a $10 rebate and yes they are Deathstars; two died within a few days with I/O errors and eventually not even showing up in the BIOS. NewEgg’s RMA policy sucks; not only do you have to pay for return shipping on the dead devices shipped to you, NewEgg doesn’t ship out the replacement product quickly (2+ days after receiving the broken hard drives.) And forget about cross shipped RMAing.
As for the RAM I went all out, using 16GB to max out the motherboard’s capacity. ZFS is a RAM intensive system and can have a performance loss if not using at least the recommended 2GB+. With large writes ZFS fills the RAM with the data then dumps it to the disks in bulk. ZFS also has an option to attach a SSD to be used as a caching device for writes, that technology is called ZIL.
For the OS drives I went with a couple of 32GB SSDs in a mirrored configuration. Originally I was going to utilized an old CF to IDE card reader but for $10 more I got the 2x 32GB SSDs which had double the capacity as the 16GB CF cards I was looking at with the SDDs having way faster read and write speeds.
The operating system – Solaris 11 Express:
Before building my first NAS I had never heard of ZFS. I deducted from the name it was a filesystem but it was foreign. UFS, btrfs, and XFS were all new in my book as well and I didn’t really read any praises for any particular one. As I was set on using FreeNAS at the time I went with UFS, UFS was the recommended filesystem to use due to other filesystems having the possibility of data loss due in part to incompatibility and partly due to a lack of development to patch bugs. In FreeNAS 0.7 there is a point made to make sure the user knows that ZFS is experimental and should only be used in a testing environment. I looked a little into ZFS, saw it had a few nice features but due to the heed of warning from FreeNAS’s lack of support I dropped the idea. There was even a warning message in the dmesg on boot warning of ZFS’s experimental state on FreeNAS 0.7.2.
This time around I decided a WebGUI was not needed. The only time I used it was during the initial setup and occasional status checks so moving on from FreeNAS was easier letting go of that feature. I’m convinced ZFS is the way to go and the best supported OS for ZFS is Solaris 11 from Sun/Oracle. I started running a copy of Solaris from VirtualBox, the virtual machine was setup with several virtual hard drives so I could really test ZFS and try to find fault. Besides being very slow from only allocating 2GB of RAM to the virtual machine everything worked and I decided I could live with Solaris even though it will be a slight learning curve.
ZFS – a closer look:
As I mentioned I had never heard of ZFS before my first NAS build and would of used ZFS had FreeNAS 0.7.2 supported it stably. The amount of useful features that ZFS has is remarkable. The key features are as follows:
- It supports pooled storage which is similar to sticking in more RAM in a PC than the current song and dance required to setup a harddrive which still has the volumes separated by partition required by other filesystems.
- There is built in data check-summing which most filesystem don’t have. This feature alerts of quiet data corruption(1)(2)(3) and repairs it in cases of mirrored or RAIDZ configurations.
- Encryption is built in, completely transparent, and from what I can tell doesn’t effect performance too much if using a processor with AES hardware instruction set. Encryption keys can be passwords, fobs, or hex hashes.
- Within ZFS itself there is the option for improved software RAID 0, 1, 5, 6, 6 + one more parity drive, and nested RAIDs of any of the former options. RAID protects from data loss due to a hard drive crashing which is inevitable.
- Limitless snapshots which is the same technology that TimeMachine uses on OSX.
- Transparent compression
- Data Deduplication. This looks for blocks that have the same pattern and uses a single block for the pattern and references said block for all other occurrences.
- Background online data scrubbing and resilvering. Which are the repair and rebuild processes.
- Pool importing. ZFS will recognize drives that were part of a previous pool even if on completely different motherboard, controller, port, order on said controller, and/or whatever else you can throw at it. ZFS just imports the pool and all is well. ZFS will even warn you if you are about to overwrite a drive with part of a pool on it. This was jaw dropping for me when I saw this.
- Copy-on-write. Files are copied to blank blocks then commits all changes at once.
- Quick RAID rebuilds due to only requiring the blocks of data with actual used data rather than recreating the entire disk including the empty data. This can be critical if the lower RAIDZ or RAIDZ2 options are used and other drives are starting to fail(1).
- Scalability. 128 bit filesystem. Which means it can have pools as large as 256,000,000,000,000,000 ZettaBytes (1ZettaByte = 1 billion TB).
- Pipelined I/O. This manages IO for maximum performance and data safety.
- Variable block size. Great for optimizing the block size for the size of files.
- Open sourced. If the licensing was less strick you’d see it on far more OSes, including Linux.
As stated before one of the nice things about ZFS is it’s pooled storage. Which means when the need arrises to expand to NAS v3, all that will be needed is additional harddrives as the rest of the current hardware setup has additional bays and ports available. I’ll be able to add another RAIDZ3 array of any sized drives in a concatenated (still one massive pool of data) type setup without erasing any of the data; which will be necessary because I think 10TB drives aren’t in the near future (4+years for consumer level).
Below is a screenshot of a NewEgg.com cart with all my hardware plus 10 more drives. If you price hunt or utilize NewEgg’s price alert you can lower the cost to about $2500 as of this writing. The LSI HBA card has a huge markup on NewEgg, I ordered from WiredZone.com which had a $100 difference from NewEgg’s price:
It’s hard to dispute that ZFS is anything short of awesome. Given all of it’s advantages and features you might wonder why it’s not used on a lot more systems and OSes, including whatever you are currently running. Well, it does have one large disadvantage which is ZFS’s opened source CDDL license renders it difficult to implement for the masses on Linux or any other OS for that matter. If you are willing to run Solaris on your NAS box you’ll have one wicked filesystem. There aren’t major differences between Solaris and Linux. The backend on Solaris is slightly different from what you might be used to on Linux but it still uses Gnome for the frontend and a package manager so for the semi-power users converting over won’t be that much of a learning curve.
Another possible solution if you are running some high end hardware or even at the research point and willing to drop the cash is going with a virtualized setup and running VMware’s EXSi virtualizing OS. Basically you could run a virtualized Solaris OS with the hardware controller passed straight to the virtualized OS, ZFS could handle all the storage aspects and you could run a daemon (iSCSI, SFTP, AFP, CIFS, SAMBA or others) to pass that data to a more familiar OS for some more comfortable file serving and handling.