Thoughts and feelings on data storage implementations after four years of immersion.

Background:

Four years ago I got serious about the way I stored my data. At that time I had four harddrives, each with a different filesystems, each with different types of data, and the drives were scattered between different PCs. I transferred files between the PCs with thumbdrives, DVDs, and occasionally SFTP. Overall the system in general was inefficient. So I set out to build something that: insured data-integrity, would allow me to access all the data from multiple devices over a network simultaneously, and would be fast and streamlined. ZFS was the obvious and indisputable choice. The choice I struggled with for a while was the OS to implement ZFS with; during my search, my system was optimized, time pasted and different configurations were tested; eventually leading to data storage nirvana.

After reading extensively, I dove deep into this journey. It started by using Solaris running as a virtual machine running on top of ESXi. An ESXi whitebox was setup as an All-In-One [AIO] server. The ESXi hypervisor ran off a USB boot device. In addition there was a very small VMFS datastore stored on a couple of junk harddrives in a hardware-RAID1 configuration. This small VMFS datastore had a Solaris virtual machine [VM] and a pfSense VM. pfSense would boot-up my network services (WAN and LAN DHCP) and then the Solaris VM would boot-up the primary data including the iSCSI VMFS. Both of these VMs used PCIe passthrough: Solaris received a HBA through PCIe passthrough and pfSense a dual port NIC. When Solaris finally booted I would have to reconfigure the iSCSI LUNs due to the encryption. After the LUNs/zvols were reconfigured they were presented to the ESXi machine via iSCSI as a datastore which contained the non-critical VMs.

Even at the time I found it kinda funny the system relied on a nested VM to give itself more data. However this setup was limited, it did not lend it self to being pushed. This setup with the nested storage was fairly easily to bring a crawl or entirely down by running too many VMs or IOs. To further my own caution the thought of restarting, a 30 minute process, hindered my desire to be creative. To me, 30 minutes from the time of pressing the power button to the time of having the nested datastor back up and running was too long. There were a number of non-scriptable (at least at the time by the talent I had then) commands that were needed to get everything to startup in the correct order on the separate OSes. Another contributing factor was once the AIO went down so did the WAN connection.

During one crash, the entire system collapsed. The smaller datastore became corrupted and I couldn’t access the massive datastore which had the disk images to reinstall pfSense and Solaris 11. This equated to about a week of downtime while I downloaded the install ISOs on a slow connection.

I recently switched to a multiple-independent-server-setup (4x Dell R610s, 1x Dell R510, 1x Norco DAS). One of the R610s is used as a 12-port-router, the R510 is used for data storage (both SAN and NAS), and the remaining R610s are used as a lab for testing configurations and breaking them. This latest setup is the best yet, as the ability to be reckless on the lab without consequence has allowed for me to learn things I would of been too cautious to try before.

 

So here are some quick tips from my experience:

  • Unless you are building a strictly-play-only-lab don’t use Intel Engineering Sample CPUs. The CPUs are cheaper than the retail releases, the CPUs typically work without issue most of the time, but for me it seemed to Purple-Screens-of-Death on VMware at the most inopportune times.
  • Whitebox servers are definitely cheaper than same-component generation server equipment from a brand but sometimes at the cost of stability. For example I had a SuperMicro motherboard in a whitebox system with an unsolvable IRQ issue. It seemed that no matter the configurations I tried, occasionally a hard to reproduce PSoD occurred due to an IRQ conflict. I ended up going with one-generation-past (12G is current as of this writing and at the time of my purchases) servers from Dell and haven’t had the same issue in a similar configuration. Also the Dell servers are way quieter at idle than my whitebox setup and they also have more granular fan control under load.
  • With Dell servers, you should assume the RAM configuration is the one you are going to use from now until the end of time. In my experience, Dells are finicky about the RAM modules being used and which slots are populated. Differences in size, speed, rank, and sometimes even model number of the RAM modules will create less than optimal running configurations. My suggestion is to either buy a used system with no RAM (more expensive in the end but you get exactly what you want in terms of speed and modules) to populate it yourself OR buy a Dell system with enough RAM to begin with (cheaper route.) This exacting nature may be the case with all [server-]motherboards, but my experience has been limited to Dells so far. My experience with Dells has been RAM modules were disabled because of the physical slot configuration was less-than-optimal in the BIOS’s opinion; also worth mentioning these Dells are the first time I have dealt with a system with 128GB of RAM.
  • Another thing to really consider is where and how you store your data. Take consideration to how you are locking your data/drives to a particular OS / RAID card / Software RAID / or possibly ZFS pool version. For example most hardware RAID is locked to that particular model of card and sometimes even to that particular version of firmware on the hardware RAID card. As for ZFS, there are four common versions in the wild: 15, 28, 5000, 34; the latter two are complete forks with no inter-operable ability.

 

My own brief experiences:

On a FreeNAS 0.7 / NAS4Free install which was particularly buggy at the time, especially with the software RAID5. I used this configuration for about a year.

  • Pros: It was the only thing available meeting my needs at the time, NAS and plug-ins. My data was a little safer than just having a single copy on a single drive.
  • Cons: Crashed a lot. Had many data-loss-scares. My drive were locked to the hardware configuration and software-raid5.

I used FreeNAS 8 for a brief time only to switch back to Solaris soon after. This was due to personal preference but I hold FreeNAS in the highest regard, along with the company that backs it, iXsystems.

I originally started with Solaris, a noble cause no doubt. I nearly immediately switched to the FreeNAS .7 / NAS4Free setup due to my intimidation with near exclusive configuration from command line. I then switched back to Solaris 11 a year later after briefly testing other platforms, only to realize what I was missing on Solaris. Solaris 11 is not for many people, but it now fits my needs wholly.

  • Pros: Very Stable, Native Data Encryption, built-in CIFS support, pool can be transferred to a fresh install without issues. SAN technologies built into the OS. Enterprise grade OS.
  • Cons: Solaris is a beast to wrestle depending on what you want to do with it. Native encrypted data is locked to Solaris 11+. Largely managed through command line interface.

 

More considerations:

Also, it’s worth noting that most HBAs and RAID cards are temperamental when it comes to harddrives timing out aka TLER/ERC-timeouts on high-recovertime-drives aka typical desktop harddrives. Sometimes I would have problems with my aging drives dropping from the array which would cause “ZFS UNAVAIL There are insufficient replicas for the pool to continue functioning.” which always stopped my heart. The solution is to use Enterprise drives, which are very costly; or to use NAS oriented drives which are closer to desktop drives both in price and hardware but with a firmware that is flashed to allow for different error handling, a handling method that resembles how enterprise drives handle errors. My personal preference for NAS drives are WD’s Reds but each manufacture has a NAS-variant drive now.

With btrfs now being considered mostly stable, it should definitely be a consideration as it looks to have a really cool feature set. At the time I was deciding my long term data solution btrfs was not nearly stable enough for me to have considered.

Consider buying long-warrantied (>=3 year) used drives on Amazon, yes used. My reasoning is, most drives either fail in the beginning week or toward the end of their warranty cycle. So I buy used effectively having other people test out the drives for me.  I have checked the SMART data on these used drives; the “Power On Hours Count” has typically been around 300 hours, a value I find acceptable. My personal hypothesis why people return drives with such low times, is that people “rent” hard drives from Amazon to transition to larger datasets on new storage systems and take the return fee hit as compensation. Then again I could of just got a batch from someone who had done that. Another advantage to buying used is, the potential for considerable savings depending on the number of drives you are buying. For me, sixty percent of the first batch (all new, 1TBs, 3 year warranty, various manufactures) of drives I bought failed. I was fortunate enough that most drives typically failed around seventy percent through their warranty. One drive did fail after one week. It typically took two weeks for the manufacture to return a refurbished drive with a renewed warranty. I ended up buying a spare drive to use as a cold spare during these exchange times. None of the second batch of drives (all used, 3TBs, 5 year warranties, WD Reds) have failed after two years of use. Not exactly apples to apples but an idea to consider. Google and BackBlaze have interesting hard drive failure data to look into as well.

 

Final Thoughts:

With bit rot, failed drives, and data loss being something I’ve personally experienced long ago before ZFS, I couldn’t imagine storing my data on anything but ZFS from now on. Getting the correct software and hardware configuration just right for your use and needs is tough. I hope this post can illuminate your way if you are still trying to achieve data storage nirvana.

This entry was posted in Commentary, Technology and tagged , , , , , , , , , , , , , , , . Bookmark the permalink.

One Response to Thoughts and feelings on data storage implementations after four years of immersion.

  1. Blai says:

    Dear EpiJunkie,

    Interesting post. I’d like to configure a NAS for my storage needs. I had been using Solaris for ever so the OS is not the question. The question is the hardware. What do you think about installing Solaris 11.2 on a FreeNAS Mini? Would it be a good match? Do you have a recommendation for a cheap and good hardware for Solaris 11.2?

    Thanks!

    Blai

Leave a Reply

Your email address will not be published. Required fields are marked *