Harddrives Update (2)

January 16th, 2014

zfs-vs-hwWow the donation drive is running pretty good. I really appreciate every single donation, so I want share you with my current thoughts with you. Right now I’m about 50/50 between dedicated hardware raid vs zfs. In the beginning I was leaning more towards the HW Raid solution cause I know them from work, but the more I dig deeper into ZFS docs or blog posts the more I’ve to think my setup.

It’s still some time to go and I don’t have to hurry up (since there’s no world going down tomorrow). When spending that much money I’ll and should take my time to carefully check my options and write them down to have an easier comparison between the 2 final solutions.

What do I wanted to do anyway

  • add more RAM (finally fill all banks)
  • add (Enterpise) SSD (raid1) for MySQL
  • add a pair of small SSDs or FlashDisks for the Hypervisor (seperate hypervisor and storage which isn’t the case right now).

How will my final setup could look like:

  • 2x small flashdisk (pcie) or ssd (sata) for Hypervisor (I’ll keep Debian + Libvirt + KVM which I’m very happy with)
  • 4x1TB SAS with Raid6 or Raidz2
  • LVM volumes or ZFS volumes (zvol) for the virtual guests (evemaps, www, mail, vpn, development, test, etc)
  • I’m very likely adding 2 enterprise SSDs in an raid1 array to store high IO stuff like database to keep it away from he spinning drives.

These are the basics. Now lets see what are the up and downsides of the 2 solutions I’m looking into right now:

HW Raid:

  • Pro
    • Usually Fire and Forget and often use in enterprise environments for single servers that doesn’t require more then 4 or 8 disks
    • Overall Performance gain through Controller Cache with Cache Protection (Flash or Battery)
    • I know megaraid controllers from my work how they perform and how they can be controlled
  • Contra
    • Raid Controllers with Cache and Cache Protection are quite expensive
    • Hardware Dependency on the Raid Controller
    • Optional SSD Read/Write Cache from the controller costs license money

ZFS:

  • Pro
    • Flexible configuration, tuning and additional features
    • No hardware dependency
    • Build in SSD Read Cache (L2ARC) and Write Cache (ZIL for sync writes)
    • For the price of the HW Raid Controller I could get a SAS HostAdapter + Additional Pair of SSDs for Read/Write Cache
    • ZFS on Linux was declared production ready about 10 month ago (which I didn’t noticed)
  • Contra
    • Higher CPU usage due to Software Raid compared with HW Raid
    • Higher RAM usage for ARC (RAM read/write cache) compared with HW Raid
    • ZFS on Linux had always the “better don’t use it or use it on solaris” touch and I don’t intent to go away from Debian+Libvirt/KVM as hypervisor
    • Needs more thoughts, planning and likely tuning to find the best configuration in detail
    • Even I’m maintaining ZFS servers (mostly backup storage server) at work, I don’t have much experience how ZFS is reacting under load.
  • ZFS + HW Raid
    • When doing my research I even found people that suggested using a HW Raid Controller to gain Speed through the WriteBack Cache and passing through every single disk as a single array (raid0 or jbod) for zfs to work with. But this sounds very much like voodoo.

Conclusion:

Since both are equally expensive (when using the same drives). In the end it all comes down to some very basic points

  • Software Raid VS Hardware Raid
  • Flexibility VS Hardware Dependency
  • Build in read/write caches when using dedicated SSDs VS Build in Controller Cache (BBU/Cache Protection) (optional but expensive ssdcache)

UPDATE (18.01.2014)

I’ll go with the ZFS solution. If something goes wrong I can still throw out the HBA and buy a HW Raid Controller, but I don’t think so.

I know the donation drive isn’t over yet, but I decided to already order about 2.5k € worth of hardware stuff (SAS Controller, SAS Drives, SSDs, Memory). I know that’s exceeding my donation drive was, but I was sick of having limited memory (and I know ZFS likes more memory). What’s better then RAM? More RAM! 🙂

I’ll share more details when I’m done.

3 Responses to “Harddrives Update (2)”

  1. Dunedan says:

    You definitely want to read the following article about next-gen filesystems before choosing a filesystem/raid combination: http://arstechnica.com/information-technology/2014/01/bitrot-and-atomic-cows-inside-next-gen-filesystems/

    • Wollari says:

      Thank you. That’s really interesting article. I’ve heard about bitrot earlier, but nearly forgot about it already. And it makes my decision stronger leaning towards ZFS right now, even I’ve hammered against it in the first day.

      I’ve spend the last evenings with lots of reading and researching. The more I read about ZFS themore I like it, even I’ve used it already at work for couple years on couple machines.

      With my current migration concept I’ll keep evemaps and my other VMs still on the old drives of course. I’ll plugin everything in and go back home (to minimize the downtime of evemaps, etc).

      Then I’ve enough time to play around with my storage pool, do some tests and when I’m happy I’ll move my VMs to the new storage. If I don’t like it, I still have time to take the other option (HW). So far I’m looking optimistic.

      The only thing that kept me away from ZFS on Linux was always the poor implementation in the first time and all this license bitch about kernel integration or not.

  2. TauCabalander says:

    I’ve been using BTRFS on a 24 TiB array, sitting on top of hardware RAID-6 (old 16-port 3Ware 9650SE-16ML card) for about a year now, and was EXT4 before that until I hit the 32-bit kernel limit (32 bit buffer cache indices x 4 KiB blocks = 16 TiB). I could have gone 64-bit EXT4 when I moved to a 64 bit kernel, but I really didn’t want to re-format to get it, and BTRFS can convert an EXT4.

    BTRFS is great. I wouldn’t touch its RAID-5/6 support yet, which is not all there (very young), hence I’m still using a hardware RAID card. However its RAID-0 and RAID-1 code is in good shape..

    ZFS has some issues, especially under Linux, and it is really more appropriate for major server hardware as it is a bit of a resource hog.

    In all cases, make sure you have backups! FWIW, Google recommends 3 copies of everything for their datacenters.

PHP MySQL NGINX Webserver Firefox EVE Onlline Twitter @wollari Facebook
API J:25 Dec 19:22 K:25 Dec 19:22 C:25 Dec 19:53 A:25 Dec 19:53 O:04 Jun 11:15 F:25 Dec 19:53 S:25 Dec 19:38 W:25 Dec 19:15