In case you’ve been living under a rock, you might have noticed that HyperConverged Infrastructure (HCI) is a fast-growing segment of the hardware market. At least that’s what all the marketing campaigns and analysts are saying. Gartner said earlier this year that the HCI market will grow to consume 24% of the integrated systems market by 2019. According to IDC, HCI systems made up 11.4% of integrated systems sales in Q42015 with a YoY growth of 170.5%. Of course to a certain degree this is derived from vendors pushing HCI very hard at the consumer and in the channel. I can’t tell you how many briefings, webinars, and marketing campaigns I’ve been hit with about HCI in the last 12 months. Excessive is the word that comes most readily to mind. Nevertheless, where there’s smoke there is likely hyperconverged fire.
Working for a company with multiple hardware partners has its advantages, and this has allowed me to actually get my hands on four different HCI systems. Before I go any further, this is the part where I tell you that the opinions expressed in this post are mine and mine alone, and that this in no way reflects or is endorsed by my employer. There, now I feel better. Moving on…
So like I said, I’ve am lucky enough to have worked on four different HCI systems. They are as follows:
Before I get into the details of the various systems, I want to pontificate a little about HCI and its place in the industry. The way that HCI is billed, you would think that it was a panacea for whatever ails your datacenter. Need more compute? Want to get rid of storage arrays? Need more automation? If you answered yes to any or all of these, then HCI is for you! The reality is that HCI is the right tool for some jobs, but by no means all. I see the main use cases for when:
An HCI is not going to replace your storage arrays, at least not in the short term. Storage arrays are easily expandable, with tiered storage types that meet your needs. An HCI appliance has a pre-configured amount of storage in it, and you usually need to purchase identical nodes for the storage replication to work properly. So expanding the storage in a node is a non-trivial affair. For the time being, it’s likely you’ll be maintaining a SAN alongside your hyperconverged nodes. Maybe in the long term all lower tiers of storage can be shuffled up into S3 or StorSimple.
An HCI is also not going to replace all of your traditional dedicated servers, the ones that are not virtualized and aren’t going to be anytime soon. While it’s true that you can basically virtualize anything you want, that doesn’t necessarily mean that you should. HCI relies on a hypervisor, typically ESXi or Hyper-V, and virtualization to run all of its instances. If you have an Exchange buildout following the preferred architecture or an Oracle RAC on physical boxes, then HCI isn’t going to virtualize and condense those bad boys.
An HCI is not going to reduce your network footprint, at least not anymore than any other virtualization solution. And in some cases it will require a higher port density on your ToR switches. For instance, an HPE C7000 enclosure with two Virtual Connect FlexFabric 20/40 modules only really needs four 10Gb uplinks (two per module) to your ToR switches. It’s possible that you will need additional uplinks if things are getting saturated, so let’s say a maximum of eight uplinks (four per module), providing connectivity for 16 hosts in the enclosure. That’s a ratio of 1:2 per host, and I’m not even considering the 40Gb QSFP ports that would drop it down to 1:4 or 1:8. In HCI world, each host has two 10Gb ports, so the ratio gets flipped to 2:1 per host. What about rack density? C7000 enclosures are 10U a piece with 16 half-height hosts. So you can fit 4 in a rack with two 1U ToR switches, assuming you can serve enough power to the rack for all four enclosures. You would need 32 ports for the enclosures, and then sufficient ports to uplink your ToR to your spine switches (assuming leaf and spine). Two 24 port switches would probably be good enough, or you could go 32 or 48 ports to be safe. That would be 64 hosts per rack. An HC250 has four nodes in 2U, so each 2U requires eight 10Gb ports. Load up 40U and you’ve got 80 nodes requiring 160 ports! You’ve got an additional 16 hosts in your rack, but you need a port density that goes beyond two 48 port ToR switches. Now you’re looking at two 2U 96 port switches.
So what do you get with HCI that’s different from buying a blade chassis and storage array? Well first off you don’t need a storage array, this is converged infrastructure after all. Each node in the HCI has an identical amount of storage in it, and the contents of that storage is replicated to other nodes in the HCI environment. The value that an individual vendor adds to the equation is the software which handles that storage replication. In HPE based hardware, the storage replication is handled by StoreVirtual VSA (virtual storage appliance?) formerly known as Lefthand before they were acquired by HPE in 2008. Nutanix uses their own Acropolis data fabric. VxRail uses VSAN naturally. Each other HCI vendor either uses VSAN or some proprietary storage replication technology. For me, the different replication technologies can be a key differentiator since they determine how much usable storage you actually end up with, and how reliable/durable your data is at rest. Then there are questions of which ones can support native encryption, site-to-site replication, data deduplication, and more.
The second thing that HCI promises is the linear scalability of components. If you need additional capacity for a particular cluster, you just snap in an additional node and away you go. The management plane should make that process straightforward and quick. In theory you shouldn’t need to install a hypervisor, configure a bunch of settings, and all the other rigmarole associated with expanding your compute infrastructure. The key differentiator between the various HCI offerings is how truly simple it is to pop in another node, and what the caveats are surrounding node expansion. Do you need to add a certain number of nodes at a time? Does the expansion have to exactly match what you already have? What is the maximum nodes per cluster, and does performance suffer beyond a certain number of nodes? A lot of this comes down to the hypervisor being used, the management overlay, and the automation layer. Which brings me to my last point…
The final thing that HCI promises to bring is simplified automation and end user driven provisioning. That’s not unique to HCI, and you really could do it with any platform, but since an HCI vendor controls the whole stack of components, they should in theory be able to bring some powerful automation and a programmable API. This is chasing the dream that already exists in the public cloud sphere with AWS and Azure. Okay okay, Google Compute Engine too. All of those public cloud providers are offering Infrastructure as a Service (IaaS), the ability for users provision resources for themselves through a portal or an API. The ability to monitor consumption. The ability for admins to create a library of complex, pre-built application solutions and provide them to the end-user. And the ability to expand to web-scale capacity in a linear and predictable fashion.
That’s the promise of Hyperconverged Infrastructure. A nascent field with a bunch of offerings. How well do they live up to the promise? In my next post we’ll take a look at the HPE HC250 and HC380.
The Science and Magic of Network Mapping and Measurement
January 9, 2025
January 2, 2025
December 30, 2024
Resourcely Guardrails and Blueprints
November 15, 2024