Conventional HCI architectures increase troubleshooting complexity and can put critical operations at risk.
This post is the final installment in a series looking at the limitations of deploying conventional HCI architectures for enterprise IT.
In the first two posts in this series, I looked at HCI costs and HCI performance challenges. This time, I look at one of the most often-claimed advantages of conventional HCI: that these architectures decrease complexity. The “simplicity” of HCI comes at a price:
Don’t overlook these factors when you are making important infrastructure decisions that you’re going to have to live with for many years.
The tightly coupled architecture of HCI makes it more difficult to troubleshoot performance issues. Because everything is layered together on each node, it becomes almost impossible to isolate the source of a performance bottleneck.
If increasing a VM’s memory and CPU resources don’t solve the problem, you then have to assume the problem is IO.
As you can see, the process gets complicated quickly and grows with the size of your HCI cluster. Often, the only solution to the above scenario is to add another node.
Virtualization helped solve many traditional infrastructure issues such as hardware maintenance and patching. With external storage, you can easily move VMs to another host by moving the compute and memory state using VMotion or Hyper-V live migration. With HCI architectures, storage is more tightly coupled with compute so there’s a lot more to think about:
All this means that your IT team needs to be extra careful about taking maintenance windows, and, in many cases, you lose the independence to do maintenance activities on hosts because it has a broader impact. Doing maintenance becomes risky, but we know the risks of not doing patching and other maintenance all too well.
A single HCI failure can trigger much larger problems. When a host fails for any reason, it has the following effects:
Failure of even a single component, such as a flash drive, can cause an entire node to collapse. The result is a far greater impact on operations than when storage is decoupled from the host.
Conventional HCI destroys the stateless nature of virtualization and increases risk. Performance problems are much easier to troubleshoot with a decoupled architecture, especially when the architecture has been built from the ground up to provide workload-granular analytics. For example, Tintri allows you to see the root cause of any latency issue across compute, network, and storage, giving you a comprehensive view of your infrastructure at the VM or container level.
Because storage and compute are physically and logically separated, none of the risks described above affect the Tintri enterprise cloud platform, making it a lower-risk option for large-scale enterprise infrastructure deployments.
Tintri all-flash storage and software controls each application automatically