Putting the Intelligent in Intelligent Infrastructure - Part I - the File System
by Rob Girard, Principal Technical Marketing Engineer
There’s an increasing amount of buzz about “Intelligent Infrastructure” these days. Someone’s gotta standup and call BS! I am seeing ridiculous claims from many of our competitors beating their chests and calling their storage solutions “intelligent.” I’m not going to name names… I don’t have to… your social media feeds are already full of these claims.
You might be thinking to yourself, “What makes your storage solutions intelligent?” I’m glad you asked, because I want to set the record straight!
First, we need to level set on what is intelligence. A few years ago a colleague recommended a great book to me: ‘On Intelligence’ by Jeff Hawkins. Although written in 2005, it still holds a lot of weight today. This book set out to explore artificial intelligence (AI), but it first needed to pause and ask, “what is intelligence?” More specifically, “what about the human brain makes it intelligent?”
If I were to take a stab at summarizing (a skill I am wildly horrible at, as evidenced in my long-windedness) here’s what I took away from it: We are prediction machines. We’ve been designed to re-wire ourselves as we observe, learn, and adapt. On first inspection of any new object or concept, we spend a lot of mental energy discovering and classifying it. As we move through the world if we had to use our most expensive brain cycles to analyze EVERYTHING we see, hear, touch, taste, and smell as it if it were the first time we’d encountered it, we wouldn’t make it out of bed in the morning! But as we learn, we can break down concepts into smaller pieces of familiarity. The cool thing is that we don’t just learn in the conscious portion of our minds; we push those learned patterns down to the unconscious edges where they are perceived.
When we encounter something normal, we simply pass a brief interpretation of that object/experience to our internal CPUs to deal with: cat, ball, airplane jets, etc. But, in the event we witness something we’ve learned to recognize that doesn’t seem quite right, such as a 3rd eye on a cat (which we nearly missed in our brief passing!)… SOUND THE ALARMS! Summon all mental energy to analyze and probe the unknown! Give it a new meaning and classify it for the next time we encounter a three-eyed feline. In this case, it turns out it was another small object we recognize: a dead leaf, which was sitting on the cat’s forehead. Decrease DEFCON status and return to business as usual.
OK, enough banter on a topic (the human brain) of which I am not an expert. Instead let me track on a topic of which I am much more familiar: computers, systems, and storage. There is a complex relationships between apps, the users driving them, the systems they run within, their interaction with neighboring systems, the affect they have on network paths, the storage I/O they generate, and the storage medium employed to persist data, etc.
Intelligence starts with an intelligent design, a foundation from which future intelligence can spawn. Admittedly, the design of a Tintri VMstore isn’t on the same level of design as our wetware brains are, but it’s still pretty impressive. We started with a custom file system, built from the ground up to be intelligent. Not intelligent for everything; we had specific goals to solve tough storage challenges that were specific to virtualized environments. We set out to do this in 2008, long before most enterprises were “all-in” virtualized, and the number of virtual machines (VMs) deployed hadn’t surpassed the number of physical servers deployed. But we could see it was on the rise and simply “made sense.” That, and it feels good to solve a tough problem for the betterment of humankind.
Referring back to my brain example, which is the foundation for human intelligence, a key tenant of its operational model is the ability to classify random signals into things, i.e., objects/concepts that we can describe, understand, work with, and/or avoid. What are some things our file system needs to intrinsically understand? Virtual Machines and the files that comprise them. We taught our file system that a VM is described by a configuration file (a .vmx file in the case of VMware’s ESX/ESXi/vSphere). This file tells us more about the VM, such as how many vDisks it has, and where they live (VMNAME.vmdk and VMNAME-flat.vmdk). These vDisks may have snapshots (VMNAME-0000000001.vmdk & VMNAME-0000000001-delta.vmdk) as well as vSwap files that are used when host memory pressure needs to spill contents over to storage. The list goes on….
OK, we taught our file system about certain objects. Later, we taught it more objects to support additional hypervisor integrations, such as the equivalent files for Hyper-V, Red Hat Enterprise Virtualization, Xen Server, and OpenStack. Recently, we taught the VMstore yet another new trick i.e., storing, serving, protecting, and cloning objects beyond VMs to include SQL Server database files: .mdf, .ldf, .ndf (and a few more along the way that we caught in our Q/A processes). Since we did it in an extensible way, this enables us to programmatically do the same again and again in the future, with minimal lift relative to the initial effort.
The result? We have a file system that understands what a VM is, instead of serving a set of seemingly random read and write calls for various offsets, whose contents and relevance are privy only to the owner of the file system, which are the vSphere hosts in the case of VMFS; not the underlying storage system.
What else? All the higher-level objects need their components (files) to be captured at any point in time, and these need to be completely consistent with one another, preserving write order. Yes, I am referring to snapshots! Per-VM snapshots, added in v1.4, which achieved general availability in early 2012. I remember this date because I was a customer that participated in the beta.
These snapshots can be used to roll back to a point in time, for data recovery. They can be replicated for offsite protection, and the replication itself is another intelligent design that not only dedupes on the wire to save precious WAN bandwidth but replicates at a per-VM granularity. This ensures more useful 100% complete replica VMs vs. a big LUN that might have only completed 90% by the time it’s needed (which = 0% from a functional perspective).
But wait, there’s more! You can breed new objects through intelligent cloning! This means you can instantly create more VMs (or databases) without any performance overhead. Not only can VMstore create them, but we know what to do with them, automatically adding them to the inventory of specified hosts! Plus, these are all incredibly space efficient since they don’t contain the churn of any other data except the exact data set you want: the data that belongs to specific VMs.
We taught our file system about the concept of hosts and clusters, and how to poll these to automatically discover which of the files we host belong to particular VMs. Or in the case of cloning, how to speak to these hypervisors in their native language (APIs) to add the new VMs we just created into inventory.
We also taught our file system the ground rules of performance. All VMs (and databases) need to play nice with one another. This is where things got really interesting. We introduced machine learning (ML) to discover the performance characteristics of VMs, and then assume that the VM will continue to behave this way for the near future. So, we reserve performance to prevent a VM from being impacted by new workloads, or an existing workload whose characteristics have strayed from their demonstrated norm. Under the covers, our file system is constantly tweaking and tuning how to prioritize I/O for each VM based upon a lot of rules: some are constants, others are variable based on how other VMs are behaving. These rules for recognizing that I/O for each VM is independent from all other I/O are the foundation of our patented auto-QoS engine.
I’m going to stop myself here because hopefully I’ve made my point that we’ve put a LOT of thought into a creating a file system to tackle a specific problem. We had the foresight to load it with variables and tunables to adapt and mature it. Our latest v4.5 is not just a number, it’s a whole lot of improvement way beyond 1.0! It’s this base brain that we’ve designed, built, and delivered to customers all over the world.
Folks: This is the real deal. With VMstore, we’ve providing you with the choice of a different experience that demonstrably can simplify your environment and substantially reduce your administrative overhead. If you want to see and experience first-hand what intelligence resides in our Intelligent Infrastructure and how it can change your environment for the better, I encourage you to reach out to us for a demo or talk to an existing Tintri customer and ask them about their experience.
In my next blog, Putting the Intelligent in Intelligent Infrastructure — Part II — Metrics and Analytics I’ll continue the intelligent infrastructure discussion and explain how the granularity of metrics and analytics can fundamentally change your user experience and reduce your administrative overhead.
Thank you for reading; I look forward to the next time. – Rob.
Unique control with VM-level actions for infrastructure functions including snapshots, replication and QoS make protection and performance certain in production, and accelerate test and development cycles.