As we were developing the Tintri OS 1.4 release (coming this summer), we looked at how customers are using our systems. We’re previewing Tintri's new advanced snapshot and cloning capabilities with some early-access customers right now and seeing lots of eyes light up. There isn’t anything like this out there today that can truly snapshot at the VM level. When we talked to many of them we heard a common refrain, which went something like:
“Virtualizing lets us do things like snapshot or clone VMs easily, which should make managing our environment much simpler, but the storage layer trips us up when we try to scale.”
VM snapshots and clones are closely related — in fact, clones are a form of snapshots — but have different attributes and uses:
Being a Clone Isn’t Easy
Many storage systems provide some form of snapshots and clones, but they differ greatly in terms of efficiency and performance. Some track changes between snapshots using very large blocks. This means that even small changes between snapshots consume large amounts of space. Others efficiently track changes to data but create lots of metadata, increasing management overhead. Finally, many implementations impose significant performance penalties for clones, which result in slow reads and writes.
In every instance, storage systems take snapshots at the storage object layer — usually a volume or LUN. Although these can be space-efficient, managing VM snapshots quickly becomes complicated as the volume of metadata explodes.
Rethinking VM Snapshots and Clones
We approached this problem from the standpoint of a virtualized environment. Our customers wanted to take snapshots of VMs, but had only two options — use the native hypervisor cloning capability, which consumes host resources and is inefficient at large scale, or use array snapshots.
Tintri snapshots and clones are operations on the VM itself (and just the VM!), and make very efficient use of data and metadata and impose very little, if any, performance overhead. This is achieved by sharing both data and metadata at a fine granularity and by designing the core data paths from the ground up to work efficiently with snapshots and clones. Snapshots and clones can be created instantaneously regardless of the size of the VM, and use no additional space until they are modified. Furthermore, because snapshots and clones can be created and managed on a per-VM basis, it gives the user more flexibility in managing the data protection policies for different VMs.
Tintri VM Snapshots
Snapshots can be created and deleted manually or automatically according to a default schedule but can also be customized for individual VMs (Figure 1).
Figure 1: VM snapshot management policy
You can also choose to create either crash-consistent or VM-consistent snapshots (Figure 2). Tintri snapshots are integrated with vSphere to allow the creation of VM-consistent snapshots: i.e., before the snapshot is created on the storage array, the VM is first quiesced and stabilized. With crash-consistent snapshots, however, the VM is not quiesced. As a result, creation of these snapshots is significantly faster, but there is no guarantee that in-flight IO for the VM will be captured in the snapshot.
Users can view various snapshot statistics such as total snapshot space usage (Figure 3) and per-VM snapshot details including the change rate (Figure 4) to decide how to tune the snapshot schedule.
Figure 3: Total VM snapshot usage
Figure 4: Per VM snapshot change rate
Tintri cloning allows users to instantaneously create multiple space-efficient, high-performance clones or copies of a VM. Clones are treated as first-class citizens just like all other VMs. A single cloning operation can simultaneously create hundreds of clones, and can be customized based on templates created in VMware vCenter (Figure 5).
Figure 5: Clone creation
It’s a very easy process for the user, but a lot of work happens behind the curtain to clone a live VM. The Tintri OS first takes a snapshot of the VM. The clone is then configured to share the data and metadata in the snapshot. Since all of the clone’s data and metadata is initially stored in the shared snapshot, the initial space consumption of the clone is zero. The space consumption of the clone increases only as new data is written to the clone. Clones can also be created from an existing snapshot of a VM. As with all VMs on a system, clones benefit from Tintri’s deduplication and compression in flash. As a result, the clones are both space-efficient and benefit from flash performance.
Figure 6 below illustrates the benefit of Tintri clones. We have a VM called linux-vm with a provisioned size of 500GB and a total space consumption of 200GB. Suppose the user creates five copies of this VM. The brute-force solution would copy this VM’s virtual disk files a total of five times. This is very time-consuming and space-inefficient: the total amount of space consumed by these five copies would be 1TB, while five Tintri clones wouldn’t use any additional space; per-VM Tintri clones only consume additional space as they are modified.
Figure 6: Copies vs. clones
You can create VM snapshots of Tintri clones, which inherit the default snapshot creation and deletion policy of the base VM. You can even create clones of clones and snapshots of clones of clones and so forth. Upon creation, clones are automatically registered with vCenter. Any specified customization specs are applied, and the clones are ready to be powered on within seconds. The cloning capability is also available through the NFS VMware APIs for Array Integration (VAAI) plug-in, so array-side cloning can be initiated directly from vCenter.
We're very excited about the new snapshot and cloning capability. As one customer told us last week, “This will revolutionize how you look at storage in your virtual environments. Cloning is so fast I didn’t have time to refill my coffee.”
Pratap Singh, Member of Technical Staff at Tintri, also contributed to this blog.
Unique control with VM-level actions for infrastructure functions including snapshots, replication and QoS make protection and performance certain in production, and accelerate test and development cycles.