One of the benefits of VM-aware storage is that application behavior is easily visible. This applies to several dimensions—performance impact, but also data generation impact.
Tintri per-VM capacity usage and snapshots allow unprecedented visibility into application behavior.
Accessing this information can be done at a glance by searching for the VM on the VMstore UI and simply reading off the results.
Tintri per-VM replication, tightly integrated with per-VM snapshots, allows unprecedented control over storage replication. You can select exactly what to replicate.
Is your application generating more data this week than it did last week? Has something suddenly changed? Has a user suddenly decided to fill up their VM's local filesystem with uncompressible contraband media content? Has somebody turned on guest-side disk or filesystem encryption, causing a big drop in data compressibility on the storage side?
It's easy to notice things like this when you have per-VM snapshots; all you have to do is look at the size of the incremental snapshots (daily or hourly) and see literally how much new data a VM generated in a given day.
In the field
Recently we had an interesting case where it was necessary for a customer to replicate a 10TB virtual machine between two data centers approximately 1000 miles apart. The network bandwidth available was approximately 80Mbps. How long would it take to replicate such a VM?
Calculating this answer was straightforward. First, we looked at the post-compressed size of the VM. This is possible because the Tintri filesystem calculates the post-compressed size of the VM's live data as well as its snapshots. It doesn't matter what the compression rate of the overall system is; that might vary significantly from the compression rate for the data in any one particular VM.
And, because Tintri VM replication also compresses the data in transit, the sum of the compressed size of the VM's snapshots indicates how much total data will have to transfer over the network. Tintri's VM replication also utilizes deduplication, but in this case we made no assumption that any of the VM's data would be found on the destination.
Here is the data on this VM's snapshots, taken directly from the VMstore UI on the customer's replication source system:
The "Changed MB" column gives the space consumed by each snapshot on a post-compression basis. We estimated that the replication of these snapshots would take approximately 11 days. The math is pretty simple: 8.5TB divided by (9 MB/sec * 24 hours/day * 3600 seconds/hour) gives approximately 11 days.
One more question
But what if the application in this VM was to generate a lot of additional new data while the existing snapshots were being replicated? If the application generated, say, 1TB per day of new data—this is not unheard of— we would never be able to catch up, given the available WAN bandwidth.
In this case, however, there was nothing to worry about. We looked at the incremental snapshot sizes -- the snapshots newer than the oldest one -- and they were very modest, cumulatively under 1% of the size of the oldest and largest snapshot of the VM. This pattern is typical of many applications -- a lot of apps just don't generate all that much new data, most of the time.
For the VM here, this means that, barring some unexpected drastic change in application behavior, the incremental snapshots would replicate quickly once the large base snapshot finished (snapshots replicated in order, oldest to newest). Ultimately, the VM here finished replicating in 11 days, as predicted.