Prior to the widespread adoption of virtualization in enterprise environments, storage was "predictable.” System admins and storage admins could easily describe what the IO patterns looked like at the server, fabric and storage levels. For example, the enterprise BI environment may have had high read requests during the day, and significant write operations at night as the processing, analysis and reporting functions were executed.
For the most part, storage was isolated to a specific server. Run-of-the-mill file systems did not handle shared/concurrent access well, which resulted in isolated local storage or LUN-masked SAN storage. Storage and compute became a 1:1 mapping, which was fairly predictable.
However, once virtualization hit the scene, IO was sliced and diced into what is known as the IO Blender effect.
Picture, for a moment, what the contents of a blender look like when you push the pulse button. The contents are being mixed up and jumping around, hardly resembling what was there before. IO is no longer predictable. Rather, it looks like it has gone through an IO Blender.
What does IO Blender affect?
The IO Blender effect can impact a number of infrastructure components and can have a negative impact on system performance:
1) Insufficient controller cache - Storage controllers are configured with specific amounts of cache that help give breathing room and some level of workspace for incoming/outgoing data. The unpredictable nature of IO Blender patterns and concurrent access to the same resources can overwhelm controller caches and result in degraded performance with added read/write latencies.
2) Insufficient SSD cache - SSD devices are being touted as the best thing since sliced bread, and their impact is indeed significant. However, so is their cost. Modern storage systems can utilize SSDs as a caching mechanism. Due to cost, though, many environments can only afford the bare minimum for caching functionality. The IO Blender effect can exhaust the added buffer that SSDs provide in SAN caching. Due to the random IO patterns and concurrent access, a SAN cache algorithm may have a difficult time determining the host and cold data for migration, writing cached data to the spinning disks in the array or only be able to keep a small amount of a number of systems in cache. Essentially, the functionality and benefit of the purchase has a questionable return for the environment.
3) Read/write scheduling in arrays - Storage controllers are responsible for coordinating read from arrays and write operations to the arrays. Depending on the disk configuration and array configuration, scheduling and performing these operations can have a significant impact on performance – especially if large amounts of requests from disparate systems are realized.
Additionally, consider the fact that traditional storage arrays are typically designed for handling sequential reads and writes. This design decision made sense at the time because it could be expected that inbound/outbound requests were for larger contiguous amounts of data. The array could handle multiple requests at the same time (analogous to a mini IO Blender), though. The IO Blender effect completely negates the semblance of sequentially. Sequential reads/writes… What’s that?!
4) VMDK file locking in block storage – In VMware-based environments that utilize block storage, a mechanism is in place to lock VMDK files prior to accessing them for IO operations. In a highly active environment, such as those impacted by the IO Blender effect, the hypervisor spends much time and effort locking and unlocking files. Plus, the SAN needs to perform multiple operations from cues by the hypervisor. All of the locking and unlocking can take a heavy toll on SAN and environmental performance.
5) LUN Alignment - Operating system disk access relies upon placing data in blocks/clusters on disk. SAN storage relies upon placing data on blocks in the array of disks. A condition exists wherein the block sizes of the OS do not match (by some factor) with the block sizes on SAN storage. This LUN misalignment can have serious impact on performance. See the following post from Jason Boche (http://www.boche.net/blog/index.php/2009/03/20/storage-block-size-and-alignment/) for more details on LUN alignment.
Make note of Figure 1 and Figure 2 in particular: the resulting LUN misalignment can have a 2x performance hit in the worst case scenario, especially in high transaction/small request environments. This is assuming a request is for data that is the same size as the storage blocks. Requesting a misaligned 2k block, for example, may result in 4k needing to be retrieved. A little more realistic example may include an extra 1-2 block requests from the array. This situation can definitely have significant impact on all SAN consumers, including those other environments that contribute to the IO Blender.
Join us next week for part 2, which will address the future of the IO Blender.
Unique control with VM-level actions for infrastructure functions including snapshots, replication and QoS make protection and performance certain in production, and accelerate test and development cycles.