Flash Revolution, Part 1: Disk-based Flash Products

Edward Lee

Architect

Flash is revolutionizing storage. The decades-long bottleneck in storage – disk spindles -- is being obliterated! A single commodity SSD is 400 times faster than a hard disk.  In comparison, the speed of sound is "only" 250 times faster than walking! Moreover, flash will continue to scale with rapid improvements in semiconductor technology.

While flash provides extraordinarily high IOPS, it brings a whole new set of problems: write amplification, latency spikes, limited write endurance, and - last but not least - very high $/GB. Today, commodity MLC SSDs cost about $2/GB - approximately twenty times more than SATA hard disks. This is too expensive to run many mainstream applications on SSD. To leverage the high IOPS but compensate for the high $/GB of flash, flash storage systems are employing a variety of techniques such as caching, tiering, and inline compression and dedupe.

Existing storage vendors and new entrants have attempted to exploit flash in different ways. These products can be grouped into two broad categories, based on their impact on latency:

  • Disk-based products with flash as a cache.
  • Flash-based products (with or without HDDs for expanded capacity).
Flash as cache still suffers from high latency since writes must be committed to disk.
Figure 1: Flash as Cache architectures commit writes to disk.

 

As seen in Figure 1, disk-based products are fundamentally designed to optimize the use of hard disk drives, with flash bolted-on as a cache to accelerate read performance. Flash as a cache is relatively easy to implement, so it is not surprising that existing legacy storage vendors have taken this path. Many flash-as-cache implementations are non-persistent and non-redundant, so performance plummets after crashes and/or component failures. Since the “master” copy remains on hard disk, reads benefit from flash, but writes do not. Therefore, overall performance will not scale directly with improvements in flash technology.

Disk-based architectures can achieve at best about 5-15ms latency, even with flash added to the system. Flash-based architectures can consistently deliver sub-millisecond latency.
Figure 2: Disk-based architectures suffer from high latency even with flash added to them.

 

Figure 2 shows the impact of flash hit rate on latency. Hit rates of 50% are typical for disk-based products with flash as a cache.  Even with hit rates as high as 67%, average read latencies are ten times higher than flash-based products.

Next-up in the series: Flash Revolution Part 2: Flash-based products.