Different Approaches for Delivering Ultra-Low Latency Shared Storage

Several new storage systems have come to market with the goal of delivering shared flash resources as a service to high-scale, distributed applications.

These products take advantage of some of the following technology developments in the storage and networking space: Standards-based PCIe-Connected SSDs, RDMA-Capable Ethernet Networking up to 100 Gbe, a standard storage protocol designed for PCIe-Connected SSDs (NVMe), and a standard protocol for remotely Accessing NVMe devices (NVMe-Over-Fabrics, or NVMe-oF). Note that Red Hat 7.4 and Ubuntu 16 both now include NVMe-oF support inbox.

These 4 technologies can allow applications to get performance from shared storage network similar to SSDs installed directly in the server as DAS, but with all of the capacity and data management advantages of networked storage. As a result, a new wave of ultra-low-latency shared storage products are coming to market, all trying to deliver on the promise of providing ultra-low latency storage network in a shared storage environment.

What is latency in storage? Different approaches use varying combinations of 4 of the above low latency data storage technologies and therefore deliver different benefits and drawbacks.

Many challenges exist when combining high-speed, low latency networking on the front end of a storage system with very fast media in the back end. The primary problem is that the traditional storage controller architecture that exists in the storage arrays of the past couple of decades will be overwhelmed, and become a huge performance bottleneck. This includes the current-generation controllers in All-Flash-Arrays. Simply upgrading network interface and protocols, or swapping out SAS or SATA SSDs for NVMe SSDs will end up squandering much of the performance gains potentially delivered by this collection of new technologies, and therefore not satisfy the requirements for high-scale application environments that are currently using DAS.

To alleviate this, most of these new products use the host tier to scale performance by deploying a custom software stack in application servers to access shared storage resources. This is a similar development to what occurred with HDD-based shared storage over time. At first, there were JBODs without any real data management features and simple controllers, and then vendors started building host-based software in the form of clustered file systems and volume managers to manage shared storage. Then storage vendors then put the data management into the storage controllers directly, which is how most shared storage is deployed today.

We are seeing a similar progression with shared NVMe-based storage product evolution. Early products implement storage management features in the host tier to deliver performance and avoid the controller bottleneck. However, this means that customers need to install custom software in all of their servers to take advantage of these products, losing the advantage of using an inbox NVMe-oF driver in their low latency environment, as well as absorbing CPU and memory resources that could otherwise be used by applications in the host tier.

So, how do you build an ultra-low-latency shared storage system that is disaggregated from the application tier and self-contained like traditional enterprise arrays AND delivers all of the benefits of 1-4 above?

The Pavilion Memory Array is the only product that has accomplished this feat, and it requires re-thinking storage array controller hardware and software design from the ground up. The team at Pavilion started working on this over 3 years ago, and the product you see today is the result of this fresh approach.