Normalizing Data Storage Performance Reporting Across Vendors for Customer Clarity

Currently within the data storage industry, there is no common approach to normalizing vendor performance across the most meaningful vectors.  Blocks & Files Editor, Chris Mellor highlights this issue in a recent article entitled Storage product comparisons on the rack where he calls out how challenging it is for customers to determine real-world performance expectations across suppliers.  Mr. Mellor suggests an approach to normalizing vendor performance.

What customers see today is storage providers publishing their hero and real-world benchmark numbers in order to best showcase their results rather than taking into account universal industry definitions and benchmarking tools that provide greater deployment clarity.  

As a result, vendors typically publish just one to two numbers that “they are proud of” - maybe IOPS, maybe throughput, maybe latency.  And most of that time it is for READ only, rather than publishing a full set of numbers for both READ and WRITE performance across all dimensions.

If the storage industry were to report performance numbers in a consistent and normalized way, customers would have trust in the claims. Similar to how the lending industry established the “truth in lending” act in 1968, customers, analysts, and press should hold the data storage industry accountable in 2021 to “truth in performance” reporting.

The Reality:

Given that the industry does not currently publish numbers in this way, customers are left to find ways to compare all the different solutions so that they can determine the right product for their environment.  

Building upon the Chris Mellor article, it would be quite easy to have an apples to apples comparison, yet as mentioned not all the data is publicly shared.

This means customers must ask vendors to share the following set of standard metrics across all dimensions of data storage performance in order to do their own analysis.  The dimensions should use a common tool and common block/file/object sizes: 

  1. IOPs for Block Storage: Small block IO (typically 4K) measured by IOPs with READ/WRITE and latencies for each.  
  2. Throughput for Block Storage: A large block (typically 1MB) measured as throughput showing GB/sec for both READ/WRITE operations.
  3. File Storage: Measure by throughput for large file (1MB) and small file (4K) showing READ/WRITE, GB/sec and IOPs.
  4. Object Storage: Measurements on 1MB file with READ/WRITE throughput.

Customers armed with this information, will be far better equipped to make the right decision for their environment, while analysts and consultants will find new ways to add value to their clients by providing this type of insight. Pulling together the information will yield important insights for organizations. 

Of course it would be easier if all vendors would just publicly share this data and then we presume Chris Mellor would update his article to show all competitor numbers.

Note: Pavilion has been providing customers with tools to evaluate vendor performance and there are a few key points we’ve learned over the years. There are many different benchmarking tools and any “generic numbers'' used should have a reference to the tool applied.  To compare apples to apples, each vendor needs to publish numbers using the same methodology. Two of the best tools are Flexible IO Tester (fio) or gdsio for NVIDIA Magnum IO GPUDirect Storage. 

The Bottom Line:

Vendors should present results using a set of common denominators.  In his article, Chris Mellor suggests using the most common unit of measure in a modern data center, the rack unit (RU).  In the table below, Mr. Mellor breaks out vendor published numbers and it’s easy to see where gaps in published numbers exist for “truth in performance” reporting. 

Source: Blocks & Files: Storage product comparison on the rack April 8, 2021

Mr. Mellor’s above analysis specifically covers Block storage performance, so what about File and Object? To move this conversation forward, we’ve taken his approach and expanded it to show these as well.  (More to come on that in a minute).

As an industry, using performance density per RU is the most sensible common denominator with storage capacity per RU as a secondary measure.  While this BLOG focuses on performance, there is also  an important and complementary metric of capacity density per RU which is simple to measure and is incredibly insightful in the architecture of any storage solution. Customers can also have vendors share their prices to create a metric we call “cost density”.

Why the Metrics Matter:

As an example, there are vendors who claim 350GB/sec throughput for 1MB file performance, yet the equipment required to do this occupies nearly 2 entire racks or 84 rack units. So normalizing to a RU matters.

To determine the amount of performance per RU, simply divide the performance by the rack units consumed by the solution tested. For example:

  • If a vendor has a solution that delivers 40GB/sec in 4RU, then their performance per RU is 10GB/sec.
  • If another vendor publishes a 300GB/sec of throughput with 48RU to do it, then their per RU performance is only 6.25GB/sec. 

As another example, an object vendor publishes 1.372Tb/sec (bits) WRITE performance. As you drill into the metrics, it is actually:

  • 171GB/sec (bytes)
  • It took 352 HPE servers at 2RU each to deliver it
  • That equates to 704 rack units

So the WRITE performance density per rack unit in this case is 171GBps/704 = 0.49 GB/sec.

By standardizing on the RU, customers can get a clear picture of the performance density they can expect from a given product. 

When evaluating performance density, results should include all elements of the solution.  For example:

  • For Appliance-based solutions, the number of enclosures/servers that are required and supporting network equipment to interconnect those devices.
  • For Software-Defined solutions the number of servers/enclosures/networking equipment required to enable the storage solution.
  • What type of services were included? What high-availability and/or data protection schemes are applied like RAID 6, RAID 10, or erasure coding, along with the percentage of usable capacity. 

A conversation on apples to apples metrics comparisons would not be complete without a discussion re: Latency. We’ve already covered that topic in another blog titled: Truth and Accuracy in Reporting: The True Measure of Latency and additional thoughts on this will be forthcoming.

The Desired Outcome:

When customers hold the industry accountable for apples to apples comparisons everyone will win. Customers win because they will now have a roadmap for making the best decisions for their environments. Vendors win, because clarity and transparency spurs innovation.

A Tool for Customers To Use:

To assist in this process, we have taken the liberty to build upon the tool published in the Blocks & Files article to also include File and Object.

Customers can use the downloadable spreadsheet tool and watch this video on how to use it to calculate comparisons .

The tool will also generate table results and visual graphs for review. Once populated, this will normalize storage vendor performance so that customers can document their own apples to apples comparisons until such time that the industry does it for them. Download the tool.

If you have any questions about how to use the tool or to understand more about how the most performant, dense, scalable, and flexible storage platform in the universe can help you and your organization shatter expectations, simply reach out to Pavilion at info@pavilion.io, we’d be happy to assist you.

Analysis by:

Costa Hasapopoulos