Why TCP/RoCE is the Right Way to Build your NVMe-oF Storage Fabric

For over two decades fibre channel (FC) has been the interconnect of choice for fabric attached block storage. Offering a secure, lossless transport with good performance, fibre channel has been deployed in almost every data center in the world. 

And when you are responsible for vast amounts of organizational data and have to meet uptime and performance requirements, using a proven interconnect that provides time tested security and reliability would seem to make a lot of sense. Fibre channel, as that proven interconnect, has been the de facto storage network of the data center for decades. Much like your grandfather’s Oldsmobile, it is safe, secure, and reliable. 

It’s time to trade in that Oldsmobile

Storage administrators are familiar with fibre channel. For all its ubiquity as a data transport, ethernet is not as familiar or trusted in the world of storage. Ethernet has been the domain of the network administrator, and in the data center, oftentimes roles and groups are well defined with limited crossover. Therefore exposure to and comfort with this technology has been hampered by lack of experience with it.

As the Tesla of storage networking, NVMe-oF is set to change that. While NVMe-oF can use either fibre channel or ethernet as a transport, the overwhelming advantages of TCP/RoCE over FC are clear. When compared to NVMe over Fibre Channel, NVMe over TCP/RoCE offers higher performance, lower latency, significantly reduced costs, greater ease of use, and a more clearly defined roadmap for the future. You can read more about these advantages here

Still, for all the benefits of moving to NVMe over TCP, storage professionals want to be sure that it can provide the performance they need, along with the security and reliability that were hallmarks of FC. Fortunately, all of these concerns can be easily addressed. Configuring a storage network to use NVMe-oF over TCP/RoCE is simple, secure, and reliable. Like a race car with airbags, it is both fast and safe. 

Get your passengers there safely

Converged ethernet provides a lossless transport through the elimination of dropped packets due to queue overload. This eliminates the early concern that ethernet allows for packets to be dropped, which was a major concern in the early days of iSCSI.

Have the road to yourself

Fibre channel is a different physical network than the data network, so storage and network traffic have always been separate. This separation can be maintained on ethernet by simply following the best practice of placing data and network traffic on different switches. When this cannot be done, different VLANs can be used to accomplish the same separation. 

The Road Ahead

With 10, 25, and even 100GbE solutions commonly deployed, along with 200 and 400GbE products now available, ethernet offers significantly greater bandwidth than fibre channel. Fibre channel does not currently have a clear roadmap beyond 128 GFC. Even if a next generation were to become available, doubling performance, fibre channel would still lag far behind ethernet in terms of available bandwidth. 

Test Drive: Building Your Network

When using NVMe TCP, servers can be configured with any standard 10GbE or faster NIC, such as those from Nivida (Mellanox), Intel, QLogic, Broadcom and many others. Switch choices include those offered by Juniper, Cisco, Arista, Nvidia (Mellanox), Dell, HPE and others that support 25GbE or higher. When configuring the network, the storage should be on a separate network than the data as a best practice. However, logical separation of the storage from the data network can be achieved using VLANs. 

When using NVMe RoCE the requirements are similar, but there are a couple of things to note. First, the servers will need NICs that support RDMA, such as those from Mellanox or QLogic. When configuring the fabric, it should be 25GbE or faster. The fabric will need to support RoCE V2 and, depending on the configuration, flow control for RDMA will need to be enabled. 

Of course, to take advantage of your high speed network, you need a storage solution that can deliver the performance you need. The last thing you want is to take that old station wagon out on the race track. 

The Pavilion Hyperparallel Flash Array (HFA) provides great configuration flexibility. The HFA can support up to 40 100GbE ports, unlike legacy all flash arrays (AFAs) which typically only have four or eight ports. This allows the HFA to be deployed in direct connect mode when used in rackscale configurations. Servers can use multiple ports to directly connect to different controller ports on the HFA for high availability.  In this configuration, each server can directly connect to a controller port on the HFA, eliminating the need for a switch, or switches can be used either to enable full hardware and network path HA  or to interconnect with other racks, to make the rack a unit of failure.

The Right Fuel

The network can also be configured with any number of servers connected to any number of arrays. Pavilion HFAs have been deployed into production environments with thousands of server connections to multiple arrays. In that type of leaf-spine deployment, servers can be configured to connect to multiple switches for a full HA solution. In this type of deployment, flow control for RDMA should be enabled to ensure performance across the fabric. 

The Complete Package

The net result is that NVMe-oF over TCP/RoCE can deliver far greater throughput with far lower latency than NVMe over Fibre Channel, while being simpler to configure and manage.  Modern ethernet hardware that supports TCP and RoCE is broadly available and has demonstrated the performance, reliability and security that storage networks require. 

For more information on how NVMe over TCP compares to NVMe over Fibre Channel, read our performance blog.