Faster Data Center Transfers with InfiniBand Network Block Device

529

The storage team of ProfitBricks has been looking for a way to speed transfers between VMs on compute nodes and physical devices on storage servers, connected via InfiniBand, in their data centers. As a solution, they developed the IBNBD driver, which presents itself as a block device on the client side and transmits the block requests to the server side, according to Danil Kipnis, Software Developer at ProfitBricks GmbH.

“Any application requiring block IO transfer over InfiniBand network can benefit from the IBNBD driver,” says Kipnis.

Danil Kipnis, Software Developer at Profitbricks GmbH
In his presentation at the upcoming Vault conference, Kipnis will describe the design of the driver and discuss its application in cloud infrastructure. We spoke with Kipnis to get a preview of his talk.

Linux.com: Please give our readers a brief overview of the IBNBD driver project.

Danil Kipnis: IBNBD (InfiniBand network block device) allows for an RDMA transfer of block IO over InfiniBand network. The driver presents itself as a block device on client side and transmits the block requests in a zero-copy fashion to the server-side via InfiniBand. The server part of the driver converts the incoming buffers back into BIOs and hands them down to the underlying block device. As soon as IO responses come back from the drive, they are being transmitted back to the client.

Linux.com: What has motivated your work in this area? What problem(s) are you aiming to solve?

Kipnis: ProfitBricks is an IaaS company. Internally, our data centers consist of compute nodes (where customer VMs are running) and storage servers (where the hard drives are) connected via InfiniBand network. The storage team of ProfitBricks has been looking for a solution for a fast transfer of customer IOs from the VM on a compute node to the physical device on the storage server. We developed the driver in order to take advantage of the high bandwidth and low latency of the InfiniBand RDMA for IO transfer without introducing the overhead of an intermediate transport protocol layer.

Linux.com: Are there existing solutions? How do they differ?

Kipnis: The SRP driver serves the same purpose while using SCSI as an intermediate protocol. Same goes for the ISER. A very similar project to ours is accelio/nbdx by Mellanox. It is different from IBNBD in that it operates in user-space on server side and its development is currently on hold/given up in favor of NVMe over Fabrics to the best of my knowledge. While NVMEoF solutions do simplify the overall storage stack, they also sacrifice the flexibility on the storage side, which can be required in a distributed replication approach.

Linux.com: What applications are likely to benefit most from the approach you describe?  

Kipnis: Any application requiring block IO transfer over InfiniBand network can benefit from the IBNBD driver. The most obvious area is the cloud context, where customer volumes are scattered across a server cluster. Here one often wants to start a VM on one machine and then attach a block device physically situated on a different machine to it.

Linux.com: What further work are you focusing on?

Kipnis: Currently, we are working on integrating the IBNBD driver into a new replication solution for our DCs. There we want to take advantage of the InfiniBand multicast feature as a way to deliver IOs to different legs of a RAID setup. This would require among other things extending the driver with a “reliable multicast” feature.

Interested in attending the Vault conference? Linux.com readers can register now with the discount code, LINUXRD5, to save $35 off the attendee registration price.