Node.js Buffer API Changes

168

By James M Snell.  I recently landed Node.js Pull Request https://github.com/nodejs/node/pull/4682 into the master branch. It’s important to understand what it does and what changes are in store for the upcoming Node.js v6 release.

The Node.js Buffer API is one of the most extensively used APIs in Node. There is, however, a challenge. About three months ago, a discussion started around ambiguities and usability issues that exist when using the “Buffer()” method to create new Buffer instances. Take the following example, for instance:

const jsonString = getJsonStringSomehow();
const myBuffer = Buffer(JSON.parse(jsonString));

Many developers may or may not be familiar with the fact that it is possible to generate a JSON representation of a Buffer instance, pass that around, and create a new Buffer from that JSON string. For instance, given the following Buffer instance:

Buffer([1,2,3])

The JSON representation is:

{“type”:”Buffer”,”data”:[1,2,3]}

The example code above essentially takes that JSON string, parses it, and passes it off to the Buffer constructor to create the new Buffer. Easy. But there’s a problem. What happens if the jsonString happens to simply be a number instead of the actual JSON representation of the Buffer?

const jsonString = '100';
const myBuffer = Buffer(JSON.parse(jsonString));

What many developers do not realize is the fact that passing this jsonString to JSON.parse(jsonString) will successfully parse the input as a JSON Number. Passing this number into the Buffer constructor will cause the Buffer to be created by allocating new, uninitialized memory. This uninitialized memory can contain potentially sensitive data that can end up being leaked if not handled appropriately.

While the behavior of the Buffer constructor has been well documented for quite some time, the fact that the Buffer constructor implements significantly different behavior based on what kinds of values are passed into it represents a fundamental API usability issue that if not understood can lead a developer to inadvertently introduce significant bugs and vulnerabilities into their applications. To fix this https://github.com/nodejs/node/pull/4682 introduces a number of new constructor methods used to create Buffer instances. The existing Buffer() constructor will continue to work, but starting with the upcoming release of Node.js v6, it is recommended that all developers begin migrating their code to use the new constructors.

// Allocate *initialized* memory. It will be zero-filled by default.
// The optional fill and encoding parameters can be used to specify
// an alternate fill value.
Buffer.alloc(size[, fill[, encoding[[)
// Allocate *uninitialized* memory
Buffer.allocUnsafe(size)
// Create a Buffer from a String, Array, Buffer, or ArrayBuffer
Buffer.from(str[, encoding])
Buffer.from(array)
Buffer.from(buffer)
Buffer.from(arrayBuffer[, offset[, length[[)

The new Buffer.allocUnsafe(size) method is the direct replacement for the existing Buffer(size) constructor where size is the number of bytes to allocate for the newly created Buffer. It is important to understand that the memory allocated by this method is uninitialized and must be overwritten completely in order to avoid accidentally leaking data when the Buffer is read out.

The new Buffer.alloc(size[, fill[, encoding[[) method, on the other hand always allocates Buffer instances with initialized memory. If the fill parameter is left undefined, the instance will be zero-filled. If the fill parameter is provided, the new Buffer is filled automatically. This is generally the equivalent to the existing Buffer(size).fill(val) pattern.

The various Buffer.from() methods are the direct replacement to the equivalent Buffer(str[, encoding]), Buffer(array), Buffer(buffer), and Buffer(arrayBuffer) constructors. The key difference with Buffer.from() is that an error will be thrown in the first argument passed is a Number.

Another significant addition coming in v6 is the introduction of the zero-fill-buffers command-line flag. Setting this flag when launching Node will force all Buffers created using Buffer.allocUnsafe(), Buffer(size), and SlowBuffer(size) to be zero-filled by default, overriding the existing default behavior.

$ node
> Buffer.allocUnsafe(10)
<Buffer 50 04 80 02 01 00 00 00 0a 00>
$ node --zero-fill-buffers
> Buffer.allocUnsafe(10)
<Buffer 00 00 00 00 00 00 00 00 00 00>

Using this new command line flag, developers can continue to safely use older modules that have not yet been updated to use the new constructor APIs and that may not be currently properly validating the input to the Buffer() constructor.

With these changes, you may be wondering what is going to happen to the existing Buffer() constructor. The answer is simple: nothing much. This pull request adds a note to the documentation that the existing Buffer() constructors have been deprecated however the constructor will continue to operate without any changes. In Node.js core terms we call this a “soft deprecation” or “docs only deprecation”. Existing code that uses the Buffer() constructor will continue to operate as it has before.

It must be noted, however, that an additional change is being considered for the Buffer(size) constructor. Currently (and historically), Buffer(size) has always allocated uninitialized memory. The additional change being considered is to switch that so that Buffer(size) will allocate initialized memory by default. If this change is made, it will have to be backported to all Node.js release streams (v5, v4, v0.12 and v0.10). The decision to make this change is still being discussed. For now, however, Buffer(size) continues to operate as it always has.

snellJames M Snell is IBM Technical Lead for Node.js

This article originally posted on Medium.