Linux needs better network file systems

106

Author: Mark Stone

In a previous article we looked at local file systems in Linux. In this article we’ll examine the range of choices available for Linux network file systems. While the choices are many, we’ll see that Linux still faces significant innovation challenges; yesterday’s network paradigm isn’t necessarily the best approach to the network of tomorrow.The Traditional Paradigm

Our current model of the network file system is defined by the paradigm of the enterprise workstation. In this model, a large enterprise has a number of knowledge workers based at a single campus, all using individual work stations that are tied together on a singel local area network (LAN).

In this model, it makes sense to centralize certain services and files so that those services and files reside on only one (or a few) servers rather than replicating them on every single workstation. The resulting efficiencies fall into three categories:

  • Administration. The fewer machines the IT staff has to touch, the more efficiently they can operate. File backup and restore is a simple example. Having a backup/recovery plan for a central file server for critical files is much easier than having a backup/recovery plan for every workstation on the LAN.
  • Resources. Not all resources need to be used all the time. Making infrequently used resources available to all on a central server is more efficient. Printing is a simple example. The cost, maintenance, and management overhead of attaching a printer to every workstation would be prohibitive, and indeed most printers would sit idle most of the time. A central, shared print server makes much more sense.
  • Collaboration. Groups working on a common project need to share and exchange files regularly. Dispersing group data to individual workstations makes it more difficult to share files, and also leads to confusion over which copy of a file is the master copy. Better to have a central file server for the work group to which each group member has access.

Not all knowledge workers fit the traditional paradigm. Companies have multiple campuses. Some workers work remotely. But for the era in which standard network file systems were developed, the single campus-single LAN model was fine.

Traditional Solutions: NFS and Samba

By their very nature, network file systems are superimposed on top of the local file system; without a local file system already in place, there is nothing the network file system can identify to mount over the network. Linux really doesn’t have a native network file system, no network equivalent of ext2/ext3. In the LAN environment, Linux’s file system capabilities have been born of the necessity to get along with other operating systems.

NFS, then, is the main network file system used by Linux in Unix envrionments. Samba is the main network file system used by Linux in Windows environments that depend on Microsoft’s SBM protocol for network file sharing. Born of different operating system environments, NFS and Samba also use somewhat different metaphors.

NFS borrows its terminology from that of local file systems. Accessing a directory on another computer over the network looks like mounting a partition on a local file system. Once network-mounted, the directory is accessible as if it were another directory on the local machine.

Samba’s metaphor is based notion of services. “Share,” as in sharing a file or directory, is possible service. Once sharing is authorized, Samba’s behavior toward the end user looks similar to NFS. Samba understands other services, however, such as “print,” which lets you access another machine’s printer but not its files.

Both NFS and Samba were created in a world where the dominant network paradigm was the LAN on a single campus. While both file systems have adapted to changing network conditions, that adaptation has at times been awkward.

The New Paradigm of Occasional Connectivity

Two innovations have dramatically changed the requirements for network file systems subsequent to the initial development of NFS and the SMB protocol:

  • The first, most obvious change is the widespread proliferation of Internet connectivity in the mid to late 90s, transforming corporate LANs from isolated to interconnected networks. This changed security demands dramatically; suddenly outside intrusion over the network was a serious concern. The Internet also changed use profiles; suddenly knowledge workers expected corporate network access from home or from on the road.
  • The second, more subtle change has been the proliferation of wireless network technology and portable computing devices that use wireless technology. The result is a paradoxical notion in which connectivity is both pervasive and sporadic: pervasive, in that we are now accustomed to thinking of network access as never more than a hotspot or cell phone call away; sporadic, in that users at the end of a wireless tether are still at best occasionally connected.

To understand how these changes impact file systems, consider a simple model: The original Palm handheld. Sitting in its cradle, it was one computing device networked (in a limited sense) to another. Removed from the cradle it became a roaming device only occasionally connected. It shared files with a desktop computer, and those files had to be synchronized. An address book or calendar entry could be changed on the Palm, on the desktop, or independently and differently on both. All of these changes had to be kept in proper synchronization.

Palm’s simple approach to synchronization was to update files from whichever device had a change since last synchronization, and, when in doubt, to duplicate entries. That taught users to treat the Palm as much as possible as a read-only device and do their data entry on the desktop. The complexities that arose from this simple network structure foreshadowed many of the challenges of network file systems today.

Once your address book and calendar could go with you everywhere, knowledge workers expected to be able to access and update them everywhere. Pre-Palm, you accepted that calendar and address book updates that arose away from the office would have to wait until you returned to the office. Now Palm has spoiled us all; we expect such changes and updates to be available on demand, any time, from anywhere.

Add to that the notebook computer, at most a novelty device when NFS and SMB were born. Now not just address books and calendars are on the road, but all of a knowledge worker’s digital work. To that mix we now add cell phones that act like PDAs, and a current generation of PDAs that include much of a notebook’s functionality. Finally, none of these devices now need to depend on any kind of cable or wire to access a network. Fixed-point access is becoming a thing of the past.

What’s emerging is a network of computing devices where any device could be connected from anywhere at any time, but where connectivity can also be lost at any time. This kind of network environment introduces three main challenges:

  • Authentication
  • Data Transport
  • Synchronization

Traditional network file systems often prove ill-adapted for these challenges. In the original design of NFS, authentication was done for hosts, not users. Thus anyone who could gain access to a given machine could also gain access to all of the machines for which that one was a valid NFS host. The addition of access control lists and privilege limiting has mitigated this problem, but these are ad hoc fixes for a system not designed for the current network environment.

Further, both NFS and the SMB protocol send data in clear text over the network. At a time when LANs were mostly isolated rather than interconnected this wasn’t a problem. Today it’s a major security risk.

Of course, not all problems necessarily need to be solved at the file system level. NFS can run over an ssh tunnel, allowing ssh to provide encrypted data transport and an extra level of authentication. Similarly, in a Windows environment Microsoft’s VPN provides an encrypted tunnel.

What none of these approaches handle very well is synchronization. Think of someone copying a file onto a laptop, working on it on the plane, then reconnecting to a home or corporate server later. Now suppose that in the interim someone else in the group has been making different changes to the same file.

Some of these issues can be dealt with at the application level rather than the file system level. Rsync, a powerful program that came out of the Samba project, provides remote file synchronization over the network. Tackling integration problems at the application level, however, leaves either the user or IT staff responsible for setting up, managing, and tracking synchronization. To accomplish all of this seamlessly at the file system level, we aren’t talking about just a network file system. We’re talking about a distributed file system.

New Tricks from an Old Approach: Coda

Much of the theoretical work done on modern file systems stems from research at Carnegie Mellon University (CMU). An alternative to NFS, for example, is AFS, derived from the Andrew File System research project at CMU.

Perhaps the most ambitious file system project at CMU is Coda. Coda is a distributed file system derived originally from AFS2. It is the brainchild of Professor Satyanarayanan. Coda is designed for mobile computing in an occasionally connected environment, is designed to work in an environment of partial network failure, and is designed to respond gracefully to total network failure. Encryption is built in for data transport, with additional security provided by authentication and access control.

The basic ideas behind Coda are:

  • The master copy of a file is kept on a Coda server
  • Coda clients maintain a persistent cache of copies of files from the server
  • Coda checks the network both for the availability of connections between client and server, and for the approximate bandwidth of the connection
  • The client cache is updated intelligently based on available bandwidth; the less bandwidth, the smaller the update increments, all the way down to a worst case of zero bandwidth, i.e. no connection
  • Updates from the client to the master must be complete; no partial file changes are ever written in the master copy

All of this sounds like a big step forward in solving the problems of a distributed file system. The technical challenges are not small, however, and Coda is still very much a work in progress. Work on Coda began in 1987, and the FAQ for the project reports, “a small userbase (20-30 users) and a few servers are pretty workable. Such a setup has been running here at CMU for the past couple of years without significant disasters. Don’t expect to easily handle terabytes of data or a large group of non-technical oriented users.”

Coda’s Descendants: Intermezzo

Keep in mind that Coda is a research project. It aims to solve the distributed file system problems in a fundamental and comprehensive way. In the real world, an 80% solution will often do. Towards that end, a lighter weight descendant of Coda has been designed for Linux: Intermezzo.

Intermezzo has been developed by kernel hacker, file system guru, and former Coda project member Peter Braam.

Intermezzo follows a similar architectural philosophy to Coda. There is a server element to the file system, and a client element, with the client side relying on a persistent cache to keep files in synch. Communication between client and server is handled by a separate program, InterSync.

Intermezzo has been included as a file system option for the Linux kernel since kernel version 2.4.15. Like Coda, it is far from a finished project, but still represents an important future direction for Linux file systems.

A Word About Clusters

The haphazard network world of the Internet and mobile users may seem the very opposite of the tightly structured network of Linux clusters. Surprisingly, the file system challenges are quite similar.

Think of the Internet as a cluster in slow motion. In a cluster environment of fibre channel interconnects, the lag time associated with disk access can look like a server failure or network outage does on the Internet. What might look like continuous availability in another context looks like intermittent connectivity in the high demand cluster context.

Thought of in this way, it should come as no surprise that the most direct application of Intermezzo is not for mobile users, but for clusters. In fact Peter Braam and his team are working on a commercial version of their file system architecture, called “Lustre, that is available through Braam’s company, ClusterFS. Lustre has been used at Lawrence Livermore National Labs, the National Center for Supercomputing Applications (NCSA), and other centers for supercomputing.

The Future of Network File Systems

In today’s network paradigm, the network file system challenge has become the distributed file system challenge, as we have moved from self-contained LAN environments to a world of occasionally connected computing. To be competitive in this environment, an operating system must have a file system that handles distribution and synchronization problems smoothly and securely.

Apple understands this. Apple’s relentless focus on the “digital lifestyle” has led them to work hard at getting a wide array of devices, from cell phones to iPods to video cameras, to connect and communicate. MacOS X gets high marks for its capabilities in this area.

Microsoft certainly understands the challenge as well. While Windows-based networks today are still mostly locked into a complex of VPNs and SMB, the plans for Longhorn are quite different. The whole .Net infrastructure, and the way Avalon aims to leverage it, should address many distributed file system issues in a way that is transparent to the user.

Will Linux compete? The potential is there, and projects like Intermezzo show that many of the right building blocks are in place. What remains is for a high profile company or project to step forward and make distributed file problems a priority. So far, that hasn’t happened.