With emerging technology, there can be the thought that old is not good. It could lack the features and performance the business requires. Cloud technology changes so much, do we still need something like Swift that predates OpenStack?
To answer this question, we must understand Swift’s unique architecture. Only with Swift can we harness the power of the BLOB.
A central concept to Swift is the Binary Large OBject (BLOB). Instead of block storage, data is divided into some number of binary streams. Any file, of any format, can be reduced to a series of ones and zeros, sometimes referred to as serialization. Start at the first bit of a file and count ones and zeros until you have a block, a megabyte or even five gigabytes. This becomes an object. The next number of bits becomes an object until there is no more file to divide into objects. These objects can be stored locally or sent to a Swift proxy server. The proxy server will send the object to a series of storage servicers where memcached will accept the object, at memory speeds. Definitely an advantage in the days before inexpensive solid state drives.
These independent objects can be placed anywhere, as long as they can be brought back together in the same order, which is what Swift does on our behalf through services. Swift uses three services to track the blobs, where they are stored, and who owns them:
-
Object Servers
-
Container Servers
-
Account Servers
These services can be deployed on the same system, or individually across several systems. This allows the Swift cluster to scale and meet the changing needs of the storage. The three services are independant of one another and distribute their data among the available nodes. The distribution has led to the use of the term “ring services.” The distribution among the object, container, and account rings is not round-robin, as the name might imply. Instead it uses an algorithm that includes the device partition index and weights to determine which node the object or its replicas should store the object.
The Object Servers are responsible for storing the actual blobs. The object is stored as a file while the metadata is stored in extended attributes (xattrs). As long as the local filesystem supports xattrs you should be able to use it for local storage. Each node could use its own filesystem, no need for the entire cluster to be the same.
The objects are stored relative to a container. The Container Server keeps a database of which objects are in which containers. It also maintains a total number of objects and how much storage each container is using.
The third of the “ring services” tracks container ownership and is maintained by the Account Server.
While the most common deployment of Swift is that each new node runs all three services, it can be easily changed as necessary. Some services may be more active than others, and the node resource demands can be different per ring as well. The flexibility of Swift means we can change our cluster to meet the storage demands for size or speed as necessary. We can deploy more Object Servers without the need to use resources for additional Account Servers.
Swift architecture frees us from the common constraints often found with NAS systems. We can store any data, anywhere we want, on whichever hardware we want. There is no vendor lock. Rackspace developed a forward thinking solution to cloud storage. As an open source tool it has revolutionised enterprise storage.
I discuss Swift in more detail in my recent Linux Foundation webinar on OpenStack: Exploring Object Storage with Ceph and Swift.
Watch the full webinar on demand now (login required).