What is Open Source Cloud?

194

Editor’s Note: This is a guest post from Joe Brockmeier, community evangelist for CloudStack at Citrix. 

For all the talk about cloud, it might come as a surprise to many in the industry that “cloud” is not a well-understood term. It’s often perceived as “just a buzzword” or something without a lot of substance. While the term can be abused, it’s actually an important concept and it’s certainly not just a passing fad.

In talking to people following the Apache CloudStack graduation, and meeting with the local Linux User Group (LUG), it dawned on me that cloud still bears some explanation. Let’s take a look the standard definition, some types of clouds, and why it matters.

NIST Definition (And Then Some…)

The National Institute of Standards and Technology (NIST) has a pretty good definition of cloud computing, which breaks down into five basic characteristics:

  • On-demand self-service – this means that users can provision their own services without requiring any interaction with another person. You can plunk down a credit card with Amazon, Dropbox, Contegix, or any number of cloud providers and start using the service almost immediately.

  • Broad network access – the cloud’s functions are available over the network.

  • Resource pooling – compute, storage, network, etc., are all pooled so that multiple users may make use of the service. Users don’t need to know the details about the resources they’re being assigned.

  • Rapid elasticity – the resources can scale (up or down) rapidly in response to demand. Resources usually appear unlimited (or close to it) to the end users.

  • Measured service – users can see how much of the resource they’re using, and are usually billed accordingly. Providers can tell exactly how much services users have consumed, and can bill (or for private clouds, chargeback) accordingly.

I also add one other item to the definition, which is sort of indicated in the Broad network access bit, but not explicitly:

  • API – if the service doesn’t expose an API, it’s not really “cloudy.” You should be able to access a service programatically. Especially true if you’re talking about an open cloud service. 

Types of Cloud

There are several types of cloud services that are common (you’ll find other Thing-as-a-Service types, but these are the dominant three):

  • Software-as-a-Service (SaaS) – things like Dropbox, Google Docs, Salesforce.com, or ownCloud are SaaS. Software that provide network/Web-based applications or services. Pretty much everything is abstracted away from the user here: they don’t need to know what OS the application is running on, nor how many resources are allocated to it. The user doesn’t have to handle upgrading software, or worry about underlying dependencies.

  • Platform-as-a-Service (PaaS) – a PaaS is a service or stack that takes care of the infrastructure, middleware, and orchestration to allow developers to focus on creating an application. Basically, it abstracts away the infrastructure layer so developers can create an application in their favorite language/framework, without getting bogged down in deployment details like the underlying operating system.

Examples of a PaaS: Google AppEngine, Engine Yard, or if you want an open source version, OpenShift.

  • Infrastructure-as-a-Service (IaaS) – finally, that brings us to the IaaS layer. Users can provision compute, storage, and network resources but the underlying details are still abstracted away. So, for example, you can spin up an “instance” using CloudStack or Amazon EC2 with the equivalent of 2 Xeon CPUs at 2.0GHz, 4GB of RAM, and 100GB of storage and a public IP address.

But you don’t have to worry about which server that resides on, what the underlying hypervisor is, how to provision the IP address on the switches, etc.

Examples of IaaS include Apache CloudStackEucalyptusOpenStack on the open cloud side, or Amazon Web Services EC2, and Google Compute on the non-open side.

How It Works

If you’re using an IaaS, you really don’t need to know how it works – that’s the beauty of it. But if you’re thinking about deploying one, it helps to know how they work and what you’re talking about.

It may be easiest to think of IaaS cloud as a sort of meta-OS. If you think about Linux, it manages all the resources of your server, desktop, laptop, or mobile device so that you can run applications on top of the hardware. It’s in charge of the network, storage, processor, etc.

An IaaS is like that, but at scale. It’s telling the individual hypervisors, network components, storage devices or servers what to do so that they don’t have to be managed manually.

It sounds like it should be amazingly complex – and an IaaS can be non-trivial to set up – but it’s not as complex as you might think.

If you take, for example, Apache CloudStack – you have an application that runs on one or more master servers and communicates with the hypervisors, storage, and network devices. It provides an interface via an API or Web-based UI that admins and users can interact with to manage resources. Instead of having to shell into a server and provision it directly, a user or admin can request specific resources and CloudStack will take care of the rest.

Why It Matters

This is extremely powerful when operating at scale and/or in an environment where it’s necessary to manage resources programatically (think a test/dev environment, for example) and where it’s necessary to allow users to provision their own resources on demand, isolate resources from other users, and to avoid having to give admin privileges to too many people.

This is the scale that Linux and open systems have made possible, and is necessary when running some of today’s organizations and applications. Organizations today need, in many cases, to manage hundreds or thousands (or tens of thousands) of servers with applications that are spread out over many, many individual VMs or servers.

Developers need to be able to write applications that can be spread over tens or thousands of servers as demand requires, rather than trying to “scale up” applications on bigger and beefier hardware.

Having an open cloud matters because we need to be able to continue the work that GNU and Linux folks have been doing for more than twenty years, at scale. It matters because we need the cloud to be bigger than Amazon or proprietary companies – and because users and organizations should have as much control over their computing destiny at scale as they have had on individual servers.

So, though many folks are probably tired about hearing about “cloud this” and “cloud that”, it’s really not going away anytime soon. And if you’re interested in software freedom, this is the next generation.