How DIGIT Created High Availability on the Public Cloud to Keep Its Games Running

407

The mobile gaming company must deliver a seamless experience for its gamers and allow for spikes in player activity on its Massively Multiplayer Online gaming platform. That’s why the company built a high-availability infrastructure that runs on Amazon Web Services (AWS) and allows them to launch a cluster in less than 5 minutes using Apache Mesos.

We want to enable developers to iterate fast on their ideas and to be able to deploy new code changes as fast as possible,” say DevOps engineer Emmanuel Rieg and build and release engineer Ross McKinley, below.  “We’re aiming at deploying multiple times a week, whenever a given feature is stable or bug is fixed.”

Rieg and McKinley will give a talk next week at MesosCon Europe on how they went from a blank canvas AWS account to a fully functional PaaS, to set up their immutable infrastructure.  Here they give a short preview of their talk and share tips for developing on top of Mesos.

Emmanuel Rieg
Linux.com: Why do you build your applications on AWS?

Emmanuel & Ross: We are very impressed by the diversity of services offered by Amazon. This is coupled with good AWS support in other tools we use. Developer friendliness is really important to us. The ability to run our cluster in an isolated environment (VPC) was a deciding factor.

Linux.com: How do you create high availability on the public cloud?

Emmanuel & Ross: HA is achievable on the public cloud. In our case, we couple redundancy across Availability Zone (AZ) with monitoring and autonomous systems to ensure our games can keep running. Using only one AZ will not ensure HA, as that entire zone could fail for a short time. Each of our applications runs in multiple containers at the same time. They’re are all being monitored to handle current load. When one container is down, another takes its place. The same applies for all parts of our infrastructure. All services are autoscaling and behind a service discovery system. On top of this, nodes in our cluster are deployed across multiple AZs, each of which being an isolated network with its own NAT gateway. This way we can survive a whole zone going down.

Ross McKinley

Linux.com: What role does Mesos play in your infrastructure?

Emmanuel & Ross: Mesos is the foundation we use to run all of our environments. This allows us to scale quickly, handle spikes in players gracefully, and enables our tech teams to develop with velocity.

Linux.com: Why is speed (i.e., launching a cluster in under 5 minutes) important to your business?

Emmanuel & Ross: As we use an Immutable Infrastructure, many components can be affected when performing large updates. Keeping the feedback loop short on infrastructure changes enables us to react to problems and deploy fixes with minimal user impact.

We want to enable developers to iterate fast on their ideas and to be able to deploy new code changes as fast as possible.We’re aiming at deploying multiple times a week, whenever a given feature is stable or bug is fixed. This also enables us to roll back awry deployments.

Linux.com: What is your top tip for creating development environments on top of Mesos?

Emmanuel & Ross: Have a comprehensive Monitoring solution, automate everything, and codify your infrastructure.

Good monitoring is the key to a successful development environment. Without Monitoring, you’re flying blind and will have a hard time tracking down issues.

A fully automated continuous delivery system for validating and pushing changes makes it easy to ensure that bad practices, like manual intervention and works-of-art, are avoided.

Infrastructure-as-Code is mandatory to prevent servers and infrastructure becoming a work-of-art which cannot be replicated. Treat your servers as cattle, each one is fully replaceable at any time.

 

Join the Apache Mesos community at MesosCon Europe on Aug. 31 – Sept. 1, 2016! Look forward to 40+ talks from users, developers and maintainers deploying Apache Mesos including Netflix, Apple, Twitter and others. Register now.

Apache, Apache Mesos, and Mesos are either registered trademarks or trademarks of the Apache Software Foundation (ASF) in the United States and/or other countries. MesosCon is run in partnership with the ASF