DataStax Chief Strategy Officer Sam Ramji believes software startups in the NoSQL space have ‘crossed a very important boundary.’ He feels the market is at a tipping point, and DataStax is all set to make the most of this opportunity. “Once you’re generating $100 billion in revenue per year, that’s pretty substantial,” Ramji said.
A couple of years ago, the NoSQL market was about $4 billion. Last year, it shot up to $6.5 billion. According to IDC, the market is expected to grow at about 35% a year for the next few years. “By 2023, we think NoSQL as a market would be about $21-$22 billion,” he said.
Ramji is making sure the large enterprises that are succeeding with Cassandra, an open source project, continue to succeed given the enormous additional market pressure they’re facing as the world goes digital really fast this year.
DataStax-Casandra-Kubernetes ecosystem
A big trend Ramji witnessed last year has been the emergence of Kubernetes as the defacto standard for container management. “Trends for Kubernetes of data are a lot harder to see because mostly the big movement with Kubernetes is the standardization of workflow in application development,” Ramji opined. Not to be left behind, DataStax is releasing a new Kubernetes operator for Cassandra.
When asked, how would DataStax differentiate its operator from others already out in the market, Ramji said that ‘differentiation’ is a word he believed open source communities should stop using.
“I actually don’t think differentiation versus open source or open source communities is a sensible thing,” he said. There are going to be multiple solutions Kube operators for Cassandra. Each of these operators are written by different companies to solve their problem of operating Cassandra at scale in their environment, automatically. Each environment has its own very specific needs. It’s not about differentiation, it’s about solving different problems leveraging the same open-source solution.
“As we’re a commercial provider of an open source project, we have to go towards generalization. Generalization and specificity exist in dynamic harmony. Who cares who wrote the Kube operator for Cassandra that becomes the most popular? What I care about is that users grab both. They install it in their environment and away they go with scalable data. It’s an invitation for the whole community to participate and build a great common operator,” he said.
Why betting on Cassandra?
When asked, what makes Cassandra uniquely suited for cloud native workloads, when the project itself predates all these kinds of technologies, Ramji said Cassandra’s uniqueness stems from two attributes — how applications experience Cassandra and how does Cassandra scale.
“If you’re talking to a relational database or you’re talking to a less scalable database, you’re always going to end up in a state of needing to shard the data. This is a huge cyclical burden for the entire system. Cassandra is shardless. Once you’ve written your application and talked to Cassandra, you’d never have to change how you talk to Cassandra,” he explains.
Cassandra’s ability to scale out, which is its other differentiator, is based on its master list or multi-master architecture. There’s no single point of failure.
“Therefore, the ability to scale Cassandra — from a few nodes to many nodes, from one cluster to multiple clusters, or multiple clusters to multiple regions — is all tested and proved. This was exactly the problem that Cassandra was built to solve for Facebook in the first place,” says Ramji.
Three Pillars of Growth
Ramji firmly believes Cassandra and DataStax offer the opportunity of a lifetime to understand data better, to be able to serve the community by linking a really well-proved open source database with a really well-proved open source cloud native technology.
Putting his money where his mouth is, Ramji is building three pillars — open source, scale-out, and cloud native — that will help DataStax deliver on its promise.
In 2016, there was a loss of focus as different parties imposed requirements for Cassandra. The first pillar aims at bringing back the focus and restoring the vibrancy of Cassandra as an open source community.
“Netflix, Instagram, Apple, DataStax, and others were going in different directions. That scattered the tribes. Now, we’re trying to gather the tribes back together. Apple is leading the way to get Cassandra 4.0 released this year. We’ll probably see the beta in Q2 and we’ll see the GA later this year,” said Ramji.
DataStax, which has a small team, has allocated about 25 engineers to work on nothing but open source.
“For us to allocate such a high percentage of our engineering force, which is a quarter of our engineering team, tells you that open source is absolutely vital to us,” he said.
The second pillar, scale-out, is a core of what Cassandra is really meant for and that’s what it’s been pushed on for a decade. Ramji intends to continue delivering on what makes Cassandra special.
Cassandra is hardened to scale. It’s not just about a lot of nodes. It’s about a lot of nodes in a cluster. It’s about a lot of clusters in a multi-cluster and a lot of multi-clusters in multiple regions. All is one addressable data fabric. “That’s what it’s really good at,” he said.
Ramji believes the third, and final pillar – cloud native – will be the big stretch.
“To be able to get a proper cloud native database, you want something that will ride along with Kubernetes, Istio, Envoy, and Prometheus. It will also have to expand and contract and fit the application workload that’s being directed through Kubernetes. That’s a super interesting area,” he said.
Working towards a perfect pairing of data and compute for a cloud native world, DataStax will be releasing a Kubernetes operator management API this year.
In conclusion
Going forward, Ramji believes designing the storage engine interface would receive a lot of attention this year.
“There’s so much advancement happening in storage, networking, network block storage and large scale environments for AI, for ML as well as for just mainstream deployment of large scale applications. As we build the storage engine interface as part of the architecture for participation for Cassandra, we will see a lot more different experts and companies come to bear so that we can plug into all of their differentiated awesome environments,” he concluded.