Check out this process that will let you get a Hadoop cluster up and running on AWS in two easy steps.
I use Apache Hadoop to process huge data loads. Setting up Hadoop in a cloud provider, such as AWS, involves spinning up a bunch of EC2 instances, configuring nodes to talk to each other, installing software, configuring the master and data nodes’ config files, and starting services.
This was a good use case to automate, considering I wanted to solve these problems.
- How do I build the cluster in minutes (as opposed to hours and maybe even days for a large number of data nodes)?
Read more at DZone