RepoNameRelease DateTarballsAptYum
StableCDH1March 2009/cdh/stable/debian/redhat/cdh/stable
TestingCDH2August 2009/cdh/testing/debian/redhat/cdh/testing
Cloudera logo

2.8.2. Starting and stopping a cluster

Example 101. Starting a cluster called "my-hadoop-cluster" with with 10 worker nodes

% bin/hadoop-ec2 launch-cluster my-hadoop-cluster 10

When the master has started the console will display a message like

Master is ec2-xxx-xxx-xxx-xxx.compute-1.amazonaws.com

To access Hadoop's web UI you need to allow access to the relevant ports by setting permissions on the security groups in your cluster. This is most easily achieved using Elastic Fox, a Firefox plugin for managing EC2 instances. In the Elastic Fox control panel:

  1. Go to the "Security Groups" tab
  2. Select "my-hadoop-cluster-master" (or the appropriate security group for your cluster) and click the green check icon.
  3. Under Protocol Details, choose "HTTP" and port 50030 (jobtracker).
  4. Enter you host or network details. It is important to restrict access to your host or network to prevent others accessing your cluster. Click "Add".
  5. Repeat to open the port 50070 (namenode) for the master's group.
[Tip]Tip

To monitor MapReduce jobs visit http://ec2-xxx-xxx-xxx-xxx.compute-1.amazonaws.com:50030 in a browser (substituting the hostname printed when the master started). Similarly, you can monitor HDFS at http://ec2-xxx-xxx-xxx-xxx.compute-1.amazonaws.com:50070.

[Note]Note

If you have selected an instance with multiple disks (one of the large or xlarge instances) then it will take several minutes for the cluster to come up after the instance has started, since the extra disks need to be formatted first.

Example 102. Logging into your Hadoop cluster

% bin/hadoop-ec2 login my-hadoop-cluster

This opens an SSH session to the master node which is a convenient place to run jobs from. For example, let's create some test input to run a job on:

Example 103. Running the Hadoop grep example on ec2

# hadoop fs -mkdir input
# hadoop fs -put /etc/hadoop/conf/*.xml input
# hadoop jar /usr/lib/hadoop/hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
# hadoop fs -cat output/part-00000 | head

[Caution]Caution

WHEN YOU TERMINATE YOUR CLUSTER ALL DATA WILL BE LOST!

Example 104. Terminating your Hadoop cluster

% bin/hadoop-ec2 terminate-cluster my-hadoop-cluster