RepoNameRelease DateTarballsAptYum
StableCDH1March 2009/cdh/stable/debian/redhat/cdh/stable
TestingCDH2August 2009/cdh/testing/debian/redhat/cdh/testing
Cloudera logo

2.6. Customization

You can specify a list of packages to install on every instance at boot time using the —user-packages command-line option (or the user_packages configuration parameter). Packages should be space-separated. Note that package names should reflect the package manager being used to install them (yum or apt-get depending on the OS).

Example 92. Installing RPMs for R and git

% hadoop-ec2 launch-cluster --user-packages 'R git-core' my-hadoop-cluster 10

You have full control over the script that is run when each instance boots. The default script, hadoop-ec2-init-remote.sh, may be used as a starting point to add extra configuration or customization of the instance. Make a copy of the script in your home directory, or somewhere similar, and set the —user-data-file command-line option (or the user_data_file configuration parameter) to point to the (modified) copy.

Another way of customizing the instance, which may be more appropriate for larger changes, is to create your own AMI using one of the base images listed in the table above.

It's possible to use any AMI, as long as it i) runs (gzip compressed) user data on boot, and ii) has Java installed.