::Go back to Oozie Documentation Index::

Building Oozie

System Requirements

JDK commands (java, javac) must be in the command path.

The Maven command (mvn) must be in the command path.

Oozie Documentation Generation

To generate the documentation, Oozie uses a patched Doxia plugin for Maven with improved twiki support.

This plugin is available from Yahoo GitHub Maven repository and it is automatically downloaded by Maven when building Oozie.

The source of the modified plugin is available in the Oozie GitHub repository, in the ydoxia branch.

To build and install it locally run the following command in the ydoxia branch:

$ mvn install

Passphare-less SSH Setup

NOTE: SSH actions are deprecated in Oozie 2.

To run SSH Testcases and for easier Hadoop start/stop configure SSH to localhost to be passphrase-less.

Create your SSH keys without a passphrase and add the public key to the authorized file:

$ ssh-keygen -t dsa
$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys2

Test that you can ssh without password:

$ ssh localhost

Building and Testing Oozie

Since Oozie 2.1, the same Oozie distribution works with different versions of Hadoop.

The JARs for the specified Hadoop and Pig versions must be available in one of the Maven repositories defined in Oozie main 'pom.xml' file. Or they must be installed in the local Maven cache.

Examples of Compiling and Testing Oozie

Compiling and testing with Hadoop 0.20.2 using embedded minicluster:

$ mvn clean test -Dhadoop20=true -DhadoopGroupId=org.apache.hadoop -DhadoopVersion=0.20.2

Compiling and testing with Hadoop 0.20.2 using a Hadoop cluster:

$ mvn clean test -Dhadoop20=true -DhadoopGroupId=org.apache.hadoop -DhadoopVersion=0.20.2 \
  -Doozie.test.hadoop.minicluster=false \
  -Doozie.test.job.tracker=localhost:9001 -Doozie.test.name.node=hdfs://localhost:9000

Compiling and testing with Hadoop 0.20.104.2 using embedded minicluster and 'simple' authentication:

$ mvn clean test -DhadoopGroupId=com.yahoo.hadoop -DhadoopVersion=0.20.104.2

Compiling and testing with Hadoop 0.20.104.2 using a Hadoop cluster and 'simple' authentication:

$ mvn clean test -DhadoopGroupId=com.yahoo.hadoop -DhadoopVersion=0.20.104.2 \
  -Doozie.test.hadoop.security=simple -Doozie.test.hadoop.minicluster=false \
  -Doozie.test.job.tracker=localhost:9001 -Doozie.test.name.node=hdfs://localhost:9000

Compiling and testing with Hadoop 0.20.104.2 using a Hadoop cluster and 'kerberos' authentication:

$ mvn clean test -DhadoopGroupId=com.yahoo.hadoop -DhadoopVersion=0.20.104.2 \
  -Doozie.test.hadoop.security=kerberos -Doozie.test.hadoop.minicluster=false \
  -Doozie.test.job.tracker=localhost:9001 -Doozie.test.name.node=hdfs://localhost:9000

NOTE: The embedded minicluster cannot be used when testing with 'kerberos' authentication.

Compiling and testing with a custom oozie-site.xml (i.e. in order to use a different DB):

$ mvn clean test -Doozie.test.config.file=/home/tucu/oozie-site-mysql.xml

Build Options Reference

generateSite : generates Oozie documentation, default is undefined (no documentation is generated)

mysql : Includes MySQL JDBC driver in the distro, default is undefined, MySQL JDBC driver is not included.

oracle : Includes Oracle JDBC driver in the distro, default is undefined, Oracle JDBC driver is not included.

skipTests : skips the execution of all testcases, no value required, default is undefined

test = : runs a single test case, to run a test give the test class name without package and extension, no default

hadoop20 = : indicates the build/test should not include classes for Hadoop 20S, default is 'false'

hadoopGroupId = : indicates the Hadoop artifacts group ID to use, default value 'com.yahoo.hadoop'

hadoopVersion = : indicates the Hadoop artifacts version to use, default value '0.20.104.2'

hadoopScope = : indicates the Hadoop artifacts scope, default is 'provided' and the generated oozie.war does not include Hadoop JARs. If 'compile' is used the generated oozie.war will contain Hadoop JARs.

pigGroupId = : indicates the Pig artifact group ID to use, default value 'com.yahoo.hadoop'

pigVersion = : indicates the Pig artifact version to use, default value '0.7.0'

oozie.test.config.file = : indicates a custom Oozie configuration file for running the testcases. The specified file must be an absolute path. For example, it can be useful to specify different database than HSQL for running the testcases.

oozie.test.hadoop.minicluster = : indicates if Hadoop minicluster should be started for testcases, default value 'true'

oozie.test.job.tracker = : indicates the URI of the JobTracker when using a Hadoop cluster for testing, default value 'localhost:9001'

oozie.test.name.node = : indicates the URI of the NameNode when using a Hadoop cluster for testing, default value 'hdfs://localhost:9000'

oozie.test.hadoop.pipes = : indicates if Hadoop pipes test case should be run or not, default value 'false'

oozie.test.hadoop.security = : indicates the type of Hadoop authentication for testing, valid values are 'simple' or 'kerberos, default value 'simple'

oozie.test.kerberos.keytab.file = : indicates the location of the keytab file, default value '${user.home}/oozie.keytab'

oozie.test.kerberos.realm = : indicates the Kerberos real, default value 'LOCALHOST'

oozie.test.kerberos.oozie.principal = : indicates the Kerberos principal for oozie, default value '${user.name}/localhost'

oozie.test.kerberos.jobtracker.principal = : indicates the Kerberos principal for the JobTracker, default value 'mapred/localhost'

oozie.test.kerberos.namenode.principal = : indicates the Kerberos principal for the NameNode, default value 'hdfs/localhost'

Oozie Version Numbers

When changing the Oozie version, it must be changed through out the pom.xml files. Look for the comment in the pom.xml files to find all the locations to change it.

Building an Oozie Distribution

The following Maven invocation builds an Oozie distribution:

$ mvn clean package assembly:single

The following properties should be specified when building a release:

  • -DgenerateDocs : forces the generation of Oozie documentation
  • -Dbuild.time= : timestamps the distribution
  • -Dvc.revision= : specifies the source control revision number of the distribution
  • -Dvc.url= : specifies the source control URL of the distribution

The provided bin/mkdistro.sh script runs the above Maven invocation setting all these properties to the right values.

NOTE: Use a Hadoop Security version when creating a distribution, otherwise there will be classes missing and the distribution will not work with a Hadoop Security distribution.

IDE Setup

Eclipse and IntelliJ can use directly Oozie Maven project files.

The only special consideration is that the following source directories from the client module must be added to the core module source path:

  • client/src/main/java : as source directory
  • client/src/main/resources : as source directory
  • client/src/test/java : as test-source directory
  • client/src/test/resources : as test-source directory

::Go back to Oozie Documentation Index::