Introducing Oozie Editor/Dashboard
The Oozie Editor/Dashboard application allows you to define Oozie workflow and coordinator applications, run workflow and coordinator jobs, and view the status of jobs.
A workflow application is a collection of actions arranged in a directed acyclic graph (DAG). It includes control flow nodes (start, end, fork, join, and kill) and action nodes (MapReduce, streaming, Java, Pig, Hive, Sqoop, Shell, and ssh actions). The current release does not support the decision control flow node and the fs and Oozie sub-workflow action nodes.
A coordinator application allows you to define and execute recurrent and interdependent workflow jobs. The coordinator application defines the conditions under which the execution of workflows can occur.
Contents
Oozie Editor/Dashboard Installation and Configuration
Oozie Editor/Dashboard is one of the applications that is installed as part of Hue. For more information about installing Hue, see Hue Installation. For information about Oozie, see Oozie Documentation.
 | Note: In order to run streaming or Pig jobs as part of a workflow, Oozie must be configured to use the Oozie ShareLib. If this is not the case, Pig and streaming actions will not run. See Hue Installation for more information. |
Starting Oozie Editor/Dashboard
To start Oozie Editor/Dashboard, click the Oozie Editor/Dashboard icon (
) in the navigation bar at the top of the Hue browser page. Oozie Editor/Dashboard opens with the following screens:
- Dashboad - shows the running and completed workflow and coordinator jobs. The screen is selected and opened to the Workflows page.
- Workflows - shows available workflows.
- Coordinators - shows available coordinators.
- History - shows a list of submitted jobs.
Filtering Lists in Oozie Editor/Dashboard
The Dashboard, Workflows, Coordinators, and History screens contain lists of workflows, coordinators, and jobs. When you type in the Filter field on these screens, the lists are dynamically filtered to display only those rows containing text that matches the specified substring.
Permissions in Oozie Editor/Dashboard
In the Dashboard workflows and coordinators can only be viewed, submitted, and modified by its owner or a superuser.
Editor permissions for performing actions on workflows and coordinators are summarized in the following table:
Action |
Superuser or Owner |
All |
View |
Y |
Only if "Is shared" is set |
Submit |
Y |
Only if "Is shared" is set |
Modify |
Y |
N |
Oozie Dashboard
Oozie Dashboard shows a summary of the running and completed workflow and coordinator jobs.
You can view jobs for a period up to the last 30 days.
You can filter the list by date (1, 7, 15, or 30 days) or status (Succeeded, Running, or Killed). The date and status buttons are toggles.
Workflows
Click the Workflows tab to view the running and completed workflows for the filters you have specified.
Click a workflow row in the Running or Completed table to view detailed information about that workflow.
For the selected workflow, the following tabs and information is available.
- The Graph tab shows the workflow DAG.
- The Actions tab shows you details about the actions that make up this workflow.
- Click the Id link to see additional details about this action.
- Click the External Id link to view this action in the Hue Job Browser.
- The Details tab shows job statistics including start and end times, and provides a link to the actual workflow definition in the File Browser.
- The Configuration tab shows selected job configuration settings.
- The Logs tab shows log output generated by the workflow.
- The Definition tab shows the Oozie workflow definition, as it appears in the workflow.xml file (also linked under the application path properties in the Details tab and the Configuration tab).
Coordinators
Click the Coordinators tab to view the running and completed coordinator jobs for the filters you have specified.
For the selected coordinator, the following tabs and information is available.
- The Calendar tab shows the timestamp of the job. Click the timestamp to open the workflow DAG.
- The Actions tab shows you details about the actions that make up this coordinator.
- Click the Id link to see additional details about this action.
- Click the External Id link to view this action in the Hue Job Browser.
- The Configuration tab shows selected job configuration settings.
- The Logs tab shows log output generated by the coordinator.
- The Definition tab shows the Oozie coordinator definition, as it appears in the coordinator.xml file (also linked under the oozie.coord.application.path property in the Configuration tab).
Workflow Editor
The Workflow Editor is where you create or edit Oozie workflows and submit them for execution. The Workflow Editor comes with several preinstalled sample workflows.
In Workflow Editor, you can create workflows that include MapReduce, streaming, Java, Pig, Hive, Sqoop, Shell, and ssh actions. You can create these actions in the Workflow Editor, or you can import job designs from Job Designer to be used as actions in your workflow.
Click the Workflows tab to open the Workflow editor.
The main page of the workflow editor shows the current set of workflow designs.
Each row shows a workflow design: its name, description, timestamp of its last modification. It also shows:
- Steps: the number of steps in the workflow execution path. This is the number of execution steps between the start and end of the workflow. This will not necessarily be the same as the number of actions in the workflow, if there are control flow nodes in the control path.
- Status: who can run the workflow. shared means users other than the owner can access the workflow. personal means only the owner can modify or submit the workflow. The default is personal.
- Owner: the user that created the workflow.
Opening a Workflow
To open a workflow, click the workflow. Proceed with Editing a Workflow.
Creating a Workflow
To create a workflow:
- Click the Create button at the top right of the Action Chooser.
- In the Name field, type a name.
- To specify the HDFS deployment directory and Oozie schema version click advanced.
- Click Save. The workflow editor opens. Proceed with Editing a Workflow.
Editing a Workflow
In the workflow editor you can add and delete actions, clone actions, create and remove fork and join control nodes, and move actions as follows:
- Add actions to the workflow by doing one of the following:
- Click the Add tab.
- Click a + action button, where action is MapReduce, Streaming, Java, Pig, Hive, Sqoop, Shell, or Ssh.
- Set the action properties and click Save.
- Click the Import tab.
- Click the + Job Design button.
- Click a radio button next to a job design and click Import.
- Clone actions by clicking the
button.
- Create or remove a fork and join by moving an action up or down with the
buttons.
- Change the position of an action by clicking the
buttons in the same direction twice.
- To upload a file (for example, a Pig script) to the folder containing the files referenced by the workflow, click the Upload button.
 | Each action must have a unique name. |
Editing Workflow Properties
- In the workflow editor, click the Properties tab.
- To share the workflow with all users, check the Is shared checkbox.
- To set advanced execution options, click advanced and edit the deployment directory, add parameters and job properties, or Oozie schema version, .
- Click Save.
Submitting a Workflow
To submit a workflow for execution, click the radio button next to the workflow and click the Submit button.
Scheduling a Workflow
To schedule a workflow for recurring execution, click the radio button next to the workflow and click the Schedule button. A coordinator is created and opened in the coordinator editor.
Coordinator Editor
The Coordinator Editor is where you create or edit Oozie coordinator applications and submit them for execution. The Workflow Editor contains one pre-installed sample coordinator.
Opening a Coordinator
To open a coordinator, click the coordinator. Proceed with Editing a Coordinator.
Creating a Coordinator
To create a coordinator:
- Click the Create button at the top right of the Action Chooser.
- In the Name field, type a name.
- In the Workflow drop-down list, choose a workflow that the coordinator will schedule.
- In the Frequency area, specify how often the workflow will be scheduled and how many times it will run.
- Click Save. The coordinator editor opens. Proceed with Editing a Coordinator.
Editing a Coordinator
 | Note: Most workflows require either an input dataset, an output dataset, or both. |
In the coordinator editor you specify coordinator properties and the datasets on which the workflow scheduled by the coordinator will operate as follows:
- To share the coordinator with all users, check the Is shared checkbox.
- To set advanced execution options, click advanced and fill in properties that determine how long a coordinator will wait before timing out, how many coordinators can wait and run concurrently, the coordinator scheduling policy, and the coordinator schema version.
- In the Frequency area, set how many time thes communicator will run for each specified unit, the start and end time of the coordinator, and the timezone of the start and end times.
- The inputs and outputs of the workflow must be mapped to some data. Click Add and select a dataset from the Dataset drop-down menu and map it to one variable of your workflow.
If no datasets exist, follow the procedure in Creating a Dataset.
- Select a dataset from the Dataset drop-down menu.
- Click Save.
Creating a Dataset
- In the coordinator editor, do one of the following:
- Click the Datasets tab at the top of the editor.
- In the Data area, click the Datasets button.
- Click Create.
- In the Start and Frequency fields, specify when and how often input datasets will be available.
- In the Uri field, specify a URI template for the location of input and output datasets. You can specify the variables
to construct URIs and URI paths containing dates and timestamps. For example:
- Specify the timezone of the start date.
- In the Done flag field, specify the flag that identifies when input datasets are no longer ready.
Submitting a Coordinator
To submit a coordinator for execution, click the radio button next to the coordinator and click the Submit button.
Submissions History
The Submissions History is where you view the history of workflow and coordinator jobs. Clicking a link in the Name column opens the workflow or coordinator in an editor. Clicking a link in the Submission Id column opens the job in the Dashboard.