The Beeswax application enables you to perform queries on Apache Hive, a data warehousing system designed to work with Hadoop. You can create Hive tables, load data, run and manage Hive queries, and download the results in a Microsoft Office Excel worksheet file or a comma-separated values file.
Beeswax is installed and configured as part of Hue. For more information about installing Hue, see Hue Installation.
Beeswax assumes an existing Hive installation. The Hue installation instructions include the configuration necessary for Beeswax to access Hive. You can view the current Hive configuration from from the Settings tab in the Beeswax application.
By default, a Beeswax user can see the saved queries for all users – both his/her own queries and those of other Beeswax users. If this behavior is not desirable, there is a configuration option you can change in the /etc/hue/hue.ini file under the [beeswax] section to restrict viewing saved queries to only the query owner and Hue administrators. To change this setting, find and uncomment the share_saved_queries property and set it to false.
To start the Beeswax application, click the Beeswax icon ( ) in the navigation bar at the top of the Hue browser page.
The first time you run Beeswax, the "Welcome to Beeswax for Hive" page appears, and prompts you to install the sample tables or import your own tables.
Once some tables have been created — either from installing the samples or importing your own data — clicking the Beeswax tab will bring you directly into the Query Editor.
The tabs in the Beeswax navigation bar allow you to navigate to the main functional areas of Beeswax.
You can install two sample Beeswax tables to use as examples.
Once you have installed the sample data, you will no longer see either the Import Data and Install Samples buttons when you run Beeswax.
If you want to import your own data instead of installing the Sample tables:
![]() | Note If the Welcome to Beeswax page with the Import Data button no longer appears, you can still import your own data by clicking the Tables tab, and creating a new table. |
The Query Editor view lets you create queries in Hive's Query Language (HQL), which is similar to Structured Query Language (SQL). You can name and save your queries to use later. When you submit a query, the Beeswax Server uses Hive to run the queries. You can either wait for the query to complete, or return later to find the queries in the Beeswax History view. You can also request receive an email message after the query is completed.
![]() | For More Information For information about HQL syntax, see http://wiki.apache.org/hadoop/Hive/LanguageManual. |
![]() | Note To run a Query, you must be logged in to Hue as a user that also has a Unix user account on the remote server. |
To create and run a query:
The section to the left of the Query field lets you specify the following options:
Option | Description |
---|---|
Hive Settings | Use Hive Settings to override the Hive and Hadoop default settings. Click Add to configure a new setting. » For Key, enter a Hive or Hadoop configuration variable name. » For Value, enter the value you want to use for the variable. For example, to override the directory where structured hive query logs are created, you would enter hive.querylog.location for Key, and a path for Value. Click Add again to add another new setting. To view the default settings, click the Settings tab at the top of the page. For information about Hive configuration variables, see: http://wiki.apache.org/hadoop/Hive/AdminManual/Configuration. For information about Hadoop configuration variables, see: http://hadoop.apache.org/common/docs/current/mapred-default.html |
File Resources | Use File Resources to make locally accessible files available at query execution time on the entire Hadoop cluster. Hive uses Hadoop's Distributed Cache to distribute the added files to all machines in the cluster at query execution time. Click Add to configure a new setting. From the Type drop-down menu, choose one of the following: jar — Adds the resources to the Java classpath. This is required in order to reference objects such as user defined functions. archive — Automatically unarchives resources when distributing them. file — Adds resources to the distributed cache. Typically, this might be a transform script (or similar) to be executed. For Path, enter the path to the file. You can also click Choose a File to browse and select the file. Note: It is not necessary to specify files used in a transform script if the files are available in the same path on all machines in the Hadoop cluster. |
User-defined Functions | You can use user-defined functions in a query. Specify the function name for Name, and specify the class name for Class name. Click Add to configure a new setting. You must specify a JAR file for the user-defined functions in File Resources. To include a user-defined function in a query, add a $ (dollar sign) before the function name in the query. For example, if MyTable is a user-defined function name in the query, you would type: SELECT * $MyTable |
Parameterization | To display a dialog box for you or other users to enter parameter values when a query is executed, check Parameterization. This is enabled by default. |
Email Notification | To receive an email message after a query completes, check Email Notification. The email is sent to the email address specified in the logged-in user's profile. |
Beeswax enables you to view the history of queries that you have previously run. Results for these queries are available for one week or until Hue is restarted.
To view query history:
You can view a list of saved queries of all users by clicking Saved Queries in the Beeswax window. You can copy any user's query, but you can only edit, delete, and view the history of your own queries.
To edit a saved query:
To delete a saved query:
To copy a saved query:
To copy a query in the Beeswax Query History window:
When working with Hive tables, you can use Beeswax to:
Although you can create tables by executing the appropriate HQL DDL query commands, it is easier to create a table using the Beeswax table creation wizard.
There are two ways to create a table: from a file, or manually.
If you create a table from a file, the format of the data in the file will determine some of the properties of the table, such as the record and file formats. The data from the file you specify is imported automatically upon table creation.
When you create a file manually, you specify all the properties of the table, and then execute the resulting query to actually create the table. You then import data into the table as an additional step.
*To create a table from a file:*
To create a table manually:
To browse the data in a table:
To browse the metadata in a table:
When importing data, you can choose to append or overwrite the table's data with data from a file.
To import data into a table:
To drop a table:
To view a table's location: