API Usage Tutorial

Cloudera Manager Concepts

The API terminology is similar to that used in the web UI:

Cluster

A cluster is a set of hosts running interdependent services. All services in a cluster have the same CDH version. A Cloudera Manager installation may have multiple clusters, which are uniquely identified by different names.

You can issue commands against a cluster.

Service

A service is an abstract entity providing a capability in a cluster. Examples of services are HDFS, MapReduce, YARN, and HBase. A service is usually distributed, and contains a set of roles that physically run on the cluster. A service has its own configuration, status and roles. You may issue commands against a service, or against a set of roles in bulk. Additionally, an HDFS service has nameservices, and a MapReduce service has activities.

All services belong to a cluster (except for the Cloudera Management Service), and is uniquely identified by its name within a Cloudera Manager installation. The types of services available depends on the CDH version of the cluster.

Role

A role performs specific actions for a service, and is assigned to a host. It usually runs as a daemon process, such as a DataNode or a TaskTracker. (There are exceptions; not all roles are daemon processes.) Once created, a role cannot be reassigned to a different host. You need to delete and re-create it.

A role has its own configuration and status. API commands on roles are always issued in bulk at the service level.

Role Type

Role type refers to the class that a role belongs to. For example, an HBase service has the Master role type and the RegionServer role type. Different service types have different sets of role types. This is not to be confused with a role, which refers to a specific role instance that is physically assigned to a host.

You can specify configuration for a role type, which is inherited by all role instances of that type.

Host

The Cloudera Manager Agent runs on hosts that are managed by Cloudera Manager. You can assign service roles to hosts.

Cloudera Manager

Everything related to the operation of Cloudera Manager is available under the /cm resource. This includes global commands, system configuration, and the Cloudera Management Service.

Cloudera Management Service

Only available in the Enterprise Edition, the Management Service provides monitoring, diagnostic and reporting features for your Hadoop clusters. The operation of this service is similar to other Hadoop services, except that the Management Service does not belong to a cluster.

Metrics

A metric is a property that can be measured to quantify the state of an entity or activity, such as the number of open file descriptors or CPU utilization percentage. Full list of metric schema is available through Cloudera Manager API /timeseries/schema endpoint.

Cloudera Manager enables retrieving of metric data using a launguage called tsquery. Please see tsquery documentation for more details on how to write a tsquery.

Debugging the API

You may enable debug logging in Cloudera Manager for API related activities. The setting is called "Enable Debugging of API" on the Administration page of the Cloudera Manager Admin Console. When enabled, the Cloudera Manager server log will contain full traces of all API requests and responses, and the debug logging of the request handling. Due to the large volumn of log data this may generate, you should enable it only during development.

API Usage Examples

The following examples use curl without a cookie jar, for ease of cut-n-paste. But note that it is an inefficient way to authenticate.

Explore Around

What clusters do we have?


  $ curl -u admin:admin 'http://localhost:7180/api/v43/clusters'

    {
      "items" : [ {
        "name" : "Cluster 1",
        "displayName" : "Cluster 1",
        "fullVersion" : "7.1.5",
        "maintenanceMode" : false,
        "maintenanceOwners" : [ ],
        "clusterUrl" : "http://localhost:7180/cmf/clusterRedirect/Cluster+1",
        "hostsUrl" : "http://localhost:7180/cmf/clusterRedirect/Cluster+1/hosts",
        "entityStatus" : "GOOD_HEALTH",
        "uuid" : "5e422ec0-4d9b-46a2-b3e9-b95616963079",
        "clusterType" : "BASE_CLUSTER",
        "tags" : [ ]
      } ]
  }

This shows the services running in a cluster, with status and health information (in the Enterprise Edition). Abridged output:


  $ curl -u admin:admin 'http://localhost:7180/api/v43/clusters/Cluster%201/services'
{
  "items" : [ {
    "healthChecks" : [ {
      "name" : "ZOOKEEPER_CANARY_HEALTH",
      "summary" : "GOOD",
      "suppressed" : false
    }, {
      "name" : "ZOOKEEPER_SERVERS_HEALTHY",
      "summary" : "GOOD",
      "suppressed" : false
    } ],
    "maintenanceOwners" : [ ],
    "tags" : [ ],
    "name" : "ZOOKEEPER-1",
    "type" : "ZOOKEEPER",
    "clusterRef" : {
      "clusterName" : "Cluster 1",
      "displayName" : "Cluster 1"
    },
    "serviceUrl" : "http://localhost:7180/cmf/serviceRedirect/ZOOKEEPER-1",
    "serviceVersion" : "CDH 7.1.5",
    "roleInstancesUrl" : "http://localhost:7180/cmf/serviceRedirect/ZOOKEEPER-1/instances",
    "serviceState" : "STARTED",
    "healthSummary" : "GOOD",
    "configStalenessStatus" : "FRESH",
    "clientConfigStalenessStatus" : "FRESH",
    "maintenanceMode" : false,
    "displayName" : "ZOOKEEPER-1",
    "entityStatus" : "GOOD_HEALTH"
  }, {...

This shows the custom configuration of hdfs1 and all the role types. Config params with default values are excluded, and only shown in the "full" view.


  $ curl -u admin:admin http://localhost:7180/api/v43/clusters/Cluster%201/services/HDFS-1/config
{
  "items" : [ {
    "name" : "core_site_safety_valve",
    "value" : "",
    "sensitive" : false
  }, {
    "name" : "dfs_replication",
    "value" : "2",
    "sensitive" : false
  }, {
    "name" : "hdfs_blocks_with_corrupt_replicas_thresholds",
    "value" : "{\"warning\":\"2.5\",\"critical\":\"3.0\"}",
    "sensitive" : false
  }, {
    "name" : "zookeeper_service",
    "value" : "ZOOKEEPER-1",
    "sensitive" : false
  } ]
}

The full configuration view shows all parameters with description. Abridged output:


  $ curl -u admin:admin http://localhost:7180/api/v43/clusters/Cluster%201/services/HDFS-1/config?view=full
{
  "items" : [ {
    "name" : "HTTP_proxy_user_groups_list",
    "required" : false,
    "default" : "*",
    "displayName" : "HTTP Proxy User Groups",
    "description" : "Comma-delimited list of groups that you want to allow the HTTP user to impersonate. The default '*' allows all groups. To disable entirely, use a string that doesn't correspond to a group name, such as '_no_group_'. This is used by WebHCat.",
    "relatedName" : "hadoop.proxyuser.HTTP.groups",
    "sensitive" : false,
    "validationState" : "OK",
    "validationWarningsSuppressed" : false
  }, {
    "name" : "HTTP_proxy_user_hosts_list",
    "required" : false,
    "default" : "*",
    "displayName" : "HTTP Proxy User Hosts",
    "description" : "Comma-delimited list of hosts where you want to allow the HTTP user to impersonate other users. The default '*' allows all hosts. To disable entirely, use a string that doesn't correspond to a host name, such as '_no_host'. This is used by WebHCat.",
    "relatedName" : "hadoop.proxyuser.HTTP.hosts",
    "sensitive" : false,
    "validationState" : "OK",
    "validationWarningsSuppressed" : false
  }, {
    "name" : "audit_event_log_dir",
    "required" : false,
    "default" : "/var/log/hadoop-hdfs/audit",
    "displayName" : "Audit Log Directory",
    "description" : "Path to the directory where audit logs will be written. The directory will be created if it doesn't exist.",
    "relatedName" : "audit_event_log_dir",
    "sensitive" : false,
    "validationState" : "OK",
    "validationWarningsSuppressed" : false
  }, {
    "name" : "catch_events",
    "required" : false,
    "default" : "true",
    "displayName" : "Enable Log Event Capture",
    "description" : "When set, each role identifies important log events and forwards them to Cloudera Manager.",
    "relatedName" : "",
    "sensitive" : false,
    "validationState" : "OK"
  }, {
    "name" : "core_site_safety_valve",
    "value" : "",
    "required" : false,
    "displayName" : "Cluster-wide Advanced Configuration Snippet (Safety Valve) for core-site.xml",
    "description" : "For advanced use only, a string to be inserted into core-site.xml. Applies to all roles and client configurations in this HDFS service as well as all its dependent services. Any configs added here will be overridden by their default values in HDFS (which can be found in hdfs-default.xml).",
    "relatedName" : "",
    "sensitive" : false,
    "validationState" : "OK",
    "validationWarningsSuppressed" : false
  }, {
    "name" : "dfs_block_local_path_access_user",
    "required" : false,
    "displayName" : "DataNode Local Path Access Users",
    "description" : "Comma separated list of users allowed to do short circuit read. A short circuit read allows a client co-located with the data to read HDFS file blocks directly from HDFS. If empty, will default to the DataNode process' user.",
    "relatedName" : "dfs.block.local-path-access.user",
    "sensitive" : false,
    "validationState" : "OK",
    "validationWarningsSuppressed" : false
  }, {...

Add a New Service and Roles

This adds a new HBase service called "my_hbase". The API input is a list of services, for bulk operation. Even though the call creates only one service, it still passes in a list (with one item). The API returns the newly created service.


  $ curl -X POST -H "Content-Type:application/json" -u admin:admin \
  -d '{ "items": [ { "name": "my_hbase", "type": "HBASE" } ] }' \
  'http://localhost:7180/api/v43/clusters/Cluster%201/services'

  {
  "items" : [ {
    "maintenanceOwners" : [ ],
    "tags" : [ ],
    "name" : "my_hbase",
    "type" : "HBASE",
    "clusterRef" : {
      "clusterName" : "Cluster 1",
      "displayName" : "Cluster 1"
    },
    "serviceUrl" : "http://localhost:7180/cmf/serviceRedirect/my_hbase",
    "serviceVersion" : "CDH 7.1.5",
    "roleInstancesUrl" : "http://localhost:7180/cmf/serviceRedirect/my_hbase/instances",
    "serviceState" : "NA",
    "configStalenessStatus" : "FRESH",
    "clientConfigStalenessStatus" : "FRESH",
    "maintenanceMode" : false,
    "displayName" : "my_hbase",
    "entityStatus" : "UNKNOWN"
  } ]
}

This creates a Master and a RegionServer roles. The API returns the newly created roles.


  $ curl -X POST -H "Content-Type:application/json" -u admin:admin \
  -d '{"items": [
        { "name": "master1", "type": "MASTER", "hostRef": { "hostId": "localhost" } },
        { "name": "rs1", "type": "REGIONSERVER", "hostRef": { "hostId": "localhost" } } ] }' \
  'http://localhost:7180/api/v43/clusters/Cluster%201/services/my_hbase/roles'

  {
  "items" : [ {
    "maintenanceOwners" : [ ],
    "name" : "master1",
    "type" : "MASTER",
    "serviceRef" : {
      "clusterName" : "Cluster 1",
      "serviceName" : "my_hbase",
      "serviceDisplayName" : "my_hbase",
      "serviceType" : "HBASE"
    },
    "hostRef" : {
      "hostId" : "0c65ecab-1eda-4a90-8de2-739d19d56e0a",
      "hostname" : "localhost"
    },
    "roleUrl" : "http://localhost:7180/cmf/roleRedirect/master1",
    "roleState" : "STOPPED",
    "configStalenessStatus" : "FRESH",
    "maintenanceMode" : false,
    "commissionState" : "COMMISSIONED",
    "roleConfigGroupRef" : {
      "roleConfigGroupName" : "my_hbase-MASTER-BASE"
    },
    "tags" : [ ]
  }, {
    "maintenanceOwners" : [ ],
    "name" : "rs1",
    "type" : "REGIONSERVER",
    "serviceRef" : {
      "clusterName" : "Cluster 1",
      "serviceName" : "my_hbase",
      "serviceDisplayName" : "my_hbase",
      "serviceType" : "HBASE"
    },
    "hostRef" : {
      "hostId" : "0c65ecab-1eda-4a90-8de2-739d19d56e0a",
      "hostname" : "localhost"
    },
    "roleUrl" : "http://localhost:7180/cmf/roleRedirect/rs1",
    "roleState" : "STOPPED",
    "configStalenessStatus" : "FRESH",
    "maintenanceMode" : false,
    "commissionState" : "COMMISSIONED",
    "roleConfigGroupRef" : {
      "roleConfigGroupName" : "my_hbase-REGIONSERVER-BASE"
    },
    "tags" : [ ]
  } ]
}
  

Set Configuration

This sets the service dependency and HDFS root directory for our newly created HBase Service. The API returns the set of custom configuration.


  $ curl -X PUT -H "Content-Type:application/json" -u admin:admin \
  -d '{ "items": [
        { "name": "hdfs_rootdir", "value": "/my_hbase" },
        { "name": "zookeeper_service", "value": "ZOOKEEPER-1" },
        { "name": "hdfs_service", "value": "HDFS-1" } ] }' \
  'http://localhost:7180/api/v43/clusters/Cluster%201/services/my_hbase/config'

  {
  "items" : [ {
    "name" : "hdfs_rootdir",
    "value" : "/my_hbase",
    "sensitive" : false
  }, {
    "name" : "hdfs_service",
    "value" : "HDFS-1",
    "sensitive" : false
  }, {
    "name" : "zookeeper_service",
    "value" : "ZOOKEEPER-1",
    "sensitive" : false
  } ]
}

Issue Commands

After setting the root directory, we need to create it in HDFS. There is an HBase service level command for that. As with all API command calls, the issued command runs asynchronously. The API returns the command object, which may still be active.


  $ curl -X POST -u admin:admin \
  'http://localhost:7180/api/v43/clusters/Cluster%201/services/my_hbase/commands/hbaseCreateRoot'

  {
    "id" : 241,
    "name" : "CreateRootDir",
    "startTime" : "2021-01-13T20:02:22.391Z",
    "active" : true,
    "serviceRef" : {
      "clusterName" : "Cluster 1",
      "serviceName" : "my_hbase",
      "serviceDisplayName" : "my_hbase",
      "serviceType" : "HBASE"
    }
  }

We can check on the command's status, at the /commands endpoint, to see whether it has finished.


  $ curl -u admin:admin 'http://localhost:7180/api/v43/commands/241'

  {
  "id" : 241,
  "name" : "CreateRootDir",
  "startTime" : "2021-01-13T20:02:22.391Z",
  "endTime" : "2021-01-13T20:02:28.745Z",
  "active" : false,
  "success" : true,
  "resultMessage" : "Successfully created HDFS directory.",
  "serviceRef" : {
    "clusterName" : "Cluster 1",
    "serviceName" : "my_hbase",
    "serviceDisplayName" : "my_hbase",
    "serviceType" : "HBASE"
  },
  "children" : {
    "items" : [ ]
  },
  "canRetry" : false
}

We now start the new HBase service.


  $ curl -X POST -u admin:admin 'http://localhost:7180/api/v43/clusters/Cluster%201/services/my_hbase/commands/start'

  {
  "id" : 242,
  "name" : "Start",
  "startTime" : "2021-01-13T20:05:05.500Z",
  "active" : true,
  "serviceRef" : {
    "clusterName" : "Cluster 1",
    "serviceName" : "my_hbase",
    "serviceDisplayName" : "my_hbase",
    "serviceType" : "HBASE"
  }
}

Again, we poll to check the command's result.


  $ curl -u admin:admin 'http://localhost:7180/api/v43/commands/242'

  {
  "id" : 242,
  "name" : "Start",
  "startTime" : "2021-01-13T20:05:05.500Z",
  "endTime" : "2021-01-13T20:05:27.589Z",
  "active" : false,
  "success" : true,
  "resultMessage" : "Successfully started service.",
  "serviceRef" : {
    "clusterName" : "Cluster 1",
    "serviceName" : "my_hbase",
    "serviceDisplayName" : "my_hbase",
    "serviceType" : "HBASE"
  },
  "children" : {
    "items" : [ {
      "id" : 243,
      "name" : "Start",
      "startTime" : "2021-01-13T20:05:05.529Z",
      "endTime" : "2021-01-13T20:05:27.586Z",
      "active" : false,
      "success" : true,
      "resultMessage" : "Successfully started process.",
      "serviceRef" : {
        "clusterName" : "Cluster 1",
        "serviceName" : "my_hbase",
        "serviceDisplayName" : "my_hbase",
        "serviceType" : "HBASE"
      },
      "roleRef" : {
        "clusterName" : "Cluster 1",
        "serviceName" : "my_hbase",
        "roleName" : "master1"
      }
    }, {
      "id" : 244,
      "name" : "Start",
      "startTime" : "2021-01-13T20:05:05.625Z",
      "endTime" : "2021-01-13T20:05:27.577Z",
      "active" : false,
      "success" : true,
      "resultMessage" : "Successfully started process.",
      "serviceRef" : {
        "clusterName" : "Cluster 1",
        "serviceName" : "my_hbase",
        "serviceDisplayName" : "my_hbase",
        "serviceType" : "HBASE"
      },
      "roleRef" : {
        "clusterName" : "Cluster 1",
        "serviceName" : "my_hbase",
        "roleName" : "rs1"
      }
    } ]
  },
  "canRetry" : false
}

Querying metric data

Getting dfs capacity metric data for service HDFS-1.


  $ curl -u admin:admin \
  'http://localhost:7180/api/v43/timeseries?query=select%20dfs_capacity,%20dfs_capacity_used,%20dfs_capacity_used_non_hdfs%20where%20entityName=HDFS-1'

  {
  "items" : [ {
    "timeSeries" : [ {
      "metadata" : {
        "metricName" : "dfs_capacity",
        "entityName" : "HDFS-1",
        "startTime" : "2021-01-07T16:01:56.565Z",
        "endTime" : "2021-01-07T16:06:56.565Z",
        "attributes" : {
          "serviceType" : "HDFS",
          "clusterDisplayName" : "Cluster 1",
          "entityName" : "HDFS-1",
          "clusterName" : "Cluster 1",
          "active" : "true",
          "serviceDisplayName" : "HDFS-1",
          "serviceName" : "HDFS-1",
          "category" : "SERVICE",
          "version" : "CDH 7.1.5"
        },
        "unitNumerators" : [ "bytes" ],
        "unitDenominators" : [ ],
        "expression" : "SELECT dfs_capacity WHERE entityName = \"HDFS-1\" AND category = SERVICE",
        "metricCollectionFrequencyMs" : 60000,
        "rollupUsed" : "RAW"
      },
      "data" : [ {
        "timestamp" : "2021-01-07T16:02:14.086Z",
        "value" : 5.19384997888E11,
        "type" : "SAMPLE"
      }, {
        "timestamp" : "2021-01-07T16:03:19.087Z",
        "value" : 5.19384997888E11,
        "type" : "SAMPLE"
      }, {
        "timestamp" : "2021-01-07T16:04:19.088Z",
        "value" : 5.19384997888E11,
        "type" : "SAMPLE"
      }, {
        "timestamp" : "2021-01-07T16:05:19.092Z",
        "value" : 5.19384997888E11,
        "type" : "SAMPLE"
      }, {
        "timestamp" : "2021-01-07T16:06:24.092Z",
        "value" : 5.19384997888E11,
        "type" : "SAMPLE"
      } ]
    }, {
      "metadata" : {
        "metricName" : "dfs_capacity_used",
        "entityName" : "HDFS-1",
        "startTime" : "2021-01-07T16:01:56.565Z",
        "endTime" : "2021-01-07T16:06:56.565Z",
        "attributes" : {
          "serviceType" : "HDFS",
          "clusterDisplayName" : "Cluster 1",
          "entityName" : "HDFS-1",
          "clusterName" : "Cluster 1",
          "active" : "true",
          "serviceDisplayName" : "HDFS-1",
          "serviceName" : "HDFS-1",
          "category" : "SERVICE",
          "version" : "CDH 7.1.5"
        },
        "unitNumerators" : [ "bytes" ],
        "unitDenominators" : [ ],
        "expression" : "SELECT dfs_capacity_used WHERE entityName = \"HDFS-1\" AND category = SERVICE",
        "metricCollectionFrequencyMs" : 60000,
        "rollupUsed" : "RAW"
      },
      "data" : [ {
        "timestamp" : "2021-01-07T16:02:14.086Z",
        "value" : 1.419948338E9,
        "type" : "SAMPLE"
      }, {
        "timestamp" : "2021-01-07T16:03:19.087Z",
        "value" : 1.419948338E9,
        "type" : "SAMPLE"
      }, {
        "timestamp" : "2021-01-07T16:04:19.088Z",
        "value" : 1.419948338E9,
        "type" : "SAMPLE"
      }, {
        "timestamp" : "2021-01-07T16:05:19.092Z",
        "value" : 1.419972608E9,
        "type" : "SAMPLE"
      }, {
        "timestamp" : "2021-01-07T16:06:24.092Z",
        "value" : 1.419972608E9,
        "type" : "SAMPLE"
      } ]
    }, {
      "metadata" : {
        "metricName" : "dfs_capacity_used_non_hdfs",
        "entityName" : "HDFS-1",
        "startTime" : "2021-01-07T16:01:56.565Z",
        "endTime" : "2021-01-07T16:06:56.565Z",
        "attributes" : {
          "serviceType" : "HDFS",
          "clusterDisplayName" : "Cluster 1",
          "entityName" : "HDFS-1",
          "clusterName" : "Cluster 1",
          "active" : "true",
          "serviceDisplayName" : "HDFS-1",
          "serviceName" : "HDFS-1",
          "category" : "SERVICE",
          "version" : "CDH 7.1.5"
        },
        "unitNumerators" : [ "bytes" ],
        "unitDenominators" : [ ],
        "expression" : "SELECT dfs_capacity_used_non_hdfs WHERE entityName = \"HDFS-1\" AND category = SERVICE",
        "metricCollectionFrequencyMs" : 60000,
        "rollupUsed" : "RAW"
      },
      "data" : [ {
        "timestamp" : "2021-01-07T16:02:14.086Z",
        "value" : 2.910454139E10,
        "type" : "SAMPLE"
      }, {
        "timestamp" : "2021-01-07T16:03:19.087Z",
        "value" : 2.9094375118E10,
        "type" : "SAMPLE"
      }, {
        "timestamp" : "2021-01-07T16:04:19.088Z",
        "value" : 2.9094678222E10,
        "type" : "SAMPLE"
      }, {
        "timestamp" : "2021-01-07T16:05:19.092Z",
        "value" : 2.9094936576E10,
        "type" : "SAMPLE"
      }, {
        "timestamp" : "2021-01-07T16:06:24.092Z",
        "value" : 2.9095247872E10,
        "type" : "SAMPLE"
      } ]
    } ],
    "warnings" : [ ],
    "timeSeriesQuery" : "select dfs_capacity, dfs_capacity_used, dfs_capacity_used_non_hdfs where entityName=HDFS-1"
  } ]
}