ApiHiveCloudReplicationArguments Data Model

Replication arguments for Hive services.

Properties
name data type constraints description
sourceAccount string  
destinationAccount string  
cloudRootPath string  
replicationOption ReplicationOption  
sentryMigration boolean required
Properties inherited from ApiHiveReplicationArguments
sourceService ApiServiceRef   The service to replicate from.
tableFilters array of ApiHiveTable   Filters for tables to include in the replication. Optional. If not provided, include all tables in all databases.
exportDir string   Directory, in the HDFS service where the target Hive service's data is stored, where the export file will be saved. Optional. If not provided, Cloudera Manager will pick a directory for storing the data.
force boolean   Whether to force overwriting of mismatched tables. Defaults to false.
replicateData boolean   Whether to replicate table data stored in HDFS. Defaults to false.

If set, the "hdfsArguments" property must be set to configure the HDFS replication job.

hdfsArguments ApiHdfsReplicationArguments   Arguments for the HDFS replication job.

This must be provided when choosing to replicate table data stored in HDFS. The "sourceService", "sourcePath" and "dryRun" properties of the HDFS arguments are ignored; their values are derived from the Hive replication's information.

The "destinationPath" property is used slightly differently from the usual HDFS replication jobs. It is used to map the root path of the source service into the target service. It may be omitted, in which case the source and target paths will match.

Example: if the destination path is set to "/new_root", a "/foo/bar" path in the source will be stored in "/new_root/foo/bar" in the target.

replicateImpalaMetadata boolean   Whether to replicate the impala metadata. (i.e. the metadata for impala UDFs and their corresponding binaries in HDFS).
runInvalidateMetadata boolean   Whether to run invalidate metadata query or not
dryRun boolean   Whether to perform a dry run. Defaults to false
numThreads number   Number of threads to use in multi-threaded export/import phase

Example

{
  "sourceAccount" : "...",
  "destinationAccount" : "...",
  "cloudRootPath" : "...",
  "replicationOption" : "METADATA_ONLY",
  "sentryMigration" : true,
  "sourceService" : {
    "peerName" : "...",
    "clusterName" : "...",
    "serviceName" : "...",
    "serviceDisplayName" : "...",
    "serviceType" : "..."
  },
  "tableFilters" : [ {
    "database" : "...",
    "tableName" : "..."
  }, {
    "database" : "...",
    "tableName" : "..."
  } ],
  "exportDir" : "...",
  "force" : true,
  "replicateData" : true,
  "hdfsArguments" : {
    "sourceService" : {
      "peerName" : "...",
      "clusterName" : "...",
      "serviceName" : "...",
      "serviceDisplayName" : "...",
      "serviceType" : "..."
    },
    "sourcePath" : "...",
    "destinationPath" : "...",
    "mapreduceServiceName" : "...",
    "schedulerPoolName" : "...",
    "userName" : "...",
    "sourceUser" : "...",
    "numMaps" : 12345,
    "dryRun" : true,
    "bandwidthPerMap" : 12345,
    "abortOnError" : true,
    "removeMissingFiles" : true,
    "preserveReplicationCount" : true,
    "preserveBlockSize" : true,
    "preservePermissions" : true,
    "logPath" : "...",
    "skipChecksumChecks" : true,
    "skipListingChecksumChecks" : true,
    "skipTrash" : true,
    "replicationStrategy" : "STATIC",
    "preserveXAttrs" : true,
    "exclusionFilters" : [ "...", "..." ],
    "raiseSnapshotDiffFailures" : true,
    "destinationCloudAccount" : "..."
  },
  "replicateImpalaMetadata" : true,
  "runInvalidateMetadata" : true,
  "dryRun" : true,
  "numThreads" : 12345
}