Chapter 12. HBase Operational Management

Table of Contents

12.1. HBase Tools and Utilities
12.1.1. HBase hbck
12.1.2. HFile Tool
12.1.3. WAL Tools
12.1.4. Compression Tool
12.1.5. CopyTable
12.1.6. Export
12.1.7. Import
12.1.8. RowCounter
12.2. Node Management
12.2.1. Node Decommission
12.2.2. Rolling Restart
12.3. Metrics
12.3.1. Metric Setup
12.3.2. RegionServer Metrics
12.4. HBase Monitoring
12.5. Cluster Replication
12.6. HBase Backup
12.6.1. Full Shutdown Backup
12.6.2. Live Cluster Backup - Replication
12.6.3. Live Cluster Backup - CopyTable
12.6.4. Live Cluster Backup - Export
12.7. Capacity Planning
12.7.1. Storage
12.7.2. Regions
This chapter will cover operational tools and practices required of a running HBase cluster. The subject of operations is related to the topics of Chapter 11, Troubleshooting and Debugging HBase, Chapter 10, Performance Tuning, and Chapter 2, Configuration but is a distinct topic in itself.

12.1. HBase Tools and Utilities

Here we list HBase tools for administration, analysis, fixup, and debugging.

12.1.1. HBase hbck

An fsck for your HBase install

To run hbck against your HBase cluster run

$ ./bin/hbase hbck

At the end of the commands output it prints OK or INCONSISTENCY. If your cluster reports inconsistencies, pass -details to see more detail emitted. If inconsistencies, run hbck a few times because the inconsistency may be transient (e.g. cluster is starting up or a region is splitting). Passing -fix may correct the inconsistency (This latter is an experimental feature).

12.1.2. HFile Tool

See Section 8.6.4.2.2, “HFile Tool”.

12.1.3. WAL Tools

12.1.3.1. HLog tool

The main method on HLog offers manual split and dump facilities. Pass it WALs or the product of a split, the content of the recovered.edits. directory.

You can get a textual dump of a WAL file content by doing the following:

 $ ./bin/hbase org.apache.hadoop.hbase.regionserver.wal.HLog --dump hdfs://example.org:8020/hbase/.logs/example.org,60020,1283516293161/10.10.21.10%3A60020.1283973724012 

The return code will be non-zero if issues with the file so you can test wholesomeness of file by redirecting STDOUT to /dev/null and testing the program return.

Similarily you can force a split of a log file directory by doing:

 $ ./bin/hbase org.apache.hadoop.hbase.regionserver.wal.HLog --split hdfs://example.org:8020/hbase/.logs/example.org,60020,1283516293161/

12.1.4. Compression Tool

See Section 12.1.4, “Compression Tool”.

12.1.5. CopyTable

CopyTable is a utility that can copy part or of all of a table, either to the same cluster or another cluster. The usage is as follows:

$ bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable [--rs.class=CLASS] [--rs.impl=IMPL] [--starttime=X] [--endtime=Y] [--new.name=NEW] [--peer.adr=ADR] tablename

Options:

  • rs.class hbase.regionserver.class of the peer cluster. Specify if different from current cluster.
  • rs.impl hbase.regionserver.impl of the peer cluster.
  • starttime Beginning of the time range. Without endtime means starttime to forever.
  • endtime End of the time range. Without endtime means starttime to forever.
  • new.name New table's name.
  • peer.adr Address of the peer cluster given in the format hbase.zookeeper.quorum:hbase.zookeeper.client.port:zookeeper.znode.parent
  • families Comma-separated list of ColumnFamilies to copy.

Args:

  • tablename Name of table to copy.

Example of copying 'TestTable' to a cluster that uses replication for a 1 hour window:

$ bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable
--rs.class=org.apache.hadoop.hbase.ipc.ReplicationRegionInterface
--rs.impl=org.apache.hadoop.hbase.regionserver.replication.ReplicationRegionServer
--starttime=1265875194289 --endtime=1265878794289
--peer.adr=server1,server2,server3:2181:/hbase TestTable

12.1.6. Export

Export is a utility that will dump the contents of table to HDFS in a sequence file. Invoke via:

$ bin/hbase org.apache.hadoop.hbase.mapreduce.Export <tablename> <outputdir> [<versions> [<starttime> [<endtime>]]]

12.1.7. Import

Import is a utility that will load data that has been exported back into HBase. Invoke via:

$ bin/hbase org.apache.hadoop.hbase.mapreduce.Import <tablename> <inputdir>

12.1.8. RowCounter

RowCounter is a utility that will count all the rows of a table. This is a good utility to use as a sanity check to ensure that HBase can read all the blocks of a table if there are any concerns of metadata inconsistency.

$ bin/hbase org.apache.hadoop.hbase.mapreduce.RowCounter <tablename> [<column1> <column2>...]