A common question for HBase administrators is estimating how much storage will be required for an HBase cluster. There are several apsects to consider, the most important of which is what data load into the cluster. Start with a solid understanding of how HBase handles data internally (KeyValue).
It is critical to understand that there is a KeyValue instance for every attribute stored in a row, and the rowkey-length, ColumnFamily name-length and attribute lengths will drive the size of the database more than any other factor.
KeyValue instances are aggregated into blocks, and the blocksize is configurable on a per-ColumnFamily basis. Blocks are aggregated into StoreFile's. See ???.
Another common question for HBase administrators is determining the right number of regions per RegionServer. This affects both storage and hardware planning. See ???.