| 
 | ||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectorg.apache.hadoop.hbase.io.hfile.HFile
@InterfaceAudience.Private public class HFile
File format for hbase. A file of sorted key/value pairs. Both keys and values are byte arrays.
The memory footprint of a HFile includes the following (below is taken from the TFile documentation but applies also to HFile):
File is made of data blocks followed by meta data blocks (if any), a fileinfo block, data block index, meta data block index, and a fixed size trailer which records the offsets at which file changes content type.
<data blocks><meta blocks><fileinfo><data index><meta index><trailer>Each block has a bit of magic at its start. Block are comprised of key/values. In data blocks, they are both byte arrays. Metadata blocks are a String key and a byte array value. An empty file looks like this:
<fileinfo><trailer>. That is, there are not data nor meta blocks present.
TODO: Do scanners need to be able to take a start and end row? TODO: Should BlockIndex know the name of its file? Should it have a Path that points at its file say for the case where an index lives apart from an HFile instance?
| Nested Class Summary | |
|---|---|
| static interface | HFile.CachingBlockReaderAn abstraction used by the block index | 
| static interface | HFile.ReaderAn interface used by clients to open and iterate an HFile. | 
| static interface | HFile.WriterAPI required to write an HFile | 
| static class | HFile.WriterFactoryThis variety of ways to construct writers is used throughout the code, and we want to be able to swap writer implementations. | 
| Field Summary | |
|---|---|
| static String | BLOOM_FILTER_DATA_KEYMeta data block name for bloom filter bits. | 
| static AtomicLong | dataBlockReadCnt | 
| static int | DEFAULT_BYTES_PER_CHECKSUMThe number of bytes per checksum. | 
| static ChecksumType | DEFAULT_CHECKSUM_TYPE | 
| static String | DEFAULT_COMPRESSIONDefault compression name: none. | 
| static Compression.Algorithm | DEFAULT_COMPRESSION_ALGORITHMDefault compression: none. | 
| static String | FORMAT_VERSION_KEYThe configuration key for HFile version to use for new files | 
| static int | MAX_FORMAT_VERSIONMaximum supported HFile format version | 
| static int | MAXIMUM_KEY_LENGTHMaximum length of key in HFile. | 
| static int | MIN_FORMAT_VERSIONMinimum supported HFile format version | 
| static int | MIN_NUM_HFILE_PATH_LEVELSWe assume that HFile path ends with ROOT_DIR/TABLE_NAME/REGION_NAME/CF_NAME/HFILE, so it has at least this many levels of nesting. | 
| Constructor Summary | |
|---|---|
| HFile() | |
| Method Summary | |
|---|---|
| static void | checkFormatVersion(int version)Checks the given HFileformat version, and throws an exception if
 invalid. | 
| static HFile.Reader | createReader(org.apache.hadoop.fs.FileSystem fs,
             org.apache.hadoop.fs.Path path,
             CacheConfig cacheConf) | 
| static HFile.Reader | createReaderWithEncoding(org.apache.hadoop.fs.FileSystem fs,
                         org.apache.hadoop.fs.Path path,
                         CacheConfig cacheConf,
                         DataBlockEncoding preferredEncodingInCache) | 
| static HFile.Reader | createReaderWithEncoding(org.apache.hadoop.fs.FileSystem fs,
                         org.apache.hadoop.fs.Path path,
                         FSDataInputStreamWrapper fsdis,
                         long size,
                         CacheConfig cacheConf,
                         DataBlockEncoding preferredEncodingInCache) | 
| static long | getChecksumFailuresCount()Number of checksum verification failures. | 
| static int | getFormatVersion(org.apache.hadoop.conf.Configuration conf) | 
| static Collection<Long> | getPreadLatenciesNanos() | 
| static int | getPreadOps() | 
| static long | getPreadTimeMs() | 
| static Collection<Long> | getReadLatenciesNanos() | 
| static int | getReadOps() | 
| static long | getReadTimeMs() | 
| static String[] | getSupportedCompressionAlgorithms()Get names of supported compression algorithms. | 
| static Collection<Long> | getWriteLatenciesNanos() | 
| static int | getWriteOps() | 
| static HFile.WriterFactory | getWriterFactory(org.apache.hadoop.conf.Configuration conf,
                 CacheConfig cacheConf)Returns the factory to be used to create HFilewriters | 
| static HFile.WriterFactory | getWriterFactoryNoCache(org.apache.hadoop.conf.Configuration conf)Returns the factory to be used to create HFilewriters. | 
| static long | getWriteTimeMs() | 
| static boolean | isReservedFileInfoKey(byte[] key)Return true if the given file info key is reserved for internal use. | 
| static void | main(String[] args) | 
| static void | offerReadLatency(long latencyNanos,
                 boolean pread) | 
| static void | offerWriteLatency(long latencyNanos) | 
| Methods inherited from class java.lang.Object | 
|---|
| clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait | 
| Field Detail | 
|---|
public static final int MAXIMUM_KEY_LENGTH
public static final Compression.Algorithm DEFAULT_COMPRESSION_ALGORITHM
public static final int MIN_FORMAT_VERSION
public static final int MAX_FORMAT_VERSION
public static final String DEFAULT_COMPRESSION
public static final String BLOOM_FILTER_DATA_KEY
public static final int MIN_NUM_HFILE_PATH_LEVELS
public static final int DEFAULT_BYTES_PER_CHECKSUM
public static final ChecksumType DEFAULT_CHECKSUM_TYPE
public static volatile AtomicLong dataBlockReadCnt
public static final String FORMAT_VERSION_KEY
| Constructor Detail | 
|---|
public HFile()
| Method Detail | 
|---|
public static final void offerReadLatency(long latencyNanos,
                                          boolean pread)
public static final void offerWriteLatency(long latencyNanos)
public static final Collection<Long> getReadLatenciesNanos()
public static final Collection<Long> getPreadLatenciesNanos()
public static final Collection<Long> getWriteLatenciesNanos()
public static final int getReadOps()
public static final long getReadTimeMs()
public static final int getPreadOps()
public static final long getPreadTimeMs()
public static final int getWriteOps()
public static final long getWriteTimeMs()
public static final long getChecksumFailuresCount()
public static int getFormatVersion(org.apache.hadoop.conf.Configuration conf)
public static final HFile.WriterFactory getWriterFactoryNoCache(org.apache.hadoop.conf.Configuration conf)
HFile writers.
 Disables block cache access for all writers created through the
 returned factory.
public static final HFile.WriterFactory getWriterFactory(org.apache.hadoop.conf.Configuration conf,
                                                         CacheConfig cacheConf)
HFile writers
public static HFile.Reader createReaderWithEncoding(org.apache.hadoop.fs.FileSystem fs,
                                                    org.apache.hadoop.fs.Path path,
                                                    CacheConfig cacheConf,
                                                    DataBlockEncoding preferredEncodingInCache)
                                             throws IOException
fs - A file systempath - Path to HFilecacheConf - Cache configuration for hfile's contentspreferredEncodingInCache - Preferred in-cache data encoding algorithm.
IOException - If file is invalid, will throw CorruptHFileException flavored IOException
public static HFile.Reader createReaderWithEncoding(org.apache.hadoop.fs.FileSystem fs,
                                                    org.apache.hadoop.fs.Path path,
                                                    FSDataInputStreamWrapper fsdis,
                                                    long size,
                                                    CacheConfig cacheConf,
                                                    DataBlockEncoding preferredEncodingInCache)
                                             throws IOException
fs - A file systempath - Path to HFilefsdis - a stream of path's filesize - max size of the trailer.cacheConf - Cache configuration for hfile's contentspreferredEncodingInCache - Preferred in-cache data encoding algorithm.
IOException - If file is invalid, will throw CorruptHFileException flavored IOException
public static HFile.Reader createReader(org.apache.hadoop.fs.FileSystem fs,
                                        org.apache.hadoop.fs.Path path,
                                        CacheConfig cacheConf)
                                 throws IOException
fs - filesystempath - Path to file to readcacheConf - This must not be null.  @see CacheConfig.CacheConfig(Configuration)
IOException - Will throw a CorruptHFileException (DoNotRetryIOException subtype) if hfile is corrupt/invalid.public static boolean isReservedFileInfoKey(byte[] key)
public static String[] getSupportedCompressionAlgorithms()
public static void main(String[] args)
                 throws IOException
IOException
public static void checkFormatVersion(int version)
                               throws IllegalArgumentException
HFile format version, and throws an exception if
 invalid. Note that if the version number comes from an input file and has
 not been verified, the caller needs to re-throw an IOException to
 indicate that this is not a software error, but corrupted input.
version - an HFile version
IllegalArgumentException - if the version is invalid| 
 | ||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||