| 
 | ||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectorg.apache.hadoop.mapreduce.InputFormat<ImmutableBytesWritable,Result>
org.apache.hadoop.hbase.mapreduce.TableInputFormatBase
@InterfaceAudience.Public @InterfaceStability.Stable public abstract class TableInputFormatBase
A base for TableInputFormats. Receives a HTable, an
 Scan instance that defines the input columns etc. Subclasses may use
 other TableRecordReader implementations.
 
An example of a subclass:
   class ExampleTIF extends TableInputFormatBase implements JobConfigurable {
     public void configure(JobConf job) {
       HTable exampleTable = new HTable(HBaseConfiguration.create(job),
         Bytes.toBytes("exampleTable"));
       // mandatory
       setHTable(exampleTable);
       Text[] inputColumns = new byte [][] { Bytes.toBytes("columnA"),
         Bytes.toBytes("columnB") };
       // mandatory
       setInputColumns(inputColumns);
       RowFilterInterface exampleFilter = new RegExpRowFilter("keyPrefix.*");
       // optional
       setRowFilter(exampleFilter);
     }
     public void validateInput(JobConf job) throws IOException {
     }
  }
 
| Constructor Summary | |
|---|---|
| TableInputFormatBase() | |
| Method Summary | |
|---|---|
|  org.apache.hadoop.mapreduce.RecordReader<ImmutableBytesWritable,Result> | createRecordReader(org.apache.hadoop.mapreduce.InputSplit split,
                   org.apache.hadoop.mapreduce.TaskAttemptContext context)Builds a TableRecordReader. | 
| protected  HTable | getHTable()Allows subclasses to get the HTable. | 
|  Scan | getScan()Gets the scan defining the actual details like columns etc. | 
|  List<org.apache.hadoop.mapreduce.InputSplit> | getSplits(org.apache.hadoop.mapreduce.JobContext context)Calculates the splits that will serve as input for the map tasks. | 
| protected  boolean | includeRegionInSplit(byte[] startKey,
                     byte[] endKey)Test if the given region is to be included in the InputSplit while splitting the regions of a table. | 
| protected  void | setHTable(HTable table)Allows subclasses to set the HTable. | 
|  void | setScan(Scan scan)Sets the scan defining the actual details like columns etc. | 
| protected  void | setTableRecordReader(TableRecordReader tableRecordReader)Allows subclasses to set the TableRecordReader. | 
| Methods inherited from class java.lang.Object | 
|---|
| clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait | 
| Constructor Detail | 
|---|
public TableInputFormatBase()
| Method Detail | 
|---|
public org.apache.hadoop.mapreduce.RecordReader<ImmutableBytesWritable,Result> createRecordReader(org.apache.hadoop.mapreduce.InputSplit split,
                                                                                                  org.apache.hadoop.mapreduce.TaskAttemptContext context)
                                                                                           throws IOException
createRecordReader in class org.apache.hadoop.mapreduce.InputFormat<ImmutableBytesWritable,Result>split - The split to work with.context - The current context.
IOException - When creating the reader fails.InputFormat.createRecordReader(
   org.apache.hadoop.mapreduce.InputSplit,
   org.apache.hadoop.mapreduce.TaskAttemptContext)
public List<org.apache.hadoop.mapreduce.InputSplit> getSplits(org.apache.hadoop.mapreduce.JobContext context)
                                                       throws IOException
getSplits in class org.apache.hadoop.mapreduce.InputFormat<ImmutableBytesWritable,Result>context - The current job context.
IOException - When creating the list of splits fails.InputFormat.getSplits(
   org.apache.hadoop.mapreduce.JobContext)
protected boolean includeRegionInSplit(byte[] startKey,
                                       byte[] endKey)
 This optimization is effective when there is a specific reasoning to exclude an entire region from the M-R job,
 (and hence, not contributing to the InputSplit), given the start and end keys of the same. 
 Useful when we need to remember the last-processed top record and revisit the [last, current) interval for M-R processing,
 continuously. In addition to reducing InputSplits, reduces the load on the region server as well, due to the ordering of the keys.
 
 
 Note: It is possible that endKey.length() == 0  , for the last (recent) region.
 
 Override this method, if you want to bulk exclude regions altogether from M-R. By default, no region is excluded( i.e. all regions are included).
startKey - Start key of the regionendKey - End key of the region
protected HTable getHTable()
HTable.
protected void setHTable(HTable table)
HTable.
table - The table to get the data from.public Scan getScan()
public void setScan(Scan scan)
scan - The scan to set.protected void setTableRecordReader(TableRecordReader tableRecordReader)
TableRecordReader.
tableRecordReader - A different TableRecordReader
   implementation.| 
 | ||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||