org.apache.hadoop.hive.ql.exec
Class FileSinkOperator

java.lang.Object
  extended by org.apache.hadoop.hive.ql.exec.Operator<T>
      extended by org.apache.hadoop.hive.ql.exec.TerminalOperator<FileSinkDesc>
          extended by org.apache.hadoop.hive.ql.exec.FileSinkOperator
All Implemented Interfaces:
Serializable, Node

public class FileSinkOperator
extends TerminalOperator<FileSinkDesc>
implements Serializable

File Sink operator implementation.

See Also:
Serialized Form

Nested Class Summary
 class FileSinkOperator.FSPaths
           
static interface FileSinkOperator.RecordWriter
          RecordWriter.
static class FileSinkOperator.TableIdEnum
          TableIdEnum.
 
Nested classes/interfaces inherited from class org.apache.hadoop.hive.ql.exec.Operator
Operator.OperatorFunc, Operator.ProgressCounter, Operator.State
 
Field Summary
protected  boolean autoDelete
           
protected  org.apache.hadoop.io.BytesWritable commonKey
           
protected  List<String> dpColNames
           
protected  DynamicPartitionCtx dpCtx
           
protected  int dpStartCol
           
protected  List<String> dpVals
           
protected  List<Object> dpWritables
           
protected  org.apache.hadoop.fs.FileSystem fs
           
protected  HiveOutputFormat<?,?> hiveOutputFormat
           
protected  boolean isCompressed
           
protected  org.apache.hadoop.mapred.JobConf jc
           
protected  int maxPartitions
           
protected  int numDynParts
           
protected  org.apache.hadoop.fs.Path parent
           
protected  FileSinkOperator.RecordWriter[] rowOutWriters
           
protected  Serializer serializer
           
protected  org.apache.hadoop.fs.Path specPath
           
protected  FileSinkOperator.TableIdEnum tabIdEnum
           
protected  HashMap<String,FileSinkOperator.FSPaths> valToPaths
           
 
Fields inherited from class org.apache.hadoop.hive.ql.exec.Operator
alias, beginTime, childOperators, childOperatorsArray, childOperatorsTag, colExprMap, conf, counterNames, counterNameToEnum, counters, done, fatalErrorCntr, groupKeyObject, id, inputObjInspectors, inputRows, isLogInfoEnabled, LOG, numInputRowsCntr, numOutputRowsCntr, operatorId, out, outputObjInspector, outputRows, parentOperators, reporter, state, statsMap, timeTakenCntr, totalTime
 
Constructor Summary
FileSinkOperator()
           
 
Method Summary
 void augmentPlan()
          Called during semantic analysis as operators are being added in order to give them a chance to compute any additional plan information needed.
 void closeOp(boolean abort)
          Operator specific close routine.
protected  void fatalErrorMessage(StringBuilder errMsg, long counterCode)
          Get the fatal error message based on counter's code.
 String getName()
          Implements the getName function for the Node Interface.
 OperatorType getType()
          Return the type of the specific operator among the types in OperatorType.
protected  void initializeOp(org.apache.hadoop.conf.Configuration hconf)
          Operator specific initialization.
 void jobClose(org.apache.hadoop.conf.Configuration hconf, boolean success, JobCloseFeedBack feedBack)
          Unlike other operator interfaces which are called from map or reduce task, jobClose is called from the jobclient side once the job has completed.
 void mvFileToFinalPath(String specPath, org.apache.hadoop.conf.Configuration hconf, boolean success, org.apache.commons.logging.Log log, DynamicPartitionCtx dpCtx)
           
 void processOp(Object row, int tag)
          Process the row.
 
Methods inherited from class org.apache.hadoop.hive.ql.exec.Operator
allInitializedParentsAreClosed, areAllParentsInitialized, assignCounterNameToEnum, checkFatalErrors, cleanUpInputFileChanged, cleanUpInputFileChangedOp, close, dump, dump, endGroup, forward, getChildOperators, getChildren, getColumnExprMap, getConf, getCounterNames, getCounterNameToEnum, getCounters, getDone, getExecContext, getGroupKeyObject, getIdentifier, getInputObjInspectors, getOperatorId, getParentOperators, getSchema, getStats, incrCounter, initEvaluators, initEvaluators, initEvaluatorsAndReturnStruct, initialize, initializeChildren, initializeCounters, initializeLocalWork, initOperatorId, logStats, passExecContext, preorderMap, process, removeChild, removeChildAndAdoptItsChildren, replaceChild, replaceParent, reset, resetId, resetLastEnumUsed, resetStats, setAlias, setChildOperators, setColumnExprMap, setConf, setCounterNames, setCounterNameToEnum, setDone, setExecContext, setGroupKeyObject, setId, setInputObjInspectors, setOperatorId, setOutputCollector, setParentOperators, setReporter, setSchema, startGroup, updateCounters
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

valToPaths

protected transient HashMap<String,FileSinkOperator.FSPaths> valToPaths

numDynParts

protected transient int numDynParts

dpColNames

protected transient List<String> dpColNames

dpCtx

protected transient DynamicPartitionCtx dpCtx

isCompressed

protected transient boolean isCompressed

parent

protected transient org.apache.hadoop.fs.Path parent

hiveOutputFormat

protected transient HiveOutputFormat<?,?> hiveOutputFormat

specPath

protected transient org.apache.hadoop.fs.Path specPath

dpStartCol

protected transient int dpStartCol

dpVals

protected transient List<String> dpVals

dpWritables

protected transient List<Object> dpWritables

rowOutWriters

protected transient FileSinkOperator.RecordWriter[] rowOutWriters

maxPartitions

protected transient int maxPartitions

fs

protected transient org.apache.hadoop.fs.FileSystem fs

serializer

protected transient Serializer serializer

commonKey

protected transient org.apache.hadoop.io.BytesWritable commonKey

tabIdEnum

protected transient FileSinkOperator.TableIdEnum tabIdEnum

autoDelete

protected transient boolean autoDelete

jc

protected transient org.apache.hadoop.mapred.JobConf jc
Constructor Detail

FileSinkOperator

public FileSinkOperator()
Method Detail

initializeOp

protected void initializeOp(org.apache.hadoop.conf.Configuration hconf)
                     throws HiveException
Description copied from class: Operator
Operator specific initialization.

Overrides:
initializeOp in class Operator<FileSinkDesc>
Throws:
HiveException

processOp

public void processOp(Object row,
                      int tag)
               throws HiveException
Description copied from class: Operator
Process the row.

Specified by:
processOp in class Operator<FileSinkDesc>
Parameters:
row - The object representing the row.
tag - The tag of the row usually means which parent this row comes from. Rows with the same tag should have exactly the same rowInspector all the time.
Throws:
HiveException

fatalErrorMessage

protected void fatalErrorMessage(StringBuilder errMsg,
                                 long counterCode)
Description copied from class: Operator
Get the fatal error message based on counter's code.

Overrides:
fatalErrorMessage in class Operator<FileSinkDesc>
Parameters:
errMsg - error message should be appended to this output parameter.
counterCode - input counter code.

closeOp

public void closeOp(boolean abort)
             throws HiveException
Description copied from class: Operator
Operator specific close routine. Operators which inherents this class should overwrite this funtion for their specific cleanup routine.

Overrides:
closeOp in class Operator<FileSinkDesc>
Throws:
HiveException

getName

public String getName()
Description copied from class: Operator
Implements the getName function for the Node Interface.

Specified by:
getName in interface Node
Overrides:
getName in class Operator<FileSinkDesc>
Returns:
the name of the operator

jobClose

public void jobClose(org.apache.hadoop.conf.Configuration hconf,
                     boolean success,
                     JobCloseFeedBack feedBack)
              throws HiveException
Description copied from class: Operator
Unlike other operator interfaces which are called from map or reduce task, jobClose is called from the jobclient side once the job has completed.

Overrides:
jobClose in class Operator<FileSinkDesc>
Parameters:
hconf - Configuration with with which job was submitted
success - whether the job was completed successfully or not
Throws:
HiveException

mvFileToFinalPath

public void mvFileToFinalPath(String specPath,
                              org.apache.hadoop.conf.Configuration hconf,
                              boolean success,
                              org.apache.commons.logging.Log log,
                              DynamicPartitionCtx dpCtx)
                       throws IOException,
                              HiveException
Throws:
IOException
HiveException

getType

public OperatorType getType()
Description copied from class: Operator
Return the type of the specific operator among the types in OperatorType.

Specified by:
getType in class Operator<FileSinkDesc>
Returns:
OperatorType.*

augmentPlan

public void augmentPlan()
Description copied from class: Operator
Called during semantic analysis as operators are being added in order to give them a chance to compute any additional plan information needed. Does nothing by default.

Overrides:
augmentPlan in class Operator<FileSinkDesc>


Copyright © 2011 The Apache Software Foundation