org.apache.hadoop.hive.ql.stats.jdbc
Class JDBCStatsAggregator

java.lang.Object
  extended by org.apache.hadoop.hive.ql.stats.jdbc.JDBCStatsAggregator
All Implemented Interfaces:
StatsAggregator

public class JDBCStatsAggregator
extends Object
implements StatsAggregator


Constructor Summary
JDBCStatsAggregator()
           
 
Method Summary
 String aggregateStats(String fileID, String statType)
          This method aggregates a given statistic from all tasks (partial stats).
 boolean cleanUp(String rowID)
          This method clears the temporary statistics that have been published without being aggregated.
 boolean closeConnection()
          This method closes the connection to the temporary storage.
 boolean connect(org.apache.hadoop.conf.Configuration hiveconf)
          This method connects to the temporary storage.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

JDBCStatsAggregator

public JDBCStatsAggregator()
Method Detail

connect

public boolean connect(org.apache.hadoop.conf.Configuration hiveconf)
Description copied from interface: StatsAggregator
This method connects to the temporary storage.

Specified by:
connect in interface StatsAggregator
Parameters:
hiveconf - HiveConf that contains the connection parameters.
Returns:
true if connection is successful, false otherwise.

aggregateStats

public String aggregateStats(String fileID,
                             String statType)
Description copied from interface: StatsAggregator
This method aggregates a given statistic from all tasks (partial stats). After aggregation, this method also automatically removes all records that have been aggregated.

Specified by:
aggregateStats in interface StatsAggregator
Parameters:
fileID - a prefix of the keys used in StatsPublisher to publish stats. Any rows that starts with the same prefix will be aggregated. For example, if the StatsPublisher uses the following compound key to publish stats: the output directory name (unique per FileSinkOperator) + the partition specs (only for dynamic partitions) + taskID (last component of task file) The keyPrefix for aggregation could be first 2 components. This will aggregates stats across all tasks for each partition.
statType - a string noting the key to be published. Ex: "numRows".
Returns:
a string representation of a long value, null if there are any error/exception.

closeConnection

public boolean closeConnection()
Description copied from interface: StatsAggregator
This method closes the connection to the temporary storage.

Specified by:
closeConnection in interface StatsAggregator
Returns:
true if close connection is successful, false otherwise.

cleanUp

public boolean cleanUp(String rowID)
Description copied from interface: StatsAggregator
This method clears the temporary statistics that have been published without being aggregated. Typically this happens when a job fails, or is forcibly stopped after publishing some statistics.

Specified by:
cleanUp in interface StatsAggregator
Parameters:
rowID - a prefix of the keys used in StatsPublisher to publish stats. It is the same as the first parameter in aggregateStats().
Returns:
true if cleanup is successful, false otherwise.


Copyright © 2011 The Apache Software Foundation