org.apache.pig
Class EvalFunc<T>

java.lang.Object
  extended by org.apache.pig.EvalFunc<T>
Direct Known Subclasses:
ARITY, AVG, AVG.Final, AVG.Initial, AVG.Intermediate, BagSize, CONCAT, ConstantSize, COUNT, COUNT.Final, COUNT.Initial, COUNT.Intermediate, DIFF, Distinct, Distinct.Final, Distinct.Initial, Distinct.Intermediate, DoubleAvg, DoubleAvg.Final, DoubleAvg.Initial, DoubleAvg.Intermediate, DoubleMax, DoubleMax.Final, DoubleMax.Initial, DoubleMax.Intermediate, DoubleMin, DoubleMin.Final, DoubleMin.Initial, DoubleMin.Intermediate, DoubleSum, DoubleSum.Final, DoubleSum.Initial, DoubleSum.Intermediate, FilterFunc, FloatAvg, FloatAvg.Final, FloatAvg.Initial, FloatAvg.Intermediate, FloatMax, FloatMax.Final, FloatMax.Initial, FloatMax.Intermediate, FloatMin, FloatMin.Final, FloatMin.Initial, FloatMin.Intermediate, FloatSum, FloatSum.Final, FloatSum.Initial, FloatSum.Intermediate, GFAny, GFCross, GFReplicate, IntAvg, IntAvg.Final, IntAvg.Initial, IntAvg.Intermediate, IntMax, IntMax.Final, IntMax.Initial, IntMax.Intermediate, IntMin, IntMin.Final, IntMin.Initial, IntMin.Intermediate, IntSum, IntSum.Final, IntSum.Initial, IntSum.Intermediate, LongAvg, LongAvg.Final, LongAvg.Initial, LongAvg.Intermediate, LongMax, LongMax.Final, LongMax.Initial, LongMax.Intermediate, LongMin, LongMin.Final, LongMin.Initial, LongMin.Intermediate, LongSum, LongSum.Final, LongSum.Initial, LongSum.Intermediate, MapSize, MAX, MAX.Final, MAX.Initial, MAX.Intermediate, MIN, MIN.Final, MIN.Initial, MIN.Intermediate, SIZE, StringConcat, StringMax, StringMax.Final, StringMax.Initial, StringMax.Intermediate, StringMin, StringMin.Final, StringMin.Initial, StringMin.Intermediate, StringSize, SUM, SUM.Final, SUM.Initial, SUM.Intermediate, TOKENIZE, TupleSize

public abstract class EvalFunc<T>
extends Object

The class is used to implement functions to be applied to a dataset. The function is applied to each Tuple in the set. The programmer should not make assumptions about state maintained between invocations of the invoke() method since the Pig runtime will schedule and localize invocations based on information provided at runtime. The programmer also should not make assumptions about when or how many times the class will be instantiated, since it may be instantiated multiple times in both the front and back end.


Field Summary
protected  org.apache.commons.logging.Log log
           
protected  PigLogger pigLogger
           
protected  PigProgressable reporter
           
protected  Type returnType
           
 
Constructor Summary
EvalFunc()
           
 
Method Summary
abstract  T exec(Tuple input)
          This callback method must be implemented by all subclasses.
 void finish()
          Placeholder for cleanup to be performed at the end.
 List<FuncSpec> getArgToFuncMapping()
           
 org.apache.commons.logging.Log getLogger()
           
 PigLogger getPigLogger()
           
 PigProgressable getReporter()
           
 Type getReturnType()
           
protected  String getSchemaName(String name, Schema input)
           
 boolean isAsynchronous()
          This function should be overriden to return true for functions that return their values asynchronously.
 Schema outputSchema(Schema input)
           
 void progress()
           
 void setPigLogger(PigLogger pigLogger)
           
 void setReporter(PigProgressable reporter)
           
 void warn(String msg, Enum warningEnum)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

reporter

protected PigProgressable reporter

log

protected org.apache.commons.logging.Log log

pigLogger

protected PigLogger pigLogger

returnType

protected Type returnType
Constructor Detail

EvalFunc

public EvalFunc()
Method Detail

getSchemaName

protected String getSchemaName(String name,
                               Schema input)

getReturnType

public Type getReturnType()

progress

public final void progress()

warn

public final void warn(String msg,
                       Enum warningEnum)

finish

public void finish()
Placeholder for cleanup to be performed at the end. User defined functions can override.


exec

public abstract T exec(Tuple input)
                throws IOException
This callback method must be implemented by all subclasses. This is the method that will be invoked on every Tuple of a given dataset. Since the dataset may be divided up in a variety of ways the programmer should not make assumptions about state that is maintained between invocations of this method.

Parameters:
input - the Tuple to be processed.
Returns:
result, of type T.
Throws:
IOException

outputSchema

public Schema outputSchema(Schema input)
Parameters:
input - Schema of the input
Returns:
Schema of the output

isAsynchronous

public boolean isAsynchronous()
This function should be overriden to return true for functions that return their values asynchronously. Currently pig never attempts to execute a function asynchronously.

Returns:
true if the function can be executed asynchronously.

getReporter

public PigProgressable getReporter()

setReporter

public final void setReporter(PigProgressable reporter)

getArgToFuncMapping

public List<FuncSpec> getArgToFuncMapping()
                                   throws FrontendException
Returns:
A List containing FuncSpec objects representing the Function class which can handle the inputs corresponding to the schema in the objects
Throws:
FrontendException

getPigLogger

public PigLogger getPigLogger()

setPigLogger

public final void setPigLogger(PigLogger pigLogger)

getLogger

public org.apache.commons.logging.Log getLogger()


Copyright © ${year} The Apache Software Foundation