com.cloudera.sqoop.mapreduce
Class ExportJobBase

java.lang.Object
  extended by com.cloudera.sqoop.mapreduce.JobBase
      extended by com.cloudera.sqoop.mapreduce.ExportJobBase
Direct Known Subclasses:
JdbcExportJob, MySQLExportJob

public class ExportJobBase
extends JobBase

Base class for running an export MapReduce job.


Field Summary
protected  ExportJobContext context
           
static java.lang.String EXPORT_MAP_TASKS_KEY
          Number of map tasks to use for an export.
static org.apache.commons.logging.Log LOG
           
static java.lang.String SQOOP_EXPORT_TABLE_CLASS_KEY
          What SqoopRecord class to use to read a record for export.
 
Fields inherited from class com.cloudera.sqoop.mapreduce.JobBase
inputFormatClass, mapperClass, options, outputFormatClass
 
Constructor Summary
ExportJobBase(ExportJobContext ctxt)
           
ExportJobBase(ExportJobContext ctxt, java.lang.Class<? extends org.apache.hadoop.mapreduce.Mapper> mapperClass, java.lang.Class<? extends org.apache.hadoop.mapreduce.InputFormat> inputFormatClass, java.lang.Class<? extends org.apache.hadoop.mapreduce.OutputFormat> outputFormatClass)
           
 
Method Summary
protected  void configureInputFormat(org.apache.hadoop.mapreduce.Job job, java.lang.String tableName, java.lang.String tableClassName, java.lang.String splitByCol)
          Configure the inputformat to use for the job.
protected  void configureMapper(org.apache.hadoop.mapreduce.Job job, java.lang.String tableName, java.lang.String tableClassName)
          Set the mapper class implementation to use in the job, as well as any related configuration (e.g., map output types).
protected  int configureNumTasks(org.apache.hadoop.mapreduce.Job job)
          Configure the number of map/reduce tasks to use in the job.
protected  java.lang.Class<? extends org.apache.hadoop.mapreduce.InputFormat> getInputFormatClass()
           
protected  org.apache.hadoop.fs.Path getInputPath()
           
protected  java.lang.Class<? extends org.apache.hadoop.mapreduce.OutputFormat> getOutputFormatClass()
           
protected  boolean inputIsSequenceFiles()
           
static boolean isSequenceFiles(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.Path p)
           
 void runExport()
          Run an export job to dump a table from HDFS to a database.
protected  boolean runJob(org.apache.hadoop.mapreduce.Job job)
          Actually run the MapReduce job.
 
Methods inherited from class com.cloudera.sqoop.mapreduce.JobBase
configureOutputFormat, getMapperClass, loadJars, setInputFormatClass, setMapperClass, setOptions, setOutputFormatClass, unloadJars
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

LOG

public static final org.apache.commons.logging.Log LOG

SQOOP_EXPORT_TABLE_CLASS_KEY

public static final java.lang.String SQOOP_EXPORT_TABLE_CLASS_KEY
What SqoopRecord class to use to read a record for export.

See Also:
Constant Field Values

EXPORT_MAP_TASKS_KEY

public static final java.lang.String EXPORT_MAP_TASKS_KEY
Number of map tasks to use for an export.

See Also:
Constant Field Values

context

protected ExportJobContext context
Constructor Detail

ExportJobBase

public ExportJobBase(ExportJobContext ctxt)

ExportJobBase

public ExportJobBase(ExportJobContext ctxt,
                     java.lang.Class<? extends org.apache.hadoop.mapreduce.Mapper> mapperClass,
                     java.lang.Class<? extends org.apache.hadoop.mapreduce.InputFormat> inputFormatClass,
                     java.lang.Class<? extends org.apache.hadoop.mapreduce.OutputFormat> outputFormatClass)
Method Detail

isSequenceFiles

public static boolean isSequenceFiles(org.apache.hadoop.conf.Configuration conf,
                                      org.apache.hadoop.fs.Path p)
                               throws java.io.IOException
Returns:
true if p is a SequenceFile, or a directory containing SequenceFiles.
Throws:
java.io.IOException

getInputPath

protected org.apache.hadoop.fs.Path getInputPath()
                                          throws java.io.IOException
Returns:
the Path to the files we are going to export to the db.
Throws:
java.io.IOException

configureInputFormat

protected void configureInputFormat(org.apache.hadoop.mapreduce.Job job,
                                    java.lang.String tableName,
                                    java.lang.String tableClassName,
                                    java.lang.String splitByCol)
                             throws java.lang.ClassNotFoundException,
                                    java.io.IOException
Description copied from class: JobBase
Configure the inputformat to use for the job.

Overrides:
configureInputFormat in class JobBase
Throws:
java.lang.ClassNotFoundException
java.io.IOException

getInputFormatClass

protected java.lang.Class<? extends org.apache.hadoop.mapreduce.InputFormat> getInputFormatClass()
                                                                                          throws java.lang.ClassNotFoundException
Overrides:
getInputFormatClass in class JobBase
Returns:
the inputformat class to use for the job.
Throws:
java.lang.ClassNotFoundException

getOutputFormatClass

protected java.lang.Class<? extends org.apache.hadoop.mapreduce.OutputFormat> getOutputFormatClass()
                                                                                            throws java.lang.ClassNotFoundException
Overrides:
getOutputFormatClass in class JobBase
Returns:
the outputformat class to use for the job.
Throws:
java.lang.ClassNotFoundException

configureMapper

protected void configureMapper(org.apache.hadoop.mapreduce.Job job,
                               java.lang.String tableName,
                               java.lang.String tableClassName)
                        throws java.lang.ClassNotFoundException,
                               java.io.IOException
Description copied from class: JobBase
Set the mapper class implementation to use in the job, as well as any related configuration (e.g., map output types).

Overrides:
configureMapper in class JobBase
Throws:
java.lang.ClassNotFoundException
java.io.IOException

configureNumTasks

protected int configureNumTasks(org.apache.hadoop.mapreduce.Job job)
                         throws java.io.IOException
Description copied from class: JobBase
Configure the number of map/reduce tasks to use in the job.

Overrides:
configureNumTasks in class JobBase
Throws:
java.io.IOException

runJob

protected boolean runJob(org.apache.hadoop.mapreduce.Job job)
                  throws java.lang.ClassNotFoundException,
                         java.io.IOException,
                         java.lang.InterruptedException
Description copied from class: JobBase
Actually run the MapReduce job.

Overrides:
runJob in class JobBase
Throws:
java.lang.ClassNotFoundException
java.io.IOException
java.lang.InterruptedException

runExport

public void runExport()
               throws ExportException,
                      java.io.IOException
Run an export job to dump a table from HDFS to a database.

Throws:
java.io.IOException - if the export job encounters an IO error
ExportException - if the job fails unexpectedly or is misconfigured.

inputIsSequenceFiles

protected boolean inputIsSequenceFiles()
Returns:
true if the input directory contains SequenceFiles.


Copyright © 2010 Cloudera, Inc.