commit 318bc781117fa276ae81a3d111f5eeba0020634f
Author: Eli Collins <eli@cloudera.com>
Date:   Tue Mar 20 10:44:19 2012 -0700

    CLOUDERA-BUILD. Allow Datanodes and Tasktrackers to connect to Namenodes and Jobtrackers with
    different build revisions.
    
    Reason: Allow DN/TT upgrade and host addition with refreshed builds
    Author: Eli Collins
    Ref: CDH-4560

commit 78ca997f549a89d60b39ae466f02a2797fa8003a
Author: Aaron T. Myers <atm@cloudera.com>
Date:   Wed Mar 14 11:16:06 2012 -0700

    The LTC should set supplementary groups in addition to euid and egid when switching users.

commit 217a3767c48ad11d4632e19a22897677268c40c4
Author: Eli Collins <eli@cloudera.com>
Date:   Mon Feb 13 23:27:11 2012 -0800

    Ammend HDFS-854. Datanode should scan devices in parallel to generate block report.
    
    Reason: Bug
    Author: Eli Collins
    Ref: CDH-4455

commit 03b655719d13929bd68bb2c2f9cee615b389cea9
Author: Eli Collins <eli@cloudera.com>
Date:   Thu Jan 5 10:35:36 2012 -0800

    HDFS-2751. Datanode drops OS cache behind reads even for short reads.
    
    HDFS-2465 has some code which attempts to disable the "drop cache
    behind reads" functionality when the reads are <256KB (eg HBase
    random access). But this check was missing in the close()
    function, so it always drops cache behind reads regardless of the
    size of the read. This hurts HBase random read performance when
    this patch is enabled.
    
    Reason: Bug
    Author: Todd Lipcon
    Ref: CDH-4047

commit c9e37a4c57a325228ecb3d333dba302ac2098e2f
Author: Andrew Bayer <andrew@cloudera.com>
Date:   Tue Dec 20 10:15:26 2011 -0800

    Updating for CDH3u3 release.

commit a04c9e2a6bd20dcb50e7242b1c0fb35e5614d1cc
Author: Eli Collins <eli@cloudera.com>
Date:   Sun Dec 18 14:05:30 2011 -0800

    HDFS-2702. A single failed name dir can cause the NN to exit.
    
    There's a bug in FSEditLog#rollEditLog which results in the NN process
    exiting if a single name dir has failed.
    
    Reason: Bug
    Author: Eli Collins
    Ref: CDH-3921

commit fcab1c7f36866fdc09cb9939ff8786d690502b81
Author: Eli Collins <eli@cloudera.com>
Date:   Sun Dec 18 14:00:58 2011 -0800

    HDFS-2703. removedStorageDirs is not updated everywhere we remove a
    storage dir.
    
    There are a number of places (FSEditLog#open, purgeEditLog, and
    rollEditLog) where we remove a storage directory but don't add it to the
    removedStorageDirs list. This means a storage dir may have been removed
    but we don't see it in the log or Web UI.
    
    Reason: Bug
    Author: Eli Collins
    Ref: CDH-3921

commit 09a73c40213864289256e8b7e749eef3f7caa778
Author: Eli Collins <eli@cloudera.com>
Date:   Sun Dec 18 13:52:48 2011 -0800

    HDFS-2701. Cleanup FS* processIOError methods.
    
    Let's rename the various "processIOError" methods to be more
    descriptive. The current code makes it difficult to identify and
    reason about bug fixes. While we're at it let's remove "Fatal" from
    the "Unable to sync the edit log" log since it's not actually a fatal
    error (this is confusing to users). And 2NN "Checkpoint done" should
    be info, not a warning (also confusing to users).
    
    Reason: Improvement
    Author: Eli Collins
    Ref: CDH-3921

commit d0b1764daa4a3ca527d6ea553c009e46def2aeee
Author: Eli Collins <eli@cloudera.com>
Date:   Sun Dec 18 13:28:46 2011 -0800

    Ammend HADOOP-4885. HADOOP-4885 erroneously removed the call to
    processIOError in rollEditLog. Without it the NN process will not exit
    if there are no valid edit streams. GetImageServlet will NPE but the
    NN keeps running, accepting modifications it will not be able to
    persist.
    
    Reason: Bug
    Author: Eli Collins
    Ref: CDH-3921

commit 9817a9bd9bf215c4f66268e8d5e9f87cd8a417b4
Author: Eli Collins <eli@cloudera.com>
Date:   Sun Dec 18 13:28:21 2011 -0800

    Ammend HADOOP-4885. Cleanup.
    
    Author: Eli Collins
    Ref: CDH-3921

commit 60a51e89b21e9577519b447d3779acc8b83ce990
Author: Harsh J <harsh@cloudera.com>
Date:   Wed Nov 23 14:09:55 2011 +0530

    CLOUDERA-BUILD. Log jsvc output to HADOOP_LOG_DIR instead of /tmp.
    
    Description: JSVC output currently goes to /tmp and is not configurable.
    Reason: Customer request
    Author: Harsh J
    Ref: CDH-3832

commit 3d21fccdb5d67426a05655377e5dc0b926479673
Author: Eli Collins <eli@cloudera.com>
Date:   Tue Dec 13 18:46:10 2011 -0800

    CLOUDERA-BUILD. Update url for searching generated docs.

commit 164e84c499699ada743fdc5f39dce212f3e742fc
Author: Philip Zeyliger <philip@cloudera.com>
Date:   Sun Dec 4 21:31:52 2011 -0800

    CLOUDERA-BUILD. Add deprecated alias to removed TransferFsImage method.
    
    The fix for HDFS-2305/CDH-2761 changed the method signature
    of a static method in TransferFsImage.  This adds back
    the old method signature to allow for compatibility with
    plug-ins that wish to use it with CDH3.
    
    Reason: Compatibility
    Author: Philip Zeyliger
    Ref: OPSAPS-5105, CDH-2761

commit 7a636fd4deeb1fbe8f73dd1cab15c180e996ef8a
Author: Eli Collins <eli@cloudera.com>
Date:   Fri Dec 9 16:55:46 2011 -0800

    HDFS-2654. Make BlockReaderLocal not extend RemoteBlockReader2.
    
    Reason: Performance
    Author: Eli Collins
    Ref: CDH-3850

commit 2b54843a4f2fb1d85c4a2cc4a4ad981f961e6a77
Author: Eli Collins <eli@cloudera.com>
Date:   Fri Dec 9 20:25:09 2011 -0800

    HDFS-2653. DFSClient should cache whether addrs are non-local when
    short-circuiting is enabled.
    
    Reason: Performance
    Author: Eli Collins
    Ref: CDH-3850

commit c955c99664e9902542ce4a0787ffac909a22a69c
Author: Eli Collins <eli@cloudera.com>
Date:   Thu Dec 8 17:07:29 2011 -0800

    HDFS-2246. Shortcut a local client reads to a Datanodes files directly.
    
    Reason: Performance
    Author: Jitendra Nath Pandey, Eli Collins
    Ref: CDH-3850

commit ae79e854209b90fcf5e574a8c927cde3bdd3eb9c
Author: Jonathan Hsieh <jon@cloudera.com>
Date:   Tue Dec 6 08:08:25 2011 -0800

    HADOOP-6886 LocalFileSystem Needs createNonRecursive API
    
    Reason: Bug (HBase data loss)
    Author: Jitendra Nath Pandey and Nicholas Spiegelberg
    Ref: CDH-3816

commit fa023cef12584d6f38f17b05ea95445eb187cb9e
Author: Jonathan Hsieh <jon@cloudera.com>
Date:   Fri Dec 2 17:15:06 2011 -0800

    HADOOP-7879 DistributedFileSystem#createNonRecursive should also incrementWriteOps statistics.
    
    Reason: Bug (hbase data loss)
    Author: Jonathan Hsieh
    Ref: CDH-3798

commit dc1f32f60531bc146d6c42f473d5ba6a0b59fff3
Author: Jonathan Hsieh <jon@cloudera.com>
Date:   Wed Nov 30 08:54:06 2011 -0800

    HADOOP-7870 Fix recursive create.
    
    Reason: Bug (hbase data loss)
    Author: Jonathan Hsieh
    Ref: CDH-3798

commit 9934440df63c790387a54b212aacad4ee12a9dc9
Author: Jonathan Hsieh <jon@cloudera.com>
Date:   Mon Nov 28 10:19:56 2011 -0800

    HADOOP-6840 Support non-recursive create() in FileSystem and SequenceFile.Writer
    
    Reason: Bug (hbase data loss)
    Author: Nicolas Spiegelberg and Jitendra Nath Pandey
    Ref: CDH-3815

commit 2d52a4c5814814bf0a95c71906d2b4efcc1aa755
Author: Jonathan Hsieh <jon@cloudera.com>
Date:   Mon Nov 28 10:50:31 2011 -0800

    HDFS-617 Support for non-recursive create() in HDFS
    
    This backport adds CDH-specific backwards-compatibility handling of non-recursive file create()
    
    Reason: Bug (hbase data loss)
    Author: Kan Zhang
    Ref: CDH-3815

commit 356e443236ce15100943cfeabc9001b1d26bc77c
Author: Alejandro Abdelnur <tucu@cloudera.com>
Date:   Fri Dec 9 13:46:10 2011 -0800

    HADOOP-7902 skipping name rules setting (if already set) should be done on UGI initialization only
    
      Fixes regression introduced by HADOOP-7887
    
      Author: Alejandro Abdelnur
      Ref: CDH-3898

commit b9bc59e6d6f024d69f7cbee65b50fc0d5f99ead4
Author: Alejandro Abdelnur <tucu@cloudera.com>
Date:   Wed Dec 7 15:38:31 2011 -0800

    HADOOP-7887 KerberosAuthenticatorHandler is not setting KerberosName name rules from configuration
    
      Author: Alejandro Abdelnur
      Ref: CDH-3890

commit 25d2544f138b41918c7861880c4dc61988b808b0
Author: Eli Collins <eli@cloudera.com>
Date:   Mon Dec 5 16:16:31 2011 -0800

    HDFS-2638. Improve a block recovery log.
    
    It would be useful to know whether an attempt to recover a block is
    failing because the block was already recovered (has a new GS) or the
    block is missing.
    
    Reason: Debugging
    Author: Eli Collins
    Ref: CDH-3888

commit df2ba14900cd081fa87eec9aedf3f75eca9c2885
Author: Eli Collins <eli@cloudera.com>
Date:   Tue Dec 6 10:54:51 2011 -0800

    HDFS-2637. The rpc timeout for block recovery is too low.
    
    The RPC timeout for block recovery does not take into account that it
    issues multiple RPCs itself. This can cause recovery to fail if the
    network is congested or DNs are busy.
    
    Reason: Bug
    Author: Eli Collins
    Ref: CDH-3834

commit 8b29bcbea6d6886e84d6dfced7179439f219543e
Author: Eli Collins <eli@cloudera.com>
Date:   Sun Nov 27 21:57:11 2011 -0800

    HDFS-854. Datanode should scan devices in parallel to generate block report.
    
    A Datanode should scan its disk devices in parallel so that the time
    to generate a block report is reduced. This will reduce the startup
    time of a cluster.
    
    Author: Dmytro Molkov, Eli Collins
    Ref: CDH-3853

commit caa7d7ddd79b0775edb780c6a9cbf71d23bc4a99
Author: Alejandro Abdelnur <tucu@cloudera.com>
Date:   Tue Nov 29 11:49:09 2011 -0800

    HADOOP-7853 multiple javax security configurations cause conflicts.
    
      Reason: Hadoop-auth initialization and UGI initialization issues due to global config.
      Author: Daryn Sharp
      Ref: CDH-3865

commit ade43c9a6eee7eec2f237edcadb370f07c72a176
Author: Eli Collins <eli@cloudera.com>
Date:   Mon Nov 28 10:47:17 2011 -0800

    CLOUDERA-BUILD. Remove external guava dependency.
    
    Remove the guava-r09 external dependency by rebasing the r09 jar on
    o.a.h.thirdparty.guava and bundling the rebased jar in our lib dir.
    
    Author: Eli Collins
    Ref: CDH-3833

commit aaeecc5050d10ff5788cea98de3004e72a0c3a3c
Author: Eli Collins <eli@cloudera.com>
Date:   Sun Nov 27 18:20:29 2011 -0800

    CLOUDERA-BUILD. Update eclipse classpath.

commit 99b2072558bb79eea211c1965a6c896750c850ea
Author: Eli Collins <eli@cloudera.com>
Date:   Sun Nov 27 10:23:23 2011 -0800

    Ammend MAPREDUCE-3015. Rename TaskTrackerStatus#getTaskFailures.
    
    Reason: JobTracker plugin compatibility
    Author: Eli Collins
    Ref: CDH-3307

commit 12c64dfa5a3548776a039c835c5e5c7aa844a2f5
Author: Eli Collins <eli@cloudera.com>
Date:   Fri Nov 25 23:45:41 2011 -0800

    MAPREDUCE-2413. TT should handle disk failures by reinitializing itself.
    MAPREDUCE-2928. MR-2413 improvements.
    MAPREDUCE-2957. The TT should not re-init if it has no good local dirs.
    MAPREDUCE-2850. Add test for MAPREDUCE-2413.
    MAPREDUCE-3395. Add mapred.disk.healthChecker.interval to mapred-default.xml.
    MAPREDUCE-2415. Distribute the user task logs on to multiple disks.
    MAPREDUCE-3424. MR-2415 cleanup.
    MAPREDUCE-3015. Add local dir failure info to metrics and the web UI.
    MAPREDUCE-3419. Don't mark exited TT threads as dead in MiniMRCluster.
    
    Author: Ravi Gummadi, Bharath Mundlapudi, Eli Collins
    Ref: CDH-3307

commit 84bbaaca1521811875b9926d98658259421be1f6
Author: Eli Collins <eli@cloudera.com>
Date:   Sat Nov 19 18:59:15 2011 -0800

    HDFS-2541. For a sufficiently large value of blocks, the DN Scanner
    may request a random number with a negative seed value.
    
    Author: Harsh J
    Ref: CDH-3803

commit 898431eb9641a2932a09ff1373f478b0f77075d5
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Nov 22 11:44:28 2011 -0800

    MAPREDUCE-2905. Fix fair scheduler to prevent clumping of tasks when assignmultiple is enabled.
    
    Reason: spread load more evenly on clusters with many slots
    Author: Todd Lipcon and Jeff Bean
    Ref: CDH-3509

commit 1944559432d8af46f68feb041969bd26b2f950b8
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Nov 15 14:20:09 2011 -0800

    MAPREDUCE-936. Allow a load difference in fairshare scheduler
    
    Improves throughput of task scheduling in the scheduler, by allowing
    some nodes to have more tasks scheduled than others while scheduling
    is happening.
    
    Reason: Backporting in advance of MAPREDUCE-2905, which depends on this patch.
    Author: Zheng Shao
    Ref: CDH-3509

commit 62916ff723c390df008abce1b0fd80cccdbc4105
Author: Eli Collins <eli@cloudera.com>
Date:   Sun Nov 20 15:39:23 2011 -0800

    HADOOP-7457. Remove out-of-date Chinese language documentation.
    
    Author: Jakob Homan
    Ref: CDH-3842

commit abe5bf1c3ef82f90c356695fc16eb615293a56df
Author: Eli Collins <eli@cloudera.com>
Date:   Sat Nov 19 23:15:06 2011 -0800

    MAPREDUCE-2555. Avoid spurious logging from completed tasks.
    
    Author: Thomas Graves
    Ref: CDH-3855

commit a390e45180fbaaf4a3ea84d97ec0316c182ec8c9
Author: Eli Collins <eli@cloudera.com>
Date:   Sun Nov 20 11:22:14 2011 -0800

    HADOOP-6614. RunJar should provide more diags when it can't create a temp file.
    
    Author: Jonathan Hsieh
    Ref: CDH-3841

commit e732b3c97a9232182d3a28917ca3f9006f968ac9
Author: Eli Collins <eli@cloudera.com>
Date:   Fri Nov 18 15:45:26 2011 -0800

    MAPREDUCE-3343. TaskTracker Out of Memory because of distributed cache.
    
    This Out of Memory happens when you run large number of jobs (using
    the distributed cache) on a TaskTracker.
    
    Author: Zhao Yunjiong
    Ref: CDH-3798

commit 371e8d5b2a38f743adecbe146b7c7e77683568c8
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Nov 16 12:33:47 2011 -0800

    HADOOP-7761. Improve performance of raw comparisons.
    
    Reason: low risk performance improvement
    Author: Todd Lipcon
    Ref: CDH-3822

commit 8a120e1edaf1a833708003af806d7929f84a08be
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Nov 16 12:15:14 2011 -0800

    HDFS-2379. Allow block reports to proceed without holding FSDataset lock
    
    Reason: fix timeouts talking to DNs on datanodes with lots of blocks
    Author: Todd Lipcon
    Ref: CDH-3823

commit 2818706e4a48344eefb3f840fc70d8f3f34a381e
Author: Eli Collins <eli@cloudera.com>
Date:   Thu Nov 17 08:44:19 2011 -0800

    Ammend MAPREDUCE-2777. Remove TestTTMemoryReporting.

commit e691c9ae61baee4920ac8dd8535bb75503339bd8
Author: Eli Collins <eli@cloudera.com>
Date:   Wed Nov 16 16:30:55 2011 -0800

    MAPREDUCE-2777. Adds cumulative cpu usage and total heap usage to task
    counters. Backport MAPREDUCE-220 and MAPREDUCE-2469.
    
    MAPREDUCE-220. Collecting cpu and memory usage for MapReduce tasks.
    
    It would be nice for TaskTracker to collect cpu and memory usage for
    individual Map or Reduce tasks over time.
    
    MAPREDUCE-2469. Task counters should also report the total heap usage
    of the task.
    
    Currently, the task counters report VSS and RSS usage of the task. The
    task counter should also report the total heap usage of the task
    also. The task might be configured with a max heap size of M but the
    task's total heap usage might only be H, where H < M. In such a case,
    knowing only M doesn't provide a complete picture of the task's memory
    usage.
    
    Author: Scott Chen, Amar Kamat
    Ref: CDH-1458

commit ea81df1ff5de3b12df08e22eeffca29295914eae
Author: Roman Shaposhnik <rvs@cloudera.com>
Date:   Wed Nov 16 17:22:21 2011 -0800

    BIGTOP-261. pseudo distributed config would benefit from dfs.safemode.extension set to 0 and dfs.safemode.min.datanodes set to 1
    
    Reason: Improvement
    Author: Roman Shaposhnik
    Ref: DISTRO-330

commit b45a9f40b02b5d5859c389bfa7b17df94317614a
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Nov 11 14:40:01 2011 -0800

    MAPREDUCE-3289. Use fadvise in the TaskTracker's MapOutputServlet.
    
    The TaskTracker now uses the posix_fadvise syscall to page in map output
    before serving it to the reducers. After serving the output, it evicts
    it from the buffer cache since it will not be read again in the majority
    of cases.
    
    This new behavior can be disabled by setting mapred.tasktracker.shuffle.fadvise
    to false.
    
    This patch differs from the upstream version since the upstream version applies
    to the NodeManager in MR2.
    
    Reason: Low-risk performance improvement
    Author: Todd Lipcon
    Ref: CDH-3818

commit 139923b6c91849e59e4288f65c245f7a71cecc22
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Nov 14 15:43:23 2011 -0800

    MAPREDUCE-3184. Add a thread to the TaskTracker which monitors for spinning Jetty selector threads, and shuts down the daemon when one is detected.
    
    Reason: detect common JVM/Jetty bug and cause the TT to suicide, minimizing impact on running jobs
    Author: Todd Lipcon
    Ref: CDH-2785

commit 45504c489dfa7255be40ecf2f2a7a8c60cece01b
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Oct 26 17:57:22 2011 -0700

    HDFS-2267. DataXceiver thread name incorrect while waiting on op during keepalive.
    
    Reason: trivial bug fix for thread names after HDFS-941
    Author: Todd Lipcon
    Ref: CDH-3777

commit 7ed1f860044cef4a43b1eceadf2d0a5e8f11c174
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Oct 26 16:53:27 2011 -0700

    HDFS-941. Reuse connections between client and DN
    
    Also incorporates HDFS-2071. Use of isConnected() in DataXceiver is invalid
    
    Reason: big performance improvement for random-read workloads
    Author: bc Wong and Todd Lipcon
    Ref: CDH-3777

commit e7c439b8842ae77d185c85d56ce294f8c6467507
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Apr 21 23:03:27 2010 -0700

    HDFS-1001. DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK
    
    Reason: This patch is necessary for backport of HDFS-941 (socket reuse).
    Author: bc Wong
    Ref: CDH-3777

commit 11a4341b9a7cddce9f86b1a47219c64a09cd0c8f
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Nov 11 14:39:43 2011 -0800

    HDFS-2465. Add HDFS support for fadvise readahead and drop-behind.
    
    The DataNode now can pass IO advice down to the operating system to improve
    performance. The new behavior defaults off and can be enabled with the
    following configs:
    - dfs.datanode.readahead.bytes (number of bytes to readahead)
    - dfs.datanode.drop.cache.behind.writes (boolean)
    - dfs.datanode.sync.behind.writes (boolean)
    - dfs.datanode.drop.cache.behind.reads (boolean)
    
    Reason: low-risk performance improvements
    Author: Todd Lipcon
    Ref: CDH-3818

commit d23d17a52da7e94deac6d720e7d62706f00ee6f8
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Nov 11 14:38:40 2011 -0800

    HADOOP-7753. Support fadvise and sync_file_range in NativeIO. Add ReadaheadPool infrastructure for use in HDFS and MR.
    
    Reason: low-risk performance improvement
    Author: Todd Lipcon
    Ref: CDH-3818

commit a888190fdc5b6356dd5325073b3120c6dfc66288
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Oct 26 21:14:35 2011 -0700

    MAPREDUCE-3278. Fix a busy loop in ReduceTask that would cause 100% cpu utilization during the fetch phase.
    
    Previously, if the number of fetch threads in the reducer exceeded the number
    of unique hosts on which map outputs were available, the reducer would spin
    in a tight loop waiting for fetches to complete. This adds a proper wait/notify
    to avoid wasting CPU.
    
    Author: Todd Lipcon
    Reason: low risk performance improvement
    Ref: CDH-3817

commit 6ef50ae8cebf1b974e3fd4da6c18e2b7ff12dc52
Author: Ahmed Radwan <ahmed@cloudera.com>
Date:   Fri Nov 11 15:38:00 2011 -0800

    HDFS-94. The "Heap Size" in HDFS web ui may not be accurate.
    
    Reason: Bug
    Author: Dmytro Molkov
    Ref: CDH-3681

commit f2d1ba670c2ede6b02f4ee86ca57fed7cedfc38f
Author: Andrew Bayer <andrew@cloudera.com>
Date:   Wed Nov 9 09:47:23 2011 -0800

    CLOUDERA-BUILD. Setting default for IVY_MIRROR_PROP.

commit fcfa442e85a7f3f107b2d4ea71b5f362b9fd3f99
Author: Andrew Bayer <andrew@cloudera.com>
Date:   Wed Nov 2 13:11:26 2011 -0700

    CLOUDERA-BUILD. Adding IVY_MIRROR_PROPS to ant calls.
    
    This will allow overriding the URLs Ivy uses for Maven repositories,
    so that we can have internal builds take advantage of our internal
    Maven mirror.

commit 7b21fe4cdbd166feb49af7b0a5c266010b8cc1aa
Author: Andrew Bayer <andrew@cloudera.com>
Date:   Thu Oct 20 09:23:21 2011 -0700

    Updating for CDH3u3 development

commit 95a824e4005b2a94fe1c11f1ef9db4c672ba43cb
Author: Roman Shaposhnik <rvs@cloudera.com>
Date:   Tue Oct 11 18:00:22 2011 -0700

    CLOUDERA-BUILD. hadoop-0.20 package should not ship a cloudera folder

commit 5fc6261a4c399bcb75bcc7cadf6cdd74f9362bcd
Author: Aaron T. Myers <atm@cloudera.com>
Date:   Mon Oct 10 18:02:41 2011 -0700

    HDFS-2422. The NN should tolerate the same number of low-resource volumes as failed volumes
    
    Reason: Bug
    Author: Aaron T. Myers
    Ref: CDH-3684

commit 207f93e5bb26dab8f38ac0f4e0740f9ac1910791
Author: Andrew Bayer <andrew@cloudera.com>
Date:   Fri Sep 30 09:46:56 2011 -0700

    Prep for CDH3u2 release

commit 4f2ec73b8231ee5c7f4b1423f5a5dd895386fd2c
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Sep 28 03:51:39 2011 -0700

    MAPREDUCE-2980. Use patched jetty to avoid fetch failures and other HTTP-related issues
    
    This changes the jetty build to the following tag:
    https://github.com/toddlipcon/jetty-hadoop-fix/tree/6.1.26.cloudera.1
    
    This tag was built by taking Jetty 6.1.26, then merging the NIO
    selector code from the Jetty "6.1.22z6" branch, provided by Greg Wilkins.
    In cluster testing, it resolves many HTTP-related issues.
    
    Reason: avoid high fetch failure rate, causing production issues
    Author: Todd Lipcon
    Ref: CDH-2785

commit 4c8f9e91fde59526b729c56276c5649e58fa10fb
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Sep 27 20:41:15 2011 -0700

    HDFS-2332. Add test for HADOOP-7629 (using an immutable FsPermission object as an RPC parameter fails).
    
    Author: Todd Lipcon
    Reason: unit test corresponding to other backport
    Ref: CDH-3568

commit 864b534240e5837629c1419893aa75346e66bf64
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Sep 27 20:40:18 2011 -0700

    HADOOP-7629. Allow immutable FsPermission objects to be used as IPC parameters.
    
    Author: Todd Lipcon
    Reason: necessary for Mahout tests to pass
    Ref: CDH-3568

commit ba8f83f3ff5877e2f4039dc0c02cb26fc4f45852
Author: Harsh J <harsh@cloudera.com>
Date:   Fri Sep 23 17:19:58 2011 +0530

    MAPREDUCE-2932. Missing instrumentation plugin class shouldn't crash the TT startup
    
    Missing instrumentation plugin class shouldn't crash the TT startup per
    design, and should fallback to default instead.
    Reason: Improvement
    Author: Harsh J
    Ref: CDH-3533

commit 3b6931e2d9882c8b4aa83436fefa6d2bb36973c8
Author: Aaron T. Myers <atm@cloudera.com>
Date:   Thu Sep 22 18:47:02 2011 -0700

    HADOOP-7674. TestKerberosName fails in 20 branch.
    
    Reason: Bug
    Author: Jitendra Nath Pandey
    Ref: CDH-3632

commit a54b6aa06f2b7e22691455c043cf9535fb51d703
Author: Aaron T. Myers <atm@cloudera.com>
Date:   Thu Sep 22 18:41:54 2011 -0700

    Amend HADOOP-7119. Add in duplicate test TestKerberosName
    
    Reason: Didn't commit this test originally since it would have always failed.
    Author: Alejandro Abdelnur
    Ref: CDH-3558

commit 15d9970fde48893b17320bf6596b39f4c512b0f2
Author: Aaron T. Myers <atm@cloudera.com>
Date:   Thu Sep 22 18:27:40 2011 -0700

    HADOOP-7645. HTTP auth tests requiring Kerberos infrastructure are not disabled on branch-0.20-security
    
    Reason: Bug
    Author: Jitendra Nath Pandey
    Ref: CDH-3609

commit b85680d3efc1fbefa6e2237e5756562106b0c39b
Author: Aaron T. Myers <atm@cloudera.com>
Date:   Thu Sep 22 18:26:14 2011 -0700

    Amend HADOOP-7119. Add in tests which require Kerberos infrastructure
    
    Reason: Didn't commit these tests originally since they would have always failed.
    Author: Alejandro Abdelnur
    Ref: CDH-3558

commit e6697b11c4d14cfb0832eae2ac0c496684a69f22
Author: Aaron T. Myers <atm@cloudera.com>
Date:   Wed Sep 21 16:02:24 2011 -0700

    HADOOP-7621. alfredo config should be in a file not readable by users
    
    Reason: Bug
    Author: Alejandro Abdelnur
    Ref: CDH-3608

commit f77da510bd7e4606477bf475731c8b303ffcae68
Author: Aaron T. Myers <atm@cloudera.com>
Date:   Wed Sep 21 15:58:20 2011 -0700

    HADOOP-7666. branch-0.20-security doesn't include o.a.h.security.TestAuthenticationFilter
    
    Reason: Bug
    Author: Aaron T. Myers
    Ref: CDH-3626

commit d6c9b1b1a69e27a2d7ea66cc6bca427ebc0ed426
Author: Aaron T. Myers <atm@cloudera.com>
Date:   Wed Sep 21 15:56:13 2011 -0700

    HADOOP-7665. branch-0.20-security doesn't include SPNEGO settings in core-default.xml
    
    Reason: Bug
    Author: Aaron T. Myers
    Ref: CDH-3625

commit 81caf3bf3a5099a7b15a6f55aae930d7beb96a0c
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sun Sep 18 10:14:23 2011 -0700

    HDFS-1779. Fix a bug regarding recovery of blocks being written while NN restarts
    
    This patch adds a new RPC 'blocksBeingWrittenReport()' which the DN calls
    on startup and when it re-connects to a restarted NameNode. This reports
    all blocks currently under construction, so the NN can re-add them to
    the targets list for a block if necessary.
    
    Reason: avoid HBase data-loss scenario when NN crashes
    Author: Hairong Kuang
    Ref: CDH-3507

commit b5bf4322cc047c1f95b814b49bc872c1433dd235
Author: Eli Collins <eli@cloudera.com>
Date:   Wed Sep 21 16:49:22 2011 -0700

    HADOOP-7653. tarball doesn't include .eclipse.templates.
    
    The hadoop tarball doesn't include .eclipse.templates. This results in
    a failure to successfully run ant eclipse-files.
    
    Reason: Bug
    Author: Jonathan Natkins
    Ref: CDH-3266

commit 77e1a32e46942124c1dcaa8ba731da7e499bd547
Author: Aaron T. Myers <atm@cloudera.com>
Date:   Mon Sep 19 00:10:39 2011 -0700

    HADOOP-7119. add Kerberos HTTP SPNEGO authentication support to Hadoop JT/NN/DN/TT web-consoles
    
    Reason: New Feature
    Author: Alejandro Abdelnur
    Ref: CDH-3558

commit d6d4c8bbd31486ad7661331044d0d568f6b5eabb
Author: Eli Collins <eli@cloudera.com>
Date:   Fri Sep 16 16:37:32 2011 -0700

    HDFS-2186. DN volume failures on startup are not counted.
    
    Volume failures detected on startup are not currently counted/reported
    as such. Eg if you have configured 4 volumes, 2 tolerated failures,
    and you start a DN with two failed volumes it will come up and report
    (to the NN) no failed volumes. The DN will still be able to tolerate 2
    additional volume failures (ie it's OK with no valid volumes
    remaining). The intent of the volume failure toleration config value
    is that if more than this # of volumes of the total set of configured
    volumes have failed the DN should shutdown, therefore volume failures
    detected on startup should count against this quota.
    
    Reason: Bug
    Author: Eli Collins
    Ref: CDH-3371

commit 8f0f3f85374720b8daa89b69a053a85138804f94
Author: Ahmed Radwan <ahmed@cloudera.com>
Date:   Fri Sep 16 01:55:10 2011 -0700

    MAPREDUCE-2836. Provide option to fail jobs when submitted
    to non-existent pools.
    
    Reason: Improvement
    Author: Ahmed Radwan
    Ref: CDH-3464

commit ea52f4e327fea44c37ff6f1c99619b254bf1f25e
Author: Aaron T. Myers <atm@cloudera.com>
Date:   Tue Sep 13 13:48:57 2011 -0700

    HDFS-2305. Running multiple 2NNs can result in corrupt file system
    
    Reason: Bug
    Author: Aaron T. Myers
    Ref: CDH-2761

commit 4d71976d7346279ff6aa6063b8e9562a9d0df281
Author: Ahmed Radwan <ahmed@cloudera.com>
Date:   Wed Aug 17 12:52:48 2011 -0700

    MAPREDUCE-2992. TestLinuxTaskController is broken.
    
    Reason: Bug
    Author: Ahmed Radwan
    Ref: CDH-3477

commit f29635a416a79d2aad6fa7361438381d9c02b713
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Aug 15 18:38:54 2011 -0700

    HDFS-1480. Fix some cases where rack policy could be violated
    
    This patch fixes an issue in determining replication targets where
    decomissioning or corrupt replicas were considered the same as
    valid replicas when considering rack locality policy. This would
    cause all replicas of a block to end up on the same rack when
    many nodes were decommissioned.
    
    Reason: avoid replication policy violation
    Ref: CDH-3069
    Author: Todd Lipcon

commit cac73dae4df7c536f36870083840a1a8f8c44303
Author: Eli Collins <eli@cloudera.com>
Date:   Wed Sep 7 18:19:33 2011 -0700

    MAPREDUCE-2760. mapreduce.jobtracker.split.metainfo.maxsize typoed in
    mapred-default.xml.
    
    The configuration mapreduce.jobtracker.split.metainfo.maxsize is
    incorrectly included in mapred-default.xml as
    mapreduce.job.split.metainfo.maxsize. It seems that jobtracker is
    correct, since this is a JT-wide property rather than a job property.
    
    Reason: Bug
    Author: Todd Lipcon
    Ref: CDH-3547

commit 9459e990a858f2452f04de02fce4cd011c1a8c6d
Author: Alejandro Abdelnur <tucu@cloudera.com>
Date:   Thu Aug 25 08:46:06 2011 -0700

    HADOOP-7507. jvm metrics all use the same namespace.
    
      Reason: Bug
      Author: Alejandro Abdelnur
      Ref: CDH-3297

commit 542c18a9d5d871d6363f93d99133e627688ef564
Author: Harsh J <harsh@cloudera.com>
Date:   Fri Aug 26 14:05:42 2011 +0530

    HDFS-1959. Better error message for missing namenode directory.
    
    Better error message when NN starts with a missing name
    dir.
    
    Reason: Improvement
    Author: Eli Collins
    Ref: CDH-3502

commit 64eca816d6b6e35c27464065d300a05932165ac8
Author: Aaron T. Myers <atm@cloudera.com>
Date:   Fri Aug 26 12:44:14 2011 -0700

    HDFS-970. FSImage writing should always fsync before close
    
    Reason: Bug
    Author: Todd Lipcon
    Ref: CDH-3474

commit dbf17dc0186ee1096f760751d2bb872798d91f26
Author: Alejandro Abdelnur <tucu@cloudera.com>
Date:   Thu Aug 25 17:20:08 2011 -0700

    CLOUDERA BUILD. Add Snappy-Java config file to switch off in-JAR native library
    
        Reason: Improvement
        Author: Alejandro Abdelnur
        Ref: CDH-3492

commit e87de0366ed98d24abb2f575509937ec25f38330
Author: Roman Shaposhnik <rvs@cloudera.com>
Date:   Mon Aug 22 14:50:45 2011 -0700

    CLOUDERA-BUILD. 32-bit builds of jsvc embed 64-bit library paths

commit 76e6564a623218e81417b241a4f1fa71b1db5606
Author: Ahmed Radwan <ahmed@cloudera.com>
Date:   Wed Aug 3 10:36:58 2011 -0700

    MAPREDUCE-2651. Race condition in LTC for job log directory creation
    
    Reason: Bug
    Author: Bharath Mundlapudi
    Ref: CDH-3385

commit 935bf0003568fa985ea241b5e68c1dc462395d13
Author: Roman Shaposhnik <rvs@cloudera.com>
Date:   Wed Aug 17 20:47:10 2011 -0700

    CLOUDERA BUILD. Provide libsnappyjava.so

commit e3f8dc3926d119ce3b765325114e5b8ada01120f
Author: Tom White <tom@cloudera.com>
Date:   Wed Aug 10 17:12:27 2011 -0700

    MAPREDUCE-1943. Implement limits on per-job JobConf, Counters, StatusReport, Split-Sizes
    
    Reason: Improvement
    Author: Mahadev konar
    Ref: CDH-1794

commit aeebe0d1415a3dc7a70aa88771607ef1eaebb192
Author: Tom White <tom@cloudera.com>
Date:   Wed Aug 10 16:59:00 2011 -0700

    MAPREDUCE-1482. Better handling of task diagnostic information stored in the TaskInProgress.
    
    Reason: Improvement
    Author: Amar Kamat
    Ref: CDH-1794

commit 27e22060592cee0ce920b592a43e46f84d01857b
Author: Eli Collins <eli@cloudera.com>
Date:   Sat Aug 13 13:15:21 2011 -0700

    HDFS-2259. DN web-UI doesn't work with paths that contain html.
    
    Reason: Bug
    Author: Eli Collins
    Ref: CDH-3304

commit af12a8df06d5a9a72f2f22b2c47e71808437ac81
Author: Eli Collins <eli@cloudera.com>
Date:   Fri Aug 12 13:28:44 2011 -0700

    HDFS-2235. Encode servlet paths.
    HADOOP-7531. Add servlet util methods for handling paths in requests.
    
    Reason: Bug
    Author: Eli Collins
    Ref: CDH-3304

commit d46071c27b19c7a691005882260967d62cda6dfd
Author: Eli Collins <eli@cloudera.com>
Date:   Wed Aug 3 13:48:23 2011 -0700

    HDFS-1317. HDFSProxy needs additional changes to work after changes to
    streamFile servlet in HDFS-1109.
    
    Reason: Bug
    Author: Rohini Palaniswamy
    Ref: CDH-3304

commit bdd9f9900811d5032e0c29b013ef45b46f7ffea2
Author: Eli Collins <eli@cloudera.com>
Date:   Wed Aug 3 13:43:49 2011 -0700

    HDFS-1109. HFTP and URL Encoding.
    
    Reason: Bug
    Author: Dmytro Molkov
    Ref: CDH-3304

commit 349cd124819f31d29c0a6dad7f21ad595e7ab788
Author: Eli Collins <eli@cloudera.com>
Date:   Wed Aug 3 13:32:41 2011 -0700

    HDFS-1340. A null delegation token is appended to the url if security
    is disabled when browsing filesystem.
    
    Reason: Bug
    Author: Jitendra Pandey
    Ref: CDH-3304

commit 581dfd390459c3e3b9962724330b4b33967f4559
Author: Eli Collins <eli@cloudera.com>
Date:   Sat Aug 13 19:00:43 2011 -0700

    HDFS-2023. Backport of NPE for File.list and File.listFiles. Merged
    ports of HADOOP-7322, HDFS-1934, HADOOP-7342, and HDFS-2019.
    
    Reason: Bug
    Author: Bharath Mundlapudi
    Ref: CDH-3307

commit 95683ebe551b9e4ee2e0b0419696dc46e2710162
Author: Eli Collins <eli@cloudera.com>
Date:   Sat Aug 13 19:17:59 2011 -0700

    CLOUDERA-BUILD. Point *-default doc links to the right place.

commit 28582806dc186d5abcbdc0c442d72eff84aa2c34
Author: Ahmed Radwan <ahmed@cloudera.com>
Date:   Tue Aug 9 19:24:53 2011 -0700

    MAPREDUCE-2524. Backport trunk heuristics for failing maps when we get fetch
    failures retrieving map output during shuffle.
    
    Reason: Improvement
    Author: Thomas Graves
    Ref: CDH-3441

commit 33a9d3f31ae34a873890b2a8b16bfb49808dc537
Author: Aaron T. Myers <atm@cloudera.com>
Date:   Mon Aug 8 15:05:11 2011 -0700

    HDFS-2190. NN fails to start if it encounters an empty or malformed fstime file
    
    Reason: Bug
    Author: Aaron T. Myers
    Ref: CDH-3331

commit 8d1c5e03eb7440901fadb1fc85509dc1e61bff86
Author: Eli Collins <eli@cloudera.com>
Date:   Fri Jul 29 15:15:28 2011 -0700

    HADOOP-7491. hadoop command should respect HADOOP_OPTS when given a class name.
    
    Reason: Improvement
    Author: Eli Collins
    Ref: CDH-3392

commit ad6ac50988232cc950bc69d0866e67f5565ec9fa
Author: Andrew Bayer <andrew.bayer@gmail.com>
Date:   Fri Jul 22 10:06:59 2011 -0700

    CLOUDERA-BUILD. Updating for CDH3u2 SNAPSHOT.

commit 927c26b2cabbbe742026e5ba70855476dc38968e
Author: Ahmed Radwan <ahmed@cloudera.com>
Date:   Mon Jul 18 05:26:48 2011 -0700

    MAPREDUCE-2529. Recognize Jetty bug 1342 and handle it.
    
    Reason: Bug
    Author: Thomas Graves
    Ref: CDH-3351

commit edb61378f7b6c80a7385e3e24997de429f39e0d8
Author: Tom White <tom@cloudera.com>
Date:   Wed Jul 13 13:34:52 2011 -0700

    MAPREDUCE-2638. Create a simple stress test for the fair scheduler
    
    Reason: Test
    Author: Tom White
    Ref: CDH-2847

commit bdafb1dbffd0d5f2fbc6ee022e1c8df6500fd638
Author: Eli Collins <eli@cloudera.com>
Date:   Mon Jul 11 18:49:14 2011 -0700

    MAPREDUCE-2670. Fixing spelling mistake in FairSchedulerServlet.java.
    
    Reason: Bug
    Author: Eli Collins
    Ref: DISTRO-273

commit be5652e5aa6a3888f4e52608c3a591b03ad48487
Author: Tom White <tom@cloudera.com>
Date:   Wed Jul 6 17:43:59 2011 +0100

    CLOUDERA-BUILD. Undeprecate backported MapReduce library classes using the old API.
    
    Ref: CDH-3203

commit 3e156a031877ce98c438376dcf7accb56b95dc65
Author: Andrew Bayer <andrew@cloudera.com>
Date:   Wed Jul 6 00:42:16 2011 -0700

    CLOUDERA-BUILD. Updating versions for cdh3u1 release.

commit 3bec5b05964cf8a1f705ad0cf3b10c3ac707f1d5
Author: Aaron T. Myers <atm@cloudera.com>
Date:   Tue Jul 5 19:43:03 2011 -0700

    HDFS-1758. Web UI JSP pages thread safety issue
    
    Reason: Bug
    Author: Tanping Wang
    Ref: CDH-2842

commit 8eff3591387814abe8e079f2689bf9a38aa498f2
Author: Aaron T. Myers <atm@cloudera.com>
Date:   Tue Jul 5 18:15:17 2011 -0700

    HDFS-2011. Removal and restoration of storage directories on checkpointing failure doesn't work properly
    
    Reason: Bug
    Author: Ravi Prakash
    Ref: CDH-3315

commit 12a5778288de7628dfa2a27fd344e83a8ce6cdc2
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Jul 5 19:44:20 2011 -0700

    MAPREDUCE-2447. Fix Child.java to set Task.jvmContext sooner to avoid corner cases in error handling.
    
    Reason: Fix possible NPE if TaskLogs.syncLogs fails in child
    Author: Siddharth Seth
    Ref: CDH-3132

commit 0eab1fbea6a968c2514a16a2f96a36ebfa30c6b6
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Jul 5 18:13:12 2011 -0700

    Amend MAPREDUCE-2373. Fix a possible NPE if setPermissions fails while launching task script.
    
    Reason: avoid NPE seen in production
    Author: Todd Lipcon
    Ref: CDH-3151

commit b5c5941d73cf037dc03ee5c8848708da6f6d5566
Author: Eli Collins <eli@cloudera.com>
Date:   Tue Jul 5 17:52:35 2011 -0700

    HDFS-1628. AccessControlException should display the full path.
    
    org.apache.hadoop.security.AccessControlException should display the
    full path for which the access is denied.
    
    Reason: Improvement
    Author: John George
    Ref: CDH-2765

commit 50cee77a34b3d7b7c8a7a710fb3f4e8e1448288c
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Jul 5 16:55:57 2011 -0700

    MAPREDUCE-2443. Fix TaskAspect for TaskUmbilicalProtocol.ping.
    
    Author: Siddharth Seth
    Reason: fix test-system compile after MR-2429
    Ref: CDH-3132

commit 5829715e1d4bf739668d5b246bf00b3f136733f2
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Jul 5 16:33:13 2011 -0700

    MAPREDUCE-2429. Validate JVM in TaskUmbilicalProtocol.
    
    Reason: Fix issue where TT gets into inconsistent state
    Author: Siddharth Seth
    Ref: CDH-3132

commit 708b259abe2a0e287b4370cc41da87254b4c46dd
Author: Eli Collins <eli@cloudera.com>
Date:   Tue Jul 5 16:02:22 2011 -0700

    HDFS-1836. Thousand of CLOSE_WAIT socket.
    
    Reason: Bug
    Author: Bharath Mundlapudi
    Ref: CDH-3200

commit 3412dde10617df0cffa4d4744d6b1f2a0d59e23a
Author: Eli Collins <eli@cloudera.com>
Date:   Tue Jul 5 14:49:16 2011 -0700

    HADOOP-7272. Remove unnecessary security related info logs.
    
    Two info logs are printed when connection to RPC server is
    established, is not necessary. On a production cluster, these log
    lines made up of close to 50% of lines in the namenode log. I propose
    changing them into debug logs.
    
    Reason: Improvement
    Author: Suresh Srinivas
    Ref: CDH-3174

commit 7d08d6a9f223f270e5f4728a85e0ed3934a347f7
Author: Eli Collins <eli@cloudera.com>
Date:   Tue Jul 5 14:33:07 2011 -0700

    HADOOP-7325. hadoop command - do not accept class names starting with a hyphen.
    
    Reason: Improvement
    Author: Brock Noland
    Ref: CDH-3244

commit ece7c80048db98aae5a81603ae426b8663afb975
Author: Eli Collins <eli@cloudera.com>
Date:   Tue Jul 5 13:16:18 2011 -0700

    HADOOP-7053. wrong FSNamesystem Audit logging setting in conf/log4j.properties.
    
    "log4j.logger.org.apache.hadoop.fs.FSNamesystem.audit=WARN" should be
    "log4j.logger.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=WARN".
    
    Reason: Bug
    Author: Jingguo Yao
    Ref: CDH-3293

commit 72cefcf1c9848ccb08a391a22830b403cd70a9a9
Author: Eli Collins <eli@cloudera.com>
Date:   Tue Jul 5 12:18:57 2011 -0700

    HDFS-1897. Documention refers to removed option dfs.network.script.
    
    The HDFS user guide tells users to use dfs.network.script for rack
    awareness. In fact, this option has been removed and using it will
    trigger a fatal error on DataNode startup. Documentation should
    describe the current rack awareness configuration system.
    
    Reason: Bug
    Author: Andrew Whang
    Ref: CDH-3153

commit c30f419af201daea7d7131d5c50fef6b09997513
Author: Eli Collins <eli@cloudera.com>
Date:   Tue Jul 5 12:15:41 2011 -0700

    MAPREDUCE-2472. Extra whitespace in mapred.child.java.opts breaks JVM initialization.
    
    When creating taskjvm.sh, we split mapred.child.java.opts on " " and
    then create a quoted argument for each of those results. So, if you
    have an extra space anywhere in this configuration, you get an
    argument '' in the child command line, which the JVM interprets as an
    empty class name. This results in a ClassNotFoundException and the
    task cannot run.
    
    Reason: Bug
    Author: Aaron T. Myers
    Ref: CDH-3152

commit 76ac08dad430d600c6cc69424c21d16a4ba42d42
Author: Eli Collins <eli@cloudera.com>
Date:   Tue Jul 5 12:10:54 2011 -0700

    HADOOP-7247. Fix documentation to reflect new jar names.
    
    In several places, we have the old jar naming style of hadoop - * -
    examples.jar. With Ivy and Maven, we had to rename the jars to hadoop
    - examples - *.jar. Therefore, we need to update the documentation.
    
    Reason: Improvement
    Author: Owen O'Malley
    Ref: CDH-3099

commit d0a46bc2c278a9f6c19365ac712c2945269f8ee1
Author: Eli Collins <eli@cloudera.com>
Date:   Tue Jul 5 11:26:08 2011 -0700

    HADOOP-5464. DFSClient does not treat write timeout of 0 properly.
    
    dfs.datanode.socket.write.timeout is used for sockets to and from
    datanodes. It is 8 minutes by default. Some users set this to 0,
    effectively disabling the write timeout (for some specific reasons).
    
    When this is set to 0, DFSClient sets the timeout to 5 seconds by
    mistake while writing to DataNodes. This is exactly the opposite of
    real intention of setting it to 0 since 5 seconds is too short.
    
    Reason: Bug
    Author: Raghu Angadi
    Ref: CDH-3101

commit 1476f32a4bb161a3ffc81231b99b472b0dbe3adb
Author: Eli Collins <eli@cloudera.com>
Date:   Tue Jul 5 10:30:44 2011 -0700

    HDFS-1753. Resource Leak in StreamFile.
    
    Reason: Bug
    Author: Uma Maheswara Rao G
    Ref: CDH-3243

commit 8b1f6a660e604fb39284ef8cad7821a6ec27baf5
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Jul 4 14:53:15 2011 -0700

    HADOOP-7428. IPC connection is orphaned with null 'out' member
    
    Reason: Can impact a user's ability to submit jobs, among other issues
    Author: Todd Lipcon
    Ref: CDH-3306

commit 2e9c10c247d5be1ea9b9b20983ada0e898d7e3ab
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Jul 4 14:44:06 2011 -0700

    HADOOP-7440. HttpServer.getParameterValues throws NPE for missing parameters
    
    Reason: fix user-visible NPE
    Author: Todd Lipcon
    Ref: CDH-3083

commit e08745678913d8a348815dcee69465f4a6a03540
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Jun 17 18:05:16 2011 -0700

    HADOOP-7402. TestConfiguration doesn't clean up after itself
    
    Reason: test cleanliness
    Author: Aaron T. Myers
    Ref: CDH-3279

commit f0e8a989ff4be91c1428eea1d6e27cc3d13d5817
Author: Ahmed Radwan <ahmed@cloudera.com>
Date:   Thu Jun 23 07:20:43 2011 -0700

    MAPREDUCE-2254. Allow setting of end-of-record delimiter for
    TextInputFormat.
    MAPREDUCE-2602. Allow setting of end-of-record delimiter for
    TextInputFormat (for the old API).
    
    Reason: Improvement
    Author: Ahmed Radwan
    Ref: CDH-3268

commit 671085620586a21d4c4e3a35476e823d237045c9
Author: Eli Collins <eli@cloudera.com>
Date:   Thu Jun 30 01:21:10 2011 -0700

    HDFS-1592. Datanode startup doesn't honor volumes.tolerated.
    
    Reason: Bug
    Author: Bharath Mundlapudi
    Ref: CDH-3064

commit 0faf23ca0a9b6b8a90282a3b266db278a28394fa
Author: Eli Collins <eli@cloudera.com>
Date:   Sat Jun 25 16:10:55 2011 -0700

    HDFS-1692. In secure mode, Datanode process doesn't exit when disks fail.
    
    Reason: Bug
    Author: Bharath Mundlapudi
    Ref: CDH-3064

commit 5bd0314bbe72ffab90c310d110986fc71165f121
Author: Eli Collins <eli@cloudera.com>
Date:   Mon Jun 27 22:40:57 2011 -0700

    HDFS-2117. DiskChecker#mkdirsWithExistsAndPermissionCheck may return
    true even when the dir is not created.
    
    Reason: Bug
    Author: Eli Collins
    Ref: CDH-3064

commit 645b176875769c8dcbf7d839926eb735bdfd5b14
Author: Eli Collins <eli@cloudera.com>
Date:   Wed Jun 29 23:16:27 2011 -0700

    HADOOP-7040. DiskChecker:mkdirsWithExistsCheck swallows FileNotFoundException.
    
    Reason: Bug
    Author: Boris Shkolnik
    Ref: CDH-3064

commit 8d991630bc1a04c70fbc31435b4fdb3f26033cf8
Author: Eli Collins <eli@cloudera.com>
Date:   Sat Jun 25 10:35:11 2011 -0700

    HDFS-235. Add support for byte-ranges to hftp.
    HDFS-2110. Cleanup StreamFile#sendPartialData.
    HADOOP-7429. Add another IOUtils#copyBytes method.
    HADOOP-7057. IOUtils.readFully and IOUtils.skipFully have typo in exception creation's message.
    
    Reason: Improvement
    Author: Eli Collins
    Ref: CDH-3243

commit 4fd9e18fb751824c140d5b67645fc925a92a7c1f
Author: Alejandro Abdelnur <tucu00@gmail.com>
Date:   Tue Jun 28 17:35:48 2011 -0700

    HADOOP-7433. Snappy SO file/links are copied to the wrong directory
    
        Reason: Bug, they must be copied to the $OS_ARCH directory
        Author: Alejandro Abdelnur
        Ref: CDH-3300

commit 3c9402dcc658e6415c59e4866ec3ee0227e819f1
Author: Eli Collins <eli@cloudera.com>
Date:   Wed Jun 22 19:10:58 2011 -0700

    HDFS-1850. DN should transmit absolute failed volume count rather than
    increments to the NN.
    
    Reason: Improvement
    Author: Eli Collins
    Ref: CDH-3065

commit 5a995e37d6430a6790f27476680da1555fbfc031
Author: Eli Collins <eli@cloudera.com>
Date:   Mon May 9 18:06:45 2011 -0700

    HDFS-556. Provide info on failed volumes in the web ui.
    
    HDFS-457 provided better handling of failed volumes but did not provide a co
    rresponding view of this functionality on the web ui, such as a view of which
    datanodes have failed volumes. This would be a good feature to have.
    
    Reason: Improvement
    Author: Eli Collins
    Ref: CDH-1099

commit 13eaedf77798820b92ac17caf717b8e3ea5f8562
Author: Eli Collins <eli@cloudera.com>
Date:   Mon May 9 18:05:57 2011 -0700

    HDFS-811. Add metrics, failure reporting and additional tests for HDFS-457.
    
    Reason: Improvement
    Author: Eli Collins
    Ref: CDH-1099

commit c216ea863bcca97efc8220bf1a7507bcd4b12ca5
Author: Eli Collins <eli@cloudera.com>
Date:   Tue Jun 28 17:35:32 2011 -0700

    HADOOP-7290. Unit test failure in TestUserGroupInformation.
    
    Reason: Bug
    Author: Eli Collins
    Ref: DISTRO-266

commit 87bcc8eb1940ed60ee1b9dc6489781dd1841e932
Author: Aaron T. Myers <atm@cloudera.com>
Date:   Tue Jun 28 14:48:18 2011 -0700

    HADOOP-7144. Implement capability of querying individual property of a mbean using JMXProxyServlet
    
    Reason: Improvement
    Author: Tanping Wang
    Ref: CDH-3229

commit 329522db408a8cbd943f7a8f62b646b8b238bfe3
Author: Aaron T. Myers <atm@cloudera.com>
Date:   Tue Jun 28 14:20:35 2011 -0700

    HADOOP-7144. Expose JMX with something like JMXProxyServlet
    
    Reason: Provide the ability to get JMX metrics and status info via HTTP
    Author: Robert Joseph Evans
    Ref: CDH-3229

commit a3dcabc6542035a0943648bdace5702192c7187c
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Apr 5 22:47:40 2011 -0700

    Amend HADOOP-6762 to fix potential deadlock.
    
    This fixes a deadlock that occurs if the writing of the call parameters
    throws an IOException after the timeout ping time has elapsed.
    
    Author: Todd Lipcon
    Ref: DISTRO-120

commit 0fb261a32591092f5b0601b294f0ea0ba8b72310
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Jun 17 17:35:35 2011 -0700

    HADOOP-7121. Exceptions while serializing IPC call responses are not handled well. Contributed by Todd Lipcon.
    
    Reason: bug fixes for potential hangs of IPC layer, and additional test
            coverage for DISTRO-120
    Author: Todd Lipcon
    Ref: DISTRO-120

commit e7ab332a8037e7117919c833c0ac0a999307d681
Author: Roman Shaposhnik <rvs@cloudera.com>
Date:   Tue Jun 28 09:20:13 2011 -0700

    CLOUDERA BUILD. Making call succeed regardless of the file permissions

commit ec7e2867282ab51ad353ff6ed4f268425036a6e3
Author: Alejandro Abdelnur <tucu00@gmail.com>
Date:   Fri Jun 24 14:30:59 2011 -0700

    CLOUDERA BUILD. Pull Snappy source, build it and wire it to Hadoop build

commit bd3ea0cbe457e49b5e1bbdf4d7dc57599feb3537
Author: Alejandro Abdelnur <tucu00@gmail.com>
Date:   Fri Jun 24 11:33:16 2011 -0700

    HADOOP-7206 SNAPPY backport
    
    Adds Snappy compression support.
    
        Reason: New Functionality
        Author: Issei Yoshida, Alejandro Abdelnur
        Ref: CDH-3039
    
    HADOOP-7206 SNAPPY backport

commit 871f34b3bdb7d323cfc91f0538ccb848deebc7f3
Author: Aaron T. Myers <atm@cloudera.com>
Date:   Mon Jun 20 17:13:22 2011 -0700

    HDFS-1602. NameNode storage failed replica restoration is broken
    
    Reason: Bug
    Author: Boris Shkolnik
    Ref: CDH-3208

commit 032c764a0a933e004085442758083d4fea2cf876
Author: Aaron T. Myers <atm@cloudera.com>
Date:   Tue Jun 21 15:49:06 2011 -0700

    HDFS-2100. Improve TestStorageRestore
    
    Reason: Test
    Author: Aaron T. Myers
    Ref: CDH-3208

commit 699edb198e0518572957f7edb77570615850da59
Author: Eli Collins <eli@cloudera.com>
Date:   Wed Jun 22 15:22:51 2011 -0700

    CLOUDERA-BUILD. Update eclipse template classpath to remove dupes and update verions.

commit b4d1557ba2d16f1fa3af7e4b7bb1265bc7cb6a30
Author: Eli Collins <eli@cloudera.com>
Date:   Sun May 22 19:51:10 2011 -0700

    HDFS-1978. All but first option in LIBHDFS_OPTS is ignored.
    
    Reason: Bug
    Author: Eli Collins
    Ref: CDH-3210

commit 1f3e7f44a9c6f56b4a2921faa82f0d81321dbd64
Author: Eli Collins <eli@cloudera.com>
Date:   Wed Jun 22 10:54:00 2011 -0700

    HDFS-2055. Add hflush support to libhdfs.
    
    Reason: New Feature
    Author: Travis Crawford
    Ref: DISTRO-257

commit 2348c8bfd82cbb3aa4685e7f8d85968c7cbe08b1
Author: Eli Collins <eli@cloudera.com>
Date:   Wed Jun 22 10:30:47 2011 -0700

    HDFS-420. Fuse-dfs should cache fs handles.
    
    Fuse-dfs should cache fs handles on a per-user basis. This significantly
    increases performance (and has the side effect of fixing the current code
    which leaks fs handles).
    
    Reason: Improvement
    Author: Brian Bockelman, Eli Collins
    Ref: CDH-2786

commit 0eeb795d156edf6f4e7c5c4b722d85737cd49736
Author: Roman Shaposhnik <rvs@cloudera.com>
Date:   Fri Jun 17 10:07:31 2011 -0700

    CLOUDERA-BUILD. Building RPMs from SRPMs in CDH needs to rebuild the projects

commit 21868a0d245c73742c90d23a82a7536c198a5a3f
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Feb 15 10:53:23 2011 -0800

    HADOOP-7145. Configuration.getLocalPath should trim strings
    
    Reason: fix potential bug with local dirs
    Author: Todd Lipcon
    Ref: CDH-2662

commit 9d0bd80bedd72f3b366d5ceda970109a0d3e124a
Author: Aaron T. Myers <atm@cloudera.com>
Date:   Fri Jun 17 16:02:59 2011 -0700

    HDFS-2082. SecondayNameNode web interface doesn't show the right info
    
    Reason: Bug
    Author: Aaron T. Myers
    Ref: CDH-3277

commit 68250308093f335bac63e65171ae22db03412c13
Author: Aaron T. Myers <atm@cloudera.com>
Date:   Fri Jun 17 12:30:43 2011 -0700

    HADOOP-3741. SecondaryNameNode has http server on dfs.secondary.http.address but without any contents
    
    Reason: New Feature
    Author: Tsz Wo (Nicholas), SZE
    Ref: CDH-1695

commit d7915a354ade800a163788af7dd43f187f0442aa
Author: Aaron T. Myers <atm@cloudera.com>
Date:   Fri Jun 17 14:24:48 2011 -0700

    HADOOP-4794. Add branch info to HadoopVersionAnnotation
    
    Reason: Improvement
    Author: Chris Douglas
    Ref: CDH-3274

commit a7f154b738ffdb129eb07be88abc925e447d6b00
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Jun 17 17:04:09 2011 -0700

    Amend MAPREDUCE-2323. Fix bug causing NPE when the "set pool" option is used twice on the same job.
    
    Reason: important bug fix
    Ref: CDH-3036
    Author: Todd Lipcon

commit 97d8bb472f57c1abc73a5240675a37b8e4b5b31a
Author: Tom White <tom@cloudera.com>
Date:   Tue Jun 7 16:59:31 2011 -0700

    HADOOP-7323. Add capability to resolve compression codec based on codec name
    
    Reason: Improvement
    Author: Alejandro Abdelnur
    Ref: CDH-3226

commit 829bc94b23b9ab447fc51919cecfe5d9bd0a0c2b
Author: Tom White <tom@cloudera.com>
Date:   Tue Jun 7 16:40:23 2011 -0700

    HADOOP-6996. Allow CodecFactory to return a codec object given a codec' class name
    
    Reason: Improvement
    Author: Hairong Kuang
    Ref: CDH-3226

commit 8b49cf2446f0a5ac5f750b2abc07787c40142878
Author: Roman Shaposhnik <rvs@cloudera.com>
Date:   Fri Jun 3 19:07:34 2011 -0700

    MAPREDUCE-2260. Remove auto-generated native build files
    HADOOP-6436. Remove auto-generated native build files
    HDFS-1582. Remove auto-generated native build files
    HDFS-1619. Remove AC_TYPE* from the libhdfs
    
    Reason: Native build files generated on older version of Linux/autotools tend to break builds on newer OSes
    Author: Roman Shaposhnik
    Ref: CDH-894

commit d94813ecd0d4b3f63f4d30baa8a22a59dc76d5a8
Author: Aaron T. Myers <atm@cloudera.com>
Date:   Tue May 24 14:26:47 2011 -0700

    Revert "HADOOP-6988. Add support for reading multiple hadoop delegation token files"
    
    This reverts commit ce67cd87f21543348ca5c137dee3ff0dc7f338dd.

commit 0dc60cdb69d7a52068629bcecf79d69dd1cb1132
Author: Eli Collins <eli@cloudera.com>
Date:   Wed May 18 14:44:27 2011 -0700

    MAPREDUCE-2505. Explain how to use ACLs in the fair scheduler.
    
    The fair scheduler already works with the ACL system introduced through the
    mapred.queue.* parameters, but the documentation doesn't explain how to use
    this. We should add a paragraph or two about it.
    
    Reason: Improvement
    Author: Matei Zaharia
    Ref: CDH-2050

commit 6a658661998bfec440c181e01cefb3dbee7f525a
Author: Roman Shaposhnik <rvs@cloudera.com>
Date:   Fri May 13 18:23:00 2011 -0700

    DISTRO-224. CDH packages should depend on JRE, not JDK

commit 1db6dae127e0a93084ab4cebb840d3af91e429c0
Author: Aaron T. Myers <atm@cloudera.com>
Date:   Mon May 16 19:17:18 2011 -0700

    Back-port HADOOP-7124, HDFS-1814, MAPREDUCE-2473 - Hadoop /usr/bin/groups equivalent
    
    Reason: Allows users to query and display their group membership.
    Author: Aaron T. Myers
    Ref: CDH-2986

commit 86fe9a95f6356855038f9c605fe54e682304c88e
Author: Aaron T. Myers <atm@cloudera.com>
Date:   Thu May 12 15:52:25 2011 -0700

    Amend HDFS-1378. Edit log replay should track and report file offsets in case of errors
    
    Reason: Original back-port had a bug. This back-port includes the fix as committed to trunk.
    Author: Aaron T. Myers and Todd Lipcon
    Ref: CDH-3072

commit d30b8b83175fbf96644cffbda37e90cb4703c139
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu May 12 15:07:51 2011 -0700

    HADOOP-6947. Kerberos login should set the refreshKrb5Config option
    
    Reason: necessary for daemons that will use multiple keytab files for different principals
    Author: Todd Lipcon
    Ref: CDH-3184

commit b96019898f7c4cba702371a7ef238977ddc88b0e
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu May 12 15:05:11 2011 -0700

    HADOOP-7189. Add ability to enable JAAS debug option with an environment variable.
    
    Adds the HADOOP_JAAS_DEBUG environment variable, which, when set to "true", dumps
    extra debugging information out of JAAS.
    
    Reason: aids debugging of security issues like "Failure to login"
    Author: Ted Yu
    Ref: CDH-3183

commit c2804fb55590f62018f4fc379275ae01af001adc
Author: Konstantin Boudnik <cos@apache.org>
Date:   Wed May 4 17:01:30 2011 -0700

    MAPREDUCE-2023. TestDFSIO read test may not read specified bytes.
    
    Reason: Fixing a bug in the test
    Author: Hong Tang
    Ref:    CDH-3148

commit b9c48d0bf87c3ca3cabd467c6b03360a022e5669
Author: Konstantin Boudnik <cos@apache.org>
Date:   Wed May 4 14:10:34 2011 -0700

    MAPREDUCE-1832. Support for file sizes less than 1MB in DFSIO benchmark.
    
    Reason: Reverting backport of MAPREDUCE-1614 and completing the merge of MAPREDUCE-1832.
    Author: Konstantin Boudnik
    Ref:    CDH-3140

commit 763893247e8e94a6da8060d2335550b90cf0662e
Author: Alejandro Abdelnur <tucu00@gmail.com>
Date:   Thu Apr 28 13:21:45 2011 -0700

    MAPREDUCE-2457. job submission should inject group.name
    
    Description:
    Reason: common used functionality by FairScheduler
    Author: Alejandro Abdelnur
    Ref: CDH-3088

commit a7c507d6d763fd6f8868198959f2759749841426
Author: Konstantin Boudnik <cos@apache.org>
Date:   Mon May 2 14:43:35 2011 -0700

    MAPREDUCE-1614. TestDFSIO should allow to configure output directory
    
    Reason: Fixing bug in the test
    Author: Konstantin Boudnik
    Ref: CDH-3123

commit 858d5bb8ad49cf2b3f65af939be97bcbae6b25e5
Author: Konstantin Boudnik <cos@apache.org>
Date:   Mon May 2 14:42:41 2011 -0700

    MAPREDUCE-1832. Support for file sizes less than 1MB in DFSIO benchmark.
    
    Reason: Backport test improvements.
    Author: Konstantin Shvachko
    Ref: CDH-3117

commit 0efd27c636b1a2a23c64e019af50cebcc2c98d83
Author: Aaron T. Myers <atm@cloudera.com>
Date:   Thu Apr 28 00:32:33 2011 -0700

    Amend HADOOP-6995. Allow wildcards to be used in ProxyUsers configurations
    
    Reason: Forgot to backport documentation portion of the change
    Author: Todd Lipcon
    Ref: CDH-3100

commit e863f5bee5763ec354384645b9d62743a052fae9
Author: Aaron T. Myers <atm@cloudera.com>
Date:   Thu Apr 28 00:20:22 2011 -0700

    HDFS-1846. Don't fill preallocated portion of edits log with 0x00
    
    Reason: Improvement
    Author: Aaron T. Myers
    Ref: CDH-3059

commit e78be89d287e49207547f82a68e92b0d9a6d5413
Author: Aaron T. Myers <atm@cloudera.com>
Date:   Wed Apr 27 13:45:03 2011 -0700

    HDFS-1862. Improve test reliability of HDFS-1594
    
    Reason: Test
    Author: Aaron T. Myers
    Ref: CDH-3095

commit 08b60c48c049cc3e8965f6f8cf8bad55b2969e99
Author: Aaron T. Myers <atm@cloudera.com>
Date:   Fri Apr 22 18:02:26 2011 -0700

    HDFS-1594. When the disk becomes full Namenode is getting shutdown and not able to recover
    
    Reason: Bug
    Author: Aaron T. Myers
    Ref: CDH-2895

commit 71618e9eb918d8e27a57db8a40683bd8a3e0d7d1
Author: Aaron T. Myers <atm@cloudera.com>
Date:   Thu Apr 21 17:35:44 2011 -0700

    HADOOP-7229. Absolute path to kinit in auto-renewal thread
    
    Reason: Bug
    Author: Aaron T. Myers
    Ref: CDH-3024

commit 927d00941693e7774174c795b91bba1811d801bd
Author: Tom White <tom@cloudera.com>
Date:   Tue Apr 19 14:19:58 2011 -0700

    MAPREDUCE-1813. NPE in PipeMapred.MRErrorThread
    
    Reason: Bug
    Author: Ravi Gummadi
    Ref: CDH-2154

commit 58815d50145c62a961f14a6f789491b3e4272fbe
Author: Eli Collins <eli@cloudera.com>
Date:   Mon Apr 18 14:52:10 2011 -0700

    HADOOP-7045. TestDU fails on systems with local file systems with extended attributes.
    
    We should modify the test to allow for some extra on-disk slack. The on-disk
    usage could also be smaller if the file data is all zeros or compression is
    enabled. The test currently handles the former by writing random data, we're
    punting on the latter.
    
    Reason: Test
    Author: Eli Collins
    Ref: CDH-3033

commit d4375b1e0415d9c76885af1df6cd2ebc3db33237
Author: Eli Collins <eli@cloudera.com>
Date:   Fri Apr 15 12:29:35 2011 -0700

    HADOOP-7159. RPC server should log the client hostname when read exception happened.
    
    Reason: Improvement
    Author: Scott Chen
    Ref: CDH-2766

commit d6a988a1f38609634c8b5364a7caac03871d2c25
Author: Andrew Bayer <andrew.bayer@gmail.com>
Date:   Thu Apr 7 12:15:29 2011 -0700

    CLOUDERA-BUILD. Updating for CDH3u1 development.

commit 81256ad0f2e4ab2bd34b04f53d25a6c23686dd14
Author: Bruno Mahé <bruno@cloudera.com>
Date:   Thu Mar 24 11:47:04 2011 -0700

    DISTRO-185. Needs to add sun jdk provided by RHEL to the list of jvm candidates
    
    Description: Services wouldn't start since they could not find the sun jdk on RHEL6
    Reason: Bug
    Author: Bruno Mahé
    Ref: CDH-2858

commit 03ab6d72146cfd99028fec22f6d28994d515df12
Author: Andrew Bayer <andrew.bayer@gmail.com>
Date:   Mon Mar 21 11:26:14 2011 -0700

    CLOUDERA-BUILD. Changing fuse-dfs tests to use test.junit.output.format for Junit formatter, rather than hardcoding as plain.

commit aa3b91aaeca5e5bcd5988ee0fe1d619167ed38fa
Author: Tom White <tom@cloudera.com>
Date:   Sun Mar 20 21:43:55 2011 -0700

    HDFS-1762. Allow TestHDFSCLI to be run against a cluster
    
    Author: Tom White
    Ref: CDH-2797

commit b399ca4df2d7bafc27fc91361d451358ef8a394a
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Mar 15 17:21:29 2011 -0700

    MAPREDUCE-2366. TaskTracker can't retrieve stdout and stderr from web UI
    
    Reason: bug fix via 0.20-security-203
    Author: Richard King
    Ref: CDH-2772

commit 53dd2c7f23291ee58a5d0d4ab8bab1b5bf47b2ba
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Mar 11 10:31:04 2011 -0800

    HDFS-1520, HDFS-1554, HDFS-1555. Add new lightweight recoverLease API for use by HBase
    
    This adds a limited-public API recoverLease() which is used by the HBase master
    when recovering the HBase write-ahead log.
    
    Author: Hairong Kuang, backport help from Andrew Purtell
    Ref: CDH-2812

commit fd14a491d0a1bcae807aa4d985b71c4170eb1136
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Mar 16 15:14:23 2011 -0700

    HDFS-1759. Improve error message when starting secure DN without jsvc
    
    Author: Todd Lipcon
    Ref: CDH-2554

commit 978164b1b4e1ed07f236b21c6cc757b3a96f3ec0
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Mar 15 15:01:08 2011 -0700

    MAPREDUCE-2364. Don't hold the rjob lock while localizing resources.
    
    Reason: TT deadlock, patch from branch 0.20-security-203
    Author: Devaraj Das
    Ref: CDH-2772

commit e1d94a529d7adac4012854703ea4b10d21f8829b
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Mar 15 15:00:27 2011 -0700

    MAPREDUCE-1563. TaskDiagnosticInfo may be missed sometime
    
    Reason: Bug fix via 0.20-security-203
    Author: Krishna Ramachandran
    Ref: CDH-2772

commit 244bc14fd518142b015ef1b539ec899daeb18e77
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Mar 15 14:55:38 2011 -0700

    MAPREDUCE-2356. Fix a task state corrupting race
    
    Reason: can cause a task to succeed even though all attempts were errors
    Author: Luke Lu
    Ref: CDH-2772

commit 7c3266e8072d54c2d18755c4b0c4d3fb153f5dc0
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Mar 15 14:22:34 2011 -0700

    CLOUDERA-BUILD. Add .gitignore for Cloudera maven target directories

commit d8f01d688916739125f3d321be75ed741b7b3a6f
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Mar 16 13:43:06 2011 -0700

    CLOUDERA-BUILD. Update footer to indicate new product naming
    
    Author: Todd Lipcon
    Ref: CDH-2831

commit d52118b2e49f1ccd29286574ad017707cdd63d0c
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Mar 16 13:20:17 2011 -0700

    MAPREDUCE-2377. task-controller fails to parse configuration if it doesn't end in \n
    
    Reason: fix hard-to-diagnose bug
    Author: Todd Lipcon
    Ref: CDH-2578

commit ca46366798e704396bd2de8e3ef4bc1b074b88a9
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Mar 15 18:34:37 2011 -0700

    CLOUDERA-BUILD. Default cloudera.hash to empty string
    
    This restores the proper behavior of inferring the git hash from the
    current repository, if it's not overridden on the command line.
    
    Author: Todd Lipcon
    Ref: CDH-2829

commit 6ca2af6321cbabf8029092ce6550ec8e78673fba
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Mar 15 14:28:51 2011 -0700

    HADOOP-7104. Remove unnecessary DNS reverse lookups from RPC layer
    
    Reason: Fixes potential performance issues when DNS blips occur
    Author: Kan Zhang
    Ref: DISTRO-108

commit bf19274fc15bb5b37089f3a50db7dbb053c92490
Author: Eli Collins <eli@cloudera.com>
Date:   Tue Mar 15 21:36:53 2011 -0700

    HDFS-780. Revive TestFuseDFS.
    
    Reason: Improvement
    Author: Eli Collins
    Ref: CDH-2778

commit f0cefd74f8727ebd331c6712ab5b4c004e46a629
Author: Eli Collins <eli@cloudera.com>
Date:   Tue Mar 15 10:37:46 2011 -0700

    CLOUDERA-BUILD. Nuke DFSConfigKeys.DFS_BLOCK_SIZE_KEY.
    
    Ref: CDH-2828

commit 04dffae7b2e160fedc5aa9fdb7daa0eb79e93b0f
Author: Eli Collins <eli@cloudera.com>
Date:   Tue Mar 15 10:29:45 2011 -0700

    HDFS-1189. Quota counts missed between clear quota and set quota.
    
    Reason: Bug
    Author: John George
    Ref: CDH-2788

commit 6b6df88107a98b899aaac7dc20f061bc9f60735f
Author: Eli Collins <eli@cloudera.com>
Date:   Tue Mar 15 10:12:26 2011 -0700

    HDFS-1258. Clearing namespace quota on "/" corrupts FS image.
    
    Reason: Bug
    Author: Aaron T. Myers
    Ref: CDH-2788

commit 0863f15d727e1ad6e96a0887a93a82f315c8f734
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Mar 11 16:33:05 2011 -0800

    Amend HADOOP-7167. Allow list of tests to be excluded during build.
    
    No longer uses /dev/null as a canonical empty file, since it causes
    the build to fail on Cygwin.
    
    Author: Todd Lipcon
    Ref: CDH-2777

commit 16fa5d016e9bbe79896adf0c24dd8510b31c0325
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Mar 11 15:55:15 2011 -0800

    MAPREDUCE-2379, HADOOP-7184. Distributed cache sizing configurations are missing from mapred-default.xml
    
    * Moves local.cache.size from core-default.xml into mapred-default.xml
    * Adds documentation for mapreduce.tasktracker.cache.local.numberdirectories
    * Fixes the configuration parameter mapreduce.tasktracker.cache.local.numberdirectories
      to be named the same as it is in trunk -- previous betas had the incorrect name
      mapreduce.tasktracker.local.cache.numberdirectories
    
    Reason: fix docs
    Author: Todd Lipcon
    Ref: CDH-2815

commit fd48669392567338109a981164083c781d5e7993
Author: Jenkins <dev-kitchen@cloudera.com>
Date:   Sat Mar 12 13:40:39 2011 -0800

    CLOUDERA-BUILD. Updating versions for cdh3u0 release.

commit 0929aa6f798e6e1b736bc8715ade29686bec08f3
Author: Tom White <tom@cloudera.com>
Date:   Fri Mar 11 18:49:07 2011 -0800

    HADOOP-7183. WritableComparator.get should not cache comparator objects
    
    Reason: regression in HADOOP-6881
    Author: Tom White
    Ref: CDH-2810

commit 2a243d114e14d80a036c6a614672df0a88f6f8f7
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Mar 11 13:18:32 2011 -0800

    Amend MAPREDUCE-2178. Fix compilation failure due to unchecked return code on gcc 4.4.4
    
    Reason: Fix test on Ubuntu Maverick
    Author: Todd Lipcon
    Ref: CDH-2813

commit da757539930beecd990188d5b0e2796f8fbb3953
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Mar 11 13:10:28 2011 -0800

    MAPREDUCE-2376. Allow test-task-controller to specify the user to test as
    
    Can now specify a username in the TC_TEST_USERNAME environment variable
    in order for this test to pass when running as a userid < 1000.
    
    Reason: fix build on Cloudera hudson where uid = 101
    Author: Todd Lipcon
    Ref: CDH-2811

commit 69fc8b16f4f098ad215582fdfc3efea26e54464f
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Mar 8 11:52:33 2011 -0800

    HADOOP-7156. Workaround for unsafe implementations of getpwuid_r
    
    Adds a new configuration hadoop.work.around.non.threadsafe.getpwuid
    which can be used to enable a mutex around this call to workaround
    the thread-unsafe behavior.
    
    Reason: RHEL 6.0 and some other systems have thread-unsafe implementations
            of this libc call. This causes JVM crashes during the shuffle
            where this call is made frequently from many threads.
    Author: Todd Lipcon
    Ref: CDH-2725

commit 50194947583182a237e14c08a968de770cd3f969
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Mar 9 14:20:28 2011 -0800

    MAPREDUCE-2372. TaskLogAppender mechanism shouldn't be set in log4j.properties
    
    Reason: fixes cleanup tasks to log to proper directory even if using a CDH2
            log4j.properties
    Author: Todd Lipcon
    Ref: CDH-2793

commit 8f4bc5f77bb496928529d1d56fe5831d35c89d83
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Mar 8 13:24:22 2011 -0800

    MAPREDUCE-2371. Fix TaskLogsTruncater to not need to call obtainLogsDirOwner
    
    Reason: fixes unnecessary fork in child tasks which causes higher ulimit requirements
     compared to CDH2
    Author: Todd Lipcon
    Ref: CDH-2784

commit c48cec8fa73f8aaa3a565a4f57985b93157e6caf
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Mar 9 14:20:51 2011 -0800

    MAPREDUCE-2373. When tasks exit with a nonzero exit status, task runner should log the stderr as well as stdout
    
    Reason: assists debugging of task failures
    Author: Todd Lipcon
    Ref: CDH-2794

commit ec8790e50f212782f59ec904210e6cd07a62eb8e
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Mar 9 14:38:33 2011 -0800

    MAPREDUCE-2374. Don't use PrintWriter API for writing taskjvm.sh
    
    Reason: PrintWriter obscures errors. Also seems to fix a race condition
            which caused "Text file busy" errors launching taskjvm.sh
            on some QA clusters
    Author: Todd Lipcon
    Ref: CDH-2794

commit 6037aff7bb49c057b9661d83fb7c89dfd3694738
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Mar 8 15:55:11 2011 -0800

    HADOOP-7154. Set MALLOC_ARENA_MAX in default config
    
    Reason: RHEL 6.0 support
    Author: Todd Lipcon
    Ref: CDH-2721

commit 462e80e19c2ab2e40aa6ca4b590580de9b9a4a1b
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Mar 8 11:02:49 2011 -0800

    HADOOP-7172. SecureIO should not check owner on non-secure clusters that have no native support
    
    Reason: Fix shuffle performance regression when native libraries are not installed
    Author: Todd Lipcon
    Ref: CDH-2779

commit 97c67eea39f2d15ecb7a479efda60204fc46e4c5
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Mar 7 12:25:32 2011 -0800

    Amend MAPREDUCE-2234. Previous patch resulted in too many ls -l calls during heartbeats
    
    The previous commit under this JIRA changed the checkLocalDirs function to use
    the checkDir() function that takes a permission. This is fine at startup, but
    is expensive since it results in an `ls -l` fork for every local directory.
    This happens on every heartbeat and is not necessary. This patch amends
    the function to only use this form of checkDir() at start time, and otherwise
    just use the less expensive native java calls.
    
    Author: Todd Lipcon
    Ref: CDH-2780

commit b6f34f37281d49de97e7d41e55ffbed596036067
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Mar 7 23:06:10 2011 -0800

    HADOOP-7173. Remove unused fstat() call from NativeIO
    
    Reason: Remove unused code after HADOOP-7115
    Author: Todd Lipcon
    Ref: CDH-2779

commit df154b1e141761112f667455baab2b4620f1b465
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Mar 7 22:36:45 2011 -0800

    HADOOP-7115. Reapply final patch to add a cache to username resolution
    
    Also fixes a bug where a user not found would trigger an assertion error
    and crash the JVM.
    
    Author: Devaraj Das
    Ref: CDH-2779

commit cc57649b0c17113dde2fc8b350206dbeb159c9e3
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Mar 7 22:23:14 2011 -0800

    Revert "HADOOP-7115. Reduces the number of calls to getpwuid_r and getpwgid_r, by implementing a cache in NativeIO."
    
    This reverts commit 3ef31bcc86610d496976b4de9ada82e73f47f162.

commit d9541f7113f0e678af0819f45876bbcd454b20d5
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Mar 3 15:19:25 2011 -0800

    Amend MAPREDUCE-2323. Fix unregistering of fair scheduler metrics updater during fairsched termination
    
    Fixes an occasional test failure where different test cases
    in the same JVM were causing each other to fail.
    
    Author: Todd Lipcon
    Ref: CDH-2677

commit c7c90372c078febe77344c656b512c3d927c6a71
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Mar 8 10:16:20 2011 -0800

    HDFS-1625. TestDataNodeMXBean fails if disk space usage changes during test run
    
    Reason: flaky test
    Author: Tsz Wo (Nicholas), SZE
    Ref: CDH-2783

commit 1eed2c5af334077a27d5007b006db345de9c4d0f
Author: Andrew Bayer <andrew.bayer@gmail.com>
Date:   Tue Mar 8 12:50:50 2011 -0800

    CLOUDERA-BUILD. Changing releases repo to point to staging area.

commit 63c98116795d0b7908d2de335e7fcd53449bc514
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Mar 7 17:36:15 2011 -0800

    HADOOP-7167. Allow using a file to exclude certain tests from the build.
    
    Reason: ability to exclude known-flaky tests on golden Hudson
    Author: Todd Lipcon
    Ref: CDH-2777

commit c388c047ece60f560187add9446de412292a583c
Author: Andrew Bayer <andrew.bayer@gmail.com>
Date:   Mon Feb 28 11:00:42 2011 -0800

    CLOUDERA-BUILD. Fixing KITCHEN-815.
    
    * Invoking mvn before anything else now for property generation, etc.
    * Adding ant-contrib jar to support that.

commit bd69a6ea66f4aa6905a7347a94e6cc351bfb235a
Author: Andrew Bayer <andrew.bayer@gmail.com>
Date:   Mon Mar 7 15:57:15 2011 -0800

    CLOUDERA-BUILD. Fixing contrib paths.

commit f7a7a032b7f4300084951720331cf6732756b5b2
Author: Andrew Bayer <andrew.bayer@gmail.com>
Date:   Mon Mar 7 13:51:21 2011 -0800

    CLOUDERA-BUILD. Removing source:jar from do-release-build

commit c3174e9c80710d30fa832394709174fc8d7f6e6b
Author: Andrew Bayer <andrew.bayer@gmail.com>
Date:   Mon Mar 7 12:44:36 2011 -0800

    CLOUDERA-BUILD. Source jars weren't being generated due to change to use existing jars for artifacts.

commit 01b42afec6b6e878ed7805aac2e9a95c779d9840
Author: Andrew Bayer <andrew.bayer@gmail.com>
Date:   Mon Mar 7 10:50:47 2011 -0800

    CLOUDERA-BUILD. Simplifying repository setup.

commit 43f756d9569ac009dbae2c84064b29e8163aaa19
Author: Andrew Bayer <andrew.bayer@gmail.com>
Date:   Thu Mar 3 11:41:36 2011 -0800

    CLOUDERA-BUILD. DISTRO-109 - Use original jars as Maven artifacts rather than exploding/rebuilding.

commit 74127ea6ddff6b107dc0e2a7a72365482e33a5c0
Author: Andrew Bayer <andrew.bayer@gmail.com>
Date:   Sun Mar 6 18:40:26 2011 -0800

    CLOUDERA-BUILD. Adding relativePath to parent POMs.

commit 8d34833e7e62ebd73a6ce4868a150172e46b9701
Author: Andrew Bayer <andrew.bayer@gmail.com>
Date:   Thu Mar 3 22:12:20 2011 -0800

    CLOUDERA-BUILD. Cleanup.

commit 0e6382feae06e358114932b0f5136862311cca6a
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Mar 3 15:42:06 2011 -0800

    HADOOP-6943. The GroupMappingServiceProvider interface should be public
    
    Reason: organizations may want to implement this interface for their needs
    Author: Aaron T. Myers
    Ref: CDH-2263

commit 85731af89c0e110d0219cdb4f6ea1cf09eb2e53a
Author: Tom White <tom@cloudera.com>
Date:   Wed Mar 2 15:48:21 2011 -0800

    MAPREDUCE-2351. mapred.job.tracker.history.completed.location should support an arbitrary filesystem URI
    
    Reason: Limitation
    Author: Tom White
    Ref: CDH-2714

commit 25ece8066682682f6fdd595845dbf71555aef5bb
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Mar 1 13:17:58 2011 -0800

    Amend MAPREDUCE-2178. Update tests for fixed configuration checking code
    
    Author: Todd Lipcon
    Ref: CDH-2755

commit d62a49fc3196810f096fdbbd0ca4f48af976d5df
Author: Tom White <tom@cloudera.com>
Date:   Tue Mar 1 09:50:14 2011 -0800

    MAPREDUCE-1845. FairScheduler.tasksToPeempt() can return negative number
    
    Reason: Bug
    Author: Scott Chen
    Ref: CDH-1555

commit 46ff62f304a755dc4d47f595194cf2c6d01faab5
Author: Tom White <tom@cloudera.com>
Date:   Mon Feb 28 13:32:52 2011 -0800

    HADOOP-7011. KerberosName.main(...) throws NPE
    
    Reason: Useful for debugging
    Author: Aaron T. Myers
    Ref: CDH-2673

commit 3fd1dd275427012b93dab53d8e9b3c78aed1fc6f
Author: Andrew Bayer <andrew.bayer@gmail.com>
Date:   Fri Feb 25 15:31:51 2011 -0800

    CLOUDERA-BUILD. Add source jars to Maven process, and add
    hadoop-mrunit to Mavenization.
    
    * This is for KITCHEN-866 - and I discovered that in CDH2, we'd been
    deploying hadoop-mrunit, but hadn't been in CDH3B4. So I've added that.

commit bc52432d9832d82bc4f60166c7c718e65fe63359
Author: Andrew Bayer <andrew.bayer@gmail.com>
Date:   Fri Feb 25 15:31:32 2011 -0800

    CLOUDERA-BUILD. Rolling back attempt to speed up build by pre-caching artifacts in Maven repo - ended up breaking in non-Maven context.

commit e4596923b767eb163e141e41d5058c983e95f885
Author: Andrew Bayer <andrew.bayer@gmail.com>
Date:   Wed Feb 23 09:53:00 2011 -0800

    CLOUDERA-BUILD. Fixing reactor repo specification.

commit 80e7cd19dd23f552efd0bdf1f8b0509aa6b4b3d3
Author: Andrew Bayer <andrew.bayer@gmail.com>
Date:   Mon Feb 21 10:32:23 2011 -0800

    CLOUDERA-BUILD. Using local Maven repo as primary first in chain.
    
    Tweaks to pre-fetch dependencies into ~/.m2/repository before ant
    build is run, with Ivy configured to get from there before trying
    Maven Central.

commit af111808a1edd957a56fe77d1ba2fdc4233cafda
Author: Jenkins <dev-kitchen@cloudera.com>
Date:   Sat Feb 19 00:28:02 2011 -0800

    CLOUDERA-BUILD. Preparing for cdh3u0 development.

commit 3aa7c91592ea1c53f3a913a581dbfcdfebe98bfe
Author: Jenkins <dev-kitchen@cloudera.com>
Date:   Sat Feb 19 00:27:52 2011 -0800

    CLOUDERA-BUILD. Preparing for CDH3B4 release.

commit dd51c56ab63cb12bc207647f314ab99e1e8da32b
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Feb 17 16:37:05 2011 -0800

    Amend HADOOP-7070. Fix spurious warning message when running on machine with no krb5.conf
    
    The issue is that UGI.initialize would call KerberosName.setConfiguration before setting
    its own flag to indicate it was initialized. Then, if there was no krb5.conf,
    the class initializer of KerberosName would call back into UGI.isSecurityEnabled,
    causing initialize() to be run a second time.
    
    This bug doesn't exist upstream.
    
    Reason: spurious warnings
    Author: Todd Lipcon
    Ref: CDH-2688

commit 1ff4b594c6f9926cf49842672740a229bf06491d
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Feb 16 11:15:12 2011 -0800

    Amend MAPREDUCE-2178. Add log message when task JVM fails to fork
    
    Author: Todd Lipcon
    Ref: CDH-2671

commit f57c22b8ec079abc5a051551bce9b1209fa3e6a3
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Feb 15 23:32:32 2011 -0800

    MAPREDUCE-2332. Improve error message when userlogs dir has bad ownership
    
    Patch differs from trunk patch on account of MR-2178
    
    Reason: common souce of user error
    Author: Todd Lipcon
    Ref: CDH-2670

commit 211e7bb1ea1fecc5894d53815c70be8b68c46643
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Feb 15 19:03:04 2011 -0800

    MAPREDUCE-2331. Cover task graph servlet in fair scheduler system test
    
    Reason: improve jcarder coverage
    Author: Todd Lipcon
    Ref: CDH-2660

commit 4e93ef108e3ea798f22ef901f090999fe44a8888
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Feb 15 19:02:54 2011 -0800

    MAPREDUCE-2180. Add coverage of Fair Scheduler servlet to system test
    
    Reason: improve jcarder coverage for possible deadlocks
    Author: Todd Lipcon
    Ref: CDH-2660

commit 279a018f693a5721d7228e7c801327dda0aecb81
Author: Bruno Mahé <bruno@gnoll.org>
Date:   Tue Feb 15 15:25:19 2011 -0800

    CLOUDERA-BUILD. Installation script needs to be adapted for the new naming scheme.
    
    Reason: Our mavenization effort changes our artifacts names
    Author: Bruno Mahé
    Ref: KITCHEN-833

commit cbada181614e3a32c9bbc2bc5e274798aa94217e
Author: Tom White <tom@cloudera.com>
Date:   Mon Feb 14 14:57:21 2011 -0800

    CLOUDERA-BUILD. TestLocalMRNotification times out in CDH3.

commit 2ac40e32af497c4c0d69c5921bd1504356b11086
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Feb 15 10:47:43 2011 -0800

    Amend MAPREDUCE-1441. Reapply trimming of whitespace in mapred.local.dir configurations
    
    Reason: User bug report - regression from b2 to b3
    Author: Todd Lipcon
    Ref: CDH-2662

commit 061eb38e4b442cf3f97fcb45a3059384fd74d036
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Feb 15 10:17:37 2011 -0800

    CLOUDERA-BUILD. Fix a bug where HADOOP_DAEMON_DETACHED leaked into the environment of children
    
    This fixes a problem reported on the cdh-user list where tasks that forked out to
    call bin/hadoop ended up only catching the first 10 lines of output.
    
    Tested by writing a streaming script that catted a large text file off HDFS - verified
    bug is fixed.
    
    Author: Todd Lipcon
    Ref: CDH-2661

commit 88e89c048d8f6f346667e64b782b7daf91d8a019
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sun Feb 13 21:06:17 2011 -0800

    Amend HADOOP-7093. Revert incompatible change in semantics of HttpServer
    
    The original backport pulled in part of HADOOP-6461, which changed the way the
    "webapps" directory is located on the classpath. This broke HBase's ability
    to locate its UIs. In order to avoid having to patch HBase in CDH, this
    patch reverts that part of the change and works around the issue in the tests
    a different way.
    
    Reason: Should work with upstream HBase
    Author: Todd Lipcon
    Ref: CDH-2635

commit e0934a30d7f6a3adb2b9b2f534eefda9a4ece41d
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Feb 14 23:31:13 2011 -0800

    HADOOP-7140. IPC Reader threads should stop when server stops
    
    Reason: bug preventing TT from shutting down when build version is incompatible
    Author: Todd Lipcon
    Ref: CDH-2634

commit 3ad9f29cdcc14fbf41c6642b746ee04afaa92ff5
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sat Feb 12 19:24:07 2011 -0800

    MAPREDUCE-2323. Add metrics to the fair scheduler
    
    Reason: Necessary for CMON, useful for monitoring
    Author: Todd Lipcon
    Ref: OPSAPS-2076

commit 3f5313383362c86a2df8be55d2c524d82f9fac85
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sun Feb 13 20:14:09 2011 -0800

    Amend MAPREDUCE-2242. Reapply after MAPREDUCE-2178.
    
    Reason: fix environment escaping
    Author: Todd Lipcon
    Ref: CDH-2572

commit 541407b9f144228f2b7934decc114c59b769e481
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sat Feb 12 13:28:13 2011 -0800

    Amend MAPREDUCE-2178. Revert incompatible API change to FileUtil.chmod
    
    Reverts a change which removed InterruptedException from FileUtil.chmod's signature.
    Though the function never throws InterruptedException, this removal causes
    compilation failures for any clients who try to catch this exception (incl Pig)
    
    Reason: fix Pig build failure
    Author: Todd Lipcon
    Ref: CDH-2633

commit a4778bbf1c461b56828f9810ea44c5893f929150
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Feb 11 17:58:47 2011 -0800

    CLOUDERA-BUILD. Re-bootstrap native builds with maintainer mode
    
    Also includes bootstrap.sh where missing

commit b329fa59501af3bef287aa0bdaa4c517cd41ad04
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Feb 11 21:17:06 2011 -0800

    CLOUDERA-BUILD. Add AM_MAINTAINER_MODE to all configure.ac

commit 50329213ff9f712bc07922a212c0931d20a31de6
Author: Konstantin Boudnik <cos@apache.org>
Date:   Fri Feb 11 17:59:15 2011 -0800

    HADOOP-6879. Provide SSH based (Jsch) remote execution API for system tests
    
    Reason: missing dependency breaks system tests build
    Author: Konstantin Boudnik
    Ref: CDH-2622

commit 4b44138e14ad63a6b54a962e16c8f1fd922b3a80
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Feb 11 00:38:48 2011 -0800

    CLOUDERA-BUILD. task-controller configuration directory should be inferred from task-controller location
    
    Searches at ../../conf/ for task-controller.cfg
    
    Author: Todd Lipcon
    Ref: CDH-2623

commit a6f5e7109f538e2b9374c6518c80b0575e2dfa9f
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Feb 11 15:06:42 2011 -0800

    Amend HADOOP-5489. hdfsproxy-env.sh.template was updated but not hdfsproxy-env.sh
    
    Reason: avoid local modifications to src tree on build
    Author: Todd Lipcon
    Ref: CDH-2588

commit cc3ba6c2c33ea827e6a54cda2759d03e7e2da4c1
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Jan 19 15:11:58 2011 -0800

    HADOOP-7114. FsShell should dump errors at debug level
    
    Reason: easier to debug exceptions thrown in FsShell
    Author: Todd Lipcon
    Ref: CDH-2624

commit 5a57891c772488d8b02bcf54f4247f8fffa81d1f
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Feb 11 13:06:26 2011 -0800

    Amend MAPREDUCE-2178. Remove AC_SYS_LARGEFILE from configure.ac
    
    This flag allows opening of files >2GB, but the task-controller doesn't need
    to do this. The removal is important because some RHEL5 systems have an
    fts.h which is incompatible with the resultant CFLAG when building 32-bit.
    
    Reason: RHEL5 32-bit build
    Author: Todd Lipcon
    Ref: CDH-2623

commit df540fdaa94d96a2a1bc2685774ae44b145bfa98
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Feb 11 10:56:52 2011 -0800

    Amend MAPREDUCE-1493. Fix a typo in HTML markup on jobdetailshistory
    
    (typo made in original backport, not upstream)
    
    Reason: fix invalid HTML
    Author: Todd Lipcon
    Ref: CDH-2622

commit e70a9985960b7ac9e2f6bf3826e93f6f8c44e46a
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Feb 9 14:58:03 2011 -0800

    HADOOP-5913. Add support for starting/stopping queues.
    
    Author: Rahul K Singh
    Ref: CDH-2622

commit 1da8cc2964b488a55091e7b8b8f3a494ba4c1772
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Feb 11 12:11:24 2011 -0800

    MAPREDUCE-2321. Check for NativeIO at TT startup
    
    Reason: Easier failure diagnosis for secure TT
    Author: Todd Lipcon
    Ref: CDH-2623

commit 4b1697b297d13990e17c3b3eaaf508686a2e78a5
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Feb 1 14:05:34 2011 -0800

    MAPREDUCE-2289. Fix job staging directory to get automatically chmodded to correct permissions if incorrect
    
    Reason: fixes failures in TestFairSchedulerSystem
    Author: Todd Lipcon
    Ref: CDH-2626

commit 5d44075f3ac224bf9a259b0731035734d9c152a2
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Feb 3 09:23:55 2011 -0800

    Ammend MAPREDUCE-2234. TaskTracker should fail on startup if log dir isn't writable
    
    Reapply after MAPREDUCE-2178 backport.
    
    Reason: Easier diagnosis of misconfigured TT permissions
    Author: Todd Lipcon
    Ref: CDH-2500

commit a2b4149afd53d59fd9a279117c6917e4c83583a3
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Feb 10 19:40:24 2011 -0800

    HDFS-1318, MAPREDUCE-2330. Add MXBeans for JT, TT, DN, NN
    
    Author: Tanping Wang, Luke Lu
    Ref: CDH-2622

commit ee5c73991b43fd49a1a4eed599d2d52065054209
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Feb 11 12:17:35 2011 -0800

    Amend MAPREDUCE-2178. Check result of chdir
    
    Reason: necessary to pass -Werror on more recent gcc
    Author: Todd Lipcon
    Ref: CDH-2623

commit 717544d462bc56188d165008bc1d841bc1c03904
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Feb 11 12:16:08 2011 -0800

    Amend MAPREDUCE-2178. Check argc *after* checks for perms, etc
    
    Reason: Fix error messages during taskcontroller setup
    Author: Todd Lipcon
    Ref: CDH-2623

commit b8f3851b9604b8c1156c4a9df5c6a8b532676104
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Feb 11 12:11:15 2011 -0800

    Amend MAPREDUCE-2178. Fix racy check for config file perms
    
    Reason: Security fix
    Author: Todd Lipcon
    Ref: CDH-2623

commit fa6aca09466301c65d8d8e5d92c43e50f46683ad
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Feb 7 10:05:49 2011 -0800

    Amend MAPREDUCE-2103. Reapply "task-controller permissions checks too stringent" after MAPREDUCE-2173
    
    Reason: match documentation
    Author: Todd Lipcon
    Ref: CDH-2623

commit 7361ea92c13d6ca986e332acef70c2d8983c2f4c
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Feb 10 21:15:48 2011 -0800

    Amend MAPREDUCE-2265. Restore sbin location for task controller install
    
    Reason: reapply after YDH 0.20.100 merge
    Author: Todd Lipcon
    Ref: CDH-2623

commit 0ed0d5e311f4f0c57ab6bafa39d324e19dd15b53
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Feb 7 09:45:39 2011 -0800

    Amend MAPREDUCE-967. Reapply behavior which was clobbered by MAPREDUCE-2178
    
    (TT should not unpack job jars unnecessarily)
    
    Author: Todd Lipcon
    Ref: CDH-2623

commit c2b050e2e466b4e43a8458fd05c72e289ed2d563
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Feb 7 09:55:35 2011 -0800

    CLOUDERA-BUILD. Integrate task-controller changes from MAPREDUCE-2178 into Cloudera build

commit ac1fd5519d4cd7ddc5cf740d4b4459232523fb12
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Feb 9 16:54:58 2011 -0800

    MAPREDUCE-2178. Write task initialization to avoid race conditions leading to privilege escalation and resource leakage by performing more actions as the user.
    
    Author: Owen O'Malley, Devaraj Das, Chris Douglas
    Ref: CDH-2622

commit bb004aae8abc4f7e772adda6a75f433cf7cb198d
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Jan 19 18:02:52 2011 -0800

    HDFS-1597. Fix assertion in TestEditLogRace
    
    Reason: Sporadic test failure
    Author: Todd Lipcon
    Ref: CDH-2559

commit dbaa8cd7a1a81de7700dcec4517dbf2012906641
Author: Todd Lipcon <todd@lipcon.org>
Date:   Wed Jan 26 21:24:38 2011 -0800

    HDFS-1601. Pipeline ACKs are sent as lots of tiny TCP packets
    
    Reason: HBase performance
    Author: Todd Lipcon
    Ref: CDH-2627

commit 0d086cde04450cbb5a5f6d39a345aafcdadaa511
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Feb 2 16:59:52 2011 -0800

    HDFS-1114. Reduce NameNode memory usage by an alternate hash table
    
    Author: Tsz Wo (Nicholas) Sze
    Reason: reduce memory usage in the NameNode
    Ref: CDH-2622

commit 921d337cfa66dcc22207f3fb42e385aff4e229d0
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Feb 2 16:59:52 2011 -0800

    HDFS-1119. Introduce a GSet interface to BlocksMap.

commit 216d29555d3fb62aca7362a5611bbc5ec7846b6a
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Feb 2 16:59:52 2011 -0800

    HDFS-599. Allow NameNode to have a separate port for service requests from client requests.
    
    Reason: Allows port-based QoS to prioritize DN RPCs over client RPCs, also increases fairness
    Author: Dmytro Molkov
    Ref: CDH-2622

commit 53c9961d5b350c96200a3b85c2302a2b569e6fa8
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Feb 2 16:59:52 2011 -0800

    HDFS-1298. Add support in HDFS for new statistics added in FileSystem to track the file system operations.
    
    Author: Suresh Srinivas
    Ref: CDH-2622

commit f1625663dc6008b89af3ff80e19d64f4717f1a9b
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Feb 2 16:59:52 2011 -0800

    HDFS-1315. Add fsck event to audit log and remove other audit log events corresponding to FSCK listStatus
    
    Author: Suresh Srinivas
    Ref: CDH-2622

commit 3d026b0a1706483a4860ad80fc17b103448ac1b0
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Feb 2 16:59:52 2011 -0800

    HDFS-1383. Better error messages in HFTP
    
    Author: Tsz Wo (Nicholas) Sze
    Ref: CDH-2622

commit f148732a5c983877fb62ecfe5815eb445a192573
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Feb 2 16:59:52 2011 -0800

    HDFS-1061. Memory footprint optimization for INodeFile object.
    
    Author: Bharath Mundlapudi
    Ref: CDH-2622

commit edda8a863002796aa282fa26d74f8843eac4b728
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Feb 2 16:59:52 2011 -0800

    HDFS-1307 Add start time, end time and total time taken for FSCK to FSCK report.
    
    Author: Suresh Srinivas
    Ref: CDH-2622

commit b2cfa8caaa27a75a4452d9e26d3f3a169e13730e
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Feb 2 16:59:52 2011 -0800

    HDFS-1085. HFTP read may fail silently on the client side if there is an exception on the server side.
    
    Author: Tsz Wo (Nicholas) Sze
    Ref: CDH-2622

commit 66beb0bfe053b9c0fa02f7ac82310081fa6da2cd
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Feb 2 16:59:52 2011 -0800

    HADOOP-6713. The RPC server Listener thread is a scalability bottleneck.
    
    Author: Dmytro Molkov
    Ref: CDH-2622

commit 807e918943e9f17d4ab7912bdb9cc90970c02ef6
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Feb 2 16:59:52 2011 -0800

    HADOOP-6859. Introduce additional statistics to FileSystem to track file system operations.
    
    Author: Suresh Srinivas
    Ref: CDH-2622

commit 8dd45e436896108d8806e5a555621ea6b346912f
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Feb 2 16:59:52 2011 -0800

    HADOOP-6899. RawLocalFileSystem#setWorkingDir() does not work for relative names
    
    Author: Sanjay Radia
    Ref: CDH-2622

commit 7a31be4853d46090c7bd7798bdb7cd41915b421c
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Feb 2 18:05:06 2011 -0800

    HADOOP-6669. Respect compression configuration when creating DefaultCodec
    
    Author: Koji Noguchi
    Ref: CDH-2622

commit b2586915b911182f60e949de3dd340ae8e8099ca
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Feb 10 20:59:06 2011 -0800

    CLOUDERA-BUILD. Re-bootstrap native

commit 1fb15b9ee1b9edd5961b5972da4062117b4709e5
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Feb 10 20:58:22 2011 -0800

    CLOUDERA-BUILD. Native build for JNI group mapping code
    
    Original JNI patch is against Yahoo's distro which has divergent build files.

commit bb55a89bf7a3decd9846989f31d93cb4ed8588b5
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Feb 2 16:59:51 2011 -0800

    HADOOP-6864. Provide a JNI-based implementation of ShellBasedUnixGroupsNetgroupMapping
    
    Author: Boris Shkolnik
    Ref: CDH-2622

commit 2780f0d352553b1a5c177fe20afdea223bd1e405
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Feb 2 16:59:51 2011 -0800

    HADOOP-6818. Provides a JNI implementation of group resolution.
    
    Author: Devaraj Das
    Ref: CDH-2622

commit 562d6a6d79943f4c132e9db773898db533b4dbfd
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Feb 2 16:59:51 2011 -0800

    MAPREDUCE-1545. Add 'first-task-launched' to job-summary
    
    Author: Luke Lu
    Ref: CDH-2622

commit 4595403c6e7b1e594ea5759784aaa65eb6d46786
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Feb 2 16:59:51 2011 -0800

    MAPREDUCE-2023 TestDFSIO read test may not read specified bytes.
    
    Author: Hong Tang
    Ref: CDH-2622

commit 4cbfcd923d102ce6bcccb5dcddc1ed124f42bb8f
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Feb 9 17:20:29 2011 -0800

    MAPREDUCE-2005. Improvements to TestDelegationTokenRenewal
    
    Reason: improve test printouts
    Author: Boris Shkolnik
    Ref: CDH-2622

commit 6cceb85a5f8743aaef4a98957e31f6930a013cdd
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Feb 2 16:59:51 2011 -0800

    MAPREDUCE-1961. ConcurrentModificationException when shutting down Gridmix
    
    Author: Hong Tang
    Ref: CDH-2622

commit 16d9cf021a1989467b1372d3c2a050e6c4606230
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Feb 2 16:59:51 2011 -0800

    MAPREDUCE-339. JobTracker should give preference to failed tasks over virgin tasks so as to terminate the job ASAP if it is
    
    Author: Devaraj Das
    Ref: CDH-2622

commit 81bd8d5735c682d69d59712a7333a70d081a4216
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Feb 2 16:59:51 2011 -0800

    MAPREDUCE-1936. Make Gridmix3 more customizable.
    
    Author: Hong Tang
    Ref: CDH-2622

commit 1b9bc9af319325bd26e1530ce18527ca8f74dafd
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Feb 2 16:59:51 2011 -0800

    MAPREDUCE-1778 CompletedJobStatusStore initialization should fail if {mapred.job.tracker.persist.jobstatus.dir} is unwritable
    
    Author: Krishna Ramachandran
    Ref: CDH-2622

commit e0104169bfac10c2760fcf133b01fd2b710208cb
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Feb 2 16:59:51 2011 -0800

    MAPREDUCE-1868 Add read timeout on userlog pull
    
    Author: Krishna Ramachandran
    Ref: CDH-2622

commit a6b1ad67cf903e1e56963dcc64d7d7599321d386
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Feb 2 16:59:51 2011 -0800

    MAPREDUCE-1850. Include job submit host information (name and ip) in jobconf and jobdetails display
    
    Author: Krishna Ramachandran
    Ref: CDH-2622

commit 4098a214be838238a9879f7f978e34e89b736986
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Feb 2 18:01:15 2011 -0800

    HDFS-1626. Make block invalidate limit configurable
    
    Author: Tsz Wo (Nicholas) Sze
    Ref: CDH-2622

commit f1b4799fad93b4f02ee29ce5ef5fc217ff72e377
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Feb 9 14:53:16 2011 -0800

    MAPREDUCE-2328. Add configs for memory-related configurations to mapred-default.xml
    
    Author: Yahoo Eng
    Ref: CDH-2622

commit f879e570e2fe88776d15de04a8597898d06f3f77
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Feb 10 14:27:45 2011 -0800

    HDFS-1364. Makes long running HFTP-based applications do relogins if necessary.
    
    Author: Jitendra Pandey
    Ref: CDH-2622

commit ce9aa5ef9dfc5c4fb8e85f9e9e47e67a4b724296
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Feb 9 17:10:49 2011 -0800

    CLOUDERA-BUILD. Increase Xmx for compiling fault injection tests
    
    Ref: CDH-2622

commit f19b644c987305988a36e7d6038b16a9768cb084
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Feb 9 17:31:35 2011 -0800

    Amend MAPREDUCE-2096. MapTask SpillRecord usage doesn't need username.
    
    Author: Yahoo Eng
    Ref: CDH-2622

commit c76f57dcd993063cf960fb42e65219edd5230432
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Feb 9 17:31:34 2011 -0800

    Amend MAPREDUCE-1100. Change log messages in ReduceTask from info to debug level
    
    Reason: reduces log size for large reduce tasks
    Author: Yahoo Eng
    Ref: CDH-2622

commit 674795fc9a60d06dbea41bd9d5a133439822a62b
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Feb 10 14:13:46 2011 -0800

    Amend HADOOP-6706. Improve retry behavior for RPC clients
    
    Author: Kan Zhang
    Ref: CDH-2622

commit bfa0b28baad26de8315ec1f9282728913863c3e7
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Feb 10 14:27:45 2011 -0800

    Partial HADOOP-6965. Refactor getTGT and getRefreshTime out of anonymous class, add synchronized block around relogin
    
    Author: Jitendra Pandey
    Ref: CDH-2622

commit e8b759460ab487e98093292be7fa90afa65f47ec
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Feb 10 14:27:45 2011 -0800

    Partial HADOOP-6471. Use StringBuilder in StringUtils.join
    
    Author: Yahoo Eng
    Ref: CDH-2622

commit 3ef31bcc86610d496976b4de9ada82e73f47f162
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Feb 2 20:03:53 2011 -0800

    HADOOP-7115. Reduces the number of calls to getpwuid_r and getpwgid_r, by implementing a cache in NativeIO.
    
    Author: Devaraj Das
    Ref: CDH-2622

commit d2032071037eb33c562d97b16e0cd291f4e3f23b
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Feb 2 16:59:51 2011 -0800

    MAPREDUCE-1521. Protection against incorrectly configured reduces
    
    Author: Mahadev Konar
    Ref: CDH-2622

commit 6bc623041a1c0d511250bcfdcae85a7b084b0d5f
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Feb 2 19:35:12 2011 -0800

    HDFS-1153. Verify dfsnodelist input for correctness
    
    Author: Ravi Phulari
    Ref: CDH-2622

commit bf655f10661132486cb40ee098fdedbbb5937892
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Feb 2 19:30:24 2011 -0800

    Partial MAPREDUCE-2055. Cache counters in retired job info
    
    Does not apply entirety of upstream JIRA as described. Simply caches
    Counters in the retired job info.
    
    Author: Krishna Ramachandran
    Ref: CDH-2622

commit 3ae2cde7b036603b8aa19e2ab31994dd3209eded
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Feb 2 18:35:44 2011 -0800

    MAPREDUCE-1960. Add ability to limit size of jobconf
    
    Author: Mahadev Konar
    Ref: CDH-2622

commit 7429b6597999c1a867926b61d5075bdf85a1be6d
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Feb 2 18:04:02 2011 -0800

    Amend HDFS-457. Include new test TestDataNodeVolumeFailure
    
    Ref: CDH-2622
    Author: Boris Shkolnik

commit af1598cf2f8ce26c43f74b0be684662287e34095
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Feb 2 18:03:25 2011 -0800

    HDFS-1101. TestDiskError should check all nodes in cluster for test case
    
    Reason: Test failure
    Author: Chris Douglas
    Ref: CDH-2622

commit 2c5115af8426e44b9de804b80fcc9502d64efadd
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Feb 10 15:16:56 2011 -0800

    MAPREDUCE-1118. Enhance the JobTracker web-ui to ensure tabular columns are sortable, also added a /scheduler servlet to CapacityScheduler for enhanced UI for queue information.
    
    Author: Krishna Ramachandran
    Ref: CDH-2622

commit ba185a27aa4bb1bd965e6aa32a9b5bf3e8388f91
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Feb 2 17:38:39 2011 -0800

    MAPREDUCE-1872, MAPREDUCE-517. Capacity scheduler improvements plus minor framework changes to support
    
    - JobInProgress changes to support locality decisions
    - JobQueueJobInProgressListener.JobSchedulingInfo now has equals() method for
    
    Author: Arun Murthy
    Ref: CDH-2622

commit 83a6619c2656f543e046521f515d97fd70d647bb
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Feb 2 17:37:15 2011 -0800

    MAPREDUCE-1774. Additions to Herriot Testing to test Gridmix, Streaming, Task Controllers
    
    Includes:
      MAPREDUCE-1758 Building blocks for the herriot test cases
      MAPREDUCE-1827 [Herriot] Task Killing/Failing tests for a streaming job.
      MAPREDUCE-2053 [Herriot] Test Gridmix file pool for different input file sizes based on pool minimum size
      MAPREDUCE-2033 [Herriot] Gridmix generate data tests with various submission policies and different user resolvers.
      ... and others from YDH
    
    Reason: QA / YDH merge
    Ref: CDH-2622

commit 5dcc0777f30ae030e20e5e1e3512a0ed6a90e7fc
Author: Eli Collins <eli@cloudera.com>
Date:   Sun Feb 6 13:22:31 2011 -0800

    DISTRO-90. FUSE can pick up the wrong libjvm.so.
    
    Reason: Bug
    Author: Eli Collins
    Ref: DISTRO-90

commit f8e6600fbdc454600990e4f3732462e9b56e0b1b
Author: Eli Collins <eli@cloudera.com>
Date:   Sun Feb 6 13:37:24 2011 -0800

    MAPREDUCE-2256. FairScheduler fairshare preemption from multiple pools may
    preempt all tasks from one pool causing that pool to go below fairshare.
    
    You have a cluster with 600 map slots and 3 pools. Fairshare for each pool is
    200 to start with. Fairsharepreemption timeout is 5 mins.
    
    1) Pool1 schedules 300 map tasks first
    2) Pool2 then schedules another 300 map tasks
    3) Pool3 demands 300 map tasks but doesn't get any slot as all slots are taken.
    4) After 5 mins pool3 should preempt 200 map-slots. Instead of peempting 100
    slots each from pool1 and pool2, the bug would cause it to preempt all 200
    slots from pool2 (last started) causing it to go below fairshare. This is
    happening because the preemptTask method is not reducing the tasks left from a
    pool while preempting the tasks.
    
    The above scenario could be an extreme case but some amount of excess
    preemption would happen because of this bug.
    
    Reason: Bug
    Author: Priyo Mustafi
    Ref: CDH-2593

commit cce41bfecdffd8f37b5a9ae571a827e8042b39c4
Author: Eli Collins <eli@cloudera.com>
Date:   Sun Feb 6 13:12:41 2011 -0800

    CLOUDERA-BUILD. tar file has incorrect permissions for jsvc and task-controller.
    
    Reason: Bug
    Author: Eli Collins
    Ref: CDH-2553

commit fa3b91e008607ff69bd2796f025680aacc97bd11
Author: Eli Collins <eli@cloudera.com>
Date:   Sat Feb 5 16:21:19 2011 -0800

    DISTRO-44. Hadoop core POM missing jackson dependency.
    
    Reason: Bug
    Author: Eli Collins
    Ref: DISTRO-44

commit f40f6bef0808d34b0632bd759b7916946b6a500c
Author: Eli Collins <eli@cloudera.com>
Date:   Sun Feb 6 14:01:28 2011 -0800

    HADOOP-5489. hadoop-env.sh still refers to java1.5.
    
    Reason: Bug
    Author: Steve Loughran
    Ref: CDH-2588

commit e3356ca6f8a2ee616f610da19fc141d7578a905d
Author: Andrew Bayer <andrew.bayer@gmail.com>
Date:   Thu Jan 27 15:55:01 2011 -0800

    CLOUDERA-BUILD. Changes to support CDH Mavenization.

commit 7e7c0e2d4fe19559a728d2c0860f406124c578e3
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Jan 31 17:55:47 2011 -0800

    Amend MAPREDUCE-1716. Fix test case to wait for up to 20 seconds to verify truncation
    
    Reason: truncation is done in a separate thread at JVM finish time, which may come
      after the job is complete
    Author: Todd Lipcon
    Ref: CDH-2579

commit f6ffedb4441ec43ef7d81fe483807115e98aca41
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Jan 26 13:31:56 2011 -0800

    HADOOP-6882. Update the patch level of Jetty to 6.1.26
    
    Reason: Address XSS and many other upstream bugs
    Author: Owen O'Malley
    Ref: CDH-2564

commit 545bcc1060833f76eab19fa0425f890cb3f9d2cb
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Jan 28 13:39:43 2011 -0800

    MAPREDUCE-2242. Fix environment escaping in LinuxTaskController
    
    Reason: Support env variables with "s
    Author: Todd Lipcon
    Ref: CDH-2572

commit c5df4748c04337af74ca80a84a03e15ba2de2f0e
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Jan 27 13:01:24 2011 -0800

    HDFS-1353. Remove getBlockLocations optimization that blew out LocatedBlocks response size
    
    Reason: Address OOME found by QA
    Author: Jakob Homan
    Ref: CDH-2573

commit 8bc90cb06955b191c5d4370ca75b3b14aabc9657
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Jan 28 14:33:31 2011 -0800

    HADOOP-5050. TestDFSShell.testFilePermissions should not assume umask setting.
    
    Reason: test failure on machines with different umask
    Author: Jakob Homan
    Ref: CDH-2574

commit 0d4eb1a867620813affdfd3291cb618d6fce63ca
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Jan 28 10:33:02 2011 -0800

    HADOOP-7122. Shell commands leak Timers when timeout expires
    
    Reason: Thread leak seen on JT
    Author: Todd Lipcon
    Ref: CDH-2568

commit 2ad8c54fecae73213da7c74da9f90ba953f9f9c5
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Jan 26 17:30:18 2011 -0800

    MAPREDUCE-2253. Servlets should specify content type
    
    Reason: Fix display in browsers
    Author: Todd Lipcon
    Ref: DISTRO-72

commit 51399a0f149292ee18138646488a8070c8b7f34c
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Jan 25 15:03:29 2011 -0800

    HADOOP-7118. Fix NullPointerException in Configuration.writeXml
    
    Reason: Bug fix
    Author: Todd Lipcon
    Ref: CDH-2558

commit be89980babbc50eb7e1ccce9b583fff0ae24cf80
Author: Tom White <tom@cloudera.com>
Date:   Wed Jan 26 10:06:06 2011 -0800

    MAPREDUCE-2082. Race condition in writing the jobtoken password file when launching pipes jobs
    
    Reason: security
    Author: Jitendra Nath Pandey
    Ref: CDH-2562

commit c5645ced5c2b32c0657ba3ca60643165c28173ff
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Jan 14 00:42:43 2011 -0800

    MAPREDUCE-1085. For tasks, "ulimit -v -1" is being run when user doesn't specify mapred.child.ulimit
    
    Reason: spurious errors in logs
    Author: Todd Lipcon
    Ref: CDH-2560

commit b02ac3f86f9d929316edd10855721b67459192ba
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Jan 20 13:12:06 2011 -0800

    MAPREDUCE-2277. Fix TestCapacitySchedulerWithJobTracker intermittent failure
    
    Reason: test failure
    Author: Todd Lipcon
    Ref: CDH-2547

commit 6b63d73a1917a6c0529158c3bb78ec2ec16ad7ce
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Jan 20 16:05:01 2011 -0800

    HDFS-1589. Dont start secure cluster with insecure ports
    
    Reason: security
    Author: Todd Lipcon
    Ref: CDH-2557

commit 8b4374bfa12b1a1ed8cc8e0ab209ad763becf791
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Jan 19 14:22:55 2011 -0800

    HADOOP-3953. Implement sticky bit for directories in HDFS.
    
    Reason: security on /tmp
    Author: Jakob Homan
    Ref: CDH-2091

commit 2bec46c2f46e42a35a69fdbd6f37f8979599e83d
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Jan 19 14:48:57 2011 -0800

    Amend HADOOP-5643. Remove PermissionChecker class accidentally left around
    
    This class was supposed to be removed by HADOOP-5643 but accidentally was
    left in the tree. Unreferenced except in one place - now updated to refer
    to the new implementation.
    
    Reason: clean up - noticed during sticky bit backport
    Author: Todd Lipcon
    Ref: CDH-2091

commit 562be1407b9e3c2d8907daaa9500ac96364c9fa2
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Jan 18 10:12:08 2011 -0800

    MAPREDUCE-2238. Avoid racy permissions handling
    
    Reason: leaving undeletable dirs in userlogs directory
    Author: Todd Lipcon

commit 2df0683fe8b9a6f1c7dc9f9ec49697960b473add
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Jan 18 09:46:30 2011 -0800

    HADOOP-7110. Use JNI to implement chmod for performance
    
    Reason: fork can be rather slow, chmod is common
    Author: Todd Lipcon

commit efda213ca9682c9ee555b6c9582eb039cfefc122
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Jan 18 09:43:52 2011 -0800

    Revert "HADOOP-6304. Use java.io.File.set{Readable|Writable|Executable} where possible in RawLocalFileSystem"
    
    This reverts commit 13e93cafe8d4b1e8b741c1873118cdba0313a564.

commit b715fdffb59ad674e16d31db09b75884ddd2e0fa
Author: Tom White <tom@cloudera.com>
Date:   Mon Jan 24 17:41:41 2011 -0800

    HADOOP-5836. Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail
    
    Reason: Bug fix
    Author: Ian Nowland
    Ref: DISTRO-76

commit 516adbfc45e739130bdbb047e45f068a38e72988
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Jan 19 00:27:37 2011 -0800

    HDFS-1330. Make RPCs to DataNodes timeout.
    
    Reason: Customer request
    Author: Hairong Kuang
    Ref: CDH-2044

commit 1e3ffff9722ebd775b870a4c914f202930bb525e
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Jan 19 00:18:37 2011 -0800

    HADOOP-6889. Make RPC to have an option to timeout.
    
    Reason: Customer request
    Author: Hairong Kuang
    Ref: CDH-2044

commit fadd26e431dc879d9611f22f2974d4eab30d7efa
Author: Tom White <tom@cloudera.com>
Date:   Fri Jan 7 16:06:03 2011 -0800

    MAPREDUCE-1382. MRAsyncDiscService should tolerate missing local.dir
    
    Reason: Makes it possible for jobtracker and tasktracker to share config file and have different volumes.
    Author: Zheng Shao
    Ref: CDH-2395, DISTRO-36

commit eb118d65f792dd3947b886ea7f2c971556d496cf
Author: Tom White <tom@cloudera.com>
Date:   Tue Jan 18 13:35:06 2011 -0800

    MAPREDUCE-787. -files, -archives should honor user given symlink path
    
    Reason: bug fix
    Author: Amareshwari Sriramadasu
    Ref: CDH-2538

commit 2b0e1289ccbdb9c6837e4ab11fdf73fa8980571c
Author: Tom White <tom@cloudera.com>
Date:   Tue Jan 18 13:33:40 2011 -0800

    MAPREDUCE-572. If #link is missing from uri format of -cacheArchive then streaming does not throw error.
    
    Reason: bug fix
    Author: Amareshwari Sriramadasu
    Ref: CDH-2538

commit a144f415c0e14d1b4d42c72ccf5c97dc8f8423e8
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Jan 18 14:17:56 2011 -0800

    Amend HADOOP-6539. Roll back some doc changes that snuck in from trunk
    
    Reason: referenced features not backported into CDH3
    Author: Todd Lipcon
    Ref: CDH-2541

commit 5ebec5b74ea0b6fe9270cc40f770bf4cf4f7d4a7
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Jan 13 17:33:24 2011 -0800

    HADOOP-7093. Servlets should default to text/plain.
    
    Reason: fix /stacks and /metrics to be usable again
    Author: Todd Lipcon
    Ref: DISTRO-72

commit 185d654adfa40db3978a2f552feec95748589c89
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sat Dec 18 18:28:04 2010 -0800

    HDFS-1560. DataNode should set permissions on its data dirs rather than failing to start. Also, should default to 700
    
    Reason: Easier setup, better security
    Author: Todd Lipcon
    Ref: CDH-2530

commit 390cedb3ba0ec9bf7e4859f89c3e10dd40be2763
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Jan 14 17:39:21 2011 -0800

    Amend MAPREDUCE-1092. Enable asserts for tests by default
    
    Reason: reapply patch accidentally reverted by Herriot merge
    Author: Todd Lipcon
    Ref: CDH-520

commit 61bf38c1c0b31ef18b93ed225c8367ab4d5d7f96
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Jan 11 15:46:43 2011 -0800

    DISTRO-73. Fix filesystem leak when userlog location has different FS URI than JT
    
    No upstream JIRA since this was fixed upstream by MAPREDUCE-157
    
    Reason: Thread leak reported on cdh-user
    Author: Todd Lipcon
    Ref: DISTRO-73

commit 5c54c0cae529a17fe30d17642b868f2609c0731b
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Jan 13 11:54:25 2011 -0800

    Amend MAPREDUCE-1784. Include TestIFile unit test
    
    Reason: missed in prior commit
    Author: Eli Collins
    Ref: CDH-862

commit b48ee52a2c451a673765c67141448fa9cdc7e37a
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Jan 13 11:56:04 2011 -0800

    HADOOP-7101. UserGroupInformation.getCurrentUser() fails when called from non-Hadoop JAAS context
    
    Reason: Hadoop access fails running from within JMX-created JAAS context
    Author: Todd Lipcon
    Ref: CDH-2525

commit 329ae61a7987d576c0d73a395f773fa820594ea4
Author: Eli Collins <eli@cloudera.com>
Date:   Fri Jan 7 10:47:45 2011 -0800

    HADOOP-7089. Fix link resolution logic in hadoop-config.sh.
    
    The link resolution logic in bin/hadoop-config.sh fails when when
    executed via a symlink, from the root directory. We can replace this
    logic with cd -P and pwd -P, which should be portable across Linux,
    Solaris, BSD, and OSX.
    
    Reason: Bug
    Author: Eli Collins
    Ref: DISTRO-9

commit 0f0f7b996033179d70f3750b3d1d0ff4a1b1aef3
Author: Eli Collins <eli@cloudera.com>
Date:   Wed Jan 5 11:32:26 2011 -0800

    CLOUDERA-BUILD. Fix documentation urls that use "current".
    
    Reason: Bug
    Author: Eli Collins
    Ref: CDH-2405

commit bd69ffce6f04c6d4f3685f55403b5d57191057d9
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Jan 11 15:53:13 2011 -0800

    MAPREDUCE-1178. Fix ClassCastException in MultipleInputs by adding a DelegatingRecordReader.
    
    Reason: bug fix
    Author: Amareshwari Sriramadasu and Jay Booth.
    Ref: CDH-2513

commit 6ff69b095f390fba1e8ba3c315a93889a94de481
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Jan 11 15:56:18 2011 -0800

    MAPREDUCE-655. Change KeyValueLineRecordReader and KeyValueTextInputFormat to use new mapreduce api.
    
    Reason: Required for MultipleInputs
    Author: Amareshwari Sriramadasu
    Ref: CDH-2513

commit c1ec4018591d3e2bbb6fa8f664f9355a76e94ad5
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Jan 11 15:48:01 2011 -0800

    MAPREDUCE-369. Change org.apache.hadoop.mapred.lib.MultipleInputs to use new mapreduce API.
    
    Amended to not deprecate the old API.
    
    Reason: Customer request, low risk
    Author: Amareshwari Sriramadasu.
    Ref: CDH-2513

commit de6b20455e53435d6079b0ed9b0a005bc0c435ff
Author: Konstantin Boudnik <cos@apache.org>
Date:   Tue Jan 11 13:43:02 2011 -0800

    HADOOP-7072 Remove java5 dependencies from build
    
    Description:
    Reason: test is affected.
    Author: cos
    Ref: CDH-2485

commit 5b2e26fd1cfa592931dc9606d6cb81aaf9a5712d
Author: Tom White <tom@cloudera.com>
Date:   Mon Jan 10 16:31:24 2011 -0800

    HADOOP-5170. Reverted: "Set max map/reduce tasks on a per-job basis, either per-node or cluster-wide"
    
    Reason: Patch not accepted upstream. See MAPREDUCE-698 and MAPREDUCE-704.
    Author: Tom White
    Ref: CDH-789

commit 51a15afdd3f2b33e9c6573bfa9d002034edaaaf7
Author: Tom White <tom@cloudera.com>
Date:   Mon Jan 10 13:49:10 2011 -0800

    HADOOP-5476. calling new SequenceFile.Reader(...) leaves an InputStream open, if the given sequence file is broken
    
    Reason: Fix file handle leak, as requested on Hive list.
    Author: Michael Tamm
    Ref: DISTRO-28

commit bda05051c5ad4c56d210427bbe6445c3db66573e
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Jan 7 14:20:01 2011 -0800

    MAPREDUCE-2234. If Localizer can't create task log directory, it should fail on the spot.
    
    Reason: Make common source of support tickets easier to diagnose
    Author: Todd Lipcon
    Ref: CDH-2500

commit b57d9d0a60f8d871511750465ad94dd18a103656
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sat Dec 18 15:44:25 2010 -0800

    MAPREDUCE-2219. Fix JT startup to not require mapred.system.dir inside a dir that it owns
    
    Reason: Easier permissions
    Author: Todd Lipcon
    Ref: CDH-2499

commit b5f1e39c0561d262829ae4cce546773a418db96e
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Jan 7 14:07:54 2011 -0800

    HADOOP-7070. Delegate calls up to parent UserGroupInformation
    
    Reason: Fix login behavior underneath glassfish or other JAAS-using containers
    Author: Todd Lipcon
    Ref: DISTRO-66

commit a3421bf550672c6615541e1f73a5e0add9fcc158
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Jan 7 13:59:13 2011 -0800

    HDFS-1542. Add test for HADOOP-7082, a deadlock writing Configuration to HDFS.
    
    Author: Todd Lipcon
    Ref: CDH-2498

commit d0fcd663498ab6af0ae550ea6ace527ac7f7eae3
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Jan 7 13:56:43 2011 -0800

    HADOOP-7082. Configuration.writeXML should not hold lock while outputting.
    
    Reason: Avoid deadlock submitting jobs
    Author: Todd Lipcon
    Ref: CDH-2498

commit 5d85605d7f324d9bb5751bf9e1733170dd97a911
Author: Tom White <tom@cloudera.com>
Date:   Thu Jan 6 12:21:40 2011 -0800

    CLOUDERA-BUILD. Part of MAPREDUCE-157 to fix doubly-escaped job history links
    
    Reason: Bug fix
    Author: Tom White
    Ref: CDH-2283

commit 73c38ae4211b732cae575d7f52f233e2cf6f909e
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Jan 5 14:39:10 2011 -0800

    MAPREDUCE-1734. Un-deprecate the old MapReduce API in the 0.20 branch.
    
    Reason: Old APIs will remain through at least 0.23
    Author: Todd Lipcon
    Ref: CDH-2494

commit 4882770efb2a9eb52ae51d5b35e6ba3a2737c44e
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sun Dec 19 17:38:57 2010 -0800

    MAPREDUCE-1906. Allow heartbeat interval minimum to be configured
    
    Author: Todd Lipcon
    Ref: CDH-2319

commit f9f9182ecf6d208fd28b23941b5e851e1efedec7
Author: Eli Collins <eli@cloudera.com>
Date:   Mon Jan 3 21:35:29 2011 -0800

    HADOOP-6578. Configuration should trim whitespace around a lot of value types.
    
    Reason: Improvement
    Author: Michele Catasta
    Ref: CDH-2266

commit 4214d3e60326a9b41e84f85895aca325d634c304
Author: Konstantin Boudnik <cos@apache.org>
Date:   Mon Dec 20 12:16:32 2010 -0800

    CDH-2381. org.apache.hadoop.cli.TestCLI.testAll (from TestCLI) failing in golden CDH3-Hadoop Hudson job
    
    Description:
    Reason: test is affected.
    Author: cos
    Ref: CDH-2381

commit 4ad53f3de801a1a670d658d4d933d9576b99445c
Author: Eli Collins <eli@cloudera.com>
Date:   Fri Dec 17 00:00:55 2010 -0800

    MAPREDUCE-1938. Ability for having user's classes take precedence over
    the system classes for tasks' classpath.
    
    It would be nice to have the ability in MapReduce to allow users to
    specify for their jobs alternate implementations of classes that are
    already defined in the MapReduce libraries. For example, an alternate
    implementation for CombineFileInputFormat.
    
    Reason: New feature
    Author: Devaraj Das
    Ref: DISTRO-64

commit b0ed02a3d621bbf994f8fb5dc1d86a451afe490d
Author: Tom White <tom@cloudera.com>
Date:   Mon Dec 13 15:00:20 2010 -0800

    MAPREDUCE-1699. JobHistory shouldn't be disabled for any reason
    
    Reason: Bug
    Author: Arun C Murthy
    Ref: CDH-1691

commit ba2c7a5b99915ca1431e3024dd80ac359c8005a1
Author: Tom White <tom@cloudera.com>
Date:   Tue Dec 14 17:41:09 2010 -0800

    MAPREDUCE-1853. MultipleOutputs does not cache TaskAttemptContext
    
    Reason: Bug
    Author: Torsten Curdt
    Ref: CDH-2010

commit 43fb37a6b9693003cc9ea1161bc080e5309b1973
Author: Eli Collins <eli@cloudera.com>
Date:   Thu Dec 9 08:54:22 2010 -0800

    MAPREDUCE-1621. Streaming's TextOutputReader.getLastOutput throws NPE if it has
    never read any output.
    
    If TextOutputReader.readKeyValue() has never successfully read a line,
    then its bytes member will be left null. Thus when logging a task
    failure, PipeMapRed.getContext() can trigger an NPE when it calls
    outReader_.getLastOutput().
    
    Reason: Bug
    Author: Amareshwari Sriramadasu
    Ref: CDH-855

commit af9ef1fcde9aa7ed6d84481837dd5c3e6e4ecc14
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Dec 7 18:29:42 2010 -0800

    MAPREDUCE-1784. IFile should check for null compressor.
    
    Reason: Avoid NPE
    Author: Eli Collins
    Ref: CDH-862

commit 50796a1b13f77ef5c2e098f6a651bf52c05cd2f7
Author: Alejandro Abdelnur <tucu00@gmail.com>
Date:   Tue Nov 30 14:03:00 2010 +0800

    CDH-2234 adding Oozie needed config to Hadoop config example-confs/
    
    Description: adding Oozie needed config to Hadoop config example-confs/
    Reason: to enable zero config for Oozie out of the box
    Author: Alejandro
    Ref: CDH-2234

commit 39b1d616bd1d7cf88cb057d1fd70b0d9b17a9992
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Nov 24 14:19:20 2010 -0800

    Amend HADOOP-6978. Add AC_SYS_LARGEFILE to native build to fix issue with large files
    
    Author: Owen O'Malley
    Ref: CDH-2009

commit 552ebe400b6d94b02d8a3ffebb61b433f7e13aa1
Author: Eli Collins <eli@cloudera.com>
Date:   Fri Nov 12 23:18:51 2010 -0800

    HDFS-1250. Namenode accepts block report from dead datanodes.
    
    Reason: Bug
    Author: Suresh Srinivas
    Ref: CDH-2277

commit fa4ca629131059ade47618d0ed201c4ddc3abe72
Author: Eli Collins <eli@cloudera.com>
Date:   Fri Nov 12 19:46:24 2010 -0800

    HADOOP-6813. Add a new newInstance method in FileSystem that takes a "user" as argument.
    
    Reason: Improvement
    Author: Devaraj Das
    Ref: CDH-648

commit f8b9f2f2e062b33c752c53e5aa3f871f08fa359c
Author: Eli Collins <eli@cloudera.com>
Date:   Wed Nov 10 14:58:26 2010 -0800

    HADOOP-6985. Suggest that HADOOP_OPTS be preserved in hadoop-env.sh.template.
    
    Reason: Improvement
    Author: Ramkumar Vadali
    Ref: CDH-2271

commit 78b9e608a82c69e59950d4be585fc17e79c8eeca
Author: Eli Collins <eli@cloudera.com>
Date:   Wed Nov 3 16:32:40 2010 -0700

    CLOUDERA-BUILD. Remove the MySQL Connector/J library. See SQOOP-97.

commit 835e4b2f8d5f5b8de9eaaf6b2585a62224574323
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Oct 19 15:19:34 2010 -0700

    HDFS-1464. Fix reporting of 2NN address when dfs.secondary.http.address is default
    
    Reason: regression due to HDFS-1080
    Author: Todd Lipcon
    Ref: CDH-2226

commit 62a9a1327165a1a363639c2f21b79be61616f7b3
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Oct 14 16:21:34 2010 -0400

    HADOOP-6663. Fix decompression of empty compressed files
    
    Author: Kang Xiao
    Ref: CDH-2215

commit 98c55c28258aa6f42250569bd7fa431ac657bdbd
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Oct 8 17:07:56 2010 -0700

    CLOUDERA-BUILD. Fix ownership of .out files when starting daemons as root
    
    Author: Todd Lipcon

commit 16ba98db9791a1a24aff066ae884c64abb4b589a
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Oct 8 14:10:50 2010 -0700

    CLOUDERA-BUILD. Use su instead of sudo for dropping root privileges.
    
    This fixes an issue on EC2, where some AMIs don't properly support
    sudo.

commit 9616bfbd1f2dd2686a29f47c62fff08d955a7ac8
Author: Todd Lipcon <todd@lipcon.org>
Date:   Thu Oct 7 23:11:25 2010 -0700

    HADOOP-6995. Allow wildcards to be used in ProxyUsers configurations
    
    Author: Todd Lipcon
    Ref: CDH-648

commit 374e10963329ec08d861774d056d1c5ee673f4c8
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Oct 8 12:43:11 2010 -0700

    Amend MAPREDUCE-2096. Fix IndexOutOfBoundsException truncating logs when tasks produced no log output
    
    Author: Todd Lipcon
    Ref: CDH-648

commit 49e808c8751615fe154061d456f171f8bb582504
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Oct 7 18:10:47 2010 -0700

    CLOUDERA-BUILD. Add symlinks to built HADOOP_HOME like hadoop-core.jar -> hadoop-core-0.20.2+NNN.jar
    
    This helps other projects create symlinks into the installed hadoop-home without
    having to declare a dependency on a particular patchlevel of the jar.

commit 4904e0fbd60c5f043bb1451ca4e3be012be8cf59
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Oct 7 14:50:46 2010 -0700

    Amend HDFS-1260. Add some sanity checking on FSDataset
    
    Reason: Help debug errors seen in the wild
    Author: Todd Lipcon
    Ref: CDH-913

commit b919f0a99b2ac3b48a32b0906d19c2b306f7a554
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Oct 6 15:38:19 2010 -0700

    CLOUDERA-BUILD. Don't use HADOOP_IDENT_STRING to set user
    
    This was a misuse of this variable - it should only determine the name of the log/pid files

commit eed3bc71002f4cbf3fd0aaeef7016cb80cf61a4a
Author: Todd Lipcon <todd@lipcon.org>
Date:   Wed Oct 6 18:05:30 2010 -0700

    CLOUDERA-BUILD. Amend bin/hadoop changes to properly start tasktracker and jobtracker with sudo

commit 8bb561e0dc46995cca059b5de334b3b790b8ae17
Author: Todd Lipcon <todd@lipcon.org>
Date:   Wed Oct 6 17:29:17 2010 -0700

    Amend MAPREDUCE-2096. fsError() when called from within MR should not do authorization
    
    Reason: Fix incorrect authorization exception
    Author: Todd Lipcon / Devaraj Das
    Ref: CDH-648

commit ce67cd87f21543348ca5c137dee3ff0dc7f338dd
Author: Aaron T. Myers <atm@cloudera.com>
Date:   Tue Oct 5 21:03:50 2010 -0700

    HADOOP-6988. Add support for reading multiple hadoop delegation token files
    
    Author: Aaron T. Myers
    Reason: So Hue can submit jobs authenticated against both the JT and NN.
    Ref: CDH-648

commit ca36717c2b3bc9d610ba2a049b98f798b9d8c1c1
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Oct 5 15:33:11 2010 -0700

    CLOUDERA-BUILD. No need to restrict jsvc usage to secure clusters
    
    Reason: It is simpler to always start the DN as root and let it drop privileges
            when jsvc is available. This is OK even if kerberos auth is off.
    Author: Todd Lipcon

commit 60a6eece06bde26516649bdcbed4096dd734503e
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Oct 4 18:01:39 2010 -0700

    CLOUDERA-BUILD. Send SecurityAudit logs to the console unless running through hadoop-daemon.sh
    
    Reason: Fixes issue where clients would try to write SecurityAuth.audit logs
    Author: Todd Lipcon

commit dccf120c3796312b1a67481daaa0366b13d471fe
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Oct 4 16:32:24 2010 -0700

    CLOUDERA-BUILD. Amend Task Controller for sbin-located task-controller
    
    Reason: earlier commit moved task-controller to an sbin directory, this updates
    	the java side
    Author: Todd Lipcon

commit c7f9a8ece8b63fa571420b0c1e40044177b8e42d
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Oct 4 16:11:04 2010 -0700

    CLOUDERA-BUILD. Redo hadoop and hadoop-daemon.sh scripts to be more compatible with packaging
    
    Author: Todd Lipcon

commit d56be41bb9648f721ba6714827ccfbf503af7d84
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Oct 4 12:32:09 2010 -0700

    Amend MAPREDUCE-2103. task-controller does not require setgid permissions

commit 7689035d99d720f374c543697016ef23fec7f4f8
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Oct 4 11:49:36 2010 -0700

    CLOUDERA-BUILD. Update example secure config

commit eebf85c655d085b5cc49860d5ac59078a99e2349
Author: Eli Collins <eli@cloudera.com>
Date:   Mon Oct 4 12:59:03 2010 -0700

    DISTRO-29. Switch Hue thrift plugin port to 10090 to avoid conflicting with HBase.
    
    Reason: Improvement
    Author: Eli Collins
    Ref: CDH-1815.

commit a93572183d61bcc9523206450a017c8908795009
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sun Oct 3 22:52:24 2010 -0700

    Amend MAPREDUCE-2096. Fix issue where JVM authorization was incorrectly triggered
    
    Reason: TaskRunner calls TaskTracker.reportDiagonsticInfo directly at one point,
            so the current user is the MR user, rather than the Job. This patch
            changes the TaskRunner to call to an unauthorized version of the function.
    Author: Todd Lipcon
    Ref: CDH-648

commit 7fb2c9a498db04a93aeee6fe7f2beb4abdf7489f
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sun Oct 3 22:51:21 2010 -0700

    MAPREDUCE-2103. task-controller shouldn't require o-r permissions
    
    Author: Todd Lipcon
    Ref: CDH-648

commit 9a17aaf708514474dff8be5706c798b4c1d5199f
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sun Oct 3 18:00:24 2010 -0700

    CLOUDERA-BUILD. jsvc and task-controller should install into a platform-specific dir

commit 766c6c6e77514164afbd5f14ca171419106d93de
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sun Oct 3 17:45:22 2010 -0700

    CLOUDERA-BUILD. do-release-build should build task controller

commit 81762d84ddc11fb5268c2ae92feb47d9e1197f1a
Author: Eli Collins <eli@cloudera.com>
Date:   Mon Sep 20 09:37:25 2010 -0700

    HDFS-1377. Quota bug for partial blocks allows quotas to be violated.
    
    There may be a delta in FSDirectory#replaceNode even with identical
    blocks because INode#diskspaceConsumed rounds up the size of the last
    block if newnode is under construction. This causes us to incorrectly
    reduce the space consumed for quota accounting. Looking at uses of
    this functions oldnode and newnode should always have the same blocks,
    therefore we should not expect a delta here.
    
    Reason: Bug
    Author: Eli Collins
    Ref: CDH-2092

commit 78625c0dfb4e0f819f79ce29d215097e87790012
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Sep 29 22:44:25 2010 -0700

    HADOOP-6408. Add a servlet at /conf to display running configuration
    
    Reason: Easier debugging and support
    Author: Todd Lipcon
    Ref: CDH-2175

commit 91fa1dfdd74ebac1e88da1d3adb644cf5fe84e7a
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Sep 29 00:44:57 2010 -0700

    HADOOP-6496. HttpServer sends wrong content-type for CSS files (and others)
    
    Author: Todd Lipcon
    Reason: Fixes styling on web UIs

commit 9309cf6f1851cc1b379028235b79cc2cf9fe1774
Author: Bruno Mahé <bruno@cloudera.com>
Date:   Mon Sep 27 19:43:02 2010 -0700

    DISTRO-38. Autotools cannot find libssl on fedora
    
    Description: Some GNU/Linux distribution have changed the DSO-linking semantics of the gcc compiler.
    Previously ld would attempt to implicitly satisfy link requirements and therefore implictely add libcrypto when linking to libssl.
    The dependency on libcrypto must now be explicitely stated on these platform when linking to libssl.
    See https://fedoraproject.org/wiki/Features/ChangeInImplicitDSOLinking and https://fedoraproject.org/wiki/UnderstandingDSOLinkChange
    Reason: Bug
    Author: Bruno Mahé
    Ref: DISTRO-38

commit daa2fd5e76c63c9d9efa11225383fd5496442862
Author: Bruno Mahé <bruno@cloudera.com>
Date:   Mon Sep 27 16:53:47 2010 -0700

    CDH-2137. Jsvc requires to set the architecture flag to the link command
    
    Reason: Bug
    Author: Bruno Mahé

commit 66e1ba8787ef26f68cc3ec125efd85a776748c36
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Sep 29 13:13:30 2010 -0700

    Amend MAPREDUCE-2096. Rebootstrap native
    
    Reason: Previous libtoolize wasn't run with --copy, so broken link was in repo
    Author: Todd Lipcon
    Ref: CDH-648

commit 90167ef041f15f351ac6357212c477c682373e05
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Sep 27 16:38:35 2010 -0700

    HADOOP-6907, HADOOP-6938, HADOOP-6905. Fix RPC client behavior to use a per-connection configuration.
    
    Author: Kan Zhang
    Ref: CDH-648

commit 3f2759c884c496ef71a75db9d436ebfe61e04111
Author: Todd Lipcon <todd@lipcon.org>
Date:   Mon Sep 27 22:35:31 2010 -0700

    MAPREDUCE-1288. DistributedCache may localize a private file for multiple users
    
    Reason: bug fix when multiple users add the same "private" file to their distributed caches
    Author: Devaraj Das
    Ref: CDH-648

commit c109efb9579e830587e5f7c1762c816d5d241b71
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Sep 27 16:10:43 2010 -0700

    HDFS-1301. TestHDFSProxy needs to use the server side conf for ProxyUser settings.
    
    Reason: Fix failing unit test after HADOOP-6815 application
    Author: Boris Shkolnik
    Ref: CDH-648

commit a1cdd7b028bfd5aaf6bdbfe18122b9a0fb44ed12
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Sep 17 10:50:38 2010 -0700

    CLOUDERA-BUILD. Upgrade Jackson to 1.5.2 to avoid conflicts with Avro and HBase
    
    Author: Todd Lipcon

commit 11d842c61eb63c156e1c3f753d795868bbd2fa0a
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Sep 23 13:03:53 2010 -0700

    HADOOP-6815. refreshSuperUserGroupsConfiguration should use server side configuration for the refresh
    
    Author: Boris Shkolnik
    Ref: CDH-648

commit 6b3856a94ca4748cc8e891cdd11473c03f821ee4
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Sep 6 12:54:33 2010 -0700

    MAPREDUCE-2096. Secure local filesystem IO from symlink vulnerabilities
    
    Reason: security vulnerability that could be exploited to gain access to
            other user's job credentials, task output, etc.
    Author: Todd Lipcon
    Ref: CDH-2009

commit 0b213def5dbb9dc7a90009a3446a913ea15f5ee7
Author: Aaron T. Myers <atm@cloudera.com>
Date:   Fri Sep 24 11:47:16 2010 -0700

    HADOOP-6951. Distinct minicluster services (e.g. NN and JT) overwrite each other's service policies
    
    Description: Make ServiceAuthorizationManager's map of service ACLs instance-specific, instead of static.
    Reason: To make HUE's tests work against CDH3.
    Author: Aaron T. Myers
    Ref: CDH-648

commit 85565602b4cebbd91829a0d434e86edd8990fcbc
Author: Eli Collins <eli@cloudera.com>
Date:   Mon Sep 20 22:14:32 2010 -0700

    DISTRO-32. Make the default example configuration support Hue.
    
    Reason: Improvement
    Author: Eli Collins
    Ref: CDH-1815

commit 0248b41179a0baf9dd7e4120137f0c24b7251e95
Author: Eli Collins <eli@cloudera.com>
Date:   Tue Sep 21 13:54:40 2010 -0700

    DISTRO-1. Add /usr/lib/jvm/default-java to HADOOP_HOME detection.
    
    Reason: Improvement
    Author: Eli Collins
    Ref: CDH-1979

commit 55958019974d56fb1b66e209b49c22efe4a4aa95
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Sep 16 16:52:39 2010 -0700

    CLOUDERA-BUILD. Change pom templates to use com.cloudera.hadoop groupId

commit 6931d93bec73254f13ba08cbe49589a747eb399d
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Sep 8 21:51:47 2010 -0400

    HADOOP-6946. SecurityUtil's ticket-fetching should call UGI.getCurrentUser rather than directly accessing JAAS
    
    This fixes a bug where a daemon could call login() and thus set the loginUser(),
    and then still have a null Subject, leading to an inability to fetch TGTs.
    This impacted, for example, the "-checkpoint force" start-up option of the 2NN.
    
    Reason: Fix 2NN startup with forced checkpoint
    Author: Todd Lipcon
    Ref: CDH-648

commit 5fe725b1a48326bf606dadfc636586904aa861c4
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Sep 6 11:41:10 2010 -0700

    HDFS-1378. Track and report file offsets in cases of edit log replay failure.
    
    Author: Todd Lipcon

commit c3b6e1fadf01e1955fff7361cb7872ff4fd997ab
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Sep 15 14:47:45 2010 -0700

    Amend HADOOP-6656. Renewal thread should shut down if it fails to renew
    
    Reason: fixes tight infinite loop that heavily loads KDC
    Author: Todd Lipcon

commit 593f3831671202afef2555243f37ca8f7ac2c46c
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Sep 8 14:30:44 2010 -0700

    Amend HDFS-895. Fix races between close() and sync()

commit a9adf89fd17aa3199c4c4f26d7a2d5f8ccffc84d
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Sep 10 16:15:12 2010 -0700

    Amend HADOOP-6539. Fix docs to remove mention of sticky bit feature not backported
    
    Author: Todd Lipcon
    Ref: CDH-648

commit 4da1b0da8e176f6d7cf5bdc13786f37e254b6eda
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Sep 10 15:45:34 2010 -0700

    HDFS-1387. Update HDFS permissions guide to reflect security
    
    Reason: documentation
    Ref: CDH-648
    Author: Todd Lipcon

commit 10db8fc860cd1c5de28d204b0efecb37476f0483
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Sep 16 15:04:18 2010 -0700

    HDFS-1404. Incorrect logic in TestNodeCount causes test failures
    
    Reason: Fix occasional red build
    Author: Todd Lipcon

commit 283d6b8d3d1c0ffece93bdf4046b09972b0f44a3
Author: Eli Collins <eli@cloudera.com>
Date:   Thu Sep 16 00:05:09 2010 -0700

    HADOOP-6950. Suggest that HADOOP_CLASSPATH should be preserved in hadoop-env.sh.template.
    
    Reason: Improvement
    Author: Philip Zeyliger
    Ref: CDH-2135

commit b7679d80577d1d3625f520fc01787b4f75faab1d
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Sep 15 22:43:18 2010 -0700

    MAPREDUCE-2073. TestTrackerDistributedCacheManager should be explicit about test environment requirements
    
    Reason: Assist testing
    Author: Todd Lipcon
    Ref: CDH-648

commit 892b49d1fd8725323dfbbb19269ec16debe05c57
Author: Eli Collins <eli@cloudera.com>
Date:   Wed Sep 15 20:08:16 2010 -0700

    HDFS-1267. fuse-dfs does not compile.
    
    Reason: Bug
    Author: Devaraj Das
    Ref: CDH-2134

commit 98f1914cc0c6f91f8c0e3aa8cea8e7609b49c901
Author: Eli Collins <eli@cloudera.com>
Date:   Wed Sep 15 20:00:17 2010 -0700

    HDFS-1000. Updates libhdfs to the new API for UGI.
    
    Reason: Bug
    Author: Devaraj Das
    Ref: CDH-648

commit 5966f146fdb0202c0ffd66d3ec3f0c7c4def6afe
Author: Eli Collins <eli@cloudera.com>
Date:   Wed Sep 15 19:55:36 2010 -0700

    Revert "HDFS-1000. libhdfs needs to be updated to use the new UGI"
    
    Description: This is being reverted to apply a newer version of the patch.
    Author: Devaraj Das
    Ref: UNKNOWN

commit d0b28bf2a7ebeff419c7226310aaff7de290af22
Author: Eli Collins <eli@cloudera.com>
Date:   Fri Sep 10 09:41:32 2010 -0700

    HADOOP-6881. The efficient comparators aren't always used except for BytesWritable and Text.
    
    Reason: Bug
    Author: Owen O'Malley
    Ref: CDH-2112

commit cdb501c28dcdeec73ccf92a886bf943f665a5693
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Sep 3 17:09:58 2010 -0700

    HDFS-446. Improvements to Offline Image Viewer.
    
    Author: Jakob Homan
    Ref: CDH-2106

commit 83da6170d68e29c1ae7881c2606af59a2145a8aa
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Sep 3 17:09:05 2010 -0700

    HDFS-461. Tool to analyze file size distribution in HDFS.
    
    Author: Konstantin Shvachko
    Ref: CDH-2106

commit dc0e28f08c2df37a6b99614a5c764fc4037032a0
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Sep 3 17:03:28 2010 -0700

    HADOOP-5752. Add a new hdfs image processor, Delimited, to oiv.
    
    Author: Jakob Homan
    Reason: Hue Headlamp app
    Ref: CDH-2106

commit 7362cada95bd07ff3b034f5c7fb15b42365c2d06
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Sep 3 16:55:59 2010 -0700

    HADOOP-5467. Add offline image viewer tool for HDFS filesystem images
    
    Author: Jakob Homan
    Reason: Necessary for Hue Headlamp application
    Ref: CDH-2106

commit b94821f874983e64c78fc93d95539a4f262dca78
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Sep 3 15:56:19 2010 -0700

    HADOOP-6939. Fix inconsistent lock ordering in AbstractDelegationTokenSecretManager
    
    Reason: Fix potential deadlock
    Author: Todd Lipcon
    Ref: CDH-648

commit ae58865f3d65faa78707c79536c16e5b7ce40c16
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Sep 2 16:44:47 2010 -0700

    MAPREDUCE-2051. Add a fair share scheduler system test
    
    Reason: Helps identify deadlocks or races in fair scheduler
    Author: Todd Lipcon
    Ref: CDH-1823

commit b839ebbb2f517eb57930dbe8ed40afc5307dbe3a
Author: Eli Collins <eli@cloudera.com>
Date:   Thu Sep 2 16:07:47 2010 -0700

    MAPREDUCE-1280. Eclipse Plugin does not work with Eclipse Ganymede (3.4).
    
    Reason: Bug
    Author: Alex Kozlov
    Ref: CDH-537

commit 62661841b0687c431f4a066323b8ebb959b90612
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Sep 1 09:51:41 2010 -0700

    DISTRO-27. Fix CombineFileInputFormat incompatible API change
    
    - Revert CombineFileInputFormat to branch-0.20 r990003
    - Reapply following patches to old-API CombineFileInputFormat:
     - MAPREDUCE-1480. Apply more correct progress indication to old-API CombineFileRecordReader
     - MAPREDUCE-1423. Improve performance of CombineFileInputFormat when multiple pools are configured
    - Resuscitate old-API test for CombineFileInputFormat
    
    Author: Todd Lipcon et al
    Reason: Fix hive integration issue
    Ref: DISTRO-27

commit e8d93d35b92d602d8095657bb08a949bfb5aeea8
Author: Eli Collins <eli@cloudera.com>
Date:   Mon Aug 30 11:14:20 2010 -0700

    HADOOP-5861. s3n files are not getting split by default.
    
    Reason: Bug
    Author: Tom White
    Ref: CDH-2011

commit 527b0ee624e8a02b357f2a2a1a31fa798f832d35
Author: Eli Collins <eli@cloudera.com>
Date:   Fri Aug 27 16:26:17 2010 -0700

    HADOOP-6925. BZip2Codec incorrectly implements read().
    
    Description: HADOOP-4012 added an implementation of read() in BZip2InputStream
    that doesn't work correctly when reading bytes > 0x80. This causes
    EOFExceptions when working with BZip2 compressed data inside of sequence files
    in some datasets.
    
    Reason: Bug
    Author: Todd Lipcon
    Ref: CDH-2068

commit 8f374b1eff2a54fd05590b935c3179c9b686fc0b
Author: Eli Collins <eli@cloudera.com>
Date:   Fri Aug 27 09:30:42 2010 -0700

    HADOOP-6928. Fix BooleanWritable comparator in 0.20.
    
    Description: The RawComparator for BooleanWritable was fixed as part of
    HADOOP-5699 in 0.21 and trunk. The fix should be pushed back into 0.20.
    
    Reason: Bug
    Author: Owen O'Malley
    Ref: CDH-2063

commit 0dee7a8262a12b12e448a4342b636842646c16d0
Author: Eli Collins <eli@cloudera.com>
Date:   Thu Aug 26 20:15:54 2010 -0700

    HADOOP-6833. IPC leaks call parameters when exceptions thrown.
    
    Reason: Bug
    Author: Todd Lipcon
    Ref: CDH-2063

commit e7c81789d095a30fb8abf93557d10b84ea66eaea
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Jun 28 13:37:33 2010 -0700

    CLOUDERA-BUILD. Add sample configuration for a secure cluster based on YDH's sample
    
    Ref: CDH-648

commit fc5270e00c648eb20737918eb689a0d4c4200e98
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Aug 25 14:14:13 2010 -0700

    Amend HDFS-1260. Fix case where FSDataset's volume map could become inconsistent with disk storage

commit 5e76abac366a112c5d221750332ba2c272f319d0
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Aug 25 12:14:02 2010 -0700

    MAPREDUCE-2034. Fix TestSubmitJob to verify actual IOException text

commit ab1f3a96c00e2eb53569c1ba682d73ed10bfb4b5
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Aug 25 11:33:35 2010 -0700

    Amend HADOOP-6762. Fix gridmix test failures when JobMonitor RPC is interrupted
    
    Reason: HADOOP-6762 added a new exception cause when outbound RPCs are Interrupted.
            This patch fixes gridmix to be aware of InterruptedExceptions.

commit ecc1a3b745384b0f925cb6efc7b6775240ad9195
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Aug 2 17:47:05 2010 -0700

    HDFS-1164. Fix problem in TestHdfsProxy when user running tests doesn't belong to 'users' group
    
    Author: Todd Lipcon
    Reason: fix broken unit test
    Ref: CDH-648

commit bef7c171a5fd2663b5d16bdbba4477ee54947df6
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Aug 2 17:42:36 2010 -0700

    HDFS-1313. HdfsProxy changes from HDFS-481 missed in y20.1xx
    
    Author: Rohini Palaniswamy
    Reason: Changes accidentally ommitted from HDFS-481 YDH backport, fixes hdfsproxy
    Ref: YDH

commit 67048e890eff6c9cd548dcdc980f5ff3072234cc
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Jul 2 22:53:21 2010 -0700

    MAPREDUCE-1682. Fix speculative execution to ensure tasks are not scheduled after job failure.
    
    Author: Arun C Murthy
    Reason: Fixes potential wasted task slots
    Ref: YDH

commit d9b7bd0ff1b74a579761d1bd8d9130c7adb9e80c
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Jul 2 16:49:11 2010 -0700

    MAPREDUCE-1914. Ensure unique sub-directories for artifacts in the DistributedCache are cleaned up.
    
    Author: Dick King
    Reason: Without patch, distributed cache accumulates directories until reaching dirent limit (32K)
            after which the TT fails.
    Ref: YDH

commit eb44564b61a0467aa2891fd3a434eda20ac30d7b
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Jul 2 16:34:35 2010 -0700

    MAPREDUCE-1538. Add a limit on the number of artifacts in the DistributedCache to ensure we cleanup aggressively.
    
    Author: Scott Chen
    Reason: Without patch, subdirectory count in cache grows without bound.
    Ref: YDH

commit f9051921efb8d76b0dcd0eed27fd15600635caf0
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Jul 27 19:08:31 2010 -0700

    MAPREDUCE-2035. Fix task controller build to use -Wall, fix warnings

commit 9d3d402301267201d771becef005864e48ea5b82
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Jun 28 12:05:46 2010 -0700

    MAPREDUCE-1900. MapReduce daemons should close FileSystems that are not needed anymore
    
    Patch: https://issues.apache.org/jira/secure/attachment/12448230/mapred-fs-close.patch
    Patch: https://issues.apache.org/jira/secure/attachment/12448509/fs-close-delta.patch
    Author: Kan Zhang
    Reason: Secured MR daemons often open DFS instances on behalf of a given user, which then
            end up stored in the FS Cache data structure. This patch allows those cache
            entries to be collected, preventing possible OOME scenario.
    Ref: CDH-648

commit 945dc2bdacb99855b66ce70b3024a6f0f8b9f2d6
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Jun 23 14:37:10 2010 -0700

    HADOOP-6832. Add a static user plugin for web auth for external users.
    
    Author: Owen O'Malley
    Reason: Security
    Ref: CDH-648

commit 00b53896f9de47f36fe8ea5a4ffaa13a85877a3c
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Apr 29 12:51:22 2010 -0700

    HDFS-1007. Update HFTPFileSystem to use delegation tokens to support security.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12443223/1007-bugfix.patch
    Patch: https://issues.apache.org/jira/secure/attachment/12446280/hdfs-1007-long-running-hftp-client.patch
    Patch: https://issues.apache.org/jira/secure/attachment/12446362/hdfs-1007-securityutil-fix.patch
    Author: Devaraj Das
    Reason: Security
    Ref: CDH-648

commit 221b3e83ec620bb4903946574fe0b250db58fc8a
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri May 28 15:14:24 2010 -0700

    HDFS-1178. The NameNode servlets should not use RPC to connect to the NameNode.
    
    Author: Owen O'Malley
    Reason: Cleanup
    Ref: YDH

commit 8de996f7fac526c605e1931744e7937f90471e88
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu May 20 23:04:35 2010 -0700

    MAPREDUCE-1807. Re-factor TestQueueManager to not timeout
    
    Author: Dick King
    Reason: Fix failing unit test
    Ref: YDH

commit f831304f9adfd7668283310e73ea66185674adc6
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu May 20 11:51:43 2010 -0700

    HADOOP-6781. Security audit log shouldn't have exceptions in it.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12445092/HADOOP-6781-BP20.patch
    Author: Boris Shkolnik
    Ref: YDH

commit 5459775249f78827a23863322002f5b0695a04d7
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed May 19 14:08:04 2010 -0700

    HADOOP-6776. UserGroupInformation.createProxyUser's javadoc is broken
    
    Patch: https://issues.apache.org/jira/secure/attachment/12444980/6776.patch
    Author: Devaraj Das
    Ref: YDH

commit 07b56fcda093c41a142668171cf7bc953c9e4db8
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue May 18 17:01:12 2010 +0530

    Amend MAPREDUCE-1664. Bug fix to enable queue admins to view jobs.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12444782/1664.qAdminsJobView.20S.v1.6.patch.
    Author: Ravi Gummadi
    Reason: bug fix to prior patch

commit e81e1a34349a1d6a35faddeed4d7ff2087a6a48c
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon May 17 10:56:25 2010 -0700

    HDFS-1157. Modifications introduced by HDFS-1150 are breaking aspect's bindings
    
    Patch: https://issues.apache.org/jira/secure/attachment/12444716/hdfs-1157.patch
    Author: Konstantin Boudnik
    Ref: YDH

commit a8b230d68070e829a2717805c8d3f7c995bf0ae0
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri May 14 22:05:38 2010 -0700

    HDFS-1130. Authorize access to default HDFS servlets with a DFS administrator ACL
    
    Patch: https://issues.apache.org/jira/secure/attachment/12444565/hdfs-1130.3.patch
    Author: Devaraj Das
    Ref: CDH-648

commit f87ec798d3f48c701cdb24372ad15e7d269e580f
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Jul 27 16:59:02 2010 -0700

    Amend HDFS-1150. Allow the requirement of DNs on low ports to be relaxed by a config
    
    Reason: simplifies testing, and allows running with other security methods
    Author: Todd Lipcon
    Ref: CDH-648

commit 8ad787821eaf906170fc913c5030b147b0bd6e80
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri May 14 16:37:07 2010 -0700

    HDFS-1150. Let DataNodes bind to privileged ports in order to verify their identity better to clients
    
    Patch: https://issues.apache.org/jira/secure/attachment/12444541/HDFS-1150-Y20S-ready-8.patch
    Patch: https://issues.apache.org/jira/secure/attachment/12444811/hdfs-1150-bugfix-1.patch
    Patch: https://issues.apache.org/jira/secure/attachment/12444864/hdfs-1150-bugfix-1.2.patch
    Patch: https://issues.apache.org/jira/secure/attachment/12445111/HDFS-1150-BF-Y20-LOG-DIRS-2.patch
    Author: Jakob Homan
    Reason: The DataXceiverProtocol does not provide mutual authentication. Binding the DNs to a low port
            number makes it harder for an attacker to impersonate a DN.
    Ref: CDH-648

commit 20f55449358f96fd20f74f3f92c24dce763158e1
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri May 14 12:58:38 2010 +0530

    MAPREDUCE-1716. Truncate logs of finished tasks to prevent node thrash due to excessive logging
    
    Patch: https://issues.apache.org/jira/secure/attachment/12444476/patch-log-truncation-bugs-20100514.txt
    Author: Vinod K V
    Ref: YDH

commit 055c06d1fba4cead09fe70e1cc56873f51927dfb
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu May 13 23:54:54 2010 -0700

    MAPREDUCE-1442. Fixed regex in job-history related to parsing Counter values.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12444349/mr-1442-y20s-v1.patch
    Author: Luke Lu
    Reason: Avoid StackOverflowError when JobHistory parses a really long line
    Ref: YDH

commit 24a779aa3f4ee27c596f26c0a524433422c91689
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu May 13 19:08:04 2010 -0700

    HADOOP-6760. WebServer shouldn't increase port number in case of negative port setting caused by Jetty's race
    
    Patch: https://issues.apache.org/jira/secure/attachment/12444455/HADOOP-6760.0.20.patch
    Author: Konstantin Boudnik
    Ref: YDH

commit a1bd71986a43df303166c7a6a3bf6d3e38d2f908
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu May 13 14:04:39 2010 -0700

    HDFS-1146. Add Javadoc for getDelegationTokenSecretManager in FSNamesystem
    
    Patch: https://issues.apache.org/jira/secure/attachment/12444261/HDFS-1146-y20.1.patch
    Author: Jitendra Nath Pandey
    Ref: CDH-648

commit 7a38053228f4b8368c183a6658d93df2a216d4ab
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sun May 9 09:19:41 2010 -0700

    MAPREDUCE-1744. Fixed DistributedCache apis to take a user-supplied FileSystem to allow for better proxy behaviour for Oozie.
    
    Amended to not deprecate any methods, since their future in the next major release has not been decided yet.
    Patch: https://issues.apache.org/jira/secure/attachment/12444060/MAPREDUCE-1744.patch
    Author: Dick King

commit a696ed00e06ec9cac5b0f0e53a7fc6bcafb6c69f
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sun May 9 01:30:36 2010 -0700

    MAPREDUCE-1733. Authentication between pipes processes and java counterparts.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12444054/MR-1733-y20.3.patch
    Author: Jitendra Nath Pandey
    Reason: Security
    Ref: CDH-648

commit 37bbc27c772ba033670be8d6323ca1a9191d34a7
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri May 7 16:23:32 2010 -0700

    HADOOP-6756. Clean up and document configuration keys in CommonConfigurationKeys.java
    
    Patch: https://issues.apache.org/jira/secure/attachment/12444008/jira.HADOOP-6756-0.20-1.patch
    Patch: https://issues.apache.org/jira/secure/attachment/12444017/jira.HADOOP-6756-0.20-1-FS_DEFAULT_NAME_KEY.patch
    Author: Erik Steffl
    Ref: YDH

commit 5127aafd818ec2c57d02481f221bff3534e12d14
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri May 7 15:08:14 2010 -0700

    HDFS-1136. FileChecksumServlets.RedirectServlet doesn't carry forward the delegation token
    
    Patch: https://issues.apache.org/jira/secure/attachment/12443986/HDFS-1136-BP20-2.patch
    Author: Boris Shkolnik
    Reason: Security
    Ref: CDH-648

commit 6951f1e5bbac61b1a1cc3587b712598ed993c729
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri May 7 23:50:08 2010 +0530

    MAPREDUCE-1759. Improve exception message for unauthorized user doing killJob, killTask, setJobPriority
    
    Patch: https://issues.apache.org/jira/secure/attachment/12443983/1759.20S.1.patch.
    Author: Ravi Gummadi
    Reason: Security
    Ref: CDH-648

commit 8a02b7518c90634ec256fa836757919f600fa0e9
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri May 7 23:13:24 2010 +0530

    HADOOP-6715. Fix AccessControlList.toString behavior when ACL is set to "*"
    
    Patch: https://issues.apache.org/jira/secure/attachment/12443982/6715.20S.6.patch
    Author: Ravi Gummadi
    Reason: Security
    Ref: CDH-648

commit 057e7fc4942d0aa9a36f5b0ae2dd73d516fcf8ba
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri May 7 13:15:22 2010 +0530

    HADOOP-6757. Fix NPE when streaming jobs launch further Hadoop clients
    
    Patch: https://issues.apache.org/jira/secure/attachment/12443934/BZ-3620565-v1.0.patch
    Author: Amar Kamat
    Ref: YDH

commit b24898d75d8239697d215475dfe1b305ada90e5f
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri May 7 12:37:30 2010 +0530

    HADOOP-6631. FileUtil.fullyDelete should continue deleting after partial failure
    
    Patch: https://issues.apache.org/jira/secure/attachment/12443931/HADOOP-6631-20100506-ydist.final.txt
    Author: Ravi Gummadi
    Ref: YDH

commit 2b0d6bb28d3733a2c42ebbbb4bfc294044c8e619
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri May 7 11:52:42 2010 +0530

    MAPREDUCE-1754, HADOOP-6748. Replace mapred.permissions.supergroup with an ACL instead of single group
    
    Patch: https://issues.apache.org/jira/secure/attachment/12443928/patch-1754-ydist.txt.
    Author: Amareshwari Sriramadasu
    Reason: Security
    Ref: CDH-648

commit cf50e5a91e7159f7f06518162a7400cc681c7f08
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu May 6 11:27:20 2010 -0700

    HADOOP-6701. Incorrect exit codes for "dfs -chown", "dfs -chgrp"
    
    Patch: https://issues.apache.org/jira/secure/attachment/12442987/HADOOP-6701-v20.patch
    Author: Ravi Phulari
    Ref: YDH

commit 463557b922ac3579aa130d01f780ab3c2e32b70f
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed May 5 23:38:06 2010 +0000

    HADOOP-6640. FileSystem.get() does RPC retries within a static synchronized block
    
    Patch: https://issues.apache.org/jira/secure/attachment/12443759/getFS_yahoo20s.patch
    Author: Hairong Kuang
    Reason: Fixes potential performance issue in multithreaded environment
    Ref: YDH

commit 781ae842245a6fb948de72d39658b28eab7c2cfe
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed May 5 14:09:32 2010 -0700

    HDFS-1006. Secure image transfer between NN and 2NN
    
    Patch: https://issues.apache.org/jira/secure/attachment/12443766/hdfs-1006-bugfix-1.patch
    Author: Boris Shkolnik
    Reason: Security
    Ref: CDH-648

commit 497fefb3b9132b912a569577bc643c20a273707e
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed May 5 09:15:13 2010 -0700

    HADOOP-6745. Add JavaDoc to Server.RpcMetrics, UGI
    
    Patch: https://issues.apache.org/jira/secure/attachment/12443726/HADOOP-6745-BP20-2.patch
    Author: Boris Shkolnik
    Reason: Security
    Ref: CDH-648

commit 51d7be14f892334ea1ad399a34d91ea1bd10e804
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed May 5 11:00:56 2010 +0530

    MAPREDUCE-1707. Fix potential NPE in TaskRunner
    
    Patch: https://issues.apache.org/jira/secure/attachment/12443680/MAPREDUCE-1707-20100504-ydist.txt
    Author: Vinod K V
    Ref: YDH

commit 2647aae4985ca29e222567d8bfc77e2569fde81c
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue May 4 19:14:10 2010 +0000

    HDFS-1104. Change fsck to not update block access times
    
    Patch: https://issues.apache.org/jira/secure/attachment/12443523/fsckATime_Yahoo0.20.patch
    Author: Hairong Kuang
    Reason: prevents a possible NN OOME during fsck
    Ref: YDH

commit 392bd0e68be1621406d3556303c30f00d6dfd019
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon May 3 18:58:34 2010 -0700

    Amend HADOOP-6332. Add large-scale test framework "Herriot" that runs against real clusters.
    
    Reason: Phase two of Herriot framework
    Patch: https://issues.apache.org/jira/secure/attachment/12443539/6332-phase2.patch
    Patch: https://issues.apache.org/jira/secure/attachment/12443668/6332-phase2.fix1.patch
    Patch: https://issues.apache.org/jira/secure/attachment/12443788/6332-phase2.fix2.patch
    Author: Konstantin Boudnik
    Ref: YDH

commit c04645dd48af89b8191e090efd15f8520946a606
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Apr 30 14:34:53 2010 -0700

    HADOOP-6693. Add metrics to track kerberos login activity
    
    Patch: https://issues.apache.org/jira/secure/attachment/12443326/HADOOP-6693.rel20.1.patch
    Author: Suresh Srinivas
    Reason: Security
    Ref: YDH

commit 4bce823b6ae54b527cfa25e7802e9a1c83f5b5d2
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Apr 29 14:19:05 2010 -0700

    HADOOP-6710. Symbolic umask for file creation is not consistent with posix
    
    Patch: https://issues.apache.org/jira/secure/attachment/12443134/hadoop-6710.rel20.patch
    Author: Suresh Srinivas
    Ref: YDH

commit e34d5e43768d0f91c0eb847071de7f9b8ec5b323
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Feb 22 14:24:21 2010 +0530

    MAPREDUCE-670 and HDFS-1022. Add a fast "commit test" target
    
    Patch: https://issues.apache.org/jira/secure/attachment/12436553/mapreduce-670-y20.patch
    Author: Jothi Padmanabhan
    Ref: YDH

commit 60b22e764e8d44570f900632b30b37df22855d0d
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Apr 27 23:46:55 2010 -0700

    MAPREDUCE-1711. Gridmix should provide an option to submit jobs to the same queues as specified in the trace.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12443040/MR-1711-yhadoop-20-1xx-7.patch.
    Author: rahul k singh
    Ref: YDH

commit 24093ee575601dc3cfbb67ac63d742fab1f40f2f
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Apr 27 14:52:55 2010 -0700

    MAPREDUCE-1687. Stress submission policy does not always stress the cluster. (htang)
    
    Patch: https://issues.apache.org/jira/secure/attachment/12442692/mr-1687-yhadoop-20.1xx-20100423-2.patch.
    Author: rahul k singh
    Ref: YDH

commit 41079a8480984ef8ac84dfe8c97930f0631afc07
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sun Apr 25 12:06:08 2010 -0700

    MAPREDUCE-1641. Fix DistributedCache to ensure same files cannot be put in both the archives and files sections.
    
    Author: Dick King
    Ref: YDH

commit 13c09b6981b1f5bd8e58b3586d67e4652ab716c8
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sat Apr 24 00:22:59 2010 +0530

    MAPREDUCE-1664. Fix job and queue ACLs to interact in a more useful manner.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12442697/1664.20S.3.4.patch
    Patch: https://issues.apache.org/jira/secure/attachment/12443139/M1664y20s-testfix.patch
    Patch: https://issues.apache.org/jira/secure/attachment/12444043/mr-1664-20-bugfix.patch
    
    Author: Ravi Gummadi
    Reason: Security
    Ref: CDH-648

commit 2d09f15358560a61169e64c488f5e6fa7aff1d7f
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Apr 23 15:46:55 2010 +0530

    MAPREDUCE-1397. Fix a possible NPE when a killTask command races with a JVM exit
    
    Patch: https://issues.apache.org/jira/secure/attachment/12442657/patch-1397-ydist.txt
    Author: Amareshwari Sriramadasu
    Ref: YDH

commit 7c0d1d3221c5558979942f8aaba3f14f29c5f7b4
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Apr 9 15:38:50 2010 -0700

    HADOOP-6670. Use the UserGroupInformation's Subject as the criteria for equals and hashCode.
    
    Author: Owen O'Malley
    Reason: Security bug fix
    Ref: CDH-648

commit 82b157a5b805c10485712ddb108aa8248ad0df0c
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Apr 22 10:13:04 2010 -0700

    HADOOP-6716. System won't start in non-secure mode when kerb5.conf (edu.mit.kerberos on Mac) is not present
    
    Patch: https://issues.apache.org/jira/secure/attachment/12442487/HADOOP-6716-BP20-3.patch
    Author: Boris Shkolnik
    Ref: CDH-648

commit d8b98cd21187c2d012f022f5fbd3511281f4adbf
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Apr 22 16:47:53 2010 +0530

    MAPREDUCE-1607. Fix possible cleanup task failure in LinuxTaskController
    
    Patch: https://issues.apache.org/jira/secure/attachment/12442538/patch-1607-ydist.txt
    Author: Amareshwari Sriramadasu
    Ref: CDH-648

commit d8fe4d58aa93d151e55cd3e427a4b90ce470e7c9
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Apr 22 10:23:26 2010 +0530

    MAPREDUCE-1533. JobTracker performance improvements.
    
    Author: Dick King
    Reason: Simple CPU usage optimizations in JT and Capacity Scheduler
    Ref: YDH

commit 1b1896055e04da90194893c90ac7e2a3787e5077
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Apr 20 10:55:56 2010 -0700

    MAPREDUCE-1701. Fix a problem with exception handling in delegation token renewals.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12442239/MAPREDUCE-1701-BP20-1.patch
    Author: Boris Shkolnik
    Ref: CDH-648

commit ea5f8c7922fc5a66e26deb4da55c6088b006e1f0
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Apr 19 16:38:48 2010 -0700

    HDFS-1096. Allow refresh of superuser proxy group mappings
    
    Patch: https://issues.apache.org/jira/secure/attachment/12442244/HDFS-1096-BP20-7.patch
    Author: Boris Shkolnik
    Ref: CDH-648

commit 877288c5ab55849a08a89ce342a8d5984f18a6df
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Apr 20 01:21:33 2010 +0530

    HDFS-1012. HDFSProxy support for fully qualified HDFS path in addition to simple unqualified path
    
    Patch: https://issues.apache.org/jira/secure/attachment/12441034/HDFS-1012-bp-y20s.patch
    Author: Srikanth Sundarrajan
    Ref: YDH

commit 4aa8da2788a0794be5469227140509cc87d29c47
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Apr 20 01:13:14 2010 +0530

    HDFS-1011. Improve Logging in HDFSProxy to include cluster name associated with the request
    
    Patch: https://issues.apache.org/jira/secure/attachment/12441031/HDFS-1011-bp-y20s.patch
    Author: Ramesh Sekaran
    Ref: YDH

commit 689eb75cdd88b4b7a080ab3883f2a317cfb2c664
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Apr 20 01:08:10 2010 +0530

    HDFS-1010. HDFSProxy: Retrieve group information from UnixUserGroupInformation instead of LdapEntry
    
    Patch: https://issues.apache.org/jira/secure/attachment/12439437/HDFS-1010-bp-y20s.patch
    Author: Srikanth Sundarrajan
    Ref: YDH

commit cb39a8280712bb507e159151de5e62895d79268d
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Apr 20 01:01:07 2010 +0530

    HDFS-481. Allow HdfsProxy to securely impersonate the real user
    
    Patch: https://issues.apache.org/jira/secure/attachment/12442210/HDFS-481-NEW.patch
    Patch: https://issues.apache.org/jira/secure/attachment/12442280/HDFS-481-bp-y20s.patch
    Author: Srikanth Sundarrajan
    Ref: CDH-648

commit 0bb2bf837469bdafa5df1b14ed1fb2991070c9d0
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Apr 19 13:15:50 2010 +0530

    MAPREDUCE-1657. Fix incorrect error message when trying to view already-deleted logs of a task.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12442135/MR1657.20S.1.patch
    Author: Ravi Gummadi
    Ref: CDH-648

commit 1beeed7552f5e83ef24f8926450bb81d7be02b8d
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Apr 19 11:00:05 2010 +0530

    MAPREDUCE-1692.  Remove TestStreamedMerge from the streaming tests
    
    Patch: https://issues.apache.org/jira/secure/attachment/12442134/patch-1692-ydist.txt.
    Author: Amareshwari Sriramadasu
    Reason: Test no longer applicable
    Ref: YDH

commit eb456ec5bc5a3afca9a854e9f4708892c8a6e2f5
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Apr 16 17:35:14 2010 -0700

    HDFS-1081. Improve performance of block access token implementation
    
    Patch: https://issues.apache.org/jira/secure/attachment/12442023/HADOOP-1081-Y20-2.patch
    Reason: Reduce number of calls to expensive HMAC functions to reduce NN CPU usage
    Author: Jakob Homan
    Ref: CDH-648

commit 457cf14f7168bacd185c03c9afc8527210e06410
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Apr 16 14:35:38 2010 -0700

    MAPREDUCE-1656. JobStory should provide queue info.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12441905/mr-1656-yhadoop-20.1xx.patch.
    Author: Hong Tang
    Ref: YDH

commit c2b68855b127c7dc532ce836fa60dc5c1836f6ec
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Apr 16 14:10:31 2010 -0700

    MAPREDUCE-1317. Reducing memory consumption of rumen objects. Contributed by Hong Tang.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12442004/mapreduce-1317-yhadoo-20.1xx.patch.
    Patch: https://issues.apache.org/jira/secure/attachment/12443927/3623945-yahoo-20-1xx.patch
    
    Author: Hong Tang
    Ref: YDH

commit cbc852fdeccf007260c6e81a15280272a6f90def
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Apr 16 12:45:26 2010 -0700

    HADOOP-6706. Improve relogin behavior for RPC clients
    
    Patch: https://issues.apache.org/jira/secure/attachment/12441782/6706.bp20.patch
    Patch: https://issues.apache.org/jira/secure/attachment/12442253/6706.bp20.1.patch
    Author: Devaraj Das
    Reason: Security
    Ref: CDH-648

commit 1444a9469340822ed0af92ee9ac780c7b9835c26
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Apr 23 09:36:33 2010 -0700

    HADOOP-6718. Client does not close connection when an exception happens during SASL negotiation
    
    Patch: https://issues.apache.org/jira/secure/attachment/12442614/6718-bp20.patch
    Author: Devaraj Das
    Ref: CDH-648

commit 12bcbba89226fdce99733f366e8eaacd09d95ab7
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Apr 16 15:36:16 2010 +0530

    MAPREDUCE-1617. Fix unit test failures due to IPv6-related issues
    
    Patch: https://issues.apache.org/jira/secure/attachment/12441951/mr-1617-v1.3.patch.
    Author: Luke Lu
    Reason: fix unit test
    Ref: YDH

commit d60df2ded7690aed311726e6493ca88d578b882b
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sat Feb 20 12:17:18 2010 -0800

    HADOOP-6545. Cached FileSystem objects can lead to wrong token being used in setting up connections
    
    Patch: https://issues.apache.org/jira/secure/attachment/12436456/6545-bp20.patch
    Author: Devaraj Das
    Ref: CDH-648

commit 110cd5235ba960e003ac94824a29a8b0ac36a031
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sat Apr 24 09:23:55 2010 -0700

    MAPREDUCE-1718. Fix a bug in HFTPFileSystem so that delegation tokens function correctly
    
    Patch: https://issues.apache.org/jira/secure/attachment/12442726/MAPREDUCE-1718-BP20-2.patch
    Author: Boris Shkolnik
    Reason: Security
    Ref: CDH-648

commit f37d58671d0e9d601cfd446e6966b3a906d95029
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Apr 16 14:43:28 2010 +0530

    MAPREDUCE-587. Fix TestStreamingExitStatus failure case on OSX
    
    Patch: https://issues.apache.org/jira/secure/attachment/12414990/MAPREDUCE-587-v1.0.patch.
    Author: Amar Kamat
    Ref: YDH

commit eb3c35987b4434c85fb0203c866a7f8fd56674aa
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Apr 14 13:20:06 2010 +0530

    MAPREDUCE-1985. Fix for java.lang.ArrayIndexOutOfBoundsException in analysejobhistory.jsp of jobs with 0 maps
    
    Author: Vinod Kumar
    Ref: YDH

commit 5bebf947cd534ee350844c3626d11dac315372ed
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Apr 13 10:39:44 2010 -0700

    MAPREDUCE-1680. Add a metric to track number of heartbeats processed by the JobTracker
    
    Patch: https://issues.apache.org/jira/secure/attachment/12441621/mapreduce-1680--2010-04-08.patch.
    Author: Dick King
    Ref: YDH

commit 87c3e693adb3912b5c2755cf7e31f9c1b9973273
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Apr 12 16:48:37 2010 -0700

    MAPREDUCE-1683. Removes JNI calls to get jvm current/max heap usage in ClusterStatus by default.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12441563/MAPREDUCE-1683_yhadoop_20_S.patch
    Patch: https://issues.apache.org/jira/secure/attachment/12441978/MAPREDUCE-1683_part2_yhadoop_20_10.patch
    Reason: Performance improvement
    Ref: YDH

commit 6ddae27ba50b6895509839bb89a7a8e2a0550284
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Apr 12 13:03:29 2010 -0700

    HADOOP-6687. User object in the subject in UGI  should be reused in case of a relogin.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12440979/HADOOP-6687-y20.2.patch
    Author: Jitendra Nath Pandey
    Ref: YDH

commit f0f38e93276cd5ca8a12c26a0cf138c65fef1951
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Apr 12 11:10:09 2010 +0530

    MAPREDUCE-1635. Fix ResourceEstimator after MAPREDUCE-842
    
    Patch: https://issues.apache.org/jira/secure/attachment/12441448/patch-1635-ydist.txt
    Author: Amareshwari Sriramadasu
    Ref: YDH

commit 56acf64d1453e7de0c87d58bb4565dc1748de5a8
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sat Apr 10 17:19:45 2010 +0530

    MAPREDUCE-1526. Gridmix: Cache the job related information while submitting the job to avoid many RPC calls to JobTracker.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12440983/1594-yhadoop-20-1xx-1-5.patc://issues.apache.org/jira/secure/attachment/12441333/1526-yhadoop-20-101-4.patch
    Author: rahul k singh
    Ref: YDH

commit 14c524cb4f78e52ee1d916fa92dfd2665a3a2527
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Apr 7 09:31:55 2010 -0700

    HADOOP-6674. Turn off SASL checksums for RPCs. (jitendra via omalley)
    
    Patch: https://issues.apache.org/jira/secure/attachment/12442640/HADOOP-6674-y20.1.bugfix.patch
    Author: Jitendra Nath Pandey
    Reason: Performance Improvement in Secure RPC
    Ref: CDH-648

commit c145357a86dcabe072a3f93775c8f45161452841
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Apr 2 17:30:00 2010 -0700

    HADOOP-5958. Replace fork of DF with library call.
    
    Author: Aaron Kimball
    Ref: YDH

commit 2de11fbbf173eef5b35a3ae10777c87728492355
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Apr 9 15:31:59 2010 -0700

    HDFS-999. Secondary namenode should login using kerberos if security is configured.
    
    Author: Boris Shkolnik
    Ref: CDH-648

commit 9d2cc8f9f81c8a39f4c8e1d7e00765ee29808145
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Apr 7 10:56:55 2010 +0530

    MAPREDUCE-1594. Support sleep jobs in gridmix
    
    Patch: https://issues.apache.org/jira/secure/attachment/12440983/1594-yhadoop-20-1xx-1-5.patch
    Author: rahul k singh
    Ref: YDH

commit af4ddb7f8866cf5fdcee45a91a7f10cb2e70c51a
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Apr 6 17:26:10 2010 -0700

    HDFS-955. Fix bug where FSImage.saveFSImage could lose edits
    
    Patch: https://issues.apache.org/jira/secure/attachment/12440925/saveNamespace-0.20.patch
    Author: Konstantin Shvachko
    Ref: YDH

commit d229a5fc83e4722162912a60eae40121f2e504e6
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Apr 6 17:01:51 2010 -0700

    HDFS-1007. Update HFTP to use delegation tokens
    
    Patch: https://issues.apache.org/jira/secure/attachment/12440931/HDFS-1007-BP20-fix-3.patch
    Author: Devaraj Das
    Ref: CDH-648

commit 46bb5af895747384e2698de9a628b8bea86093d0
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Apr 5 16:28:56 2010 -0700

    HDFS-1080. SecondaryNameNode image transfer should use the defined http address rather than local ip address
    
    Patch: https://issues.apache.org/jira/secure/attachment/12440810/HDFS-1080-Y20.patch
    Author: Jakob Homan
    Ref: YDH

commit 46683a5902d3c698e08dde2b1e2464da7636b809
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Apr 2 17:48:05 2010 -0700

    HADOOP-6539. Update various pieces of documentation
    
    Patch: https://issues.apache.org/jira/secure/attachment/12440665/C6539-2-y20s.patch
    Author: Corinne Chandel
    Ref: YDH

commit 9637fcdf4f49a8a6f8674f9ca047b33ab34dba0d
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Apr 1 16:35:15 2010 -0700

    HADOOP-6682. Fix incorrect hostname normalization for hostnames starting with [a-f]
    
    Author: Jakob Homan
    Ref: YDH

commit 976302c14eeebc784e85d4af4746d45d39803a70
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Mar 26 11:24:22 2010 -0700

    HADOOP-6661. Add documentation on how to securely impersonate other users
    
    Patch: https://issues.apache.org/jira/secure/attachment/12439897/HADOOP-6661-y20.2.patch
    Author: Jitendra Nath Pandey
    Ref: CDH-648

commit 25c7e968255a8fdacaaaca0358ce8894d7d925a3
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Mar 24 17:27:04 2010 -0700

    MAPREDUCE-1624. Document job credentials and delegation tokens
    
    Patch: https://issues.apache.org/jira/secure/attachment/12439738/job-creds.2.patch
    Author: Devaraj Das
    Ref: CDH-648

commit c0dbf7361b383dd2e179eb33a5beeb7d6e52fc10
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Mar 23 00:20:20 2010 -0700

    HADOOP-6656. Renew Kerberos TGT when 80% of the renew lifetime has been used up. (omalley)
    
    Author: Devaraj Das
    Reason: Security framework needs to renew Kerberos tickets while the process is running
    Ref: CDH-648

commit 9ecece0b50dbc523a1b9a7c48bb6330ae6e8c93e
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sun Mar 21 14:10:53 2010 -0700

    HADOOP-6653. Protect against NPE in setupSaslConnection when the real user is NULL.
    
    Author: Owen O'Malley
    Ref: CDH-648

commit a1871a06d05bcaa41c68673b68339a10c5f5e183
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sat Mar 20 17:06:36 2010 -0700

    HADOOP-6652. Remove redundant cache in ShellBasedUnixGroupsMapping
    
    Patch: https://issues.apache.org/jira/secure/attachment/12439372/groups.patch
    Author: Devaraj Das
    Ref: CDH-648

commit ba1b824bfccac3386041eab7dbf29f5c7d4b8662
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sat Mar 20 15:22:59 2010 -0700

    HADOOP-6649. The login object should be moved to the subject in the UGI.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12439344/HADOOP-6649-y20.1.patch
    Patch: https://issues.apache.org/jira/secure/attachment/12439391/HADOOP-6649-y20.1.1.patch
    Author: Jitendra Nath Pandey
    Ref: CDH-648

commit 68b5ce77348d26ed4c694a505e7ad00f4ebfc650
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Mar 19 19:22:15 2010 -0700

    HADOOP-6637. Add benchmark for overhead of RPC session establishment
    
    Patch: https://issues.apache.org/jira/secure/attachment/12439348/miniRPCBenchmark-20-100.patch
    
    Author: Konstantin Shvachko
    Ref: CDH-648

commit 5f84248c0faa67eddf505c5afa7b9f320b23e35c
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Mar 19 17:12:53 2010 -0700

    HADOOP-6648. Credentials must ignore null tokens that can be generated when using HFTP to talk to insecure clusters.
    
    Author: Devaraj Das
    Ref: CDH-648

commit 635b88303ce6ef229308b9787469cc5b04b4fe3c
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Mar 19 14:05:16 2010 -0700

    HADOOP-6647. Service authorization should compare short names, not full names.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12439325/HADOOP-6647-BP20.patch
    Author: Boris Shkolnik
    Ref: CDH-648

commit ea78340db1b2eb3333a5f79fad076285f82917e7
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sat Mar 20 00:01:23 2010 +0530

    MAPREDUCE-1612. job conf file is not accessible from job history web page
    
    Patch: https://issues.apache.org/jira/secure/attachment/12439310/jobconf_history_jsp.fix.20S.patch
    Author: Ravi Gummadi
    Ref: YDH

commit 94abbf7bee9701230cff703aac7d740ff0333176
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Mar 19 22:58:21 2010 +0530

    MAPREDUCE-1611. Add service authorization to the AdminOperationsProtocol
    
    Patch: https://issues.apache.org/jira/secure/attachment/12439295/MAPREDUCE-1611-20100319-ydist.txt.
    Author: Amar Kamat
    Ref: CDH-648

commit 51c966ff9e07ff017eb1960cac87d8035e4549c6
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Mar 19 10:10:35 2010 -0700

    HADOOP-6644. Fix incorrect code style in util.Shell
    
    Patch: https://issues.apache.org/jira/secure/attachment/12439243/HADOOP-6644-BP20.patch
    Author: Boris Shkolnik
    Ref: YDH

commit ef89b9ba5cacecaf5fd292b8cf2e6ebe347e85b7
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Mar 19 19:49:22 2010 +0530

    MAPREDUCE-1609. TaskTracker.localizeJob should not set permissions on job log directory recursively
    
    Patch: https://issues.apache.org/jira/secure/attachment/12439278/MAPREDUCE-1609-20-1.patch
    Author: Amareshwari Sriramadasu
    Ref: CDH-648

commit 9d3a600899712a637109816b5f50b99257ceb79c
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Mar 19 14:27:30 2010 +0530

    MAPREDUCE-1610. Forrest documentation should be updated to reflect the changes in MAPREDUCE-856
    
    Patch: https://issues.apache.org/jira/secure/attachment/12439252/MAPREDUCE-1610-20.patch
    Author: Ravi Gummadi
    Ref: CDH-648

commit aa3c9671cd7cfa6818c913f293d2605b87431e60
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Mar 19 01:10:57 2010 -0700

    Amend MAPREDUCE-1532. Add more informative logging messages for configuration-authentication mismatch
    
    Patch: https://issues.apache.org/jira/secure/attachment/12439248/1532-bp20.4.2.patch
    Author: Devaraj Das
    Ref: CDH-648

commit 8f3777a09e9cc3e260e68c1a177908204b0dad8c
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Mar 19 13:26:36 2010 +0530

    MAPREDUCE-1417. Forrest documentation should be updated to reflect the changes in MAPREDUCE-744
    
    Patch: https://issues.apache.org/jira/secure/attachment/12439247/MAPREDUCE-1417-20.patch
    Author: Ravi Gummadi
    Ref: CDH-648

commit adbf650d942aff9d372289110672916fcb4574a7
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Mar 19 10:03:46 2010 +0530

    HADOOP-6634. AccessControlList uses full-principal names to verify acls causing queue-acls to fail
    
    Patch: https://issues.apache.org/jira/secure/attachment/12439238/HADOOP-6634-20100317-ydist.1.txt
    Author: Vinod K V
    Ref: CDH-648

commit 89e10647b3c733c16124b648a6bfe187670390da
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Mar 18 17:02:47 2010 -0700

    HADOOP-6642. Fix javac, javadoc, findbugs warnings
    
    Patch: https://issues.apache.org/jira/secure/attachment/12439225/C6642-1y20.patch
    Author: Chris Douglas
    Ref: CDH-648

commit 5abf4f644d7fc869858f896708b49f211ddf17d4
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Mar 18 15:42:18 2010 -0700

    HDFS-1044. Cannot submit mapreduce job from secure client to unsecure sever
    
    Patch: https://issues.apache.org/jira/secure/attachment/12439220/HDFS-1044-BP20-6.patch
    Author: Boris Shkolnik
    Ref: CDH-648

commit 0f77b062de438f4d138da08d4028a7afe7a233bd
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Mar 18 15:23:21 2010 -0700

    HADOOP-6638. Try to relogin in a case of  failed RPC connection (expired tgt) only in case the subject is loginUser or proxyUgi.realUser.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12439080/HADOOP-6638-BP20.patch
    Author: Boris Shkolnik
    Ref: CDH-648

commit cfbc26a6211204e5e87d9986019fc97110662127
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Mar 18 03:58:08 2010 -0700

    HADOOP-6632. Support for using different Kerberos keys for different instances of Hadoop services
    
    Patch: https://issues.apache.org/jira/secure/attachment/12439144/HADOOP-6632-Y20S-22.patch
    Patch: https://issues.apache.org/jira/secure/attachment/12439307/6632.mr.patch
    Author: Kan Zhang
    Ref: CDH-648

commit 5b6834d2f255fec50d44ec50bec38b0004055c7d
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Mar 18 02:45:31 2010 -0700

    HADOOP-6526. Need mapping from long principal names to local OS user names
    
    Patch: https://issues.apache.org/jira/secure/attachment/12439139/HADOOP-6526-y20.4.patch
    Patch: https://issues.apache.org/jira/secure/attachment/12442917/3595485.patch
    Author: Owen O'Malley
    Reason: Security
    Ref: YDH

commit 87c00a2594f42c8d96479ba339800e5224136902
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Mar 18 12:05:01 2010 +0530

    MAPREDUCE-1604. Document Job ACLs in forrest
    
    Patch: https://issues.apache.org/jira/secure/attachment/12439114/patch-1604-ydist.txt
    Author: Amareshwari Sriramadasu
    Ref: CDH-648

commit c69f8f7977591ac297cf5501374c4f40acfea7ee
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Mar 17 20:32:34 2010 -0700

    HDFS-1045. In secure clusters, re-login is necessary for https clients before opening connections
    
    Patch: https://issues.apache.org/jira/secure/attachment/12439110/HDFS-1045-Y20.patch
    Author: Jakob Homan
    Ref: CDH-648

commit 0f37bc72df436c30fbb3a1c826b2040f8118570b
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Mar 17 14:13:39 2010 -0700

    HDFS-6603. Clarify a comment in SecurityUtil
    
    Patch: https://issues.apache.org/jira/secure/attachment/12439078/fix_comment_y20.patch
    Author: Jakob Homan
    Ref: CDH-648

commit b10afd0f9b75e7deb5045e19a9a954769c0925e6
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Feb 26 18:40:08 2010 +0000

    HDFS-985. HDFS should issue multiple RPCs for listing a large directory
    
    Patch: http://issues.apache.org/jira/secure/attachment/12437088/iterativeLS_yahoo1.patc
    Patch: http://issues.apache.org/jira/secure/attachment/12437499/testFileStatus.patch
    Patch: https://issues.apache.org/jira/secure/attachment/12439066/directoryBrowse_0.20yahoo_2.patch.
    Reason: Performance of large directory access
    Author: Hairong Kuang
    Ref: YDH

commit 353f15176ed481f268f37e33d4fc9f1745f44afe
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Mar 18 00:06:22 2010 +0530

    MAPREDUCE-1543. Log messages of JobACLsManager should use security logging of HADOOP-6586
    
    Patch: https://issues.apache.org/jira/secure/attachment/12439057/mapreduce-1543-y20s-3.patch
    Author: Luke Lu
    Ref: CDH-648

commit 02d4404d39e60322ea13a2ce0160635df4ef3155
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Mar 17 10:30:20 2010 -0700

    MAPREDUCE-1606. TestJobACLs may timeout as there are no slots for launching JOB_CLEANUP task
    
    Patch: https://issues.apache.org/jira/secure/attachment/12439054/MR1606.20S.1.patch
    Author: Ravi Gummadi
    Ref: CDH-648

commit aa56ca096e5bba16a873600c57d29586896432ce
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Mar 17 09:12:13 2010 -0700

    HADOOP-6633. Normalize property names for JT/NN kerberos principal names in configuration
    
    Patch: https://issues.apache.org/jira/secure/attachment/12438949/HADOOP-6633-BP20-2.patch
    Author: Boris Shkolnik
    Ref: CDH-648

commit ec983446e7a1600a95fcf60bd92205f5b9318d99
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Mar 17 00:21:35 2010 -0700

    HADOOP-6613. RPC server should check for version mismatch before authentication method
    
    Patch: https://issues.apache.org/jira/secure/attachment/12437831/HADOOP-6613-Y20S-1.patch
    Author: Kan Zhang
    Ref: CDH-648

commit d8de7f9dbe8b94d380849f2f5f4ff357ce50e1a6
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Mar 17 11:57:13 2010 +0530

    HADOOP-5592. Fix typo in streaming documentation
    
    Patch: https://issues.apache.org/jira/secure/attachment/12436671/patch-5592-ydist.txt
    Author: Corinne Chandel
    Ref: YDH

commit 3a91217a9802a4abe89a50dc0dcb8425cd42c060
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Mar 17 11:50:39 2010 +0530

    MAPREDUCE-813. Address errors in streaming docs.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12436672/patch-813-ydist.txt
    Author: Corinne Chandel
    Ref: YDH

commit d170cf2c73554beec3dbc43d79cd3d15f7b4d99c
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Mar 17 11:37:43 2010 +0530

    MAPREDUCE-927. Cleanup of task-logs should happen in TaskTracker instead of the Child
    
    Patch: https://issues.apache.org/jira/secure/attachment/12439009/patch-927-5-dist.txt
    Author: Amareshwari Sriramadasu
    Ref: YDH

commit 2acd7103481363aeadb4593738cc8e6f8f11f483
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Mar 16 15:46:02 2010 -0700

    HDFS-1039. Service should be set in the token in JspHelper.getUGI
    
    Patch: https://issues.apache.org/jira/secure/attachment/12438896/HDFS-1039-y20.2.patch
    Patch: https://issues.apache.org/jira/secure/attachment/12439603/HDFS-1039-y20.2.1.patch
    Author: Jitendra Nath Pandey
    Ref: CDH-648

commit d31d5e1b2eef3d92c60781087aba2857900d2273
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Mar 16 12:01:28 2010 -0700

    MAPREDUCE-1599. MRBench reuses jobConf and credentials therein.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12438844/MR-1599-y20.1.patch
    Author: Jitendra Nath Pandey
    Ref: CDH-648

commit f4ad3c7d410bfb5d86053fcbc91b4275fe0fd74c
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Mar 11 17:28:05 2010 -0800

    HDFS-1036. in DelegationTokenFetch dfs.getURI returns no port
    
    Patch: https://issues.apache.org/jira/secure/attachment/12438549/HDFS-1036-BP20.patch
    Patch: https://issues.apache.org/jira/secure/attachment/12438585/HDFS-1036-BP20-1.patch
    Patch: https://issues.apache.org/jira/secure/attachment/12438856/fetchdt_doc.patch
    Author: Boris Shkolnik
    Ref: CDH-648

commit e4974cce6a221e1e4b8206145746e26c83fd9253
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Mar 11 20:17:52 2010 -0800

    HDFS-1038. Fix bug causing NPE in nn_browsedfscontent.jsp when security is disabled
    
    Patch: https://issues.apache.org/jira/secure/attachment/12438570/HDFS-1038-y20.1.patch
    Author: Jitendra Nath Pandey
    Ref: CDH-648

commit 596d28594c9bd32116c6510e6607308bc1a762e8
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Mar 11 16:58:55 2010 -0800

    HADOOP-6627. "Bad Connection to FS" message in FSShell should print message from the exception
    
    Patch: https://issues.apache.org/jira/secure/attachment/12438455/HADOOP-6627-BP20.patch
    Author: Boris Shkolnik
    Ref: CDH-648

commit 6d83d8407d13f0e4b8b26f74bfbbd592b4c906ee
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Mar 11 14:38:56 2010 -0800

    HDFS-1033. In secure clusters, NN and SNN should verify that the remote principal during image and edits transfer
    
    Patch: https://issues.apache.org/jira/secure/attachment/12438477/HDFS-1033-Y20.patch
    Author: Jakob Homan
    Ref: CDH-648

commit 3d43a9035ebaf0c449647c5f35fbd37444708d9f
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Mar 11 10:48:27 2010 -0800

    MAPREDUCE-1522. FileInputFormat may change the file system of an input path
    
    Patch: https://issues.apache.org/jira/secure/attachment/12437994/M1522-1v20.patch
    Author: Tsz Wo (Nicholas), SZE
    Ref: CDH-648

commit 897cd8d3d578c3cee70f039050f3cdd800daafb1
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Mar 10 15:45:12 2010 +0530

    MAPREDUCE-1100. User's task-logs filling up local disks on the TaskTrackers
    
    Patch: https://issues.apache.org/jira/secure/attachment/12438394/patch-1100-fix-ydist.2.txt
    Author: Vinod K V
    Ref: YDH

commit 6cbba23fe597ada4f109fc92ecbffc3d01dcc8ac
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Mar 10 15:17:17 2010 +0530

    MAPREDUCE-1422. Changing permissions of files/dirs under job-work-dir may be needed sothat cleaning up of job-dir in all mapred-local-directories succeeds always
    
    Patch: https://issues.apache.org/jira/secure/attachment/12438393/mapreduce-1422-y20s.patch
    Author: Amar Kamat
    Ref: CDH-648

commit 1cb57487f49bd7fd14c7575a1e8f5842b3f24e35
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Mar 9 23:39:13 2010 -0800

    HDFS-992. Re-factor block access token implementation to conform to the generic Token interface in Common
    
    Patch: https://issues.apache.org/jira/secure/attachment/12438371/h992-BK-0.20-07.1.patch
    Author: Kan Zhang
    Ref: CDH-648

commit 543dcb4cdef6028146e5f8104123c8ac84e11e6b
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Mar 10 11:20:33 2010 +0530

    MAPREDUCE-890. After HADOOP-4491, the user who started mapred system is not able to run job.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12438369/MR890.20S.patch
    Author: Ravi Gummadi
    Ref: CDH-648

commit 7b32dc80001de8b93c52041e15ab29bc52d5a68d
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Mar 9 17:07:36 2010 -0800

    HADOOP-6598. Remove verbose logging from the Groups class
    
    Patch: https://issues.apache.org/jira/secure/attachment/12438059/HADOOP-6598-BP20.patch
    Patch: https://issues.apache.org/jira/secure/attachment/12438562/HADOOP-6598-BP20-Fix.patch
    Author: Boris Shkolnik
    Ref: CDH-648

commit 8c19066cdc5f6605e65d07127649a77309f2943a
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Mar 9 12:54:42 2010 -0800

    HADOOP-6620. NPE if renewer is passed as null in getDelegationToken
    
    Patch: https://issues.apache.org/jira/secure/attachment/12438072/HADOOP-6620-y20.1.patch
    Author: Jitendra Nath Pandey
    Ref: CDH-648

commit 447441ec4a592225dca7d3bf42a6c6c2f977b2dc
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Mar 8 22:58:43 2010 +0530

    Amend MAPREDUCE-1435. symlinks in cwd of the task are not handled properly after MAPREDUCE-896
    
    Reason: fixes chmod during cleanup to not make private files group-readable, adds tests
    Patch: https://issues.apache.org/jira/secure/attachment/12438172/MR-1435-y20s-1.txt
    Author: Ravi Gummadi
    Ref: CDH-648

commit eb5a68ab5a10bd85527cd4495a17687a826af698
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sun Mar 7 23:14:19 2010 -0800

    HADOOP-6612. Protocols RefreshUserToGroupMappingsProtocol and RefreshAuthorizationPolicyProtocol will fail with security enabled
    
    Patch: https://issues.apache.org/jira/secure/attachment/12437809/HADOOP-6612-BP20.patch
    Author: Boris Shkolnik
    Ref: CDH-648

commit c62d6f0ed1c6d9e48db909acade2683830ebf37c
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Mar 5 15:18:03 2010 -0800

    MAPREDUCE-1566. Mechanism to import tokens and secrets from a file in to the submitted job.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12438122/mr-1566-1.patch (bugfixes for testcases on top of the patch committed earlier)
    Patch: https://issues.apache.org/jira/secure/attachment/12438376/mr-1566-1.1.patch
    Author: Jitendra Nath Pandey
    Ref: CDH-648

commit 1c42a6fc1ed560fd31c0622ecb272e48a56d70a1
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Mar 5 18:42:28 2010 -0800

    HADOOP-6603. Provide workaround for issue with Kerberos not resolving cross-realm principal
    
    Patch: https://issues.apache.org/jira/secure/attachment/12437826/HADOOP-6603-Y20S-4.patch
    Author: Jitendra Nath Pandey
    Ref: CDH-648

commit 072d68d2af1235eaacb49975f78d3457cec60938
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Mar 5 15:41:57 2010 +0530

    MAPREDUCE-1421. LinuxTaskController tests failing on trunk after the commit of MAPREDUCE-1385
    
    Patch: https://issues.apache.org/jira/secure/attachment/12437985/patch-1421-1-ydist.txt
    Author: Amareshwari Sriramadasu
    Ref: CDH-648

commit 69059680d9e0df4004ac7199c2c54ea658c35173
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Mar 5 05:58:25 2010 +0000

    HDFS-814. Add an api to get the visible length of a DFSDataInputStream.
    
    Patch: http://issues.apache.org/jira/secure/attachment/12437934/getLength-yahoo-0.20.patch
    Patch: http://issues.apache.org/jira/secure/attachment/12438026/privateInputStream.patch
    Author: Tsz Wo (Nicholas), SZE
    Ref: YDH

commit e5a03085722a84a8ee0419ba8d91cd022023c0a0
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Mar 4 19:05:52 2010 -0800

    HDFS-1023. Allow http server to start as regular principal if https principal not defined.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12437962/HADOOP-1023-Y20-1.patch
    Patch: https://issues.apache.org/jira/secure/attachment/12437962/HADOOP-1023-Y20-1.patch
    Patch: https://issues.apache.org/jira/secure/attachment/12438241/HDFS-1023-Y20-Update-2.patch
    Author: Jakob Homan
    Ref: CDH-648

commit 5dcd47a711f530fb989350569dd5553b93e62490
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Mar 4 00:55:53 2010 -0800

    HDFS-1015. Intermittent failure in TestSecurityTokenEditLog
    
    Patch: https://issues.apache.org/jira/secure/attachment/12437830/HDFS-1015-y20.1.patch
    Author: Jitendra Nath Pandey
    Ref: CDH-648

commit 23cdcc2b187566333ae2201cd9706655f55ebf15
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Mar 3 19:14:35 2010 -0800

    HDFS-1020. The canceller and renewer for delegation tokens should be long names.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12437838/HDFS-1020-y20.2.patch
    Author: Jitendra Nath Pandey
    Ref: CDH-648

commit 522fd421225fbab7258a2977e4d40c0c13179376
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Mar 3 19:09:49 2010 -0800

    HDFS-1019. Incorrect default values for delegation tokens in hdfs-default.xml
    
    Patch: https://issues.apache.org/jira/secure/attachment/12437832/HDFS-1019-y20.1.patch
    Author: Jitendra Nath Pandey
    Ref: CDH-648

commit 7187e0d7a8367e5e072b5e589db5becae0e6eb1d
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Mar 3 18:55:35 2010 -0800

    Amend MAPREDUCE-1430. JobTracker should be able to renew delegation tokens for the jobs
    
    Patch: https://issues.apache.org/jira/secure/attachment/12437822/1430-bp20-bugfix.patch
    Author: Devaraj Das
    Ref: YDH

commit 52f7ba19fcd24172f1576b7c19db1c45427fe85d
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Mar 3 18:53:33 2010 -0800

    MAPREDUCE-1559. The DelegationTokenRenewal timer task should use the jobtracker's credentials to create the filesystem
    
    Patch: https://issues.apache.org/jira/secure/attachment/12437821/mr-1559.patch
    Author: Devaraj Das
    Ref: CDH-648

commit cab178d48ab27c74a44009b9d37e686580172794
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Mar 3 18:50:32 2010 -0800

    MAPREDUCE-1550. UGI.doAs should not be used for getting the history file of jobs
    
    Patch: https://issues.apache.org/jira/secure/attachment/12437835/1550-2.patch
    Patch: https://issues.apache.org/jira/secure/attachment/12437870/1550-2.1.patch
    Author: Devaraj Das
    Ref: CDH-648

commit 4fd9491b5aeaea1f598f4314203aa5047576f137
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Mar 3 16:48:17 2010 -0800

    HADOOP-6609. Fix UTF8 to use a thread local DataOutputBuffer instead of
    a static that was causing a deadlock in RPC. (omalley)
    
    Author: Owen O'Malley
    Ref: YDH

commit a4c9f34046770120a47ba864eaaaab0ecc952d86
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Mar 3 10:50:34 2010 -0800

    HDFS-1017. browsedfs jsp should call JspHelper.getUGI rather than using createRemoteUser()
    
    Patch: https://issues.apache.org/jira/secure/attachment/12437683/HDFS-1017-Y20-2.patch
    Author: Jakob Homan
    Ref: CDH-648

commit 5802343b91fc15ee719c2cffabf6a3f9f01f4007
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Mar 3 09:45:14 2010 +0530

    MAPREDUCE-899. When using LinuxTaskController, localized files may become accessible to unintended users if permissions are misconfigured.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12437670/mr-899-20.patch
    Author: Amareshwari Sriramadasu
    Ref: CDH-648

commit d763323e21122018c746515655c9fdff00547635
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Mar 3 00:36:08 2010 +0000

    HDFS-204. Revive number of files listed metrics
    
    Patch: http://issues.apache.org/jira/secure/attachment/12437576/getFileNum-yahoo20.patch
    Author: Jitendra Nath Pandey
    Ref: YDH

commit ba0fc48b0d49ba1c03ef50ddb289580e61e0d689
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Mar 2 23:04:42 2010 +0000

    HADOOP-6569. FsShell#cat should avoid calling unecessary getFileStatus before opening a file to read
    
    Patch: http://issues.apache.org/jira/secure/attachment/12437633/optimizeCat-yahoo2.patch
    Author: Hairong Kuang
    Ref: YDH

commit ebc7ac47eb93eeab3f8b8987ac74a56b0e40a982
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Mar 1 17:25:45 2010 -0800

    HDFS-1014. Error in reading delegation tokens from edit logs.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12437547/HDFS-1014-y20.1.patch
    Author: Jitendra Nath Pandey
    Ref: CDH-648

commit e5ec38e7ece3c2e9d7c64ae8bad4fe60e2a75088
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Mar 1 01:05:29 2010 -0800

    HDFS-1006. getImage/putImage http requests should be https for the case of security enabled.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12437467/HDFS-1006-Y20.1.patch
    Author: Boris Shkolnik
    Ref: CDH-648

commit 0c9d8c2dd9693f0a7317a139b5911a32c85aab61
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Mar 1 00:27:00 2010 -0800

    HDFS-1005. Fsck security
    
    Patch: https://issues.apache.org/jira/secure/attachment/12437435/HDFS-1005-BP20.patch
    Patch: https://issues.apache.org/jira/secure/attachment/12438474/HDFS-1005-BP20-1.patch
    Author: Boris Shkolnik
    Ref: CDH-648

commit 5a1ea2b0a3050a0a8fdae64902067504d6f8c8eb
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Mar 1 00:09:00 2010 -0800

    HDFS-1007. HFTP needs to be updated to use delegation tokens
    
    Patch: https://issues.apache.org/jira/secure/attachment/12437458/distcp-hftp.2.patch
    Patch: https://issues.apache.org/jira/secure/attachment/12437464/distcp-hftp.2.1.patch
    Patch: https://issues.apache.org/jira/secure/attachment/12438384/distcp-hftp-2.1.1.patch
    Author: Devaraj Das
    Ref: CDH-648

commit 4bcd449f60c72fb3058a6b3ff1316b7bc10514e8
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sat Feb 27 04:04:26 2010 -0800

    HDFS-992. Re-factor block access token implementation to conform to the generic Token interface in Common
    
    Patch: https://issues.apache.org/jira/secure/attachment/12437340/h992-BK-0.20-07.patch
    Author: Kan Zhang
    Ref: CDH-648

commit 039917c679539092585e96f590fae59052a3cae6
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sat Feb 27 03:26:42 2010 -0800

    MAPREDUCE-1528. TokenStorage should not be static
    
    Patch: https://issues.apache.org/jira/secure/attachment/12437339/MAPREDUCE-1528_yhadoop20.patch
    Author: Jitendra Nath Pandey
    Ref: CDH-648

commit 40c1a9177db24c0fd589ad8312a31a715f4cf8ad
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sat Feb 27 03:19:02 2010 -0800

    HADOOP-6584. Provide Kerberized SSL encryption for webservices
    
    Patch: https://issues.apache.org/jira/secure/attachment/12437337/HADOOP-6584-Y20-4.patch
    Patch: https://issues.apache.org/jira/secure/attachment/12437337/HADOOP-6584-Y20-4.patch
    Patch: https://issues.apache.org/jira/secure/attachment/12437768/HADOOP-6584-FixJavadoc-Y20.patch
    Author: Jakob Homan
    Ref: CDH-648

commit ee57d93dde68a072985a44e025594bbb8c9340b0
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sat Feb 27 16:41:17 2010 +0530

    MAPREDUCE-1493. Authorization for job-history pages
    
    Patch: https://issues.apache.org/jira/secure/attachment/12437336/MAPREDUCE-1493-20100227.3-ydist.txt
    Author: Vinod K V
    Ref: CDH-648

commit e6e5a1bd6943f228f5271627a200350bff525cba
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sat Feb 27 15:44:10 2010 +0530

    MAPREDUCE-1455. Authorization for servlets
    
    Patch: https://issues.apache.org/jira/secure/attachment/12437322/1455.20S.2.patch
    Patch: https://issues.apache.org/jira/secure/attachment/12437379/1455.20S.2.fix.patch
    Author: Ravi Gummadi
    Ref: CDH-648

commit 3ad925fb44f4e98e986b5648e2c593d1feabafd4
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sat Feb 27 15:33:40 2010 +0530

    MAPREDUCE-1307. Introduce the concept of Job Permissions
    
    Patch: https://issues.apache.org/jira/secure/attachment/12437331/MAPREDUCE-1307-20100227-ydist.txt
    Author: Vinod K V
    Ref: CDH-648

commit 06b43c680556c94120774e2c22279086654f50ee
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sat Feb 27 15:03:15 2010 +0530

    HADOOP-6568. Authorization for default servlets
    
    Patch: https://issues.apache.org/jira/secure/attachment/12437323/HADOOP-6568-20100226.1-ydist.patch
    Author: Vinod K V
    Ref: CDH-648

commit cf43306ef9d6a7b1a3d3287a898c05cbd51361b6
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sat Feb 27 00:04:02 2010 -0800

    HADOOP-6589. A framework to enable better error messages when rpc connections
    fail to authenticate. (Kan Zhang via omalley)
    
    Author: Kan Zhang
    Ref: CDH-648

commit ea1922be01c027410bdfdb3776b79cbb2328162d
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Feb 26 22:45:11 2010 -0800

    HADOOP-6600. mechanism for authorization check for inter-server protocols
    
    Patch: https://issues.apache.org/jira/secure/attachment/12437320/HADOOP-6600-4-BP20.patch
    Patch: https://issues.apache.org/jira/secure/attachment/12437534/HADOOP-6600-BP20-fix.patch
    Author: Boris Shkolnik
    Ref: CDH-648

commit 877787c7557f3c2bb824f526d5b036c734bfc5e1
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Feb 26 21:43:27 2010 -0800

    HADOOP-6580,HDFS-993,MR-1516. UGI should contain authentication method.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12437317/HADOOP-6580-0_20.5.patch
    Author: Jitendra Nath Pandey
    Ref: CDH-648

commit 2a717fe40d0a2514647f25457e6a8f344dff7941
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Feb 26 21:22:54 2010 -0800

    HADOOP-6573, HDFS-984, MR-1537. Delegation Tokens should be persisted.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12437292/HDFS-984-0_20.4.patch
    Author: Jitendra Nath Pandey
    Ref: CDH-648

commit 5ba4559365dec19e119d041ddf7f6f65b7fc0c56
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Feb 26 19:26:55 2010 -0800

    HDFS-994, HADOOP-6594. Provide methods for obtaining delegation token from Namenode for hftp and other uses
    
    Patch: https://issues.apache.org/jira/secure/attachment/12436748/HADOOP-6594.patch
    Author: Jakob Homan
    Ref: CDH-648

commit c697b870f3d87e17438115a037e9fd831e757e3f
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Feb 26 18:27:28 2010 -0800

    HADOOP-6586. Log authentication and authorization failures and successes
    
    Patch: https://issues.apache.org/jira/secure/attachment/12437302/HADOOP-6586-8-BP20-1.patch
    Author: Boris Shkolnik
    Ref: CDH-648

commit ac2d2869ecccd91fdacab7544eca00767f8d64a1
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Feb 26 11:28:36 2010 -0800

    HDFS-991. Use delegation token to authenticate to the hdfs servlets.
    
    Author: Owen O'Malley
    Ref: CDH-648

commit 8b1175a8139fc28041c67f095212d7224ac44b2e
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Feb 26 16:50:29 2010 -0800

    HADOOP-6599. Split RPC metrics into summary and detailed metrics
    
    Patch: https://issues.apache.org/jira/secure/attachment/12437251/hadoop-6599.rel20.patch
    Author: Suresh Srinivas
    Ref: YDH

commit 12f51ccde0651d7ed7e2458dfd498f0b8856d70a
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Feb 26 13:14:06 2010 -0800

    HDFS-998. The servlets should quote server generated strings sent in the response
    
    Patch: http://issues.apache.org/jira/secure/attachment/12436835/H998-0y20.patch
    Author: Chris Douglas
    Ref: CDH-648

commit 0d1bfe678e356b154d237e34d4e4379a84614158
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Feb 26 13:12:41 2010 -0800

    MAPREDUCE-1454. The servlets should quote server generated strings sent in the response
    
    Patch: http://issues.apache.org/jira/secure/attachment/12436834/M1454-0y20.patch
    Patch: https://issues.apache.org/jira/secure/attachment/12437591/M1454-1y20.patch
    Author: Chris Douglas
    Ref: CDH-648

commit 36710ad2bc76df6df01ef540df21fedb9b702160
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Feb 25 18:28:07 2010 -0800

    HDFS-1000. libhdfs needs to be updated to use the new UGI
    
    Patch: https://issues.apache.org/jira/secure/attachment/12437071/hdfs-1000-bp20.4.patch
    Author: Devaraj Das
    Ref: CDH-648

commit a9847d0ea9f695dca2d3dca6e81a8cd1f29665fc
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Feb 25 18:25:29 2010 -0800

    MAPREDUCE-1532. Delegation token is obtained as the superuser
    
    Patch: https://issues.apache.org/jira/secure/attachment/12437096/1532-bp20.4.patch
    Author: Devaraj Das
    Ref: CDH-648

commit c44edf05330198b89ec994b2543e1fe459dd30bd
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sun Feb 21 22:10:38 2010 -0800

    MAPREDUCE-1430. JobTracker should be able to renew delegation tokens for the jobs
    
    Patch: https://issues.apache.org/jira/secure/attachment/12436542/1430-dd4-BP20.patch
    Author: Boris Shkolnik
    Ref: CDH-648

commit 6e62f0d9c2d57096e4dfa937ebeab8c76b354e63
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Feb 25 10:37:41 2010 -0800

    HADOOP-6596. Add a version field to the serialization of the AbstractDelegationTokenIdentifier.
    
    Author: Owen O'Malley
    Ref: CDH-648

commit 74cc8e6e9836a9daab24553b41efca110f50411a
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Feb 25 10:29:08 2010 -0800

    HADOOP-5561. Add javadoc.maxmemory to build.xml to allow larger memory.
    
    Author: Jakob Homan
    Ref: YDH

commit 03e35485989da9f5d60dacdcfc67e8566b8590f8
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Feb 25 10:14:22 2010 -0800

    HADOOP-6579. Add a mechanism for encoding and decoding Tokens in to
    url-safe strings. Also change commons-codec library to 1.4.
    
    Author: Owen O'Malley
    Ref: CDH-648

commit a04e3abfd58010fd99877910c0f15bbbebb1b45a
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Feb 25 20:48:43 2010 +0530

    MAPREDUCE-1354. Incremental enhancements to the JobTracker for better scalability
    
    Patch: https://issues.apache.org/jira/secure/attachment/12437010/mr-1354-y20.patch
    Author: Dick King
    Ref: YDH

commit 69ec78a2c5d4faa8700cf45f7560fe1d10517ddf
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Feb 24 17:16:50 2010 -0800

    HDFS-999. Secondary namenode should login using kerberos if security is configured
    
    Patch: https://issues.apache.org/jira/secure/attachment/12436938/HDFS-999-BP20.patch
    Author: Boris Shkolnik
    Ref: CDH-648

commit ac14a11d60fcbd9cc660fc8b48a894cd6774b102
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Feb 25 00:16:28 2010 +0530

    MAPREDUCE-1466. FileInputFormat should save #input-files in JobConf
    
    Patch: https://issues.apache.org/jira/secure/attachment/12436886/MAPREDUCE-1466_yhadoop20-3.patch
    Author: Luke Lu
    Ref: YDH

commit 50ddfaab96c196a2342e7807423d31f5bd10a18e
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Feb 24 17:04:46 2010 +0530

    MAPREDUCE-1403. Save file-sizes of each of the artifacts in DistributedCache in the JobConf
    
    Patch: https://issues.apache.org/jira/secure/attachment/12436842/MAPREDUCE-1403_yhadoop20-2.patch
    Author: Arun C Murthy
    Ref: YDH

commit 645b54a053bf565ef2a0f36be8c5a02f80a2775a
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Feb 23 23:35:49 2010 -0800

    HADOOP-6566. Hadoop daemons should not start up if the ownership/permissions on the directories used at runtime are misconfigured
    
    Patch: https://issues.apache.org/jira/secure/attachment/12436814/HADOOP-6566_yhadoop20.patch
    Author: Arun C Murthy
    Ref: CDH-648

commit 2943513d74b4a8c1763eccab854b96a57caec7f4
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Feb 23 18:04:41 2010 -0800

    MAPREDUCE-1520. TestMiniMRLocalFS fails on trunk
    
    Patch: https://issues.apache.org/jira/secure/attachment/12436695/patch-1520-20S.txt
    Author: Amareshwari Sriramadasu
    Ref: CDH-648

commit e82c781785f284d926591a91df837021f5c68fcc
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Feb 23 17:45:16 2010 -0800

    HADOOP-6543. Allow authentication-enabled RPC clients to connect to authentication-disabled RPC servers
    
    Patch: https://issues.apache.org/jira/secure/attachment/12436797/6543-bp20.0.patch.
    Patch: https://issues.apache.org/jira/secure/attachment/12436807/6543-bp20.1.patch
    Author: Kan Zhang
    Ref: CDH-648

commit 48788d6ab1852ee1f47b56ee6f097022c30ec409
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Feb 23 17:29:15 2010 -0800

    MAPREDUCE-1505. Cluster class should create the rpc client only when needed
    
    Patch: https://issues.apache.org/jira/secure/attachment/12436628/MAPREDUCE-1505_yhadoop20.patch
    Author: Dick King
    Ref: YDH

commit 8a4f25cf0d3631d190ecfa81e17a80bfbb019c69
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Feb 23 16:58:04 2010 -0800

    HADOOP-6549. TestDoAsEffectiveUser should use ip address of the host for superuser ip check
    
    Patch: https://issues.apache.org/jira/secure/attachment/12436794/HADOOP-6549-0_20.1.patch
    Author: Jitendra Nath Pandey
    Ref: CDH-648

commit 7bc77f6677512242d19b75c643f21a457087d29a
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Feb 23 16:18:31 2010 -0800

    HDFS-786. Implement getContentSummary(..) in HftpFileSystem
    
    Patch: https://issues.apache.org/jira/secure/attachment/12436792/h786_20100223_0.20.patch
    Author: Tsz Wo (Nicholas), SZE
    Ref: YDH

commit ff8f5ea311eb73564805f083a379147fd5aa6d47
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Feb 23 20:28:10 2010 +0000

    HDFS-946. NameNode should not return full path name when lisitng a directory or getting the status of a file
    
    Patch: http://issues.apache.org/jira/secure/attachment/12436753/HdfsFileStatus-yahoo20.patch.
    Patch: http://issues.apache.org/jira/secure/attachment/12436769/HdfsFileStatusProxy-Yahoo20.patch
    Ref: YDH

commit a3fe5640d9cdab14cc7080093b2dbb7933d640d9
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Feb 23 23:04:04 2010 +0530

    MAPREDUCE-1398. TaskLauncher remains stuck on tasks waiting for free nodes even if task is killed.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12436724/mr-1398-y20.patch
    Author: Amareshwari Sriramadasu
    Ref: YDH

commit ab177895085d9b5fcfcb1fc695bbd94b753b9160
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Feb 23 22:38:02 2010 +0530

    MAPREDUCE-1476. committer.needsTaskCommit should not be called for a task cleanup attempt
    
    Patch: https://issues.apache.org/jira/secure/attachment/12436722/mr-1476-y20.patch
    Author: Amareshwari Sriramadasu
    Ref: YDH

commit eb3c54506da655c486c06560877a2e30dd3aec3f
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Feb 23 06:57:55 2010 +0000

    HADOOP-6467. Performance improvement for liststatus on directories in hadoop archives.
    
    Patch: http://issues.apache.org/jira/secure/attachment/12436653/HADOOP-6467-y.0.20-branch-v2.patch
    Author: Mahadev konar
    Ref: YDH

commit 1ee8f38881d450a818754c889038ef2ea8de865d
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Feb 22 23:47:58 2010 +0000

    HADOOP-6558. archive does not work with distcp -update
    
    Patch: http://issues.apache.org/jira/secure/attachment/12436264/c6558_20100216b_y0.20.patch
    Author: Tsz Wo (Nicholas), SZE
    Ref: YDH

commit 5792ec1c6229c000b34fd2cb1f3447ae71ad9949
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Feb 22 15:21:20 2010 -0800

    HADOOP-6583. Capture metrics for authentication/authorization at the RPC layer
    
    Patch: https://issues.apache.org/jira/secure/attachment/12436643/6583-bp20.patch
    Author: Devaraj Das
    Ref: CDH-648

commit f370a6f8f27d6bc813872a89d9dac122dd357b53
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Feb 22 14:35:53 2010 -0800

    HADOOP-6577. IPC server response buffer reset threshold should be configurable
    
    Patch: https://issues.apache.org/jira/secure/attachment/12436399/hadoop-6577.2.rel20.patch) from yahoo-hadoop-0.20 into yahoo-hadoop-0.20.1xx
    Author: Suresh Srinivas
    Ref: YDH

commit 364fd3118df3fb08ec239306fcc3b1762cb803d0
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Feb 22 17:11:21 2010 +0530

    MAPREDUCE-1316. JobTracker holds stale references to retired jobs via unreported tasks
    
    Patch: https://issues.apache.org/jira/secure/attachment/12436563/mapreduce-1316-y20s.patch
    Author: Amar Kamat
    Ref: YDH

commit 2471700c34dece87f611cceaf5b961647ab58700
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Feb 19 15:58:46 2010 -0800

    HADOOP-6551, HDFS-986, MAPREDUCE-1503. Change API for tokens to throw
    exceptions instead of returning booleans.
    
    Author: Owen O'Malley
    Ref: CDH-648

commit 5838b15e3232608c1358887b8910638b1497043f
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Feb 19 23:55:25 2010 -0800

    HADOOP-6572. RPC responses may be out-of-order with respect to SASL
    
    Patch: https://issues.apache.org/jira/secure/attachment/12436421/6572-bp20.patch
    Author: Kan Zhang
    Ref: CDH-648

commit 4719ab45ca9a8ae6d8289dddc024854683726b19
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Feb 19 15:11:03 2010 -0800

    HDFS-965. Split the HDFS TestDelegationToken into two tests, of which
    one proxy users and the other normal users. (jitendra via omalley)
    
    Author: Jitendra Nath Pandey
    Ref: CDH-648

commit a0380f3cae542769bd6861311d0fc709201e9dc3
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Feb 19 14:35:02 2010 -0800

    HADOOP-6332, HDFS-1134, MAPREDUCE-1774. Herriot (system test framework)
    
    Author: Konstantin Boudnik
    Ref: YDH

commit 205a6b6697f7a8934f684e08c187682d0f1d3b2d
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Feb 18 22:12:29 2010 +0000

    HADOOP-6560. HarFileSystem throws NPE for har://hdfs-/foo
    
    Patch: http://issues.apache.org/jira/secure/attachment/12436045/c6560_20100212_y0.20.patch
    Author: Tsz Wo (Nicholas), SZE
    Ref: YDH

commit f33ae6567528673de4c4b0de8f40cf2c9ff5741c
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Feb 18 12:15:13 2010 +0530

    MAPREDUCE-686. Move TestSpeculativeExecution.Fake* into a separate class so that it can be used by other tests also
    
    Patch: https://issues.apache.org/jira/secure/attachment/12436181/MAPREDUCE-686-y20.patch
    Author: Jothi Padmanabhan
    Ref: YDH

commit dd2ce99bb706ba8e7771b3382de7af687ae8467f
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Feb 16 12:51:39 2010 -0800

    HDFS-111. UnderReplicationBlocks should use generic types
    
    Patch: https://issues.apache.org/jira/secure/attachment/12436027/1026-bp20-bugfix.patch
    Author: Devaraj Das
    Ref: YDH

commit 67e25d271e93cc035ed664f9bfa16705f6a45958
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sun Feb 14 23:34:54 2010 -0800

    HADOOP-6559. The RPC client should try to re-login when it detects that the TGT expired
    
    Patch: https://issues.apache.org/jira/secure/attachment/12435851/h-6559.6.bp20.patch
    Author: Devaraj Das
    Ref: CDH-648

commit 176816d52d875d33877baba51294de5b1868d3aa
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sun Feb 14 14:50:27 2010 +0530

    HADOOP-2141. speculative execution start up condition based on completion time
    
    Patch: https://issues.apache.org/jira/secure/attachment/12435253/hadoop-2141-yahoo-v1.4.8.patch (only test related changes)
    Author: Andy Konwinski
    Ref: YDH

commit cd035a28f73f373b695b3704243d013508036346
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Feb 11 19:51:52 2010 +0000

    MAPREDUCE-1425. archive throws OutOfMemoryError
    
    Patch: http://issues.apache.org/jira/secure/attachment/12435030/MAPREDUCE-1425_y_0.20.patch
    Author: Mahadev konar
    Ref: YDH

commit 7260de34b087c442e5054410e038f7bc2214e077
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Feb 11 19:35:40 2010 +0000

    MAPREDUCE-1399. The archive command shows a null error message
    
    Patch: http://issues.apache.org/jira/secure/attachment/12435380/m1399_20100205trunk2_y0.20.patch
    Author: Tsz Wo (Nicholas), SZE
    Ref: YDH

commit a60877ba994c36b0d81f7d1c47a81b1111906bd2
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Feb 9 21:08:18 2010 -0800

    HADOOP-6552. KEYTAB_KERBEROS_OPTIONS in UserGroupInformation should have options for automatic renewal of keytab based tickets
    
    Patch: https://issues.apache.org/jira/secure/attachment/12435369/6552.patch
    Author: Devaraj Das
    Ref: CDH-648

commit b96008a997c4cf52f01a32daa103244a27190639
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Feb 9 21:06:07 2010 -0800

    MAPREDUCE-1433. Create a Delegation token for MapReduce
    
    Patch: https://issues.apache.org/jira/secure/attachment/12435412/1433.bp20.patch
    Author: Owen O'Malley
    Ref: CDH-648

commit 29b6749dfdd9038d9f54ef9f0669c5b1fc553463
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Feb 9 01:34:27 2010 -0800

    HADOOP-6547, HDFS-949, MAPREDUCE-1470. Move the Delegation Token feature to common since both HDFS and MapReduce needs it
    
    Patch: https://issues.apache.org/jira/secure/attachment/12435271/6547-949-1470-0_20.1.patch
    Author: Devaraj Das
    Ref: CDH-648

commit 542f37d8c93b1cae42aec29789068d27bc2330fb
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Feb 9 12:07:45 2010 +0530

    HADOOP-5879. GzipCodec should read compression level etc from configuration
    
    Patch: http://issues.apache.org/jira/secure/attachment/12435254/hadoop-5879-yahoo-0.20-v1.0.patch
    Author: He Yongqiang
    Ref: YDH

commit 7d2d3c129e0c1358e490d09744ce1b448e430beb
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Feb 9 11:10:54 2010 +0530

    HADOOP-6161. Add get/setEnum to Configuration
    
    Patch: http://issues.apache.org/jira/secure/attachment/12434928/hadoop-6161-yahoo-20-v1.patch
    Author: Chris Douglas
    Ref: YDH

commit ed87d9db29b9375d2c10047bbc3e7d136c76a581
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Feb 8 19:44:20 2010 -0800

    HADOOP-6510, HDFS-935, MAPREDUCE-1464. Add support for a superuser authenticating on behalf of a proxy user.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12435223/HADOOP-6510-0_20.4.patch
    Author: Jitendra Nath Pandey
    Ref: CDH-648

commit df2da5d463d6a2b09f11c44d1eebbf85fa73ce81
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Feb 8 20:15:15 2010 +0530

    MAPREDUCE-1435. symlinks in cwd of the task are not handled properly after MAPREDUCE-896
    
    Patch: https://issues.apache.org/jira/secure/attachment/12435154/MR-1435-y20s.patch
    Author: Ravi Gummadi
    Ref: CDH-648

commit 4fe06f63ca10d7cc5949615d2d41261782156653
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sun Feb 7 00:22:29 2010 -0800

    MAPREDUCE-1457. For secure job execution, couple of more UserGroupInformation.doAs needs to be added
    
    Patch: https://issues.apache.org/jira/secure/attachment/12435115/MAPREDUCE-1457-BPY20.patch.1
    Author: Jakob Homan
    Ref: CDH-648

commit 6b874f7d11e11f14450e882b670180daef48a76e
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sat Feb 6 12:45:49 2010 -0800

    MAPREDUCE-1440. MapReduce should use the short form of the user names
    
    Patch: https://issues.apache.org/jira/secure/attachment/12435087/1440.y20.patch
    Author: Owen O'Malley
    Ref: CDH-648

commit b14b570878b9262f1a77c668b5123345406f8374
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Feb 5 17:29:01 2010 -0800

    HDFS-737. Improvement in metasave output
    
    Patch: https://issues.apache.org/jira/secure/attachment/12435041/HDFS-737.3.rel20.patch
    Author: Jitendra Nath Pandey
    Ref: YDH

commit 5141b5979120da19551385ff1ce13b545266b204
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Feb 5 15:35:16 2010 -0800

    HADOOP-6419. Change RPC layer to support SASL based mutual authentication
    
    Patch: https://issues.apache.org/jira/secure/attachment/12434998/HADOOP-6419-0.20-15.patch
    Patch: https://issues.apache.org/jira/secure/attachment/12435135/6419-bp20-jobsubmitprotocol.patch
    Author: Kan Zhang
    Ref: CDH-648

commit d1f946ae7bfd5e619e8167ebad228be72668b0a9
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Feb 5 15:29:26 2010 -0800

    HADOOP-6538. Set hadoop.security.authentication to "simple" by default
    
    Patch: https://issues.apache.org/jira/secure/attachment/12435031/6538-bp20.patch
    Author: Devaraj Das
    Ref: CDH-648

commit d012efa36429328941a04742bd6febd35d3875ef
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Feb 5 23:06:29 2010 +0000

    HDFS-938. Replace calls to UGI.getUserName() with UGI.getShortUserName()
    
    Patch: https://issues.apache.org/jira/secure/attachment/12435015/HDFS-938-BP20-2.patch
    Author: Jakob Homan
    Ref: CDH-648

commit 4f96064bf4bb838ce0c6d4e99152a02ad9737032
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Feb 5 15:03:57 2010 -0800

    HADOOP-6521. FsPermission:SetUMask not updated to use new-style umask setting.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12434469/hadoop-6521.rel20.1.patch
    Author: Suresh Srinivas
    Ref: YDH

commit ee18c74b284b015975b0df13c750e135ff938fbe
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Feb 5 20:23:24 2010 +0000

    HADOOP-6544. fix ivy settings to include JSON jackson.codehause.org libs for .20
    
    Patch: https://issues.apache.org/jira/secure/attachment/12435002/contrib.ivy.jackson.patch-3
    Author: Boris Shkolnik
    Reason: contrib build breaks because ivy is not configured to include jackson libs.
    Ref: YDH

commit 18b89be19183117cbe0a567ecb16e8012bc83c48
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Feb 4 18:47:54 2010 -0800

    HDFS-907. Add tests for getBlockLocations and totalLoad metrics.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12434919/HDFS907s.patch
    Author: Ravi Phulari
    Ref: YDH

commit 77510480a8b45b2f9e605b720671d609d7bf4687
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Feb 2 16:25:41 2010 -0800

    HADOOP-6204. Implementing aspects development and fault injection framework for Hadoop
    
    Patch: https://issues.apache.org/jira/secure/attachment/12434616/HADOOP-6204-ydist.patch
    Author: Konstantin Boudnik
    Ref: YDH

commit 993bc455b265d185f74c23ec7ccb272203190298
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Feb 2 10:31:13 2010 -0800

    MAPREDUCE-1432. Add the hooks in JobTracker and TaskTracker to load tokens from the token cache into the user's UGI
    
    Patch: https://issues.apache.org/jira/secure/attachment/12434550/MAPREDUCE-1432-BP20-2.patch
    Author: Devaraj Das
    Ref: CDH-648

commit 32957687aa59ef0d515682d87e521d8aba244e3b
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Feb 1 22:52:13 2010 -0800

    MAPREDUCE-1383. Allow storage and caching of delegation token.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12434455/MAPREDUCE-1383-BP20-7.patch
    Author: Boris Shkolnik
    Ref: CDH-648

commit ce7c3dc280b3d3ba0b176f0b7a9dc09d5ca163f5
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Feb 1 21:26:52 2010 -0800

    HADOOP-6337. Update FilterInitializer class to be more visible and take a conf for further development
    
    Patch: https://issues.apache.org/jira/secure/attachment/12434503/HADOOP-6337-Y.patch
    Patch: https://issues.apache.org/jira/secure/attachment/12434547/HADOOP-6337-Y.patch
    Author: Jakob Homan
    Ref: CDH-648

commit 6fa12363baff6e2e11650c0f395b52e9f47d6266
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Feb 1 10:28:11 2010 -0800

    HADOOP-6520. UGI should load tokens from the environment
    
    Patch: https://issues.apache.org/jira/secure/attachment/12434423/HADOOP-6520-0_20.2.patch
    Author: Devaraj Das
    Ref: CDH-648

commit 256f62f67d848260f25fca2e052fd878b93bfe17
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sun Jan 31 22:53:34 2010 -0800

    HADOOP-6517, HADOOP-6518. Ability to add/get tokens from UserGroupInformation
    
    Patch: https://issues.apache.org/jira/secure/attachment/12434368/HADOOP-6518-0_20.1.patch
    Author: Owen O'Malley
    Ref: CDH-648

commit 34c6a146be045127f5e43828fd3c578ebb3c113c
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Jan 22 16:03:23 2010 -0800

    MAPREDUCE-1376. Support for varied user submission in Gridmix
    
    Patch: https://issues.apache.org/jira/secure/attachment/12431174/M1376-4.patch
    Patch: https://issues.apache.org/jira/secure/attachment/12440324/1376-5-yhadoop20-100.patch
    Author: Chris Douglas
    Ref: YDH

commit e02e4b0320d45fe401470e72d62b73d18a3dd579
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sun Jan 31 20:00:10 2010 -0800

    HADOOP-6299. Use JAAS LoginContext for our login
    
    Patch: https://issues.apache.org/jira/secure/attachment/12434362/HADOOP-6299-Y20.patch
    Author: Owen O'Malley
    Ref: CDH-648

commit e1ab72fadf1fc48bb21a44c62edd39b3883a392a
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Jan 29 00:00:30 2010 +0530

    Amend MAPREDUCE-842. Per-job local data on the TaskTracker node should have right access-control
    
    Reason: follow-up patch to fix a backport bug
    Patch: https://issues.apache.org/jira/secure/attachment/12431690/MR-842-follow-up.patch
    Author: Vinod K V
    Ref: CDH-648

commit 662c95fb0e8b2aaefb64362119aef66a04268eb1
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Jan 27 21:58:11 2010 +0530

    MAPREDUCE-1186. While localizing a DistributedCache file, TT sets permissions recursively on the whole base-dir
    
    Patch: https://issues.apache.org/jira/secure/attachment/12431573/1186.20S-6.patch
    Author: Amareshwari Sriramadasu
    Reason: performance
    Ref: YDH

commit 137e608a13a26b47883cea6d03a48974b2f16c68
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Jan 26 23:40:06 2010 -0800

    HDFS-899. Delegation Token Implementation
    
    Patch: https://issues.apache.org/jira/secure/attachment/12431529/HDFS-899-0_20.2.patch
    Author: Jitendra Nath Pandey
    Ref: CDH-648

commit db6344ec8a93fb65830f7902e6566a96619ff7dd
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Jan 26 15:10:41 2010 +0530

    MAPREDUCE-896. Users can set non-writable permissions on temporary files for TT and can abuse disk usage.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12431413/MR-896.v8-y20.patch
    Author: Ravi Gummadi
    Ref: CDH-648

commit 2aff67e0291b9641d2e17a7288faa694efe16976
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Jan 25 20:36:44 2010 +0530

    MAPREDUCE-744. Support in DistributedCache to share cache files with other users after HADOOP-4493
    
    Patch: https://issues.apache.org/jira/secure/attachment/12431313/744-6-y20.patch
    Author: Devaraj Das
    Ref: CDH-648

commit d1b26621983f80167bf3af5b38ae48467c739f14
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sat Jan 23 20:27:58 2010 +0530

    MAPREDUCE-1140. Per cache-file refcount can become negative when tasks release distributed-cache files
    
    Patch: https://issues.apache.org/jira/secure/attachment/12431213/patch-1140-3-y20.txt
    Author: Amareshwari Sriramadasu
    Ref: CDH-648

commit 6804e20bd4d9ee5e0005b61d202ce7dd928b5b22
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sat Jan 23 20:01:51 2010 +0530

    MAPREDUCE-1284. TestLocalizationWithLinuxTaskController fails
    
    Patch: https://issues.apache.org/jira/secure/attachment/12427577/MR-1284.patch
    Author: Ravi Gummadi
    Ref: CDH-648

commit 9e3f0d458c0ac31bad77cc336b6fdf0206fbe0d6
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sat Jan 23 14:00:14 2010 +0530

    MAPREDUCE-1098. Incorrect synchronization in DistributedCache causes TaskTrackers to freeze up during localization of Cache for tasks.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12431207/patch-1098-7-y20.txt
    Author: Amareshwari Sriramadasu
    Ref: CDH-648

commit d3131417c36e68cb59ad0833d271d10bd869b27c
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Jan 22 17:45:33 2010 -0800

    MAPREDUCE-1338. Add ability to store and load security keys
    
    Patch: https://issues.apache.org/jira/secure/attachment/12431172/MAPREDUCE-1338-BP20-3.patch
    Author: Boris Shkolnik
    Ref: CDH-648

commit 478ebff927c0a45f72c531952bcaf7632e990a12
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Jan 22 17:17:08 2010 -0800

    HADOOP-6495. Identifier should be serialized after the password is created In Token constructor
    
    Patch: https://issues.apache.org/jira/secure/attachment/12431145/HADOOP-6495-0_20.2.patch
    Author: Jitendra Nath Pandey
    Ref: CDH-648

commit ef9e572a545e56b790000f16bf6d416b63083520
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Jan 22 16:23:34 2010 +0530

    HADOOP-5457. Failing contrib tests should not stop the rest of the contrib tests
    
    Patch: https://issues.apache.org/jira/secure/attachment/12431103/Hadoop-5457-y20.patch
    Author: Giridharan Kesavan
    Ref: YDH

commit 6b6fdbe4b79d6e623fdbcc60f052749cf99b0c32
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Jan 21 12:30:20 2010 -0800

    Amend HADOOP-4181. Add support for git revision in saveVersion.sh
    
    Author: Owen O'Malley
    Reason: Support git revisions without explicitly passing them in
    Ref: YDH

commit 71d9e9a5b3577937fe06b40cebc6656419324323
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Jan 21 22:10:58 2010 +0530

    MAPREDUCE-856. Localized files from DistributedCache should have right access-control
    
    Patch: https://issues.apache.org/jira/secure/attachment/12431040/MAPREDUCE-856-20090908-y20.txt
    Author: Vinod K V
    Ref: CDH-648

commit c0826b2e0c43581aa90afff465ddd7401e12b1ee
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Jan 20 16:12:33 2010 +0530

    MAPREDUCE-871. Job/Task local files have incorrect group ownership set by LinuxTaskController binary
    
    Patch: https://issues.apache.org/jira/secure/attachment/12430867/871.20S.patch
    Author: Vinod K V
    Ref: CDH-648

commit 3ca7d8529bcb3cea9640dffaa296c508a07f89a4
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Jan 20 15:28:40 2010 +0530

    MAPREDUCE-476. Extend DistributedCache to work locally (LocalJobRunner)
    
    Patch: https://issues.apache.org/jira/secure/attachment/12430866/476.20S-2.patch
    Author: Philip Zeyliger
    Ref: YDH

commit 5d79da536f9811f47af0e073aa68ca776c24b0da
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Jan 19 20:06:10 2010 +0530

    MAPREDUCE-711. Move Distributed Cache from Common to Map/Reduce
    
    Patch: https://issues.apache.org/jira/secure/attachment/12430713/711.20S.patch
    Author: Vinod K V
    Ref: YDH

commit be2df477d6dc267f4a4b7c6602c8108ece1cb783
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Jan 19 13:30:55 2010 +0530

    MAPREDUCE-478. separate jvm param for mapper and reducer
    
    Patch: https://issues.apache.org/jira/secure/attachment/12430705/478.20S-1.patch
    Author: Arun C Murthy
    Ref: YDH

commit 44df01f8009c02e7346b69389ee8a26ef824bba2
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Jan 19 12:11:07 2010 +0530

    MAPREDUCE-842. Per-job local data on the TaskTracker node should have right access-control
    
    Patch: https://issues.apache.org/jira/secure/attachment/12430697/842.20S-4.patch
    Author: Vinod K V
    Ref: CDH-648

commit 3b9ed4395593a6f67897126126e7c4c74a35c42c
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Jan 15 19:47:22 2010 +0530

    MAPREDUCE-408. TestKillSubProcesses fails with assertion failure sometimes
    
    Patch: https://issues.apache.org/jira/secure/attachment/12430404/MR-408.v1.1.y20.patch
    Author: Ravi Gummadi
    Ref: CDH-648

commit fc472723ed6f5ca78abd4bfca56584489c485ee1
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Jan 15 19:16:38 2010 +0530

    HADOOP-4041. IsolationRunner does not work as documented
    
    Patch: https://issues.apache.org/jira/secure/attachment/12430398/HADOOP-4041-v4-y20.patch
    Author: Philip Zeyliger
    Ref: YDH

commit c759d3e421565c13c79d6091d1917ce57cbb6636
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Jan 13 22:27:07 2010 -0800

    MAPREDUCE-1316. Fix jobs' retirement from the JobTracker to prevent memory leaks via stale references.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12430197/mapreduce-1316-v1.15-branch20-yahoo.patch
    Author: Amar Kamat
    Ref: YDH

commit 10024643cacf3d40faa870505c83dd344f8ff366
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Jan 13 15:28:34 2010 -0800

    MAPREDUCE-1342. Fixed deadlock in global blacklisting of tasktrackers.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12430116/patch-1342-3-ydist.txt
    Author: Amareshwari Sriramadasu
    Ref: YDH

commit 675d02da77a6db7b98e8a30afda2926fe768fe3e
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Jan 12 15:43:25 2010 -0800

    MAPREDUCE-181. Secure job submission
    
    Patch: https://issues.apache.org/jira/secure/attachment/12430064/181.20.s.3.patch
    Patch: https://issues.apache.org/jira/secure/attachment/12436083/jobclient.patch
    Patch: https://issues.apache.org/jira/secure/attachment/12440358/181.20.s.3.fix.patch
    Author: Devaraj Das
    Ref: CDH-648

commit fda013b025b050107afd17120270ec6e5cb99138
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Jan 12 22:14:37 2010 +0530

    HADOOP-5737. UGI checks in testcases are broken
    
    Patch: https://issues.apache.org/jira/secure/attachment/12430029/HADOOP-5737-y20.patch
    Author: Amar Kamat
    Ref: CDH-648

commit 829bb385ecbde947eeacff39ea2f3f3af703fdbe
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Jan 12 12:34:52 2010 +0530

    HADOOP-5771. Create unit test for LinuxTaskController
    
    Patch: https://issues.apache.org/jira/secure/attachment/12429998/5771.20S.patch
    Author: Sreekanth Ramakrishnan
    Ref: CDH-648

commit 0045482b56026f2858bdd983d09ab2a38e06bfa8
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Jan 8 18:49:32 2010 -0800

    HADOOP-4656, HDFS-685, MAPREDUCE-1083. Add a user to groups mapping service
    
    Patch: https://issues.apache.org/jira/secure/attachment/12429805/MR-1083-0_20.2.patch
    Author: Boris Shkolnik
    Ref: CDH-648

commit bff776eae3e576e29b3f48546d620469c7a65a6f
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Jan 7 11:15:53 2010 -0800

    MAPREDUCE-1250. Refactor job token to use a common token interface
    
    Patch: https://issues.apache.org/jira/secure/attachment/12429629/MR-1250-0_20.2.patch
    Author: Kan Zhang
    Ref: CDH-648

commit f9bf7f1aa9a663f09e3377671722e4bce0fa5f20
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Jan 6 13:53:05 2010 -0800

    MAPREDUCE-1026. Shuffle should be secure
    
    Patch: https://issues.apache.org/jira/secure/attachment/12429584/MR-1026-0_20.2.patch
    Author: Boris Shkolnik
    Ref: CDH-648

commit 8761586736722c1c6f2eb3c7ad7d1842383431b6
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Jan 4 17:54:04 2010 -0800

    HADOOP-4268. Permission checking in fsck
    
    Patch: https://issues.apache.org/jira/secure/attachment/12428975/HADOOP-4268-0_20.2.patch
    Author: Tsz Wo (Nicholas), SZE
    Ref: CDH-648

commit ac18b1312c05f9d85b23686807c9b0120f99eac0
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Jan 4 14:50:55 2010 -0800

    HADOOP-6415. Adding a common token interface for both job token and delegation token
    
    Patch: https://issues.apache.org/jira/secure/attachment/12429399/HADOOP-6415-0_20.2.patch
    Author: Kan Zhang
    Ref: CDH-648

commit dca4c3a73ff34fb37ff2b92c7d6ba2331cd1405d
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Dec 25 13:56:07 2009 -0800

    HDFS-764 and HADOOP-6367. Moving Access Token implementation from Common to HDFS
    
    Patch: https://issues.apache.org/jira/secure/attachment/12428959/HADOOP-6367_HDFS-764-0_20.1.patch
    Author: Kan Zhang
    Ref: CDH-648

commit 6205b10e6bbd702cab014af83061791b35a2248a
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Dec 24 11:50:12 2009 -0800

    HDFS-409. Add more access token tests
    
    Patch: https://issues.apache.org/jira/secure/attachment/12428924/HDFS-409-0_20.4.patch
    Author: Kan Zhang
    Ref: CDH-648

commit c1c67ca1ab0d958c2258c8c1571adb89996f684a
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Dec 24 11:46:03 2009 -0800

    HADOOP-6132. RPC client opens an extra connection for VersionedProtocol
    
    Patch: https://issues.apache.org/jira/secure/attachment/12428925/HADOOP-6132-0_20.1.patch
    Author: Kan Zhang
    Ref: YDH

commit 07ba75bcabd5beeecd41c0c9f54b850304ad9225
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Dec 23 16:39:18 2009 -0800

    Amend HDFS-445. Bring DFSClient block caching code more up to date with trunk
    
    Patch: https://issues.apache.org/jira/secure/attachment/12428885/HDFS-445-0_20.2.patch
    Author: Kan Zhang
    Ref: YDH

commit ad1f19c5727c6457a4e37c26cc1ede0dae3b76ec
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Dec 22 18:05:31 2009 -0800

    HDFS-195. Need to handle access token expiration when re-establishing the pipeline for dfs write
    
    Patch: https://issues.apache.org/jira/secure/attachment/12428788/HDFS-195-0_20.1.patch
    Author: Kan Zhang
    Ref: CDH-648

commit 53782d128507af30429e8c697788aa14fa9849c8
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Dec 22 14:52:12 2009 -0800

    HADOOP-6176. Adding a couple private methods to AccessTokenHandler for testing purposes
    
    Patch: https://issues.apache.org/jira/secure/attachment/12428771/HADOOP-6176-0_20.2.patch.
    Author: Kan Zhang
    Ref: CDH-648

commit 97b3aed79705201bfe0bea392ca19e3fc96cd81e
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Dec 22 12:26:46 2009 -0800

    HADOOP-5824. Remove unused OP_READ_METADATA functionality from Datanode
    
    Patch: https://issues.apache.org/jira/secure/attachment/12428759/HADOOP-5824-0_20.1.patch
    Author: Kan Zhang
    Ref: YDH

commit 459d8d98b1ad0675b0e1525dfa23e445e1f82453
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Dec 22 00:16:39 2009 -0800

    HADOOP-4359. Access Token: Support for data access authorization checking on DataNodes
    
    Patch: https://issues.apache.org/jira/secure/attachment/12428711/HADOOP-4359-0_20.2.patch
    Patch: https://issues.apache.org/jira/secure/attachment/12435352/4359.patch
    Author: Kan Zhang
    Ref: CDH-648

commit b76311584ce48bdc06c6c1103e45cbc2e2cc9112
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Dec 16 23:46:27 2009 +0530

    MAPREDUCE-1100. Truncate user logs to prevent TaskTrackers' disks from filling up.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12428200/MAPREDUCE-1100-20091216.2.txt
    Author: Vinod K V
    Ref: YDH

commit 444beac7f8610cd3ec9433c8fe5e006462a2d07c
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Dec 15 21:49:36 2009 -0800

    HADOOP-6441. Prevent remote XSS attacks in Hostname and UTF-7.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12428133/h-6441.20.patch
    Author: Owen O'Malley
    Ref: CDH-648

commit 18ec4a074b50ad5a7d8b3148da40d58ed0baf768
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Dec 15 20:37:18 2009 -0800

    MAPREDUCE-1063. Document Gridmix benchmark
    
    Patch: https://issues.apache.org/jira/secure/attachment/12427976/M1063-y20-0.patch
    Author: Chris Douglas
    Ref: YDH

commit 2b04facf5a226b592874137356929aca62320648
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Dec 15 20:19:19 2009 -0800

    MAPREDUCE-1124. TestGridmixSubmission fails sometimes
    
    Patch: https://issues.apache.org/jira/secure/attachment/12427971/M1124-y20-1.patch
    Author: Chris Douglas
    Ref: YDH

commit 4e30197fdddc64b48c0a3fb4575cdea4e5eaaf9b
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Dec 15 18:52:33 2009 +0530

    MAPREDUCE-1143. runningMapTasks counter is not properly decremented in case of failed Tasks.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12427898/MAPRED-1143-ydist-9.patch
    Author: rahul k singh
    Ref: YDH

commit e3a25294b2faf7d57fbf69b75060611362a35463
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Dec 15 15:41:45 2009 +0530

    MAPREDUCE-676. Fix Hadoop Vaidya to ensure it works for map-only jobs.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12410257/vaidya-patch-06092009.patch
    Author: Suhas Gogate
    Ref: YDH

commit 600a35b4e22088859c7b12ece7fa67dbfe489c2b
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Dec 15 15:38:41 2009 +0530

    HADOOP-5582. Fix Hadoop Vaidya to use new Counters in org.apache.hadoop.mapreduce package. Contributed by Suhas Gogate.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12407120/vaidya-0.21.0-5582-5764.patch
    Author: Suhas Gogate
    Ref: YDH

commit 4ba755db0d9eae199355905193c424d1a8a78dae
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Dec 14 19:45:50 2009 -0800

    HDFS-595. FsPermission tests need to be updated for new octal configuration parameter from HADOOP-6234
    
    Patch: https://issues.apache.org/jira/secure/attachment/12427977/HDFS-595-Y20.patch
    Author: Jakob Homan
    Ref: YDH

commit 0dbd09e5e1a6f3eaa76ad7a54815d03707da5a27
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Dec 11 13:31:45 2009 +0530

    MAPREDUCE-1171. Allow the read-error notification in shuffle to be configurable.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12427571/patch-1171-1-ydist.txt
    Author: Amareshwari Sriramadasu
    Ref: YDH

commit 29334f33eee06ef864f1fed490da870050e5c7ff
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Dec 11 09:13:53 2009 +0530

    MAPREDUCE-353. Allow shuffle read and connection timeouts to be configurable.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12427566/patch-353-ydist.txt
    Author: Ravi Gummadi
    Ref: YDH

commit a529046a05bfd89965d08b1ec6d80c1a777a8136
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Dec 9 10:23:53 2009 +0530

    MAPREDUCE-754. NPE in expiry thread when a TT is lost
    
    Patch: https://issues.apache.org/jira/secure/attachment/12427347/mapreduce-754-v2.2.1-yahoo.patch
    Author: Amar Kamat
    Ref: YDH

commit 96ee0a0a723e65ca3dbcbbdaece44e3a752256f0
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Dec 8 11:27:55 2009 +0530

    MAPREDUCE-1185. URL to JT webconsole for running job and job history should be the same
    
    Patch: https://issues.apache.org/jira/secure/attachment/12426630/patch-1185-3-ydist.txt
    Author: Amareshwari Sriramadasu
    Ref: YDH

commit 578be5dcfdece1f48aae8809648ae00f646bb040
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Dec 4 15:30:39 2009 -0800

    HDFS-781. Metrics PendingDeletionBlocks is not decremented
    
    Patch: https://issues.apache.org/jira/secure/attachment/12426993/hdfs-781.rel20.1.patch.
    Author: Suresh Srinivas
    Ref: YDH

commit b350799d72e4a4bec1f76527eb8fe02590295785
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Nov 30 16:10:53 2009 +0530

    HADOOP-4933. ConcurrentModificationException in JobHistory.java
    
    Patch: http://issues.apache.org/jira/secure/attachment/12397116/HADOOP-4933-v1.1.patch
    Author: Amar Kamat
    Ref: YDH

commit 9fa324d4ff152e41e2afada5990d26b0ba296e17
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Nov 27 11:38:52 2009 +0530

    MAPREDUCE-1231. Allow distcp checksumming to be skipped for faster startup time
    
    Patch: https://issues.apache.org/jira/secure/attachment/12426265/mapred-1231-y20-v4.patch
    Author: Jothi Padmanabhan
    Ref: YDH

commit becc6bade8b0d4ef4248cd82da7e7d337bc10cbc
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Nov 24 11:42:17 2009 -0800

    HDFS-758. Changes to report decommissioning status on namenode web UI.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12426000/HDFS-758.5.0-20.patch
    Author: Jitendra Nath Pandey
    Ref: YDH

commit 9d9e86678faa54e23c3ac41c1b8fdb6b379e9b5d
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Nov 20 17:01:21 2009 -0800

    HADOOP-6234. Permission configuration files should use octal and symbolic
    
    Patch: https://issues.apache.org/jira/secure/attachment/12425635/COMMON-6234.rel20.1.patch
    Author: Jakob Homan
    Ref: YDH

commit c446d2df912c744705ecc72bf98f424973eb0817
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Nov 19 11:52:38 2009 -0800

    MAPREDUCE-1219. Fixed JobTracker to not collect per-job metrics, thus easing load on it.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12425302/patch-1219-ydist.txt
    Author: Amareshwari Sriramadasu
    Ref: YDH

commit f6b78b61fda941b83973de1dceebf0549c9eaca9
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Nov 17 23:22:41 2009 +0000

    HADOOP-6203. Improve error message when moving to trash fails due to quota issue
    
    Patch: https://issues.apache.org/jira/secure/attachment/12425243/c6203_20091116_0.20.patch
    Author: Boris Shkolnik
    Ref: YDH

commit ec1f09b887a729a7682047248c624a42584d7233
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Nov 16 19:35:16 2009 +0000

    HADOOP-5675. DistCp should not launch a job if it is not necessary
    
    Patch: https://issues.apache.org/jira/secure/attachment/12406687/5675_20090428.patch
    Author: Tsz Wo (Nicholas), SZE
    Ref: YDH

commit f7e4f728e818e137066caaa1f0a277a5485a5080
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Nov 9 16:24:12 2009 -0800

    MAPREDUCE-1196. MAPREDUCE-947 incompatibly changed FileOutputCommitter
    
    Patch: https://issues.apache.org/jira/secure/attachment/12424351/MAPREDUCE-1196_yhadoop20.patch
    Author: Arun C Murthy
    Ref: YDH

commit 03a613af4dafb3212cf52833898f00c2b1f6195d
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Nov 5 17:31:56 2009 -0800

    HDFS-625. ListPathsServlet throws NullPointerException
    
    Patch: https://issues.apache.org/jira/secure/attachment/12424176/hdfs-625.0-20.patch
    Author: Suresh Srinivas
    Ref: YDH

commit 6c2dc76b06cb9967d89e6ab94465e0668b921dfa
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Nov 5 14:46:23 2009 -0800

    HADOOP-6343. Stack trace of any runtime exceptions should be recorded in the server logs.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12424150/HADOOP-6343.0-20.patch
    Author: Jitendra Nath Pandey
    Ref: YDH

commit 3bd620eff0f312008d11e29dffea2fa62457a630
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Oct 29 19:11:25 2009 -0700

    HADOOP-6344. rm and rmr fail to correctly move the user's files to the trash prior to deleting when they are over quota.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12423634/HDFS-740-for-Y20.patch
    Author: Jakob Homan
    Ref: YDH

commit ef18bb354fc9cd2b0f99bc141e094d328f4f1f14
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Oct 29 10:19:27 2009 +0530

    MAPREDUCE-1160. Two log statements at INFO level fill up jobtracker logs
    
    Patch: https://issues.apache.org/jira/secure/attachment/12423534/MAPREDUCE-1160-20.patch
    Author: Ravi Gummadi
    Ref: YDH

commit 93ea3b6b97dd23cc0a69eacd2438f11e8a64be54
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Oct 28 21:12:36 2009 +0530

    MAPREDUCE-1158. running_maps metric is not decremented when the tasks of a job is killed/failed
    
    Patch: https://issues.apache.org/jira/secure/attachment/12423451/1158_yahoo.patch
    Author: Sharad Agarwal
    Ref: YDH

commit 5165b65008a62a834903f46db18b991dfe2aeacf
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Oct 26 09:21:56 2009 +0530

    MAPREDUCE-1062. MRReliability test does not work with retired jobs
    
    Patch: https://issues.apache.org/jira/secure/attachment/12422201/mapreduce-1062-3-ydist.patch
    Author: Sreekanth Ramakrishnan
    Ref: YDH

commit 6ae254e4aef44f833859bb060797bd3177085d4e
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sun Oct 25 17:41:49 2009 +0530

    MAPREDUCE-1090. Modify log statement in Tasktracker log related to memory monitoring to include attempt id.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12423142/MAPREDUCE-1090-20.patch
    Author: Hemanth Yamijala
    Ref: YDH

commit 07d06691d3d22dc7055568b8ca574ff264faf6ac
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sun Oct 25 16:05:47 2009 +0530

    MAPREDUCE-1048. Show total slot usage in cluster summary on jobtracker webui
    
    Patch: http://issues.apache.org/jira/secure/attachment/12423136/MAPREDUCE-1048-20.patch
    Author: Amareshwari Sriramadasu
    Ref: YDH

commit 2041bfbf1352a83f450b8fb6680e3899a3582f5f
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sat Oct 24 17:42:59 2009 +0530

    MAPREDUCE-1103. Additional JobTracker metrics for slot usage
    
    Patch: https://issues.apache.org/jira/secure/attachment/12423030/1103_v5_yahoo_1.patch
    Author: Sharad Agarwal
    Ref: YDH

commit b68a6a3c45b48d7681f5e8dc51571b161a90daec
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Oct 22 20:31:26 2009 +0530

    MAPREDUCE-947. OutputCommitter should have an abortJob method
    
    Patch: https://issues.apache.org/jira/secure/attachment/12422899/mr-947-y20.patch
    Patch: https://issues.apache.org/jira/secure/attachment/12423191/yhadoop20-bug-fix-947.patch
    Author: Amar Kamat
    Ref: YDH

commit a268e40988a356cb7d6912906b88c8752c226656
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Oct 21 23:06:57 2009 +0530

    MAPREDUCE-1105. CapacityScheduler: It should be possible to set queue hard-limit beyond its actual capacity
    
    Patch: https://issues.apache.org/jira/secure/attachment/12422823/MAPREDUCE-1105-yahoo-version20-5.patch
    Author: rahul k singh
    Ref: YDH

commit e529fcd5080e89bc0e759164d7b7cc6fc19d8f69
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Oct 21 11:32:59 2009 +0530

    MAPREDUCE-1086. hadoop commands in streaming tasks are trying to write to tasktracker's log
    
    Patch: https://issues.apache.org/jira/secure/attachment/12422677/MR-1086-yhadoop20.patch
    Author: Ravi Gummadi
    Ref: YDH

commit 4d2f9fdf63f30f0149f60142796838e245e7d564
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sun Oct 18 23:21:37 2009 -0700

    MAPREDUCE-1088. JobHistory files should have narrower 0600 perms
    
    Patch: https://issues.apache.org/jira/secure/attachment/12422526/MAPREDUCE-1088_yhadoop20.patch
    Author: Arun C Murthy
    Ref: CDH-648

commit 13e93cafe8d4b1e8b741c1873118cdba0313a564
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sun Oct 18 23:19:27 2009 -0700

    HADOOP-6304. Use java.io.File.set{Readable|Writable|Executable} where possible in RawLocalFileSystem
    
    Patch: https://issues.apache.org/jira/secure/attachment/12422525/HADOOP-6304_yhadoop20.patch
    Author: Arun C Murthy
    Ref: YDH

commit e5b918e037e5a01a4098b43a20e3437b34022328
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Oct 15 16:26:26 2009 +0530

    HADOOP-6284. Add new HADOOP_JAVA_PLATFORM_OPTS passed to the java PlatformName command
    
    Patch: http://issues.apache.org/jira/secure/attachment/12421342/HADOOP-6284-y0.20.1.patch
    Author: Koji Noguchi
    Ref: YDH

commit 5b18d7b8a13873ca3b0cb3f5da074f3ee846e63c
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Oct 15 16:05:41 2009 +0530

    MAPREDUCE-732. Node health check script should not log "UNHEALTHY" status for every heartbeat in INFO mode
    
    Patch: http://issues.apache.org/jira/secure/attachment/12413001/MAPRED-732-ydist.patch
    Author: Sreekanth Ramakrishnan
    Ref: YDH

commit f2f02dce3f12d9fe445f62c6a28a7e89c1f33efa
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Oct 15 15:47:51 2009 +0530

    MAPREDUCE-144. TaskMemoryManager should log process-tree's status while killing tasks.
    
    Patch: http://issues.apache.org/jira/secure/attachment/12418917/MAPREDUCE-144-20090907.internal.txt
    Author: Vinod K V
    Reason: This helps a lot in debugging why a particular task has gone beyond memory limits.
    Ref: YDH

commit a77ecd569495efc6bee0059eeafeebbfe6c797c4
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Oct 15 11:37:22 2009 +0530

    MAPREDUCE-277. Job history counters should be available on the UI.
    
    Patch: https://issues.apache.org/jira/secure/attachment/12421419/patch-277-0.20.txt
    Author: Jothi Padmanabhan
    Ref: YDH

commit ce8f674e1c1dc6ff24b33210f726ef4b006552b2
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Oct 8 17:00:59 2009 -0700

    HDFS-587. Test programs support only default queue.
    
    Patch: http://issues.apache.org/jira/secure/attachment/12422760/jira.HDFS-587.branch-0.20-internal.1.patch
    Author: Erik Steffl
    Ref: YDH

commit 5ef23c2c9af26024567584fb6308645c58db8088
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Sep 28 14:28:46 2009 -0700

    MAPREDUCE-270. Fix the tasktracker to optionally send an out-of-band heartbeat on task-completion for better job-latency.
    
    Configuration changes: add mapreduce.tasktracker.outofband.heartbeat
    Patch: https://issues.apache.org/jira/secure/attachment/12420718/MAPREDUCE-270_yhadoop20.patch
    Author: Arun C Murthy
    Reason: increase scheduling throughput for short tasks
    Ref: YDH

commit bfa424ff3808e6dc20199ecc7d52f2592afdbd3a
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Sep 28 13:59:32 2009 -0700

    MAPREDUCE-1030. Fix capacity-scheduler to assign a map and a reduce task per-heartbeat.
    
    Patch: http://issues.apache.org/jira/secure/attachment/12420549/MAPREDUCE-1030-2.patch.txt
    Author: rahul k singh
    Ref: YDH

commit 48bfdd9b2a6eac72ac42b0defe5e86501001a7ab
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Sep 28 13:54:07 2009 -0700

    MAPREDUCE-1028. Fixed number of slots occupied by cleanup tasks to one irrespective of slot size for the job.
    
    Patch: http://issues.apache.org/jira/secure/attachment/12420581/yhadoop-0.20-MR1028.patch
    Author: Ravi Gummadi
    Ref: YDH

commit 3ce342baafd3774e4d920a7fcb49a7e091a0cad1
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Sep 28 13:36:31 2009 -0700

    MAPREDUCE-964. Fixed start and finish times of TaskStatus to be consistent, thereby fixing inconsistencies in metering tasks.
    
    Patch: http://issues.apache.org/jira/secure/attachment/12420539/mapreduce-964-ydist.patch
    Patch: http://issues.apache.org/jira/secure/attachment/12420893/mapreduce-964-ydist-1.patch
    Author: Sreekanth Ramakrishnan
    Ref: YDH

commit 2219e76392d0bf29d8c40bf2b60d23d7b188ac3d
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Sep 24 16:08:33 2009 -0700

    HADOOP-5976. Add a new command, classpath, to the hadoop script. Contributed by Owen O'Malley
    
    Patch: http://issues.apache.org/jira/secure/attachment/12420325/script.patch
    Author: Owen O'Malley and Gary Murry
    Ref: YDH

commit 1e8994a568a45a994ef7c2af354ac1ddc2c1586b
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Sep 24 16:07:40 2009 -0700

    HADOOP-5784. Makes the number of heartbeats that should arrive a second at the JobTracker configurable.
    
    Patch: http://issues.apache.org/jira/secure/attachment/12420257/HADOOP-5784_yhadoop20.patch
    Author: Amareshwari Sriramadasu
    Reason: Improve job latency on small clusters
    Ref: YDH

commit eac5f2a5d51414c8fea6ea9792e21cf85433d017
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Sep 24 16:06:26 2009 -0700

    MAPREDUCE-945. Modifies MRBench and TestMapRed to use ToolRunner so that options such as queue name can be passed via command line.
    
    Patch: http://issues.apache.org/jira/secure/attachment/12418910/mapreduce-945-internal-3.8.patch.txt
    Author: Sreekanth Ramakrishnan
    Ref: YDH

commit 073f548e560fd8de055d8d075ac7c5db0239f6cf
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Sep 3 11:25:54 2009 -0700

    HADOOP-6227. Configuration does not lock parameters marked final if they have no value.
    
    Patch: http://issues.apache.org/jira/secure/attachment/12418242/patch-6227-ydist.txt
    Author: Amareshwari Sriramadasu
    Ref: YDH

commit fe36ce2d38b60a1fe1541555e172cc05473debec
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Sep 3 10:54:21 2009 -0700

    Amend HADOOP-5363. Removed pickOneAddress function.
    
    Author: zhiyong zhang
    Ref: YDH

commit e37a57a386abb6f03336097f2d7b0d54d4ec6a82
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon Aug 31 10:23:19 2009 -0700

    HADOOP-5780: Fix slightly confusing log from "-metaSave" on NameNode.
    
    Patch https://issues.apache.org/jira/secure/attachment/12417831/HADOOP-5780.hadoop-0.20.patch
    Author: Raghu Angadi
    Ref: YDH

commit b131d77cefee39b7296530b018d59ca4d1516b01
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Aug 25 09:33:35 2009 -0700

    Amend MAPREDUCE-768. Improved version of JobTracker configuration dump that also dumps job queues
    
    Author: V.V.Chaitanya Krishna
    Ref: YDH

commit c602e3c58dab89470526d912f32ca05260a18e8c
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Aug 18 09:16:07 2009 -0700

    MAPREDUCE-682. Reserved tasktrackers should be removed when a node is globally blacklisted
    
    Patch: http://issues.apache.org/jira/secure/attachment/12414313/mapreduce-682-ydist.patch
    Author: Sreekanth Ramakrishnan
    Ref: YDH

commit 953a6498484ee51bf09691568ecb5e56cdb31034
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Aug 18 09:14:36 2009 -0700

    HADOOP-5420.  Support killing of process groups in LinuxTaskController binary
    
    Author: Sreekanth Ramakrishnan
    Ref: YDH

commit e2a79393fa3a9f88029f289e89831c5dcbd7274c
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Aug 18 09:12:41 2009 -0700

    HADOOP-5488. HADOOP-2721 doesn't clean up descendant processes of a jvm
    that exits cleanly after running a task successfully
    
    Author: Ravi Gummadi
    Reason: Avoid zombie processes
    Ref: YDH

commit 000ab92d9544f483b3c59c0c100154badf8fd8a6
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Aug 18 09:07:19 2009 -0700

    MAPREDUCE-467. Collect information about number of tasks succeeded / total per time unit for a tasktracker.
    
    Author: Sharad Agarwal
    Reason: Useful operational feature
    Ref: YDH

commit ef2406bed6475cd6665f3601e9d78972beed739f
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Aug 13 09:35:35 2009 -0700

    MAPREDUCE-817. Add a cache for retired jobs with minimal job info and provide a way to access history file url
    
    Author: Sharad Agarwal
    Reason: Reduces memory usage of JT for completed jobs
    Ref: YDH

commit ff22ad890d9228399e36846f308dd42f96c49fde
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Jul 30 17:40:49 2009 -0700

    MAPREDUCE-809. Job summary logs from MAPREDUCE-740 show status of completed jobs as RUNNING
    
    Author: Arun C Murthy
    Reason: Bug fix for MAPREDUCE-740
    Ref: YDH

commit 9ed072be95517e09cbc78333abbc3d5129e2db7d
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Jul 30 17:40:48 2009 -0700

    MAPREDUCE-740. Log a job-summary at the end of a job, while allowing it to be
    configured to use a custom appender if desired.
    
    Author: Arun C Murthy
    Ref: YDH

commit cdd93ee3bca2b400f7b193c5b6527705262c4769
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Jul 30 17:40:47 2009 -0700

    MAPREDUCE-771. Setup and cleanup tasks remain in UNASSIGNED state for a long
    time on tasktrackers with long running high RAM tasks.
    
    Author: Hemanth Yamijala
    Reason: Bug fix
    Ref: YDH

commit e53741132f4e458382899f5181e4c3a45a199113
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Jul 30 17:40:47 2009 -0700

    MAPREDUCE-733. When running ant test TestTrackerBlacklistAcrossJobs, losing task tracker heartbeat.
    
    Author: Arun C Murthy
    Reason: Bug fix
    Ref: YDH

commit ce660087bdc95831ee5d2d18621bbdafb2c7e3fb
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Jul 30 17:40:46 2009 -0700

    MAPREDUCE-734. ConcurrentModificationException observed in unreserving slots for HiRam Jobs
    
    Author: Arun Murthy
    Ref: YDH

commit a44f3f66cbc30bf5493aa6a3d21c3b6ca42fbac6
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Jul 30 17:40:44 2009 -0700

    MAPREDUCE-693. Conf files not moved to "done" subdirectory after JT restart
    
    Author: Amar Kamat
    Reason: Improves stability of JobTracker job recovery
    Ref: YDH

commit 9e729a1e4afd7f691dfd86f38cb89788e8eeee00
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Jul 30 17:40:44 2009 -0700

    MAPREDUCE-722. More slots are getting reserved for HiRAM job tasks then required
    
    Author: Vinod K V
    Reason: More slots were getting reserved for HiRAM job tasks then required
    Ref: YDH

commit 45605c6b29c206b9ed3ec2324f4f709c914ca1e3
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Jul 30 17:40:42 2009 -0700

    MAPREDUCE-709. Node health check script does not display the correct message on timeout
    
    Author: Sreekanth Ramakrishnan
    Reason: Improve usefulness of health check feature
    Ref: YDH

commit 5c24b7d50ba0960f694bce33332e61fe7c5abe68
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Jul 30 17:40:41 2009 -0700

    MAPREDUCE-732. Removed spurious log statements in the node blacklisting logic.
    
    Author: Sreekanth Ramakrishnan
    Ref: YDH

commit 6b1a17e13ddaf20b519eba0b49d4b0e8717bd5b9
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Jul 30 17:40:40 2009 -0700

    MAPREDUCE-522. Rewrite TestQueueCapacities to make it simpler and avoid timeout errors
    
    Author: Sreekanth Ramakrishnan
    Reason: Fix unit test failures
    Ref: YDH

commit 73597dcbf6f791bd6e01c3096d41fe65ddc2034c
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Jul 30 17:40:40 2009 -0700

    MAPREDUCE-532. Allow admins of the Capacity Scheduler to set a hard-limit on the capacity of a queue
    
    Reason: There should be a mechanism to cap the capacity available for a queue/job.
    Author: Rahul K Singh
    Ref: YDH

commit aea5743326793c6f5aa6dc7f7fc5baf5752528d9
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Jul 30 17:40:38 2009 -0700

    MAPREDUCE-211. Provide a node health check script and run it periodically to check the node health status
    
    Reason: Adds ability to preemptively blacklist task-trackers when node health is bad
    Author: Sreekanth Ramakrishnan
    Ref: YDH

commit a89847e2c69619eff9ced8b86c81bfab321a9918
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Jul 30 17:40:34 2009 -0700

    MAPREDUCE-516. Fix the 'cluster drain' problem in the Capacity Scheduler wrt High RAM Jobs
    
    Reason: When a HighRAMJob turns up at the head of the queue, the current implementation
            of support for HighRAMJobs in the Capacity Scheduler has a problem in that the
            scheduler stops assigning tasks to all TaskTrackers in the cluster until a
            HighRAMJob finds a suitable TaskTrackers for all its tasks.
    Author: Arun C Murthy
    Ref: YDH

commit 7a6862110776544476ac1066e3dbade4d1456567
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Jul 30 17:40:31 2009 -0700

    HADOOP-5980. LD_LIBRARY_PATH not passed to tasks spawned off by LinuxTaskController
    
    Reason: Security
    Author: Sreekanth Ramakrishnan
    Ref: CDH-648

commit d37609510f33ad26bbe6bf3c3d235b34b804f93a
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Jul 30 17:40:28 2009 -0700

    HADOOP-5420. Support killing of process groups in LinuxTaskController binary
    
    Reason: Security - prevent orphaning forked child processes
    Author: Sreekanth Ramakrishnan
    Ref: CDH-648

commit 4c3c667f54a058d0f2e746ceb2e744f56dd9515a
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Jul 30 17:40:24 2009 -0700

    HADOOP-5801. JobTracker should refresh the hosts list upon recovery
    
    Reason: YDH
    Author: Amar Kamat
    Ref: YDH

commit 91d28f32f9db514661cc9bd755c8e85756c09cfc
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Jul 30 17:40:26 2009 -0700

    HADOOP-5818. Revert the renaming from checkSuperuserPrivilege to checkAccess by HADOOP-5643
    
    Author: Amar Kamat
    Ref: YDH

commit b46f960ff5488b6d6ace47e127257eb1b0fbc330
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Jul 30 17:40:23 2009 -0700

    HADOOP-5643. Add ability to blacklist a TaskTracker
    
    Author: Amar Kamat
    Ref: YDH

commit ebb508c5a286dc3939d960fbf44ca18b34f1c12f
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Jul 30 17:40:22 2009 -0700

    HADOOP-5419. Provide a way for users to find out what operations they can do on which M/R queues
    
    Reason: Security
    Author: Rahul K Singh
    Ref: CDH-648

commit feb0e489f3e9757db541ea1694fe49f902e93f8c
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Jul 30 17:40:17 2009 -0700

    HADOOP-5739 / MAPREDUCE-521. After JobTracker restart Capacity Scheduler does not schedule pending tasks from already running tasks.
    
    Reason: YDH
    Author: Rahul K Singh
    Ref: YDH

commit 32bac3250a29cc47985fc88edadf0844d2519045
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Jul 30 17:40:17 2009 -0700

    HADOOP-5396. Queue ACLs should be refreshed without requiring a restart of the Job Tracker
    
    Reason: Security
    Author: Vinod K V
    Ref: CDH-648

commit cd043f04714cf1a9940fe4351d0919011f8e9f86
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Jul 30 17:40:15 2009 -0700

    HADOOP-4490. Tasks should run as the user who submitted the jobe
    
    Reason: Security
    Author: Hemanth Yamijala
    Ref: CDH-648

commit c64b6a0deb1311e410f01e5d94b9498795cbbaef
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Jul 30 17:40:11 2009 -0700

    HADOOP-4930. Implement setuid executable for Linux to launch tasks as job owners
    
    Reason: Security
    Author: Sreekanth Ramakrishnan
    Ref: CDH-648

commit 5b5972174da804fb6dcb4d0723208bfa42366a31
Author: Eli Collins <eli@cloudera.com>
Date:   Tue Aug 24 16:18:50 2010 -0700

    CLOUDERA-BUILD. Revert scribe log4j.
    
    Ref: CDH-742

commit 2a37b553b8f446f03cb3610b2f7a84f54064f812
Author: Eli Collins <eli@cloudera.com>
Date:   Tue Aug 24 14:01:25 2010 -0700

    CLOUDERA-BUILD. Revert scribe log4j.
    
    Revert "CLOUDERA-BUILD. Apply Scribe patches to Hadoop"
    
    This reverts commit cb7a3677942c1d2f9e0d2a75dbffa09fa6125e61.
    
    Conflicts:
    
    	src/contrib/scribe-log4j/ivy.xml
    
    Ref: CDH-742

commit ea2a876095da80eccebca35890f437307843eb2c
Author: Eli Collins <eli@cloudera.com>
Date:   Tue Aug 24 13:55:39 2010 -0700

    CLOUDERA-BUILD. Revert scribe log4j.
    
    Revert "CLOUDERA-BUILD. Add dependency libraries for Scribe/log4j"
    
    This reverts commit aaeb69f8dda72a2e7aecacd622e99c00bc961efa.
    
    Ref: CDH-742

commit e463bba27fcae3ea83a8d33a64a8c1c38c2a7578
Author: Eli Collins <eli@cloudera.com>
Date:   Tue Aug 24 13:41:13 2010 -0700

    CLOUDERA-BUILD. Revert scribe log4j.
    
    Revert "CLOUDERA-BUILD. Fix scribe-log4j's ivy.xml to properly get log4j on the compile classpath"
    
    This reverts commit 349281bfa0243f5adbbd459266f4a9ac7ac8c1cc.
    
    Ref: CDH-742

commit c912024353450a0fa2c53a95500b4ed653f76129
Author: Eli Collins <eli@cloudera.com>
Date:   Tue Aug 24 22:53:23 2010 -0700

    MAPREDUCE-118. Job.getJobID() will always return null.
    
    Reason: Bug
    Author: Amareshwari Sriramadasu
    Ref: DISTRO-20

commit be7cd3b5cec66c22b58caa8053de4258826e7c08
Author: Eli Collins <eli@cloudera.com>
Date:   Wed Aug 11 15:07:00 2010 -0700

    CLOUDERA-BUILD. Update the default build version.

commit 506dc096fcc4a288fc853dfb527d7fa8888dd6f6
Author: Bruno Mahé <bruno@cloudera.com>
Date:   Fri Jul 16 19:51:45 2010 -0700

    CDH-1085. $SYSTEM_LIB_DIR default value shouldn't contain $PREFIX.
    
    Description: $SYSTEM_LIB_DIR default value shouldn't contain $PREFIX.
    $PREFIX will be prepended later on
    Reason: Bug
    Author: Bruno Mahe
    Ref: CDH-1085

commit b7cba5f7ab2cb9f2240b45dd90c34f4974c5757a
Author: Bruno Mahé <bruno@cloudera.com>
Date:   Mon Jul 12 20:17:48 2010 -0700

    CDH-1085. Native libraries should be installed in /usr/lib64/ on 64bit redhat
    
    Description: On 64bit redhat, native libraries should be installed in /usr/lib64/ instead of
    /usr/lib/. This patch makes possible to override the destination of native libraries and will default to
    /usr/lib/.
    Reason: Bug
    Author: Bruno Mahe
    Ref: CDH-1085

commit 9b72d268a0b590b4fd7d13aca17c1c453f8bc957
Author: Eli Collins <eli@cloudera.com>
Date:   Sun Jun 27 18:42:45 2010 -0700

    CLOUDERA-BUILD. Make symlinks so old hadoop jar names are preserved (CDH-1543).

commit 4c50269dda2038d202ddb890ffde38dc3fb2ead2
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Thu Jun 24 18:25:09 2010 -0700

    MAPREDUCE-1887. MRAsyncDiskService does not properly absolutize volume root paths.
    
    Description: In MRAsyncDiskService, volume names are sometimes specified as
    relative paths, which are not converted to absolute paths. This can cause
    errors of the form "cannot delete &lt;/full/path/to/foo&gt; since it is outside of
    &lt;relative/volume/root&gt;" even though the actual path is inside the root.
    Reason: Bug
    Author: Aaron Kimball
    Ref: CDH-1509

commit 43ccf90369692c4d8b7d13a7f04b0864c55f615a
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Jun 23 17:35:08 2010 -0700

    HDFS-1266. Add Apache License Notice to several places where it was missing
    
    Description: Adds license headers to source code
    Reason: Apache policy
    Author: Todd Lipcon
    Ref: CDH-1495

commit bf08bde983501e3ce8ebf6197049262518580611
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Jun 23 16:14:50 2010 -0700

    HDFS-1260. tryUpdateBlock should do validation before renaming meta file
    
    Description: Solves bug where block became inaccessible in certain failure
                 conditions (particularly network partitions). Observed under
                 HBase workload at user site.
    Reason: Potential loss of synced data when write pipeline fails
    Author: Todd Lipcon
    Ref: CDH-659

commit 7243001d5511922f293f0641cb8dbc0af4850dae
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Jun 18 16:13:45 2010 -0700

    HDFS-1254. Enable append feature by default
    
    Description: Changes dfs.support.append to "true" in hdfs-default.xml
    Reason: Append/sync have been tested in CDH3b2 and are safe to use.
    Author: Dhruba Borthakur
    Ref: CDH-659

commit 0e1d71c08923bb4c4172ef043b0b2d82f95b92fa
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sat Jun 19 16:26:39 2010 -0700

    HDFS-1252. Updates to TestDFSConcurrentFileOperations (test was previously broken)
    
    Description: Fixes TestDFSConcurrentFileOperations to test the correct
                 semantics for sync feature
    Reason: Test was previously flaky
    Author: Todd Lipcon
    Ref: CDH-659

commit 829497f4867a0e92da712faf02f83c7087df07ce
Author: Eli Collins <eli@cloudera.com>
Date:   Fri Jun 18 19:31:58 2010 -0700

    CLOUDERA-BUILD. Remove Sqoop from the build.

commit 298fda37c4c25434a15886ee9c261e566d595dff
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Jun 18 18:42:37 2010 -0700

    HADOOP-5203. TT's version build is too restrictive.
    
    Description: Use the md5sum checksum of the source for determining version compatibility.
    Reason: Improvement
    Author: Rick Cox (0.20 backport by Bill Au)
    Ref: CDH-1139

commit f07b2df591b91c7de50e8dbb526cf11b27a32a6f
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Jun 18 17:58:53 2010 -0700

    MAPREDUCE-679. XML-based metrics as JSP servlet for JobTracker
    
    Description: A simple XML translation of the existing JobTracker status page
    which provides the same metrics (including the tables of
    running/completed/failed jobs) as the human-readable page. This is a
    relatively lightweight addition to provide some machine-understandable metrics
    reporting.
    Reason: Improvement
    Author: Aaron Kimball
    Ref: CDH-651

commit d8dc8dad821a02619afdbfc3d1cb978b86cb071b
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Jun 18 17:24:07 2010 -0700

    MAPREDUCE-1372. ConcurrentModificationException in JobInProgress
    
    Description: Fixes a ConcurrentModificationException in JobInProgress
    Reason: Bug
    Author: Dick King
    Ref: CDH-546

commit e212ca0b0abbd78cdea4596fe9f3c6dbbaa57258
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Jun 18 16:20:01 2010 -0700

    MAPREDUCE-1378. Args in job details links on jobhistory.jsp are not URL encoded
    
    Description: The logFile argument in the job links on the JT jobhistory.jsp
    page is not properly URL encoded leading to links that result in 500 errors.
    Reason: Bug
    Author: Eric Sammer
    Ref: CDH-645

commit 23e68e669a118d34e265af5e8ffda3615c2666f9
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Jun 18 15:52:15 2010 -0700

    MAPREDUCE-1570. Shuffle stage - Key and Group Comparators
    
    Description: Shuffle method in org.apache.hadoop.mrunit.MapReduceDriverBase
    doesn't currently allow the use of custom GroupingComparator and
    SortComparator. This patch adds these features.
    Reason: Improvement
    Author: Chris White
    Ref: CDH-958

commit 4601521a9793255e8b5881d64ff1a921451bc951
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Jun 18 15:48:41 2010 -0700

    MAPREDUCE-739. Allow relative paths to be created inside archives.
    
    Description: Allow creating archives with relative paths with a -p option on
    the command line.  Archives currently stores the full path from the input
    sources – since it allows multiple sources and regular expressions as inputs.
    So the created archives have the full path of the input sources.  This is un
    intuitive and a user hassle. We should get rid of it and allow users to say
    that the created archive should be relative to some absolute path and throw an
    exception if the input does not confirm to the relative absolute path.
    Reason: Improvement
    Author: Mahadev konar
    Ref: CDH-501

commit 1d4e15f0f8b749981d62bfca9849e0d0493afdad
Author: Todd Lipcon <todd@lipcon.org>
Date:   Thu Jun 17 20:02:51 2010 -0700

    HDFS-1247. Improvements to HDFS-1204 test
    
    Reason: Fixes compile warnings
    Author: Todd Lipcon
    Ref: CDH-659

commit 1fab52d87c29bc7117eb7324d1a152d8d889f62b
Author: Todd Lipcon <todd@lipcon.org>
Date:   Wed Jun 2 18:25:11 2010 -0700

    HDFS-1246. Manual tool to test sync on a cluster
    
    Description: Tool for automated testing that sync maintains every edit after kill -9
    Reason: Cluster Testing of Sync support for CDH3
    Author: Todd Lipcon
    Ref: CDH-659

commit b9259a145f516a01ba37a33b3803c88824fd55e5
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Jun 17 09:55:31 2010 -0700

    HDFS-1240. Fix failing TestDFSShell due to HDFS-909 backport on branch-20
    
    Reason: Fix red build
    Author: Todd Lipcon
    Ref: CDH-659

commit 7276208c2789f2c3961c6dc9fa1d2757774971b1
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Jun 16 12:16:25 2010 -0700

    HDFS-1243. Replication tests in TestFileAppend4 should wait for a second for replication to occur
    
    Reason: Test error - fix sporadic failure of TestFileAppend4
    Author: Todd Lipcon
    Ref: CDH-659

commit dc1797ec8380b07117bbc6d662e2f1f56b25e6bd
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue Jun 15 17:56:43 2010 -0700

    HDFS-1207. stallReplicationWork should be marked volatile in FSNamesystem
    
    Description: Small bug fix for code used by tests only
    Reason: Fix sporadic failure of TestFileAppend4
    Author: Todd Lipcon
    Ref: CDH-659

commit a960eea40dbd6a4e87072bdf73ac3b62e772f70a
Author: Todd Lipcon <todd@lipcon.org>
Date:   Sun Jun 13 23:02:38 2010 -0700

    HDFS-1197. Received blocks should not be added to block map prematurely for under construction files
    
    Description: Fixes a possible dataloss scenario when using append() on
                 real-life clusters. Also augments unit tests to uncover
                 similar bugs in the future by simulating latency when
                 reporting blocks received by datanodes.
    Reason: Append support dataloss bug
    Author: Todd Lipcon
    Ref: CDH-659

commit 3cc1405289ac4ec6616a5ba9da18ff421a93678e
Author: Todd Lipcon <todd@lipcon.org>
Date:   Mon Jun 14 01:43:18 2010 -0700

    HDFS-1209. Add parameter dfs.client.block.recovery.retries to determine how many times to try to recover block
    
    Reason: Used by append tests
    Author: Todd Lipcon
    Ref: CDH-659

commit 128395ae4d317204fe8fb118333270826adf96d5
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sun Jun 6 16:38:21 2010 -0400

    HDFS-1118. DFSOutputStream socket leak when can't connect to DN
    
    Reason: Fixes DFS Client socket leaks in an error condition
    Author: Zheng Shao
    Ref: CDH-659

commit 4ba384d2b9f92f7300ce06b35a967e4edc3ba671
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Jun 4 15:10:00 2010 -0700

    HADOOP-6762. Interrupting a thread performing an RPC should not hang that thread.
    
    Description: Moves the sending of parameters for RPC calls to a separate
                 thread, such that interrupting a thread that is making
                 an RPC call does not negatively affect the shared RPC channel.
    Reason: Fixes occasional hangs of HBase under heavy load during failure
            testing.
    Author: Sam Rash
    Ref: CDH-659, CDH-1084

commit 6e99c7e2a12eea782629337f5fb5734e8e5e5865
Author: Todd Lipcon <todd@lipcon.org>
Date:   Wed Jun 2 22:32:45 2010 -0700

    HDFS-1210. DFSClient should print IOE that caused recovery failure
    
    Description: Adds an extra WARN message during DFS client error recovery
    Reason: Makes it easier to debug/diagnose recovery issues
    Author: Todd Lipcon
    Ref: CDH-659

commit 1b8d8c3de261c8334d6eac4f5d3fd42cad894e81
Author: Todd Lipcon <todd@lipcon.org>
Date:   Wed Jun 2 21:53:01 2010 -0700

    HDFS-1186. Writers should be interrupted when recovery is started, not when it's completed.
    
    Description: When the write pipeline recovery process is initiated, this
                 interrupts any concurrent writers to the block under recovery.
                 This prevents a case where some edits may be lost if the
                 writer has lost its lease but continues to write (eg due to
                 a garbage collection pause)
    Reason: Fixes a potential dataloss bug
    Author: Todd Lipcon
    Ref: CDH-659

commit 2ec4301341b249acd0c0cac1792aaa6a6dabab8e
Author: Todd Lipcon <todd@lipcon.org>
Date:   Thu May 20 00:23:20 2010 -0700

    HDFS-915. Write pipeline hangs for too long when ResponseProcessor hits timeout
    
    Description: Previously, the write pipeline would hang for the entire write
                 timeout when it encountered a read timeout (eg due to a
                 network connectivity issue). This patch interrupts the writing
                 thread when a read error occurs.
    Reason: Faster recovery from pipeline failure for HBase and other
            interactive applications.
    Author: Todd Lipcon
    Ref: CDH-659

commit 641090318603c47bfd55e1eea2b039f37e5b723a
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri May 14 19:20:10 2010 -0700

    HDFS-1218. Replicas that are recovered during DN startup should not be allowed to truncate better replicas.
    
    Description: If a datanode loses power and then recovers, its replicas
                 may be truncated due to the recovery of the local FS
                 journal. This patch ensures that a replica truncated by
                 a power loss does not truncate the block on HDFS.
    Reason: Potential dataloss bug uncovered by power failure simulation
    Author: Todd Lipcon
    Ref: CDH-659

commit 46f2b3ad578ea1d2ee2cca4e6467ba2daa57df0e
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri May 14 19:34:09 2010 -0700

    HDFS-445. pread should refetch block locations when necessary
    
    Description: The positional read API in DFSInputStream was previously
                 missing any retry logic. This patch adds this logic.
    Reason: HBase and other applications depend on the pread API.
    Author: Kan Zhang
    Ref: CDH-659

commit aea067a20e16345f307de7efe80935dd7addbe6b
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri May 14 19:19:56 2010 -0700

    HDFS-1204. LeaseManager expiring leases should only expire the single file, not entire lease
    
    Reason: Logic bug in lease recovery could cause incorrectly interrupted
            writers
    Author: Sam Rash
    Ref: CDH-659

commit 10e5944da20d851a847cb2ef422383507d070085
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu May 13 16:33:15 2010 -0700

    HDFS-1242. Add unit test for the appendFile race condition / synchronization bug fixed in HDFS-142
    
    Reason: Test coverage for previously applied patch.
    Author: Todd Lipcon
    Ref: CDH-659

commit 18174a2abc5a91105ae1adc2bda026d90c41a60b
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed May 12 20:06:33 2010 -0700

    HDFS-1202. Don't try to update block scan status if block scanner is not initialized yet
    
    Reason: Fixes NPE seen at DataNode startup
    Author: Todd Lipcon
    Ref: CDH-659

commit ca9e1b3c59b05de9dc4fafa19f24dca80110bcc0
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed May 12 19:28:56 2010 -0700

    HDFS-1205. Make async disk service threads nameable
    
    Description: HDFS-611 moved some datanode operations to a separate thread
                 pool. This patch ensures that these worker threads have
                 clear names.
    Reason: Aids debugging/diagnosing of issues
    Author: Todd Lipcon
    Ref: CDH-659

commit 1b8316d403ac542772c0745159a7397c798a5698
Author: Todd Lipcon <todd@cloudera.com>
Date:   Tue May 11 16:47:47 2010 -0700

    HDFS-606. Avoid ConcurrentModification in replica invalidation
    
    Description: Replica invalidation iterated over a collection that it
                 also modified, causing a CME. This patch makes a copy
                 before iteration. Performance should be unaffected
                 as this is a rare code path.
    Reason: Avoid runtime exception in namenode
    Author: Konstantin Shvachko
    Ref: CDH-659

commit b7f908bc77d9344c36dcc409bbfe92709b98cf88
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu May 6 08:52:18 2010 -0700

    HDFS-1244. Misc improvements to TestFileAppend2
    
    Description: Improvements made to a test case to enable it to be run
                 from the command line, with the various test parameters
                 available in arguments.
    Reason: Enable long-running stress tests of append functionality.
    Author: Todd Lipcon
    Ref: CDH-659

commit 370c9a1e75cc5d5e93cec066006ada0485139fb8
Author: Todd Lipcon <todd@lipcon.org>
Date:   Tue Jun 15 18:48:58 2010 -0700

    HDFS-1141. completeFile should check lease holder
    
    Description: Fixes a bug where a writer could finalize an in-progress
                 file after it had already lost its lease. This could occur
                 for example if the writer entered a GC pause after finishing
                 the last block but before finalizing the file.
    Reason: Potential dataloss bug with append/sync
    Author: Todd Lipcon
    Ref: CDH-659

commit 7f0d67fa52b9c58360b06e851bf77bc2f909f65f
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed May 5 14:43:40 2010 -0700

    HDFS-1215. Fix TestNodeCount to not infinite loop after HDFS-409 MiniCluster changes
    
    Description: Fixes a test to work properly after some test infrastructure
                 was changed by HDFS-142 in branch-0.20-append.
    Reason: Fixes failing test.
    Author: Todd Lipcon
    Ref: CDH-659

commit 77ac4f46fb5c011b5ac7c5fedb4c51b31580c9ba
Author: Todd Lipcon <todd@lipcon.org>
Date:   Tue Jun 15 18:33:58 2010 -0700

    HDFS-1248. Miscellaneous cleanup and improvements on 0.20 append branch
    
    Description: Miscellaneous code cleanup and logging changes, including:
     - Slight cleanup to recoverFile() function in TestFileAppend4
     - Improve error messages on OP_READ_BLOCK
     - Some comment cleanup in FSNamesystem
     - Remove toInodeUnderConstruction (was not used)
     - Add some checks for null blocks in FSNamesystem to avoid a possible NPE
     - Only log "inconsistent size" warnings at WARN level for non-under-construction blocks.
     - Redundant addStoredBlock calls are also not worthy of WARN level
     - Add some extra information to a warning in ReplicationTargetChooser
    Reason: Improves diagnosis of error cases and clarity of code
    Author: Todd Lipcon
    Ref: CDH-659

commit 46e6199d8819538d96c3f4c5dbbfba163382b2a9
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon May 3 15:02:32 2010 -0700

    HDFS-1122. Don't allow client verification to prematurely add inprogress blocks to DataBlockScanner
    
    Description: When a client reads a block that is also open for writing,
                 it should not add it to the datanode block scanner.
                 If it does, the block scanner can incorrectly mark the
                 block as corrupt, causing data loss.
    Reason: Potential dataloss with concurrent writer-reader case.
    Author: Sam Rash
    Ref: CDH-659

commit 07711a4ea3edd1a504eb9bbb13c93d5573620d34
Author: Todd Lipcon <todd@cloudera.com>
Date:   Mon May 3 12:04:49 2010 -0700

    HDFS-1057. Fixes for concurrent readers behind an appended file
    
    Description: Allows a client to read a file while it is still being
                 written by a writer, so long as the writer has called
                 sync().
    Reason: Used by HBase replication, and useful for other "tail"-like
            applications.
    Author: Sam Rash
    Ref: CDH-659

commit 587de668e43486f7109a885f617b9b757d7a649e
Author: Todd Lipcon <todd@cloudera.com>
Date:   Sat Apr 24 17:33:34 2010 -0700

    HADOOP-6722. Workaround a TCP spec quirk by not allowing NetUtils.connect to connect to itself
    
    Description: TCP's ephemeral port assignment results in the possibility
                 that a client can connect back to its own outgoing socket,
                 resulting in failed RPCs or datanode transfers.
    Reason: Fixes intermittent errors in cluster testing with ephemeral
            IPC/transceiver ports on datanodes.
    Author: Todd Lipcon
    Ref: CDH-659

commit 7a93fcc8c22b7cff87221ec0a8bf8f6689f12b82
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Apr 22 10:24:59 2010 -0700

    HDFS-1203. Add small sleep to prevent DN flooding NN in error cases
    
    Description: If the datanode experiences an error in sending its block
                 reports to the name node, it previously would loop retrying
                 with no delay between attempts. In the case that the DN
                 is sending an invalid report, this will flood the NN with
                 RPCs. This patch adds a short sleep before the retry.
    Reason: Prevents possible flood of RPCs to the NameNode in DN error
            conditions.
    Author: Todd Lipcon
    Ref: CDH-659

commit a30c033c1eed744948ddfddb82b81b06e12bba46
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri Apr 16 15:19:08 2010 -0700

    HDFS-561. Fix read timeouts in write pipeline to stage correctly
    
    Description: Previously, the read timeout on the write pipeline was
                 incorrectly calculated. This caused the client to detect
                 the wrong failed datanode when a datanode's network
                 failed or froze for another reason.
    Reason: Fix recovery behavior for frozen datanodes
    Author: Kan Zhang
    Ref: CDH-659

commit 02ab12541a004d67a96428055a58a3b726c1c4b6
Author: Todd Lipcon <todd@cloudera.com>
Date:   Thu Apr 15 01:04:43 2010 -0700

    HDFS-895. Allow hflush/sync to operate in parallel with other writers
    
    Description: Modifies synchronization of the DFSOutputStream sync feature
                 such that multiple threads can sync the same stream
                 concurrently and each will wait only the minimal amount
                 of time. Also allows further writes to continue past the
                 sync point while the sync waits.
    Reason: Substantial performance improvement for durable HBase
    Author: Todd Lipcon
    Ref: CDH-659

commit d1c4359e1abc3f3e5e4fa16ee1c83a3d7f015da3
Author: Todd Lipcon <todd@cloudera.com>
Date:   Wed Apr 14 14:59:39 2010 -0700

    HDFS-1211. BlockReceiver logs too much at INFO level when using sync()
    
    Description: Reduces the log level from INFO to DEBUG for a common message
                 in the datanode log when using the sync feature.
    Reason: Substantially reduces DN log chattiness for syncing clients.
    Author: Todd Lipcon
    Ref: CDH-659

commit 23cfa9e8263ad1d92814b5829e2f50bb37d57857
Author: todd <todd@monster01.sf.cloudera.com>
Date:   Sun Mar 21 16:25:48 2010 -0700

    HDFS-1056. Fix possible multinode deadlocks during block recovery when using ephemeral dataxceiver ports
    
    Description: Fixes the logic by which datanodes identify local RPC targets
                 during block recovery for the case when the datanode
                 is configured with an ephemeral data transceiver port.
    Reason: Potential internode deadlock for clusters using ephemeral ports
    Author: Todd Lipcon
    Ref: CDH-659

commit 08cbce1e413e98d0aaeceeaca26a60c3d9a50b29
Author: todd <todd@monster01.sf.cloudera.com>
Date:   Sun Mar 21 14:56:56 2010 -0700

    HDFS-611. Move block deletions to an async thread. Applying this to make the HDFS-142 patch apply cleanly
    
    Description: Moves the deletion of blocks in the datanode into a thread
                 pool. Substantially improves datanode heartbeat consistency
                 for workloads with heavy deletes and/or lots of disks.
    Reason: Substantially reduces frequency of "could not complete block"
            errors and needless re-replication on clusters with lots of disks
            or heavy deletes.
    Author: Zheng Shao
    Ref: CDH-659

commit 57783d0683f0d675423369e0a0f9f5dd520c17f2
Author: todd <todd@monster01.sf.cloudera.com>
Date:   Sun Mar 21 03:36:45 2010 -0700

    HDFS-1055. Improve thread naming in DN Xceiver
    
    Description: Names the threads created by the DataNode based on the action
                 they are performing.
    Reason: Eases diagnosis of datanode performance/lock contention issues.
    Author: Todd Lipcon
    Ref: CDH-659

commit fddb2bd057e88506a1bb94232426053d1640a34b
Author: todd <todd@monster01.sf.cloudera.com>
Date:   Sun Mar 21 03:36:29 2010 -0700

    HDFS-894. Fix ipcPort tracking in Datanode registration. TODO: add the test case from JIRA
    
    Description: Fixes the NameNode to properly reregister datanodes when they
                 crash and restart with a different IPC port (eg when IPC port
                 is configured to be ephemeral)
    Reason: Fixes errors on clusters with ephemeral ports.
    Author: Todd Lipcon
    Ref: CDH-659

commit bc5217543eccc2cfd8a182cdbb051b39d2abf3e7
Author: Dhruba Borthakur <dhruba@apache.org>
Date:   Fri Jun 11 23:37:38 2010 +0000

    HDFS-1054. remove sleep before retry for allocating a block.
    
    Description: When the write pipeline fails to allocate a new block,
                 it previously slept for hard-coded 6 seconds before
                 retrying. This sleep has little reasoning behind it,
                 so is removed.
    Reason: Improve failure recovery performance for interactive applications
            like HBase.
    Author: Todd Lipcon
    Ref: CDH-931

commit 870c7526a3e6a632eb23cf14f9011f279181a759
Author: Dhruba Borthakur <dhruba@apache.org>
Date:   Thu Jun 10 22:25:39 2010 +0000

    HDFS-142. Blocks that are being written by a client are stored in the blocksBeingWritten directory.
    
    git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-append@953482 13f79535-47bb-0310-9956-ffa450edef68
    
    Description: Moves blocks being written by clients into a different
                 directory in dfs.data.dir. Also fixes several other bugs
                 in the datanode and namenode to support various error
                 conditions related to append and sync.
    Reason: Necessary for proper recovery of synced data in several error conditions.
    Author: Dhruba Borthakur, Nicolas Spiegelberg, Todd Lipcon
    Ref: CDH-659

commit 8e888717294496caae825d7f3f609d0661e7997a
Author: Dhruba Borthakur <dhruba@apache.org>
Date:   Thu Jun 10 18:46:03 2010 +0000

    HDFS-826. Allow a mechanism for an application to detect that datanode(s) have died in the write pipeline. (dhruba)
    
    Description: Adds an API in DFSOutputStream to determine the current length
                 of the write pipeline.
    Reason: Necessary for better reliability of HBase write-ahead logs.
    Author: Dhruba Borthakur
    Ref: CDH-931

commit 8fcb419648160efaed6fdd467875c3b1743d2bee
Author: Dhruba Borthakur <dhruba@apache.org>
Date:   Wed Jun 9 23:12:21 2010 +0000

    HDFS-988. Fix bug where savenameSpace can corrupt edits log.
    
    Description: Fixes several synchronization errors in the NameNode and ensures
                 that all edits have been synced to the edits log before
                 the namespace is saved.
    Reason: Fixes potential data corruption bug.
    Author: Todd Lipcon
    Ref: CDH-1436

commit f5ace5f920bc16fd202a6e4a53fe0ffe0cb5045e
Author: Todd Lipcon <todd@lipcon.org>
Date:   Thu May 20 01:23:15 2010 -0700

    HDFS-101. Datanodes should continue to forward acks until client stops pipeline.
    
    Description: When one node in the pipeline dies, the datanodes in between the client
                 and the dead node should stay alive and continue to forward acks until
                 the client stops the pipeline. This fixes an issue where the client
                 would incorrectly determine that the local DN had failed when in fact
                 another DN in the pipeline was at fault.
    Reason: Common source of failed pipeline recovery in cluster fault testing
    Author: Hairong Kuang, Todd Lipcon
    Ref: CDH-693

commit 132ef7c852847e9d2c1e7879f2fca26652bb77ef
Author: Dhruba Borthakur <dhruba@apache.org>
Date:   Fri Jun 4 07:20:10 2010 +0000

    HDFS-200. Support append and sync for hadoop 0.20 branch.
    
    Description: Provides basic support for append and sync on 0.20
    Reason: Append and sync required for durable HBase and many other
            applications.
    Author: Dhruba Borthakur
    Ref: CDH-659

commit 092bcd174dbf609f5002078490c357462e0ce8b1
Author: Konstantin Shvachko <shv@apache.org>
Date:   Wed Apr 21 03:05:45 2010 +0000

    HDFS-909. Fix race in edit log rolling
    
    Description: Fixes a race condition when rolling edit logs that can corrupt
                 the logs.
    Reason: Potential namenode metadata corruption bug.
    Author: Todd Lipcon
    Ref: CDH-1174

commit e2a78f767d26b838bf67354a4b85235ddd731038
Author: Eli Collins <eli@cloudera.com>
Date:   Fri Jun 18 14:41:14 2010 -0700

    CLOUDERA-BUILD. Update hadoop-config.sh to reflect new jar version.

commit 1756e97a35451bbc01a493e843f1ec0885c99792
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Jun 18 11:37:22 2010 -0700

    MAPREDUCE-1644. Remove Sqoop from Apache Hadoop (moving to github)
    
    Description: Sqoop is moving to github! All code for sqoop is already live at
    http://github.com/cloudera/sqoop - this issue removes the duplicate code from the Apache Hadoop
    repository. CDH users should install the separate 'sqoop' package for this functionality.
    Reason: Moving to a separate package
    Author: Aaron Kimball
    Ref: CDH-1404

commit e0afb34b89a013419fca4bdcda5f2cf0401f93ca
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Thu Jun 17 19:06:50 2010 -0700

    MAPREDUCE-1302. TrackerDistributedCacheManager can delete file asynchronously
    
    Description: With the help of AsyncDiskService from MAPREDUCE-1213, we should be able to delete
    files from distributed cache asynchronously. That will help make task initialization faster, because task initialization calls the code that
    localizes files into the cache and may delete some other files.
    The deletion can slow down the task initialization speed.
    Reason: Performance improvement
    Author: Zheng Shao
    Ref: CDH-495

commit 456821d6934fd769ab317c2290a4ff53b075269e
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Thu Jun 17 19:04:31 2010 -0700

    HADOOP-6433. Add AsyncDiskService that is used in both hdfs and mapreduce
    
    Description: create a thread pool per disk volume, and use that for scheduling async disk
    operations.
    Reason: Improvement
    Author: Zheng Shao
    Ref: CDH-495

commit 6e467c42d62aafd00fd2f38269806680427631c8
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Thu Jun 17 18:50:47 2010 -0700

    MAPREDUCE-1213. TaskTrackers restart is very slow because it deletes distributed cache directory synchronously
    
    Description: We are seeing that when we restart a tasktracker, it tries to recursively delete all
    the file in the distributed cache. It invoked FileUtil.fullyDelete() which is very very slow. This
    means that the TaskTracker cannot join the cluster for an extended period of time (upto 2 hours for
    us). The problem is acute if the number of files in a distributed cache is a few-thousands.
    Reason: Performance
    Author: Zheng Zhao
    Ref: CDH-495

commit 5626a0e301557dbc93ad5084aa9ef4527316db7b
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Thu Jun 17 18:45:58 2010 -0700

    MAPREDUCE-1443. DBInputFormat can leak connections
    
    Description: The DBInputFormat creates a Connection to use when enumerating splits, but never closes
    it. This can leak connections to the database which are not cleaned up for a long time.
    Reason: bug
    Author: Aaron Kimball
    Ref: CDH-1435

commit 912eed1c5d50066e68700d2143b775914d7f8e54
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Thu Jun 17 16:00:49 2010 -0700

    MAPREDUCE-1489. DataDrivenDBInputFormat should not query the database when generating only one split
    
    Description: DataDrivenDBInputFormat runs a query to establish bounding values for each split it
    generates; but if it's going to generate only one split (mapreduce.job.maps == 1), then there's no
    reason to do this. This will remove overhead associated with a single-threaded import of a
    non-indexed table since it avoids a full table scan.
    Reason: Improvement
    Author: Aaron Kimball
    Ref: CDH-1431

commit 1c3fc82063212196fd2fac7f55df8eb323e8f601
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Tue Apr 27 11:44:29 2010 -0700

    MAPREDUCE-1728. Oracle timezone strings do not match Java
    
    Description: OracleDBRecordReader sets the session timezone based on the toString representation of
    the current java.util.TimeZone. This is incorrect; Oracle manages a separate database of acceptable
    timezone strings, whose string representations are different than the timezone representations
    recognized by Java.
    Reason: Bug
    Author: Aaron Kimball
    Ref: CDH-961

commit 11bc9be1ff2fd994046acd660afa7631f9203cfb
Author: Eli Collins <eli@cloudera.com>
Date:   Thu May 27 17:44:00 2010 -0700

    HADOOP-6714. FsShell 'hadoop fs -text' does not support compression codecs.
    
    Currently, 'hadoop fs -text myfile' looks at the first few magic bytes
    of a file to determine whether it is gzip compressed or a sequence
    file. This means 'fs -text' cannot properly decode .deflate or .bz2
    files (or other codecs specified via configuration).
    
    Reason: Improvement
    Author: Eli Collins
    Ref: CDH-1136

commit e95781032b5d886aa6583cab1306025fe372babf
Author: Eli Collins <eli@cloudera.com>
Date:   Tue May 25 13:20:00 2010 -0700

    HADOOP-1849. IPC server max queue size should be configurable.
    
    Description: Currently max queue size for IPC server is set to (100 *
    handlers). Usually when RPC failures are observed (e.g. HADOOP-1763),
    we increase number of handlers and the problem goes away. I think a
    big part of such a fix is increase in max queue size. I think we
    should make maxQsize per handler configurable (with a bigger default
    than 100). There are other improvements also (HADOOP-1841).  Server
    keeps reading RPC requests from clients. When the number in-flight
    RPCs is larger than maxQsize, the earliest RPCs are deleted. This is
    the main feedback Server has for the client. I have often heard from
    users that Hadoop doesn't handle bursty traffic.
    
    Say handler count is 10 (default) and Server can handle 1000 RPCs a
    sec (quite conservative/low for a typical server), it implies that an
    RPC can wait for only for 1 sec before it is dropped. If there 3000
    clients and all of them send RPCs around the same time (not very rare,
    with heartbeats etc), 2000 will be dropped. In stead of dropping the
    earliest RPCs, if the server delays reading new RPCs, the feedback to
    clients would be much smoother, I will file another jira regd queue
    management.
    
    For this jira I propose to make queue size per handler configurable,
    with a larger default (may be 500).
    
    Reason: Improvement
    Author: Eli Collins
    Ref: CDH-1133

commit 776a20d37142534751178b060285d2813cc66c1c
Author: Eli Collins <eli@cloudera.com>
Date:   Tue May 25 13:09:30 2010 -0700

    HADOOP-6724. IPC doesn't properly handle IOEs thrown by socket factory.
    
    Description: If the socket factory throws an IOE inside
    setupIOStreams, then handleConnectionFailure will be called with
    socket still null, and thus generate an NPE on socket.close(). This
    ends up orphaning clients, etc.
    
    Reason: Bug fix
    Author: Eli Collins
    Ref: CDH-1132

commit 1864359f4ef32974ed41a1278e640e1ee246ef9b
Author: Eli Collins <eli@cloudera.com>
Date:   Tue May 25 13:05:38 2010 -0700

    HADOOP-6723. Unchecked exceptions thrown in IPC connection should not orphan clients.
    
    Description: If the server sends back some malformed data, for
    example, receiveResponse() can end up with an incorrect call ID. Then,
    when it tries to find it in the calls map, it will end up with null
    and throw NPE in receiveResponse. This isn't caught anywhere, so the
    original IPC client ends up hanging forever instead of catching an
    exception. Another example is if the writable implementation itself
    throws an unchecked exception or OOME.
    
    We should catch Throwable in Connection.run() and shut down the
    connection if we catch one.
    
    Reason: Bug fix
    Author: Eli Collins
    Ref: CDH-1131

commit 95d64157f05d467dad3e1190a5cba2a3f89b0925
Author: Eli Collins <eli@cloudera.com>
Date:   Thu May 20 17:15:13 2010 -0700

    CLOUDERA-BUILD. Rename the fuse_dfs wrapper.
    
    Description: Rename the fuse_dfs wrapper to hadoop-fuse-dfs.
    
    Reason: Improvement
    Author: Alex Newman
    Ref: CDH-1103

commit d8c973d9c6f650032c88915d9fef6f4a568d37a5
Author: Chad Metcalf <chad@cloudera.com>
Date:   Wed May 19 15:38:14 2010 -0700

    CLOUDERA-BUILD. Fixes for the fuse_dfs wrapper.
    
    Description: The wrapper uses bash syntax (i.e., +=) so we should use
    bash. We need to modprobe fuse explicitly on Ubuntu. Since this is
    installed by install_hadoop.sh we know HADOOP_HOME and should use it
    directly. Lastly, there is more robust JAVA_HOME checking in
    hadoop-config.sh so we should use that.
    
    Reason: Fuse currently broken on Ubuntu
    Author: Chad Metcalf
    Ref: CDH-1089

commit e810911445859693ee0b868c2a5d8bc18360cdb9
Author: Eli Collins <eli@cloudera.com>
Date:   Tue May 18 14:30:04 2010 -0700

    HDFS-1161. Make DN minimum valid volumes configurable
    
    Description: This change adds a dfs.datanode.failed.volumes.tolerated parameter so that users can configure the number of volumes that are allowed to fail before a datanode stops offering service. By default any volume failure will cause a datanode to shutdown.
    
    Reason: Improvement
    Author: Eli Collins
    Ref: CDH-1081

commit baa77bdde4fd971877418391a4fe491c2d4c2501
Author: Eli Collins <eli@cloudera.com>
Date:   Mon May 17 19:49:44 2010 -0700

    HDFS-1160. Improve some FSDataset warnings and comments.
    
    Description: Cleans up HDFS-547 warnings.
    
    Reason: Improvement
    Author: Eli Collins
    Ref: CDH-1080

commit 90f5a4bf77d17adcabb834a3cc2e02becb9f012d
Author: Eli Collins <eli@cloudera.com>
Date:   Mon May 17 18:53:50 2010 -0700

    HDFS-612. FSDataset should not use org.mortbay.log.Log.
    
    Description: Cleans up HDFS-547 logging.
    
    Reason: Improvement
    Author: Eli Collins
    Ref: CDH-1079

commit 4a925fe53a2015e504cd8c8796e0e590d22019c4
Author: Eli Collins <eli@cloudera.com>
Date:   Thu Apr 22 14:41:08 2010 -0700

    HDFS-457. Better handling of volume failure in Data Node storage.
    
    Description: Current implementation shuts DataNode down completely when one of the configured volumes of the storage fails. This is rather wasteful behavior because it decreases utilization (good storage becomes unavailable) and imposes extra load on the system (replication of the blocks from the good volumes). These problems will become even more prominent when we move to mixed (heterogeneous) clusters with many more volumes per Data Node.
    
    Reason: Improvement
    Author: Eli Collins
    Ref: CDH-472

commit 3af9533ee6f260373f302ff4a16dd04eb75e0616
Author: Chad Metcalf <chad@cloudera.com>
Date:   Mon Mar 1 15:28:19 2010 -0800

    CLOUDERA-BUILD. hadoop-config runs before hadoop-env.sh
    
        conf/hadoop-env.sh says you can update JAVA_HOME there, but it gets
        sourced after hadoop-config.sh, which errors out if JAVA_HOME is not
        set. This patch changes the flow so hadoop-env is always sourced by
        hadoop-config after the --config flag is processed. This will allow
        JAVA_HOME to be set in hadoop-env and still allow for trying to find a valid
        JAVA_HOME.

commit c9295d4ac2848403362e5dbaa78aa7be4ce4254e
Author: Eli Collins <eli@cloudera.com>
Date:   Sat May 15 13:39:08 2010 -0700

    HADOOP-3659. Fix hadoop native to compile on Mac OS X.
    
    Description: This patch makes the autoconf script work on Mac OS X. LZO needs to be installed (including the optional shared libraries) for the compile to succeed. You'll want to regenerate the configure script using autoconf after applying this patch.
    
    Reason: Bug fix
    Author: Eli Collins
    Ref: CDH-825

commit cc035175e1cf1ddef878cba6aa93725f832d0327
Author: Eli Collins <eli@cloudera.com>
Date:   Sat May 15 12:55:06 2010 -0700

    MAPREDUCE-1785. Add streaming config option for not emitting the key.
    
    Description: PipeMapper currently does not emit the key when using TextInputFormat. If you switch to input formats (eg LzoTextInputFormat) the key will be emitted. We should add an option so users can explicitly make streaming not emit the key so they can change input formats without breaking or having to modify their existing programs.
    
    Reason: Improvement
    Author: Eli Collins
    Ref: CDH-856

commit 590a82c257842be51170619deafd15cc2988541e
Author: Eli Collins <eli@cloudera.com>
Date:   Thu May 13 21:25:53 2010 -0700

    HADOOP-4885. Try to restore failed replicas of Name Node storage (at checkpoint time).
    
    Description: If one of the replicas of the NameNode storage fails for whatever reason (for example temporarily failure of NFS) this Storage object is removed from the list of storage objects forever. It can be added back only on restart of the NameNode. We propose to check the status of a failed storage on every checkpoint and if it becomes valid - try to restore the edits and fsimage.
    
    Reason: Improvement
    Author: Eli Collins
    Ref: CDH-473

commit 0f2f19e1bd5725f6163998ae86d9103c0d552de3
Author: Eli Collins <eli@cloudera.com>
Date:   Thu May 13 20:07:02 2010 -0700

    HDFS-1024. SecondaryNamenode fails to checkpoint because namenode fails with CancelledKeyException.
    
    Description: The secondary namenode fails to retrieve the entire fsimage from the Namenode. It fetches a part of the fsimage but believes that it has fetched the entire fsimage file and proceeds ahead with the checkpointing.
    
    Reason: Bug fix
    Author: Eli Collins
    Ref: CDH-891

commit 0ec1d6ed85a30327c657c2418932728d0e4e98df
Author: Todd Lipcon <todd@lipcon.org>
Date:   Wed May 12 21:33:45 2010 -0700

    HADOOP-6254. Slow reads cause s3n to fail with SocketTimeoutException
    
    Reason: Bug fix for users of s3n:// file system
    Author: Andrew Hitchcock
    Ref: CDH-1035

commit d64943401780c3dd1dc498419f33ded8222c3210
Author: Eli Collins <eli@cloudera.com>
Date:   Wed May 12 12:05:26 2010 -0700

    HADOOP-6667. RPC.waitForProxy should retry through NoRouteToHostException.
    
    Description: RPC.waitForProxy already loops through ConnectExceptions, but NoRouteToHostException is not a subclass of ConnectException. In the case that the NN is on a VIP, the No Route To Host error is reasonably common during a failover, so we should retry through it just the same as the other connection errors.
    
    Reason: Improvement
    Author: Eli Collins
    Ref: CDH-907

commit a5fb4a8c8bf9d6a3a96c3a06eb3a46febaf21a0f
Author: Todd Lipcon <todd@cloudera.com>
Date:   Fri May 7 15:36:14 2010 -0700

    MAPREDUCE-1375. TestFileArgs fails intermittently
    
    Description: Fixes an error in a test case without modifying code. This is an amendment to the prior fix which did not address the issue properly.
    Reason: Should fix flaky tests.
    Author: Todd Lipcon
    Ref: CDH-657

commit 148d291aa14a4481dc206d2fc9a8527eb6761488
Author: newalex <newalex@ubuntu64-build01.(none)>
Date:   Fri Apr 16 15:48:14 2010 -0700

    CLOUDERA-BUILD. Add a fuse manpage
    
    Description: Adding a fuse_dfs manpage and adding a manpage to the build.
    Reason: New Feature
    Author: Alex Newman
    Ref: CDH-927

commit 9acfd39492f85c92bc45d47d6dcfb309e3826c64
Author: newalex <newalex@centos64-build01.sf.cloudera.com>
Date:   Thu Apr 8 10:35:19 2010 -0700

    CLOUDERA-BUILD. Build script changes to build DEB packages
    
    Description: The required changes to the cloudera hadoop building scripts for pulling the fuse files out and cleaning up its mess v.v. DEBs.
    Reason: Building packages
    Author: Alex Newman
    Ref: CDH-929

commit d144085817496eecc57c510022d66d0540b4511d
Author: newalex <newalex@centos64-build01.sf.cloudera.com>
Date:   Tue Apr 6 14:05:29 2010 -0700

    CLOUDERA-BUILD. Added an RPM for fuse
    
    Description: The required changes to the cloudera hadoop building scripts for pulling the fuse files out and cleaning up its mess.
    Reason: Building packages
    Author: Alex Newman
    Ref: CDH-928

commit 56648efe291503249fec22a242917ec4dddc6214
Author: Eli Collins <eli@cloudera.com>
Date:   Tue Mar 30 15:17:50 2010 -0700

    HADOOP-6522. Fix decoding of codepoint zero in UTF8.
    
    Description: TestUTF8 is actually flaky. It generates 10 random strings to run the test on. If you change this number to 100000 it fails every time. The problem is that the null character (codepoint zero) was correctly encoded but incorrectly decoded. I've attached a patch that fixes this and increases the size of the tests so that problems like this will likely be discovered sooner.
    
    Reason: Bugfix to UTF8
    Author: Eli Collins
    Ref: CDH-718

commit 936a67ba3b34dc8c8efd3df92d9e50309fafb8f6
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Mon Mar 29 23:50:14 2010 -0700

    MAPREDUCE-1460. Oracle support in DataDrivenDBInputFormat
    
    Description: DataDrivenDBInputFormat does not work with Oracle due to various SQL syntax issues.
    Reason: Required for Sqoop/Oracle integration
    Author: Aaron Kimball
    Ref: CDH-888

commit c08f94a6927f9c8b0dfaeb674835afdd3fdd1d08
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Mon Mar 29 17:15:53 2010 -0700

    MAPREDUCE-1569. Mock Contexts & Configurations
    
    Description: Currently the library creates a new Configuration object in the MockMapContext and
    MocKReduceContext constructors, rather than allowing the developer to configure and pass their own
    Reason: Feature improvement for MRUnit
    Author: Chris White
    Ref: CDH-838

commit 27cfda1de80048bf2b46d74d78b61275ecc79be1
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Mon Mar 29 16:43:49 2010 -0700

    MAPREDUCE-1536. DataDrivenDBInputFormat does not split date columns correctly.
    
    Description: The DateSplitter does not properly split a range of (min, max) dates.
    Reason: Bugfix to DateSplitter
    Author: Aaron Kimball
    Ref: CDH-813

commit 7fc6e48e296c30f0afa8ae8da668bddbc9f422bf
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Mon Mar 29 16:11:22 2010 -0700

    MAPREDUCE-1480. CombineFileRecordReader does not properly initialize child RecordReader
    
    Description: CombineFileRecordReader instantiates child RecordReader instances but never calls their initialize() method to give them the proper TaskAttemptContext.
    Reason: Bug in CombineFileInputFormat prevents proper use.
    Author: Aaron Kimball
    Ref: CDH-811

commit 32330fbadb4aed16627397979b90d52f2474ef38
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Mon Mar 29 15:50:20 2010 -0700

    MAPREDUCE-1423. Improve performance of CombineFileInputFormat when multiple pools are configured
    
    Description: I have a map-reduce job that is using CombineFileInputFormat. It has configured 10000
    pools and 30000 files. The time to create the splits takes more than an hour. The reaosn being that
    CombineFileInputFormat.getSplits() converts the same path from String to Path object multiple times,
    one for each instance of a pool. Similarly, it calls Path.toUri(0 multiple times. This code can be
    optimized.
    
    Reason: Improves CombineFileInputFormat performance (used by Sqoop); needed to apply MAPREDUCE-1480 cleanly
    Author: Dhruba Borthakur
    Ref: CDH-811

commit 6906389e07244931a108f2930544b9feada3a487
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Mon Mar 29 15:41:38 2010 -0700

    MAPREDUCE-364. Change org.apache.hadoop.examples.MultiFileWordCount to use new mapreduce api.
    
    Description: Updates MultiFileWordCount example to use the new API in
    org.apache.hadoop.mapreduce instead of the deprecated API of
    org.apache.hadoop.mapred.
    
    This incorporates MAPREDUCE-367: Change org.apache.hadoop.mapred.lib.CombineFileInputFormat
    to use the new api.
    
    This solves duplicate issue MAPREDUCE-1112: Fix CombineFileInputFormat for hadoop 0.20
    
    Reason: CombineFileInputFormat required for many clients of the new API, including Sqoop.
    Author: Amareshwari Sriramadasu
    Ref: CDH-811

commit 4b592cf8cb44c018f86abe529d71434d5106ce1e
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Mon Mar 29 13:07:15 2010 -0700

    HADOOP-6382. Publish hadoop jars to apache mvn repo.
    
    Description: This provides an 'ant mvn-install' command that will
    install Hadoop core, streaming, examples, etc. jars in a maven repository.
    
    Uses the maven ant task to publish hadoop 20 jars to the apache maven repo.
    Reason: Required for cross-distribution dependency management in downstream projects (e.g., sqoop)
    Author: Giridharan Kesavan
    Ref: CDH-402

commit 8424e32eb866d677f40a9446f9c4cf74972b751e
Author: Chad Metcalf <chad@cloudera.com>
Date:   Thu Mar 18 17:05:47 2010 -0700

    HADOOP-6643. Set executable bit for python cloud scripts in the distribution
    
    Description: This needs to be set in the tar target.
    Reason: Required for the EC2 scripts.
    Author: Tom White
    Ref: CDH-821

commit cfc3233ece0769b11af9add328261295aaf4d1ad
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:56:30 2010 -0800

    CLOUDERA-BUILD. Fix ivy xml after rebase. Removed a redundant </dependencies> closing tag.
    
    Author: Matt Massie

commit 54e1aefdd7a25a539831cac2c9b1bc3597f119ea
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:56:07 2010 -0800

    CLOUDERA-BUILD. Small tweaks and fixes to Cloudera styling:
    
    Description:
        - Fixes trivial CSS bug for missing table cell borders in Chrome
        - Fixes footer to read "Distribution for Hadoop" instead of "Distribution of Hadoop"
    
    Author: Todd Lipcon

commit ea83036b3838fa97c673e73145d52867b8ace6ac
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:55:30 2010 -0800

    HDFS-1013. Miscellaneous improvements to HTML markup for web UIs
    
    Description: The Web UIs have various bits of bad markup (eg missing &lt;head&gt; sections, some pages missing CSS links, inconsistent td vs th for table headings). We should fix this up.
    <hr/>
        Improve markup and add Cloudera styling to Web UIs
    
        This adds a favicon and a number of HTML/CSS improvements to make the
        pages more space-efficient and easy on the eyes.
    
        This may be an incompatible change for users who are scraping the HTML
        output of the web UIs. Those users are encouraged to access the data
        programmatically rather than through scraping.
    
        The non-Cloudera-specific improvements will be contributed upstream
        as HDFS-1013 and MAPREDUCE-1544.
    Reason: User experience improvement
    Author: Todd Lipcon
    Ref: UNKNOWN

commit 90ba5543e4c3176343e23943131a34d666c23d89
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:54:58 2010 -0800

    MAPREDUCE-1436. Deadlock in preemption code in fair scheduler
    
    Description: In testing the fair scheduler with preemption, I found a deadlock between updatePreemptionVariables and some code in the JobTracker. This was found while testing a backport of the fair scheduler to Hadoop 0.20, but it looks like it could also happen in trunk and 0.21. Details are in a comment below.
    <hr/>
    The fair scheduler introduces a potential jobtracker deadlock which
    was fixed on trunk by MAPREDUCE-870. This patch adjusts the locking
    in 0.20-based MapReduce to prevent this condition.
    
    Reason: bugfix (deadlock)
    Author: Matei Zaharia
    Ref: UNKNOWN

commit 6f04e94feee3f40a73449cc6fbe7b4e3c48f1fc4
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:54:13 2010 -0800

    HDFS-696. Java assertion failures triggered by tests
    
    Description: Re-purposing as catch-all ticket for assertion failures when running tests with java asserts enabled. Running with the attached patch on trunk@823732 the following tests all trigger assertion failures:
    
    <p>TestAccessTokenWithDFS<br/>
    TestInterDatanodeProtocol<br/>
    TestBackupNode <br/>
    TestBlockUnderConstruction<br/>
    TestCheckpoint  <br/>
    TestNameEditsConfigs<br/>
    TestStartup<br/>
    TestStorageRestore</p>
    <hr/>
        Disable failing asserts (see HDFS-696).
    
        Disabled asserts in HDFS that cause unit tests to fail.
        These will be re-enabled at a later date when the underlying cause is fixed
        upstream. In the meantime, these are disabled to keep our CI server returning
        only new failures. Issue HDFS-696 lists the failing tests and tracks their
        progress.
    Reason: Test harness improvement
    Author: Eli Collins
    Ref: UNKNOWN

commit 74b80b9c9490bba1a1120f3a9376d2f21f3763b6
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:53:38 2010 -0800

    MAPREDUCE-1093. Java assertion failures triggered by tests
    
    Description:
        Removes failing asserts from the CDH build until they are fixed in trunk.
        Tracking MAPREDUCE-1506 to include a fix for this assertion failure.
    Reason: Test harness improvement
    Author: Aaron Kimball
    Ref: UNKNOWN

commit b4be440cd928976544bcbeb7e10566fc523dbd0c
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:53:13 2010 -0800

    MAPREDUCE-1092. Enable asserts for tests by default
    
    Description: See <a href="http://issues.apache.org/jira/browse/HADOOP-6309" title="Enable asserts for tests by default"><del>HADOOP-6309</del></a>. Let's make the tests run with java asserts by default.
    Reason: Test coverage improvement
    Author: Eli Collins
    Ref: UNKNOWN

commit 5e7fb9843f99f5e1023f2723210f26ac0c33323b
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:52:45 2010 -0800

    MAPREDUCE-1375. TestFileArgs fails intermittently
    
    Description: TestFileArgs failed once for me with the following error
    <div class="code panel" style="border-width: 1px;"><div class="codeContent panelContent">
    <pre class="code-java">expected:&lt;[job.jar
    sidefile
    tmp
    ]&gt; but was:&lt;[]&gt;
    sidefile
    tmp
    ]&gt; but was:&lt;[]&gt;
            at org.apache.hadoop.streaming.TestStreaming.checkOutput(TestStreaming.java:107)
            at org.apache.hadoop.streaming.TestStreaming.testCommandLine(TestStreaming.java:123)</pre>
    </div></div>
    
        This test was flaky due to trying to write some data into /bin/ls.
        Depending on the speed of the test run, this sometimes resulted
        in a Broken Pipe on flush() which caused the test to fail.
    
    Reason: Bugfix (race condition in test)
    Author: Todd Lipcon
    Ref: UNKNOWN

commit ae699cda01c093097ae723224553773247577aa2
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:52:32 2010 -0800

    HDFS-961. dfs_readdir incorrectly parses paths
    
    Description: fuse-dfs dfs_readdir assumes that DistributedFileSystem#listStatus returns Paths with the same scheme/authority as the dfs.name.dir used to connect. If NameNode.DEFAULT_PORT port is used listStatus returns Paths that have authorities without the port (see <a href="http://issues.apache.org/jira/browse/HDFS-960" title="DistributedFileSystem#makeQualified port inconsistency">HDFS-960</a>), which breaks the following code.
    
    <div class="code panel" style="border-width: 1px;"><div class="codeContent panelContent">
    <pre class="code-java"><span class="code-comment">// hack city: todo fix the below to something nicer and more maintainable but
    </span><span class="code-comment">// with good performance
    </span><span class="code-comment">// strip off the path but be careful <span class="code-keyword">if</span> the path is solely '/'
    </span><span class="code-comment">// NOTE - <span class="code-keyword">this</span> API started returning filenames as full dfs uris
    </span><span class="code-keyword">const</span> <span class="code-object">char</span> *<span class="code-keyword">const</span> str = info[i].mName + dfs-&gt;dfs_uri_len + path_len + ((path_len == 1 &amp;&amp; *path == '/') ? 0 : 1);</pre>
    </div></div>
    
    <p>Let's make the path parsing here more robust. listStatus returns normalized paths so we can find the start of the path by searching for the 3rd slash. A more long term solution is to have hdfsFileInfo maintain a path object or at least pointers to the relevant URI components.</p>
    Reason: bugfix
    Author: Eli Collins
    Ref: UNKNOWN

commit 7f9f42b27b109eff6fafc6ee24526fcadaf68d69
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:52:23 2010 -0800

    MAPREDUCE-1467. Add a --verbose flag to Sqoop
    
    Description: Need a <tt>--verbose</tt> flag that sets the log4j level to DEBUG.
    Reason: Logging improvement
    Author: Aaron Kimball
    Ref: UNKNOWN

commit db680058f5796fc41d61242d60bc86b1b25facf9
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:52:07 2010 -0800

    MAPREDUCE-1469. Sqoop should disable speculative execution in export
    
    Description: Concurrent writers of the same output shard may cause the database to try to insert duplicate primary keys concurrently. Not a good situation. Speculative execution should be forced off for this operation.
    Reason: Bugfix (race condition)
    Author: Aaron Kimball
    Ref: UNKNOWN

commit a5ccc56a79fc53de5ff16c6cb996f41a4216c28d
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:51:29 2010 -0800

    MAPREDUCE-1341. Sqoop should have an option to create hive tables and skip the table import step
    
    Description: In case the client only needs to create tables in hive, it would be helpful if Sqoop had an optional parameter:
    
    <p>--hive-create-only</p>
    
    <p>which would omit the time consuming table import step, generate hive create table statements and run them.</p>
    
    <p>Also adds --hive-overwrite flag which allows overwriting of existing table definition.
    
    Reason: New feature
    Author: Leonid Furman
    Ref: UNKNOWN

commit bdf576aa69eeb56a954416f7c2fcbe0136f421bd
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:51:16 2010 -0800

    HADOOP-4012. Providing splitting support for bzip2 compressed files
    
    Description: Hadoop assumes that if the input data is compressed, it can not be split (mainly due to the limitation of many codecs that they need the whole input stream to decompress successfully).  So in such a case, Hadoop prepares only one split per compressed file, where the lower split limit is at 0 while the upper limit is the end of the file.  The consequence of this decision is that, one compress file goes to a single mapper. Although it circumvents the limitation of codecs (as mentioned above) but reduces the parallelism substantially, as it was possible otherwise in case of splitting.
    
    <p>BZip2 is a compression / De-Compression algorithm which does compression on blocks of data and later these compressed blocks can be decompressed independent of each other.  This is indeed an opportunity that instead of one BZip2 compressed file going to one mapper, we can process chunks of file in parallel.  The correctness criteria of such a processing is that for a bzip2 compressed file, each compressed block should be processed by only one mapper and ultimately all the blocks of the file should be processed.  (By processing we mean the actual utilization of that un-compressed data (coming out of the codecs) in a mapper).</p>
    
    <p>We are writing the code to implement this suggested functionality.  Although we have used bzip2 as an example, but we have tried to extend Hadoop's compression interfaces so that any other codecs with the same capability as that of bzip2, could easily use the splitting support.  The details of these changes will be posted when we submit the code.</p>
    Reason: New feature
    Author: Abdul Qadeer
    Ref: UNKNOWN

commit 8e47288583fcdbdf649ddf3486bf201788e79202
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:50:51 2010 -0800

    MAPREDUCE-707. Provide a jobconf property for explicitly assigning a job to a pool
    
    Description: A common use case of the fair scheduler is to have one pool per user, but then to define some special pools for various production jobs, import jobs, etc. Therefore, it would be nice if jobs went by default to the pool of the user who submitted them, but there was a setting to explicitly place a job in another pool. Today, this can be achieved through a sort of trick in the JobConf:
    
    <div class="code panel" style="border-width: 1px;"><div class="codeContent panelContent">
    <pre class="code-java">&lt;property&gt;
      &lt;name&gt;mapred.fairscheduler.poolnameproperty&lt;/name&gt;
      &lt;value&gt;pool.name&lt;/value&gt;
    &lt;/property&gt;
    
    &lt;property&gt;
      &lt;name&gt;pool.name&lt;/name&gt;
      &lt;value&gt;${user.name}&lt;/value&gt;
    &lt;/property&gt;</pre>
    </div></div>
    
    <p>This JIRA proposes to add a property called mapred.fairscheduler.pool that allows a job to be placed directly into a pool, avoiding the need for this trick.</p>
    Reason: Configuration improvement
    Author: Alan Heirich
    Ref: UNKNOWN

commit 96e17e1e593b818a888c8dfc177b8fb36e514e8f
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:50:18 2010 -0800

    MAPREDUCE-967. (version 2) TaskTracker does not need to fully unjar job jars
    
    Description:
        This is a performance improvement for jobs that contain a large number of
        classes. The unpacking of these jars consumes a large amount of time, as
        does the resulting cleanup. This patch changes the classpath to simply
        include the jar itself, and only unpacks the lib/ directory out of the
        jar in order to add those dependencies to the classpath.
    
        Users who previously depended on this functionality for shipping non-code
        dependencies can use the undocumented configuration parameter
        "mapreduce.job.jar.unpack.pattern" to cause specific jar contents to be unpacked
    
        This new patch version fixes a streaming regression where the "-file" argument
        no longer worked. It includes a new unit test, TestFileArgs, to protect
        against this regression.
    Author: Todd Lipcon
    Ref: UNKNOWN

commit cf08a128b87bbfae90babd61795599b3645d37a3
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:48:40 2010 -0800

    HDFS-455, MAPREDUCE-1441, HADOOP-6534. Allow spaces in between comma-separated elements in directory list configurations.
    
    Description: Make NN and DN handle in a intuitive way comma-separated configuration strings
    
    The following configuration causes problems:<br/>
    &lt;property&gt;<br/>
    &lt;name&gt;dfs.data.dir&lt;/name&gt;<br/>
    &lt;value&gt;/mnt/hstore2/hdfs, /home/foo/dfs&lt;/value&gt; <br/>
    &lt;/property&gt;
    
    <p>The problem is that the space after the comma causes the second directory for storage to be " /home/foo/dfs" which is in a directory named &lt;SPACE&gt; which contains a sub-dir named "home" in the hadoop datanodes default directory. This will typically cause the user's home partition to fill, but will be very hard for the user to understand since a directory with a whitespace name is hard to understand.</p>
    
    <p>(ripped from <a href="http://issues.apache.org/jira/browse/HADOOP-2366" title="Space in the value for dfs.data.dir can cause great problems"><del>HADOOP-2366</del></a>)</p>
    
    <hr/>
    This fixes any configuration consisting of a comma-separated list of directories
    (e.g., dfs.data.dir, dfs.name.dir, fs.checkpoint.dir, mapred.local.dir, etc) so that
    the elements may also contain separating whitespace. Without this patch,
    setting mapred.local.dir to "/disk1, /disk2" would create a directory by the name
    " " in the user's home directory, or fail outright. The patch trims the
    directory
    names as they are fetched from the configuration.
    
    Reason: Configuration improvement
    Author: Todd Lipcon
    Ref: UNKNOWN

commit 65a04ab8197a8db21a97d279ca881b5cd45a5365
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:48:03 2010 -0800

    HADOOP-2366. Space in the value for dfs.data.dir can cause great problems
    
    Description: The following configuration causes problems:
    
    <p>&lt;property&gt;<br/>
      &lt;name&gt;dfs.data.dir&lt;/name&gt;<br/>
      &lt;value&gt;/mnt/hstore2/hdfs, /home/foo/dfs&lt;/value&gt;  <br/>
      &lt;description&gt;<br/>
      Determines where on the local filesystem an DFS data node  should store its bl<br/>
    ocks.  If this is a comma-delimited  list of directories, then data will be stor<br/>
    ed in all named  directories, typically on different devices.  Directories that <br/>
    do not exist are ignored.  <br/>
      &lt;/description&gt;<br/>
    &lt;/property&gt;</p>
    
    <p>The problem is that the space after the comma causes the second directory for storage to be " /home/foo/dfs" which is in a directory named &lt;SPACE&gt; which contains a sub-dir named "home" in the hadoop datanodes default directory.  This will typically cause the user's home partition to fill, but will be very hard for the user to understand since a directory with a whitespace name is hard to understand.</p>
    
    <p>My proposed solution would be to trimLeft all path names from this and similar property after splitting on comma.  This still allows spaces in file and directory names but avoids this problem. </p>
    <hr/>
        This provides support in Configuration to get comma-separated string lists in such
        a way that whitespace in between elements is ignored. This patch is required for
        later patches which fix mapred.local.dir, dfs.data.dir, etc to support spaces
        in between elements.
    
        Test plan: unit tested in TestStringUtils
    Reason: Configuration improvement
    Author: Michele (@pirroh) Catasta
    Ref: UNKNOWN

commit 8d4807322a42509726b376b37a89739acd6cbd7d
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:47:55 2010 -0800

    MAPREDUCE-1356. Allow user-specified hive table name in sqoop
    
    Description: The table name used in a hive-destination import is currently pegged to the input table name. This should be user-configurable.
    Reason: New feature
    Author: Aaron Kimball
    Ref: UNKNOWN

commit 8bf3439ff69762a33967dca4abb15c0cd2bb8417
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:47:45 2010 -0800

    MAPREDUCE-1395. Sqoop does not check return value of Job.waitForCompletion()
    
    Description: Old code depended on JobClient.runJob() throwing IOException on failure. Job.waitForCompletion can fail in that manner, or it can fail by returning false. Sqoop needs to check for this condition.
    Reason: bugfix
    Author: Aaron Kimball
    Ref: UNKNOWN

commit bd4e81234dd12fa9534577f0caa0db5c3d0a99fc
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:47:30 2010 -0800

    CLOUDERA-BUILD. Set HADOOP_PID_DIR to something smarter than /tmp
    
    Author: Chad Metcalf

commit 2466310d0e2a426e848860e9a8411b8ea14e1bb1
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:47:07 2010 -0800

    HADOOP-6453. Hadoop wrapper script shouldn't ignore an existing JAVA_LIBRARY_PATH
    
    Description: Currently the hadoop wrapper script assumes its the only place that uses JAVA_LIBRARY_PATH and initializes it to a blank line.
    
    <p>JAVA_LIBRARY_PATH=''</p>
    
    <p>This prevents anyone from setting this outside of the hadoop wrapper (say hadoop-config.sh) for their own native libraries.</p>
    
    <p>The fix is pretty simple. Don't initialize it to '' and append the native libs like normal. </p>
    Reason: Bugfix (environment)
    Author: Chad Metcalf
    Ref: UNKNOWN

commit a67b4b1c361c26e002da64953a7f8bc068d29b98
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:46:42 2010 -0800

    MAPREDUCE-1327. Oracle database import via sqoop fails when a table contains the column types such as TIMESTAMP(6) WITH LOCAL TIME ZONE and TIMESTAMP(6) WITH TIME ZONE
    
    Description: When Oracle table contains the columns "TIMESTAMP(6) WITH LOCAL TIME ZONE" and "TIMESTAMP(6) WITH TIME ZONE", Sqoop fails to map values for those columns to valid Java data types, resulting in the following exception:
    
    <p>ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.NullPointerException<br/>
    java.lang.NullPointerException<br/>
            at org.apache.hadoop.sqoop.orm.ClassWriter.generateFields(ClassWriter.java:253)<br/>
            at org.apache.hadoop.sqoop.orm.ClassWriter.generateClassForColumns(ClassWriter.java:701)<br/>
            at org.apache.hadoop.sqoop.orm.ClassWriter.generate(ClassWriter.java:597)<br/>
            at org.apache.hadoop.sqoop.Sqoop.generateORM(Sqoop.java:75)<br/>
            at org.apache.hadoop.sqoop.Sqoop.importTable(Sqoop.java:87)<br/>
            at org.apache.hadoop.sqoop.Sqoop.run(Sqoop.java:175)<br/>
            at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)<br/>
            at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)<br/>
            at org.apache.hadoop.sqoop.Sqoop.main(Sqoop.java:201)<br/>
            at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)<br/>
            at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)</p>
    
    Reason: Compatibility improvement
    Author: Leonid Furman
    Ref: UNKNOWN

commit a937ba2b9b6132883d727f856911ae31d22ad619
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:46:26 2010 -0800

    MAPREDUCE-1394. Sqoop generates incorrect URIs in paths sent to Hive
    
    Description: Hive used to require a ':8020' in HDFS URIs used with LOAD DATA statements, even though the normalized form of such a URI does not contain an explicit port number (since 8020 is the default port). Sqoop matched this by hacking the URI strings it forwarded to Hive.
    
    <p>Hive fixed this bug a while ago &#8211; Sqoop should catch up.</p>
    Reason: bugfix (compatibility)
    Author: Aaron Kimball
    Ref: UNKNOWN

commit c5c9b8bf0bf83637589a809b3c376cf74a2fb464
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:45:54 2010 -0800

    MAPREDUCE-1313. NPE in FieldFormatter if escape character is set and field is null
    
    Description: Performing an import with the <tt>&#45;&#45;escaped-by</tt> character set on a table with a null field will cause a NullPointerException in FieldFormatter
    Reason: bugfix
    Author: Aaron Kimball
    Ref: UNKNOWN

commit 1c6dd471832946929928801dd9c9e4b79259ad9d
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:45:38 2010 -0800

    HADOOP-6460. Namenode runs of out of memory due to memory leak in ipc Server
    
    Description: Namenode heap usage grows disproportional to the number objects supports (files, directories and blocks). Based on heap dump analysis, this is due to large growth in ByteArrayOutputStream allocated in o.a.h.ipc.Server.Handler.run().
    Reason: Bugfix (Scalability)
    Author: Suresh Srinivas
    Ref: UNKNOWN

commit d190a8067827ce09cdcb7741d588cce0e0e7aa02
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:45:23 2010 -0800

    HADOOP-5687. Hadoop NameNode throws NPE if fs.default.name is the default value
    
    Description: Throwing NPE is confusing; instead, an exception with a useful string description could be thrown instead.
    Reason: Logging improvement
    Author: Philip Zeyliger
    Ref: UNKNOWN

commit 7604c6f69076effbb0c9793e114946d679f5912d
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:45:02 2010 -0800

    HADOOP-6505. sed in build.xml fails
    
    Description: I'm not sure whether this is a Solaris thing or an ant 1.7.1 thing, but it definitely doesn't do what it is supposed to.  Instead of getting SunOS-x86-32 (or whatever) I get -x86-32.
    
    <p>This patch replaces the sed call with tr. </p>
    Reason: OS compatibility improvement
    Author: Allen Wittenauer
    Ref: UNKNOWN

commit ca662cbba6044be216b586e7359d9fc2f1dd4e4f
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:44:00 2010 -0800

    HDFS-908. (version 2) TestDistributedFileSystem fails with Wrong FS on weird hosts
    
    Description: On the same host where I experienced <a href="http://issues.apache.org/jira/browse/HDFS-874" title="TestHDFSFileContextMainOperations fails on weirdly configured DNS hosts">HDFS-874</a>, I also experience this failure for TestDistributedFileSystem:
    
    <p>Testcase: testFileChecksum took 0.492 sec<br/>
      Caused an ERROR<br/>
    Wrong FS: hftp://localhost.localdomain:59782/filechecksum/foo0, expected: hftp://127.0.0.1:59782<br/>
    java.lang.IllegalArgumentException: Wrong FS: hftp://localhost.localdomain:59782/filechecksum/foo0, expected: hftp://127.0.0.1:59782<br/>
      at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:310)<br/>
      at org.apache.hadoop.fs.FileSystem.makeQualified(FileSystem.java:222)<br/>
      at org.apache.hadoop.hdfs.HftpFileSystem.getFileChecksum(HftpFileSystem.java:318)<br/>
      at org.apache.hadoop.hdfs.TestDistributedFileSystem.testFileChecksum(TestDistributedFileSystem.java:166)</p>
    
    <p>Doesn't appear to occur on trunk or branch-0.21.</p>
    
    This is version two of this patch. THe previous patch fixed some systems
    but broke others.
    Reason: Bugfix
    Author: Todd Lipcon
    Ref: UNKNOWN

commit 7fafe032223921ad194c69b16ab451b4aade87fa
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:43:41 2010 -0800

    HADOOP-4368. Superuser privileges required to do "df"
    
    Description: super user privileges are required in DFS in order to get the file system statistics (FSNamesystem.java, getStats method).  This means that when HDFS is mounted via fuse-dfs as a non-root user, "df" is going to return 16exabytes total and 0 free instead of the correct amount.
    
    <p>As far as I can tell, there's no need to require super user privileges to see the file system size (and historically in Unix, this is not required).</p>
    
    <p>To fix this, simply comment out the privilege check in the getStats method.</p>
    Reason: Usability improvement
    Author: Craig Macdonald
    Ref: UNKNOWN

commit 6129c87f5dd1fdb7375c80285534b8b91fbcd392
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:43:25 2010 -0800

    HDFS-412. Hadoop JMX usage makes Nagios monitoring impossible
    
    Description: When Hadoop reports Datanode information to JMX, the bean uses the name "DataNode-" + storageid.  The storage ID incorporates a random number and is unpredictable.
    
    <p>This prevents me from monitoring DFS datanodes through Hadoop using the JMX interface; in order to do that, you must be able to specify the bean name on the command line.</p>
    
    <p>The fix is simple, patch will be coming momentarily.  However, there was probably a reason for making the datanodes all unique names which I'm unaware of, so it'd be nice to hear from the metrics maintainer.</p>
    Reason: Monitoring improvement
    Author: Brian Bockelman
    Ref: UNKNOWN

commit 5dfcc6d2d7806636c6237996e1b28a00ba075b4b
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:43:05 2010 -0800

    HADOOP-6503. contrib projects should pull in the ivy-fetched libs from the root project
    
    Description: On branch-20 currently, I get an error just running "ant contrib -Dtestcase=TestHdfsProxy". In a full "ant test" build sometimes this doesn't appear to be an issue. The problem is that the contrib projects don't automatically pull in the dependencies of the "Hadoop" ivy project. Thus, they each have to declare all of the common dependencies like commons-cli, etc. Some are missing and this causes test failures.
    Reason: Build system improvement
    Author: Todd Lipcon
    Ref: UNKNOWN

commit be70b10f11445f4a71807405718bfeebd38ad924
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:42:51 2010 -0800

    MAPREDUCE-1155. Streaming tests swallow exceptions
    
    Description: Many of the streaming tests (including TestMultipleArchiveFiles) catch exceptions and print their stack trace rather than failing the job. This means that tests do not fail even when the job fails.
    Reason: Test coverage improvement
    Author: Todd Lipcon
    Ref: UNKNOWN

commit f84830ae5e6c862cd0e2b8ebea57880e54c8a082
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:42:33 2010 -0800

    HADOOP-5647. TestJobHistory fails if /tmp/_logs is not writable to. Testcase should not depend on /tmp
    
    Description: TestJobHistory sets /tmp as hadoop.job.history.user.location to check if the history file is created in that directory or not. If /tmp/_logs is already created by some other user, this test will fail because of not having write permission.
    Reason: Bugfix in test harness
    Author: Ravi Gummadi
    Ref: UNKNOWN

commit 669b65f14d78ffd1cf0304cf459d1abbae3412ae
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:42:15 2010 -0800

    CLOUDERA-BUILD. Fix javadoc warnings shown by test-patch, and update eclipse classpath to match current CDH.
    
    Author: Todd Lipcon

commit 51804fd45d3a527a130a373c591a17c185102a0c
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:41:40 2010 -0800

    Revert "HDFS-127: DFSClient block read failures cause open DFSInputStream to become unusable"
    
    Description: This is being reverted as it causes infinite retries when there are no valid replicas.
    Reason: bugfix
    Author: Todd Lipcon
    Ref: UNKNOWN

commit 623bfc0c18087274315dfbd41d025a8a775abe80
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:40:30 2010 -0800

    HDFS-877. Client-driven block verification not functioning
    
    Description: This is actually the reason for <a href="http://issues.apache.org/jira/browse/HDFS-734" title="TestDatanodeBlockScanner times out in branch 0.20"><del>HDFS-734</del></a> (TestDatanodeBlockScanner timing out). The issue is that DFSInputStream relies on readChunk being called one last time at the end of the file in order to receive the lastPacketInBlock=true packet from the DN. However, DFSInputStream.read checks pos &lt; getFileLength() before issuing the read. Thus gotEOS never shifts to true and checksumOk() is never called.
    
    This is a simpler patch than the one on 0.21/0.22 since those fix a further regression
    since 0.20.
    
    Reason: bugfix
    Author: Todd Lipcon
    Ref: UNKNOWN

commit b332fe77255047409da701dfb97df1bddb5b10cb
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:40:05 2010 -0800

    CLOUDERA-BUILD. Add mockito to 0.20 branch for easier unit testing of HDFS stability patches.
    
    Reason: Test coverage improvement
    Author: Todd Lipcon

commit 44a6c559de056b35c6eb2e2d53798c88d8c779e6
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:39:09 2010 -0800

    HDFS-630. In DFSOutputStream.nextBlockOutputStream(), the client can exclude specific datanodes when locating the next block.
    
    Description: created from hdfs-200.
    
    <p>If during a write, the dfsclient sees that a block replica location for a newly allocated block is not-connectable, it re-requests the NN to get a fresh set of replica locations of the block. It tries this dfs.client.block.write.retries times (default 3), sleeping 6 seconds between each retry ( see DFSClient.nextBlockOutputStream).</p>
    
    <p>This setting works well when you have a reasonable size cluster; if u have few datanodes in the cluster, every retry maybe pick the dead-datanode and the above logic bails out.</p>
    
    <p>Our solution: when getting block location from namenode, we give nn the excluded datanodes. The list of dead datanodes is only for one block allocation.</p>
    Reason: bugfix (Fault tolerance improvement)
    Author: Cosmin Lehene (modified by Cloudera to not break compatibility)
    Ref: UNKNOWN

commit 47c404e0cf10ceb31336d2a77d53e0a971348102
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:37:37 2010 -0800

    HDFS-908. TestDistributedFileSystem fails with Wrong FS on weird hosts
    
    Description: On the same host where I experienced <a href="http://issues.apache.org/jira/browse/HDFS-874" title="TestHDFSFileContextMainOperations fails on weirdly configured DNS hosts">HDFS-874</a>, I also experience this failure for TestDistributedFileSystem:
    
    <p>Testcase: testFileChecksum took 0.492 sec<br/>
      Caused an ERROR<br/>
    Wrong FS: hftp://localhost.localdomain:59782/filechecksum/foo0, expected: hftp://127.0.0.1:59782<br/>
    java.lang.IllegalArgumentException: Wrong FS: hftp://localhost.localdomain:59782/filechecksum/foo0, expected: hftp://127.0.0.1:59782<br/>
      at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:310)<br/>
      at org.apache.hadoop.fs.FileSystem.makeQualified(FileSystem.java:222)<br/>
      at org.apache.hadoop.hdfs.HftpFileSystem.getFileChecksum(HftpFileSystem.java:318)<br/>
      at org.apache.hadoop.hdfs.TestDistributedFileSystem.testFileChecksum(TestDistributedFileSystem.java:166)</p>
    
    <p>Doesn't appear to occur on trunk or branch-0.21.</p>
    Reason: bugfix
    Author: Todd Lipcon
    Ref: UNKNOWN

commit 7c2a791f0a397d924a623e45bf823c238374c42c
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:37:19 2010 -0800

    MAPREDUCE-1258. Fair scheduler event log not logging job info
    
    Description: The <a href="http://issues.apache.org/jira/browse/MAPREDUCE-706" title="Support for FIFO pools in the fair scheduler"><del>MAPREDUCE-706</del></a> patch seems to have left an unfinished TODO in the Fair Scheduler - namely, in the dump() function for periodically dumping scheduler state to the event log, the part that dumps information about jobs is commented out. This makes the event log less useful than it was before.
    
    <p>It should be fairly easy to update this part to use the new scheduler data structures (Schedulable etc) and print the data.</p>
    Reason: Logging improvement
    Author: Matei Zaharia
    Ref: UNKNOWN

commit 353f7813bf7dfb0bca1362f9370f6a080256a345
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:36:58 2010 -0800

    MAPREDUCE-1198. Alternatively schedule different types of tasks in fair share scheduler
    
    Description: Matei has mentioned in <a href="http://issues.apache.org/jira/browse/MAPREDUCE-961" title="ResourceAwareLoadManager to dynamically decide new tasks based on current CPU/memory load on TaskTracker(s)">MAPREDUCE-961</a> that the current scheduler will first try to launch map tasks until canLaunthTask() returns false then look for reduce tasks. This might starve reduce task. He also mention that alternatively schedule different types of tasks can solve this problem.
    Reason: bugfix
    Author: Scott Chen
    Ref: UNKNOWN

commit ef449fb7832055951e2364cf12a73717b2add3ce
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:36:50 2010 -0800

    MAPREDUCE-698. Per-pool task limits for the fair scheduler
    
    Description: The fair scheduler could use a way to cap the share of a given pool similar to <a href="http://issues.apache.org/jira/browse/MAPREDUCE-532" title="Allow admins of the Capacity Scheduler to set a hard-limit on the capacity of a queue"><del>MAPREDUCE-532</del></a>.
    Reason: New feature
    Author: Kevin Peterson
    Ref: UNKNOWN

commit a1e25ec70e677db322b2cce43c6381f865eb3f79
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:36:42 2010 -0800

    HDFS-464. Memory leaks in libhdfs
    
    Description: hdfsExists does not call destroyLocalReference for jPath anytime,<br/>
    hdfsDelete does not call it when it fails, and<br/>
    hdfsRename does not call it for jOldPath and jNewPath when it fails
    Reason: bugfix
    Author: Christian Kunz
    Ref: UNKNOWN

commit d93dad715d3c702d15c2a32c85d586c708e70857
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:36:23 2010 -0800

    CLOUDERA-BUILD. Add test ivy configurations to additional projects.
    
    Author: Aaron Kimball
    Reason: Build system improvement

commit 5d0c8f82b87e7cbb541ace9e4f22abfad2799e56
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:35:08 2010 -0800

    CLOUDERA-BUILD. Sqoop bin script now includes jars from contrib/sqoop/lib/ on classpath.
    
    Author: Aaron Kimball

commit 7e009a29c0806537cd50972df90ec87b617eb78f
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:34:54 2010 -0800

    MAPREDUCE-1212. Mapreduce contrib project ivy dependencies are not included in binary target
    
    Description: As in <a href="http://issues.apache.org/jira/browse/HADOOP-6370" title="Contrib project ivy dependencies are not included in binary target">HADOOP-6370</a>, only Hadoop's own library dependencies are promoted to ${build.dir}/lib; any libraries required by contribs are not redistributed.
    Reason: Build system (packaging) improvement
    Author: Aaron Kimball
    Ref: UNKNOWN

commit 8d289f97d6b66cd435f755a4acae9f138de934d6
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:34:43 2010 -0800

    CLOUDERA-BUILD. Update cloud script version to cdh-0.20.1
    
    Author: Tom White

commit ac7eacd44af059d7a859b8d6773a82cd84ba4c9b
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:34:35 2010 -0800

    HADOOP-6466. Add a ZooKeeper service to the cloud scripts
    
    Description: It would be good to add other Hadoop services to the cloud scripts.
    Reason: New feature
    Author: Tom White
    Ref: UNKNOWN

commit 06ceb079693292a41085af795c5b2bbc3fd10af2
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:34:24 2010 -0800

    HADOOP-6454. Create setup.py for EC2 cloud scripts
    
    Description: This would make it easier to install the scripts.
    Reason: Installation improvement
    Author: Tom White
    Ref: UNKNOWN

commit 23c45791bbc3a23d69c77f3518b5d1a1a4702ccc
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:34:11 2010 -0800

    HADOOP-6462. contrib/cloud failing, target "compile" does not exist
    
    Description: I'm not seeing this mentioned in hudson or other bugreports, which confuses me. With the addition of a src/contrib/cloud/build.xml from <a href="http://issues.apache.org/jira/browse/HADOOP-6426" title="Create ant build for running EC2 unit tests"><del>HADOOP-6426</del></a>, contrib/build.xml won't build no more: <br/>
    hadoop-common/src/contrib/build.xml:30: The following error occurred while executing this line:<br/>
    Target "compile" does not exist in the project "hadoop-cloud".
    
    <p>What is odd is this: the final patch of <a href="http://issues.apache.org/jira/browse/HADOOP-6426" title="Create ant build for running EC2 unit tests"><del>HADOOP-6426</del></a> does include the stub &lt;target&gt; files needed, yet they aren't in SVN_HEAD. Which implies that a different version may have gone in than intended. </p>
    Reason: Build system bugfix
    Author: Tom White
    Ref: UNKNOWN

commit 083a6a1cfb2a5198243aa82a020681ad62da5938
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:33:58 2010 -0800

    HADOOP-6444. Support additional security group option in hadoop-ec2 script
    
    Description: When deploying a hadoop cluster on ec2 alongside other services it is very useful to be able to specify additional (pre-existing) security groups to facilitate access control.  For example one could use this feature to add a cluster to a generic "hadoop" group, which authorizes hdfs access from instances outside the cluster.  Without such an option the access control for the security groups created by the script need to manually updated after cluster launch.
    Reason: Security improvement
    Author: Paul Egan
    Ref: UNKNOWN

commit 63152ce4ba3c0cf2006016cc825fc72b0bd23d2d
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:33:49 2010 -0800

    HADOOP-6426. Create ant build for running EC2 unit tests
    
    Description: There is no easy way currently to run the Python unit tests for the cloud contrib.
    Reason: Test coverage improvement
    Author: Tom White
    Ref: UNKNOWN

commit a20069b2adfafa59e0001fe5e5685d36d9eb7fee
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:33:15 2010 -0800

    HADOOP-6392. Run namenode and jobtracker on separate EC2 instances
    
    Description: Replace concept of "master" with that of "namenode" and "jobtracker". Still need to be able to run both on one node, of course.
    Reason: Scalability improvement
    Author: Tom White
    Ref: UNKNOWN

commit 361221a2a082d0ab7a87ba0226dbe05938440738
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:33:07 2010 -0800

    HADOOP-6108. Add support for EBS storage on EC2
    
    Description: By using EBS for namenode and datanode storage we can have persistent, restartable Hadoop clusters running on EC2.
    Reason: New feature
    Author: Tom White
    Ref: UNKNOWN

commit 4ca1c78e1b257eefa10b5ed94479df8a6473d3e9
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:32:50 2010 -0800

    HDFS-861. fuse-dfs does not support O_RDWR
    
    Description: Some applications (for us, the big one is rsync) will open a file in read-write mode when it really only intends to read xor write (not both).  fuse-dfs should try to not fail until the application actually tries to write to a pre-existing file or read from a newly created file.
    Reason: bugfix
    Author: Brian Bockelman
    Ref: UNKNOWN

commit 00f6976093cc20ea825a35f6831f645dc5f61637
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:32:17 2010 -0800

    HDFS-860. fuse-dfs truncate behavior causes issues with scp
    
    Description: For whatever reason, scp issues a "truncate" once it's written a file to truncate the file to the # of bytes it has written (i.e., if a file is X bytes, it calls truncate(X)).
    
    <p>This fails on the current fuse-dfs.</p>
    Reason: bugfix (tool compatibility)
    Author: Brian Bockelman
    Ref: UNKNOWN

commit 46d2b6d6b27887375c44d691d776f70e89e4b81b
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:31:58 2010 -0800

    HDFS-859. fuse-dfs utime behavior causes issues with tar
    
    Description: When trying to untar files onto fuse-dfs, tar will try to set the utime on all the files and directories.  However, setting the utime on a directory in libhdfs causes an error.
    
    <p>We should silently ignore the failure of setting a utime on a directory; this will allow tar to complete successfully.</p>
    Reason: bugfix (tool compatibility)
    Author: Brian Bockelman
    Ref: UNKNOWN

commit 9a38b9c423aca358307aa6455977432f34aef990
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:31:45 2010 -0800

    HDFS-858. Incorrect return codes for fuse-dfs
    
    Description: fuse-dfs doesn't pass proper error codes from libhdfs; places I'd like to correct are hdfsFileOpen (which can result in permission denied or quota violations) and hdfsWrite (which can result in quota violations).
    
    <p>By returning the correct error codes, command line utilities return much better error messages - especially for quota violations, which can be a devil to debug.</p>
    Reason: bugfix
    Author: Brian Bockelman
    Ref: UNKNOWN

commit 84afb26bb0e42eda1e26b07e3aac016695f5ad87
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:31:37 2010 -0800

    HDFS-857. Incorrect type for fuse-dfs capacity can cause "df" to return negative values on 32-bit machines
    
    Description: On sufficiently large HDFS installs, the casting of hdfsGetCapacity to a long may cause "df" to return negative values.  tOffset should be used instead.
    Reason: bugfix
    Author: Brian Bockelman
    Ref: UNKNOWN

commit a4cf3e8e86cbd42bef25eb3aab7e464ac86e3068
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:31:19 2010 -0800

    HDFS-856. Hardcoded replication level for new files in fuse-dfs
    
    Description: In fuse-dfs, the number of replicas is always hardcoded to 3 in the arguments to hdfsOpenFile.  We should use the setting in the hadoop configuration instead.
    Reason: Configuration improvement
    Author: Brian Bockelman
    Ref: UNKNOWN

commit e9f3ec90e57b383faf49e6a6eb8cc91e5182d31e
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:31:08 2010 -0800

    HADOOP-5625. Add I/O duration time in client trace
    
    Description: Add I/O duration information into client trace log for analyzing performance.
    
    Reason: Logging improvement
    Author: Lei Xu
    Ref: UNKNOWN

commit 42eeb4540850278563e76841f0c6b369933d5b70
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:30:43 2010 -0800

    HADOOP-5222. Add offset in client trace
    
    Description: By adding offset in client trace, the client trace information can provide more accurately information about I/O.<br/>
    It is useful for performance analyzing.
    
    <p>Since there is  no random write now, the offset of writing is always zero.</p>
    Reason: Logging improvement
    Author: Lei Xu
    Ref: UNKNOWN

commit 5880960fb32ae0fc2c16bac1f333dbb237c3448f
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:30:27 2010 -0800

    CLOUDERA-BUILD. Solaris do-release-build fix
    
    Author: Eli Collins
    Ref: CDH-531

commit 35f87aef6d7cd4030644a1d454da2e0a6e2969c0
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:30:18 2010 -0800

    MAPREDUCE-1310. CREATE TABLE statements for Hive do not correctly specify delimiters
    
    Description: Imports to HDFS via Sqoop that also inject metadata into Hive do not correctly specify delimiters; using Hive to access the data results in rows being parsed as NULL characters. See <span class="nobr"><a href="http://getsatisfaction.com/cloudera/topics/sqoop_hive_import_giving_null_query_values">http://getsatisfaction.com/cloudera/topics/sqoop_hive_import_giving_null_query_values<sup><img class="rendericon" src="https://issues.apache.org/jira/images/icons/linkext7.gif" height="7" width="7" align="absmiddle" alt="" border="0"/></sup></a></span> for an example bug report
    Reason: Bugfix
    Author: Aaron Kimball
    Ref: UNKNOWN

commit 60784d712cdd5781ceff262bb67e2d484fde428b
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:29:56 2010 -0800

    MAPREDUCE-1235. java.io.IOException: Cannot convert value '0000-00-00 00:00:00' from column 6 to TIMESTAMP.
    
    Description: <b>Description</b>: java.io.IOException is thrown when trying to import a table to HDFS using Sqoop. Table has "0" value in a field of type datetime. <br/>
    <b>Full Exception</b>: java.io.IOException: Cannot convert value '0000-00-00 00:00:00' from column 6 to TIMESTAMP. <br/>
    <b>Original question</b>: <span class="nobr"><a href="http://getsatisfaction.com/cloudera/topics/cant_import_table?utm_content=reply_link&amp;utm_medium=email&amp;utm_source=reply_notification">http://getsatisfaction.com/cloudera/topics/cant_import_table?utm_content=reply_link&amp;utm_medium=email&amp;utm_source=reply_notification<sup><img class="rendericon" src="https://issues.apache.org/jira/images/icons/linkext7.gif" height="7" width="7" align="absmiddle" alt="" border="0"/></sup></a></span>
    Reason: Bugfix (compatibility)
    Author: Aaron Kimball
    Ref: UNKNOWN

commit 23c116b6ab5615bdb846e22b61a41e92ca287bdf
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:29:47 2010 -0800

    MAPREDUCE-1174. Sqoop improperly handles table/column names which are reserved sql words
    
    Description: In some databases it is legal to name tables and columns with terms that overlap SQL reserved keywords (e.g., <tt>CREATE</tt>, <tt>table</tt>, etc.). In such cases, the database allows you to escape the table and column names. We should always escape table and column names when possible.
    Reason: Bugfix
    Author: Aaron Kimball
    Ref: UNKNOWN

commit d4b3b7592c94aa1f4608245829b5de202ed1b148
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:29:39 2010 -0800

    MAPREDUCE-1168. Export data to databases via Sqoop
    
    Description: Sqoop can import from a database into HDFS. It's high time it works in reverse too.
    Reason: New feature
    Author: Aaron Kimball
    Ref: UNKNOWN

commit b29023803d1136bf7d4de45853a2d4481fb36d3c
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:29:24 2010 -0800

    MAPREDUCE-1169. Improvements to mysqldump use in Sqoop
    
    Description: Improve Sqoop's integration with mysqldump
    Reason: Feature/performance improvements
    Author: Aaron Kimball
    Ref: UNKNOWN
    
    commit c6b956630e327ddabf674f8e06de02408e603155
    Author: Aaron Kimball <aaron@cloudera.com>
    Date:   Wed Jan 6 16:05:05 2010 -0800
    
        MAPREDUCE-1169. Improvements to mysqldump use in Sqoop

commit 26ba4fd749755a3df79eaa27792662e5b7e3da80
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:29:15 2010 -0800

    MAPREDUCE-1036. An API Specification for Sqoop
    
    Description: Over the last several months, Sqoop has evolved to a state that is functional and has room for extensions. Developing extensions requires a stable API and documentation. I am attaching to this ticket a description of Sqoop's design and internal APIs, which include some open questions. I would like to solicit input on the design regarding these open questions and standardize the API.
    Reason: Documentation
    Author: Aaron Kimball
    Ref: UNKNOWN

commit e8c47124bb2ada5de0cfdf49150dd7296a41df71
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:29:04 2010 -0800

    MAPREDUCE-1069. Implement Sqoop API refactoring
    
    Description: Implement refactoring decisions outlined in <a href="http://issues.apache.org/jira/browse/MAPREDUCE-1036" title="An API Specification for Sqoop"><del>MAPREDUCE-1036</del></a>
    Reason: API compatibility
    Author: Aaron Kimball
    Ref: UNKNOWN

commit b73cab8083c1594c0328a565eef05951a17f998a
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:28:46 2010 -0800

    MAPREDUCE-1146. Sqoop dependencies break Eclipse build on Linux
    
    Description: Under  Linux there's the error in the Eclipse "Problems" view:
    <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent">
    <pre>- "com.sun.tools cannot be resolved" at line 166 of  org.apache.hadoop.sqoop.orm.CompilationManager
    </pre>
    </div></div>
    <p>The problem doesn't appear on MacOS though</p>
    Reason: bugfix
    Author: Aaron Kimball
    Ref: UNKNOWN

commit 0629ac30abb5e58fb80be56a385867ac7360de22
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:28:37 2010 -0800

    MAPREDUCE-1148. SQL identifiers are a superset of Java identifiers
    
    Description: SQL identifiers can contain arbitrary characters, can start with numbers, can be words like <tt>class</tt> which are reserved in Java, etc. If Sqoop uses these names literally for class and field names then compilation errors can occur in auto-generated classes. SQL identifiers need to be cleansed to map onto Java identifiers.
    Reason: bugfix
    Author: Aaron Kimball
    Ref: UNKNOWN

commit dec4c616921b547e5a332a254254d77efc3a7d5e
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:28:25 2010 -0800

    MAPREDUCE-1224. Calling "SELECT t.* from <table> AS t" to get meta information is too expensive for big tables
    
    Description: The SqlManager uses the query, "SELECT t.* from &lt;table&gt; AS t" to get table spec is too expensive for big tables, and it was called twice to generate column names and types.  For tables that are big enough to be map-reduced, this is too expensive to make sqoop useful.
    Reason: Performance improvement
    Author: Spencer Ho
    Ref: UNKNOWN

commit 1198ef1375387ba107d46f0ab5e9a7c6a7645931
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:28:15 2010 -0800

    MAPREDUCE-706. Support for FIFO pools in the fair scheduler
    
    Description: The fair scheduler should support making the internal scheduling algorithm for some pools be FIFO instead of fair sharing in order to work better for batch workloads. FIFO pools will behave exactly like the current default scheduler, sorting jobs by priority and then submission time. Pools will have their scheduling algorithm set through the pools config file, and it will be changeable at runtime.
    
    <p>To support this feature, I'm also changing the internal logic of the fair scheduler to no longer use deficits. Instead, for fair sharing, we will assign tasks to the job farthest below its share as a ratio of its share. This is easier to combine with other scheduling algorithms and leads to a more stable sharing situation, avoiding unfairness issues brought up in <a href="http://issues.apache.org/jira/browse/MAPREDUCE-543" title="large pending jobs hog resources"><del>MAPREDUCE-543</del></a> and <a href="http://issues.apache.org/jira/browse/MAPREDUCE-544" title="deficit computation is biased by historical load">MAPREDUCE-544</a> that happen when some jobs have long tasks. The new preemption (<a href="http://issues.apache.org/jira/browse/MAPREDUCE-551" title="Add preemption to the fair scheduler"><del>MAPREDUCE-551</del></a>) will ensure that critical jobs can gain their fair share within a bounded amount of time.</p>
    Reason: New feature
    Author: Matei Zaharia
    Ref: UNKNOWN

commit 5699f5483e2a9ee9debd0f0154c6506ee5dc87e2
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:28:03 2010 -0800

    MAPREDUCE-1285. DistCp cannot handle -delete if destination is local filesystem
    
    Description: The following exception is thrown:
    <div class="code panel" style="border-width: 1px;"><div class="codeContent panelContent">
    <pre class="code-java">Copy failed: java.io.IOException: wrong value class: org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus is not class org.apache.hadoop.fs.FileStatus
    	at org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:988)
    	at org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:977)
    	at org.apache.hadoop.tools.DistCp.deleteNonexisting(DistCp.java:1226)
    	at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1134)
    	at org.apache.hadoop.tools.DistCp.copy(DistCp.java:650)
    	at org.apache.hadoop.tools.DistCp.run(DistCp.java:857)
    	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)</pre>
    </div></div>
    Reason: bugfix
    Author: Peter Romianowski
    Ref: UNKNOWN

commit 34bb813a5884aeb05909c2ce2cc541882ca3eda1
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:27:53 2010 -0800

    MAPREDUCE-764. TypedBytesInput's readRaw() does not preserve custom type codes
    
    Description: The typed bytes format supports byte sequences of the form <tt>&lt;custom type code&gt; &lt;length&gt; &lt;bytes&gt;</tt>. When reading such a sequence via <tt>TypedBytesInput</tt>'s <tt>readRaw()</tt> method, however, the returned sequence currently is <tt>0 &lt;length&gt; &lt;bytes&gt;</tt> (0 is the type code for a bytes array), which leads to bugs such as the one described <span class="nobr"><a href="http://dumbo.assembla.com/spaces/dumbo/tickets/54">here<sup><img class="rendericon" src="https://issues.apache.org/jira/images/icons/linkext7.gif" height="7" width="7" align="absmiddle" alt="" border="0"/></sup></a></span>.
    Reason: bugfix
    Author: Klaas Bosteels
    Ref: UNKNOWN

commit 7fd2cb371354219abd108fda35087f08dc481b35
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:27:31 2010 -0800

    HADOOP-6400. Log errors getting Unix UGI
    
    Description: For various reasons, the calls out to `whoami` and `id` can fail when trying to get the unix UGI information. Currently it silently ignores failures and uses the default DrWho/Tardis ugi. This is extremely confusing for users - we should log the exception at warn level when the shell execs fail.
    Reason: Debug logging improvement
    Author: Todd Lipcon
    Ref: UNKNOWN

commit d6dc22fecc058e12695a481fa354078d9b012089
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:27:21 2010 -0800

    MAPREDUCE-1293. AutoInputFormat doesn't work with non-default FileSystems
    
    Description: AutoInputFormat uses the wrong FileSystem.get() method when getting a reference to a FileSystem object. AutoInputFormat gets the default FileSystem, so this method breaks if the InputSplit's path is pointing to a different FileSystem.
    Reason: bugfix
    Author: Andrew Hitchcock
    Ref: UNKNOWN

commit 25a4ea86b0b085e3afd6f2f040201594155b3de1
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:27:09 2010 -0800

    MAPREDUCE-1131. Using profilers other than hprof can cause JobClient to report job failure
    
    Description: If task profiling is enabled, the JobClient will download the <tt>profile.out</tt> file created by the tasks under profile. If this causes an IOException, the job is reported as a failure to the client, even though all the tasks themselves may complete successfully. The expected result files are assumed to be generated by hprof. Using the profiling system with other profilers will cause job failure.
    Reason: compatibility bugfix
    Author: Aaron Kimball
    Ref: UNKNOWN

commit ab98123c7114752945452af0b96c8de04af9ba93
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:26:02 2010 -0800

    MAPREDUCE-370. Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api.
    
    Description: Ports the MultipleOutputs OutputFormat to the new context-based API.
    Reason: API compatibility improvement.
    Author: Amareshwari Sriramadasu
    Ref: UNKNOWN

commit 50726d13750f3f71d2fc5d3a012ce81aa2adb26d
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:24:46 2010 -0800

    CLOUDERA-BUILD. Backport MapReduceTestUtil to Hadoop 0.20
    
    Description: MapReduceTestUtil is required for unit tests in subsequent
    patches, but this class itself was not created in one clean JIRA. Therefore
    it was backported "As-is" from the trunk and not in a patch-wise fashion.
    This class is only used in the JUnit tests for Hadoop.
    Author: Aaron Kimball
    Reason: Testing improvement
    Ref: UNKNOWN

commit d713dc1063afc4967381b6583ec424d2850bac63
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:24:30 2010 -0800

    MAPREDUCE-1059. distcp can generate uneven map task assignments
    
    Description: distcp writes out a SequenceFile containing the source files to transfer, and their sizes. Map tasks are created over spans of this file, representing files which each mapper should transfer. In practice, some transfer loads yield many empty map tasks and a few tasks perform the bulk of the work.
    Reason: Improvement for load balancing
    Author: Aaron Kimball
    Ref: UNKNOWN

commit 855b0bf3718f2c397ef79967475468e4153f120a
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:24:20 2010 -0800

    MAPREDUCE-1128. MRUnit Allows Iteration Twice
    
    Description: MRUnit allows one to iterate over a collection of values twice (ie.
    
    <p>reduce(Key key, Iterable&lt;Value&gt; values, Context context){
       for(Value : values ) /* iterate once */;
       for(Value : values ) /* iterate again */;
    }</p>
    
    <p>Hadoop will allow this as well, however the second iterator will be empty. MRUnit should either match hadoop's behavior or warn the user that their code is likely flawed.</p>
    Reason: bugfix (API compatibility)
    Author: Aaron Kimball
    Ref: UNKNOWN

commit c9d77f6e1fdbb24b45675e363e3bd5111533893a
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:24:10 2010 -0800

    HDFS-464. Memory leaks in libhdfs
    
    Description: hdfsExists does not call destroyLocalReference for jPath anytime,<br/>
    hdfsDelete does not call it when it fails, and<br/>
    hdfsRename does not call it for jOldPath and jNewPath when it fails
    Reason: bugfix
    Author: Christian Kunz
    Ref: UNKNOWN

commit c7996c5e2fbb9260740fec369550551d6320762a
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:23:51 2010 -0800

    HDFS-423. Unbreak FUSE build and fuse_dfs_wrapper.sh
    
    Description: fuse-dfs depends on libhdfs, and fuse-dfs build.xml still points to the libhfds/libhdfs.so location but libhdfs now is build in a different location <br/>
    please take a look at this bug for the location details
    
    <p><span class="nobr"><a href="https://issues.apache.org/jira/browse/HADOOP-3344">https://issues.apache.org/jira/browse/HADOOP-3344<sup><img class="rendericon" src="https://issues.apache.org/jira/images/icons/linkext7.gif" height="7" width="7" align="absmiddle" alt="" border="0"/></sup></a></span></p>
    
    <p>Thanks,<br/>
    Giri</p>
    Reason: Build system bugfix
    Author: Eli Collins
    Ref: UNKNOWN

commit 72b0b791cd347e760807a44f5197599f57afde03
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:23:39 2010 -0800

    CLOUDERA-BUILD. Make bin/hadoop-config.sh work with dev builds
    
    Author: Eli Collins

commit a9466041ccfcdb07f4f0dd34a57c9e9bdd6a3e70
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:23:06 2010 -0800

    HDFS-727. bug setting block size hdfsOpenFile
    
    Description: In hdfsOpenFile in libhdfs invokeMethod needs to cast the block size argument to a jlong so a full 8 bytes are passed (rather than 4 plus some garbage which causes writes to fail due to a bogus block size).
    
    Reason: Bugfix
    Author: Eli Collins
    Ref: UNKNOWN

commit 4e7d205daa86d904614252101bb422664ab6d203
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:22:47 2010 -0800

    Revert MAPREDUCE-967. TaskTracker does not need to fully unjar job jars
    
    Author: Todd Lipcon
    Ref: UNKNOWN

commit d5f0c77a6c81e9e56da81976645614280247f7a2
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:22:18 2010 -0800

    HADOOP-5640. Allow ServicePlugins to hook callbacks into key service events
    
    Description: <a href="http://issues.apache.org/jira/browse/HADOOP-5257" title="Export namenode/datanode functionality through a pluggable RPC layer"><del>HADOOP-5257</del></a> added the ability for NameNode and DataNode to start and stop ServicePlugin implementations at NN/DN start/stop. However, this is insufficient integration for some common use cases.
    
    <p>We should add some functionality for Plugins to subscribe to events generated by the service they're plugging into. Some potential hook points are:</p>
    
    <p>NameNode:</p>
    <ul class="alternate" type="square">
    	<li>new datanode registered</li>
    	<li>datanode has died</li>
    	<li>exception caught</li>
    	<li>etc?</li>
    </ul>
    
    <p>DataNode:</p>
    <ul class="alternate" type="square">
    	<li>startup</li>
    	<li>initial registration with NN complete (this is important for HADOOP-4707 to sync up datanode.dnRegistration.name with the NN-side registration)</li>
    	<li>namenode reconnect</li>
    	<li>some block transfer hooks?</li>
    	<li>exception caught</li>
    </ul>
    
    <p>I see two potential routes for implementation:</p>
    
    <p>1) We make an enum for the types of hookpoints and have a general function in the ServicePlugin interface. Something like:</p>
    
    <div class="code panel" style="border-width: 1px;"><div class="codeContent panelContent">
    <pre class="code-java"><span class="code-keyword">enum</span> HookPoint {
      DN_STARTUP,
      DN_RECEIVED_NEW_BLOCK,
      DN_CAUGHT_EXCEPTION,
     ...
    }
    
    void runHook(HookPoint hp, <span class="code-object">Object</span> value);</pre>
    </div></div>
    
    <p>2) We make classes specific to each "pluggable" as was originally suggested in HADDOP-5257. Something like:</p>
    
    <div class="code panel" style="border-width: 1px;"><div class="codeContent panelContent">
    <pre class="code-java">class DataNodePlugin {
      void datanodeStarted() {}
      void receivedNewBlock(block info, etc) {}
      void caughtException(Exception e) {}
      ...
    }</pre>
    </div></div>
    
    <p>I personally prefer option (2) since we can ensure plugin API compatibility at compile-time, and we avoid an ugly switch statement in a runHook() function.</p>
    
    <p>Interested to hear what people's thoughts are here.</p>
    
    HADOOP-5640 puts this in the new test dir. It needs to be in the old one.
    
    Reason: Improvement
    Author: Todd Lipcon
    Ref: UNKNOWN

commit e9b04609d88ed5d1af442ee950aa5dcd6646e830
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:22:08 2010 -0800

    MAPREDUCE-1017. Compression and output splitting for Sqoop
    
    Description: Sqoop "direct mode" writing will generate a single large text file in HDFS. It is important to be able to compress this data before it reaches HDFS. Due to the difficulty in splitting compressed files in HDFS for use by MapReduce jobs, data should also be split at compression time.
    Reason: New feature
    Author: Aaron Kimball
    Ref: UNKNOWN

commit 8c9b473e1af036a3e2cc9036a945a4567277db8a
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:21:14 2010 -0800

    HADOOP-6312. Configuration sends too much data to log4j
    
    Description: Configuration objects send a DEBUG-level log message every time they're instantiated, which include a full stack trace. This is more appropriate for TRACE-level logging, as it renders other debug logs very hard to read.
    Reason: Logging improvement
    Author: Aaron Kimball
    Ref: UNKNOWN

commit 698fe169f31e54111d30e4420cd1c1c5eaeecdec
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:21:03 2010 -0800

    HDFS-686. NullPointerException is thrown while merging edit log and image
    
    Description: Our secondary name node is not able to start on NullPointerException:<br/>
    ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: java.lang.NullPointerException<br/>
            at org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedSetTimes(FSDirectory.java:1232)<br/>
            at org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedSetTimes(FSDirectory.java:1221)<br/>
            at org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:776)<br/>
            at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:992)<br/>
            at<br/>
    org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.doMerge(SecondaryNameNode.java:590)<br/>
            at<br/>
    org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.access$000(SecondaryNameNode.java:473)<br/>
            at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doMerge(SecondaryNameNode.java:350)<br/>
            at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:314)<br/>
            at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:225)<br/>
            at java.lang.Thread.run(Thread.java:619)
    
    <p>This was caused by setting access time on a non-existent file.</p>
    Reason: bugfix
    Author: Hairong Kuang
    Ref: UNKNOWN

commit b2cc8e02f37a1604bb076acefff0ebf016c249d5
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:20:40 2010 -0800

    MAPREDUCE-112. Reduce Input Records and Reduce Output Records counters are not being set when using the new Mapreduce reducer API
    
    Description: After running the examples/wordcount (which uses the new API), the reduce input and output record counters always show 0. This is because these counters are not getting updated in the new API
    This adds counters for reduce input, output records to the new API.
    Reason: Bugfix
    Author: Jothi Padmanabhan
    Ref: UNKNOWN

commit 3e62477434542dc3de89fd43fd9b19abaf76f0de
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:20:00 2010 -0800

    MAPREDUCE-768. Configuration information should generate dump in a standard format.
    
    Description:  We need to generate the configuration dump in a standard format .
    This adds the 'hadoop jobtracker -dumpConfiguration' command.
    This is modified from the original patch in that it does not dump QueueManager configuration.
    This is because we have not backported HADOOP-5396
    
    Reason: New feature
    Author: V.V.Chaitanya Krishna
    Ref: UNKNOWN

commit 4d9333b00772455a1ca7a365fa5b5b2f6872abd7
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:19:46 2010 -0800

    HADOOP-6184. Provide a configuration dump in json format.
    
    Description: Configuration dump in json format.
    Reason: New feature
    Author: V.V.Chaitanya Krishna
    Ref: UNKNOWN

commit 96244c3e7d6735f450b618fdcbdbbf9a81436ba3
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:19:27 2010 -0800

    CLOUDERA-BUILD. Duplicated effort. FULL_VERSION already set in package.mk
    
    Description: Revert "Need to pass in FULL_VERSION"
    Author: Chad Metcalf

commit 604d3a71334b9340a6219e3b88bf563b79f5d083
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:19:11 2010 -0800

    CLOUDERA-BUILD. Copy the sqoop manpage to the expected version number
    
    Author: Chad Metcalf

commit 6d428f70591a92a90dca5256968c62a510659240
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:18:58 2010 -0800

    CLOUDERA-BUILD. Bump jdiff stable to 0.20.1
    
    Author: Chad Metcalf

commit 46ffc9aa9260a96bdf67fbaee9a2acd76cfcf675
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:18:44 2010 -0800

    CLOUDERA-BUILD. Need to pass in FULL_VERSION
    
    Author: Chad Metcalf

commit aa7ae9d9826866f94ecfe5629d087ef68e4b5c54
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:18:29 2010 -0800

    MAPREDUCE-999. Improve Sqoop test speed and refactor tests
    
    Description: Sqoop's tests take a long time to run, but this can be improved (by a factor of 2 or more) by taking advantage of <tt>jobclient.completion.poll.interval</tt>.
    Reason: Testing performance improvement
    Author: Aaron Kimball
    Ref: UNKNOWN

commit 084c390ed5fcb03c456121c8497759b40a74f809
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:18:13 2010 -0800

    MAPREDUCE-1089. Fair Scheduler preemption triggers NPE when tasks are scheduled but not running
    
    Description: We see exceptions like this when preemption runs when a task has been scheduled on a TT but has not yet started running.
    
    <p>2009-10-09 14:30:53,989 INFO org.apache.hadoop.mapred.FairScheduler: Should preempt 2 MAP tasks for job_200910091420_0006: tasksDueToMinShare = 2, tasksDueToFairShare = 0<br/>
    2009-10-09 14:30:54,036 ERROR org.apache.hadoop.mapred.FairScheduler: Exception in fair scheduler UpdateThread<br/>
    java.lang.NullPointerException<br/>
            at org.apache.hadoop.mapred.FairScheduler$2.compare(FairScheduler.java:1015)<br/>
            at org.apache.hadoop.mapred.FairScheduler$2.compare(FairScheduler.java:1013)<br/>
            at java.util.Arrays.mergeSort(Arrays.java:1270)<br/>
            at java.util.Arrays.sort(Arrays.java:1210)<br/>
            at java.util.Collections.sort(Collections.java:159)<br/>
            at org.apache.hadoop.mapred.FairScheduler.preemptTasks(FairScheduler.java:1013)<br/>
            at org.apache.hadoop.mapred.FairScheduler.preemptTasksIfNecessary(FairScheduler.java:911)<br/>
            at org.apache.hadoop.mapred.FairScheduler$UpdateThread.run(FairScheduler.java:286)</p>
    Reason: Bugfix
    Author: Todd Lipcon
    Ref: UNKNOWN

commit 34ca2a5547398f9435a5d3d22603d0f7da420226
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:17:48 2010 -0800

    MAPREDUCE-551. Add preemption to the fair scheduler
    
    Description: Task preemption is necessary in a multi-user Hadoop cluster for two reasons: users might submit long-running tasks by mistake (e.g. an infinite loop in a map program), or tasks may be long due to having to process large amounts of data. The Fair Scheduler (<a href="http://issues.apache.org/jira/browse/HADOOP-3746" title="A fair sharing job scheduler"><del>HADOOP-3746</del></a>) has a concept of guaranteed capacity for certain queues, as well as a goal of providing good performance for interactive jobs on average through fair sharing. Therefore, it will support preempting under two conditions:<br/>
    1) A job isn't getting its <em>guaranteed</em> share of the cluster for at least T1 seconds.<br/>
    2) A job is getting significantly less than its <em>fair</em> share for T2 seconds (e.g. less than half its share).
    
    <p>T1 will be chosen smaller than T2 (and will be configurable per queue) to meet guarantees quickly. T2 is meant as a last resort in case non-critical jobs in queues with no guaranteed capacity are being starved.</p>
    
    <p>When deciding which tasks to kill to make room for the job, we will use the following heuristics:</p>
    <ul class="alternate" type="square">
    	<li>Look for tasks to kill only in jobs that have more than their fair share, ordering these by deficit (most overscheduled jobs first).</li>
    	<li>For maps: kill tasks that have run for the least amount of time (limiting wasted time).</li>
    	<li>For reduces: similar to maps, but give extra preference for reduces in the copy phase where there is not much map output per task (at Facebook, we have observed this to be the main time we need preemption - when a job has a long map phase and its reducers are mostly sitting idle and filling up slots).</li>
    </ul>
    
    This fixes an error in the previous backport where the
    EagerTaskInitializationListener wasn't properly passed the
    TaskTrackerManager before starting.
    
    Reason: New feature
    Author: Matei Zaharia
    Ref: UNKNOWN

commit a3e29eff0b9337a1007ec1b90ccb832dca5c1d20
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:17:33 2010 -0800

    CLOUDERA-BUILD. Fix hadoop wrapper to properly pass through multiword quoted arguments
    
    Author: Todd Lipcon

commit 975647b6c3a6644cabbd48bf14e074a0efda2cb9
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:17:15 2010 -0800

    CLOUDERA-BUILD. Sqoop documentation is now part of the generated tarball. Updated the install script to reflect that change.
    
    Author: Matt Massie

commit 19c038a6af07e3999e83a2178d2328535e00dedb
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:16:55 2010 -0800

    CLOUDERA-BUILD. Generate the sqoop documentation and ensure that it's in the release tarball
    
    Author: Matt Massie

commit 6957626991875302f33bb73630f4f376412f9711
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:16:43 2010 -0800

    CLOUDERA-BUILD. More changes to get debs building correctly
    
    Author: Chad Metcalf

commit 67d1c732cea0eebf59de512301ae8f2a1cb2f349
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:16:30 2010 -0800

    CLOUDERA-BUILD. Reformatted Sqoop manpage asciidoc for CDH build process
    
    Author: Aaron Kimball

commit af158d6aa7ffe72d931bc4763ace7d4a299d077b
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:16:14 2010 -0800

    CLOUDERA-BUILD. Only rerun libtoolize if version 2.2 is installed
    
    Author: Todd Lipcon

commit 586992381042e1b4ec8c9ece069561ad2e4dfcc0
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:15:42 2010 -0800

    HADOOP-6279. Add JVM memory usage to JvmMetrics
    
    Description: The JvmMetrics currently publish memory usage from the MemoryMXBean. This is useful, but doesn't include the total heap size (eg as displayed in the JT Web UI).
    
    <p>It would be nice to expose Runtime.getRuntime().maxMemory() as part of JvmMetrics.</p>
    
    <p>It seems that Runtime.getRuntime().totalMemory() (used by the JT for "memory used") is the same as the 'memHeapCommittedM' which already exists.</p>
    Reason: Metrics improvement
    Author: Todd Lipcon
    Ref: UNKNOWN

commit 7c168a8a2613d93e19508a91e7c4db3b3cfb503b
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:15:26 2010 -0800

    HADOOP-6269. Missing synchronization for defaultResources in Configuration.addResource
    
    Description: Configuration.defaultResources is a simple ArrayList. In two places in Configuration it is accessed without appropriate synchronization, which we've seen to occasionally result in ConcurrentModificationExceptions.
    Reason: bugfix (race condition)
    Author: Sreekanth Ramakrishnan
    Ref: UNKNOWN

commit 8bf845170decdcb12254bc1dc98ccbf0fda7d233
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:15:01 2010 -0800

    CLOUDERA-BUILD. Recreate c++ configure files during build if we have the right build dependencies
    
    Author: Todd Lipcon

commit e7e9812fa7a6a256652f2f6bbb269334f883c53b
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:14:43 2010 -0800

    CLOUDERA-BUILD. Package sqoop docs w/o requiring asciidoc
    
    Author: Chad Metcalf
    Ref: UNKNOWN

commit 7171eabfad501d635b1da9e0287f50e025b4a83f
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:13:39 2010 -0800

    CLOUDERA-BUILD. Revert "Package sqoop docs."
    
    Description: This reverts packaging of sqoop documentation in preparation
    for including MAPREDUCE-906 properly after it has been committed
    to Apache.
    Author: Chad Metcalf
    Ref: UNKNOWN

commit 4bd437c9d70f2c0d68047e0376a7af21cc4a70e0
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:13:17 2010 -0800

    HADOOP-5891. If dfs.http.address is default, SecondaryNameNode can't find NameNode
    
    Description: As detailed in this blog post:<br/>
    <span class="nobr"><a href="http://www.cloudera.com/blog/2009/02/10/multi-host-secondarynamenode-configuration/">http://www.cloudera.com/blog/2009/02/10/multi-host-secondarynamenode-configuration/<sup><img class="rendericon" src="https://issues.apache.org/jira/images/icons/linkext7.gif" height="7" width="7" align="absmiddle" alt="" border="0"/></sup></a></span><br/>
    if dfs.http.address is not configured, and the 2NN is a different machine from the NN, the 2NN fails to connect.
    
    <p>In SecondaryNameNode.getInfoServer, the 2NN should notice a "0.0.0.0" dfs.http.address and, in that case, pull the hostname out of fs.default.name. This would fix the default configuration to work properly for most users.</p>
    Reason: Configuration improvement
    Author: Todd Lipcon
    Ref: UNKNOWN

commit 74e10e4a137b2aa60ab39186115350b5e82464fc
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:11:50 2010 -0800

    HDFS-127. DFSClient block read failures cause open DFSInputStream to become unusable
    
    Description: We are using some Lucene indexes directly from HDFS and for quite long time we were using Hadoop version 0.15.3.
    
    <p>When tried to upgrade to Hadoop 0.19 - index searches started to fail with exceptions like:<br/>
    2008-11-13 16:50:20,314 WARN <span class="error">&#91;Listener-4&#93;</span> [] DFSClient : DFS Read: java.io.IOException: Could not obtain block: blk_5604690829708125511_15489 file=/usr/collarity/data/urls-new/part-00000/20081110-163426/_0.tis<br/>
    at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1708)<br/>
    at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1536)<br/>
    at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1663)<br/>
    at java.io.DataInputStream.read(DataInputStream.java:132)<br/>
    at org.apache.nutch.indexer.FsDirectory$DfsIndexInput.readInternal(FsDirectory.java:174)<br/>
    at org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:152)<br/>
    at org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:38)<br/>
    at org.apache.lucene.store.IndexInput.readVInt(IndexInput.java:76)<br/>
    at org.apache.lucene.index.TermBuffer.read(TermBuffer.java:63)<br/>
    at org.apache.lucene.index.SegmentTermEnum.next(SegmentTermEnum.java:131)<br/>
    at org.apache.lucene.index.SegmentTermEnum.scanTo(SegmentTermEnum.java:162)<br/>
    at org.apache.lucene.index.TermInfosReader.scanEnum(TermInfosReader.java:223)<br/>
    at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:217)<br/>
    at org.apache.lucene.index.SegmentTermDocs.seek(SegmentTermDocs.java:54) <br/>
    ...</p>
    
    <p>The investigation showed that the root of this issue is that we exceeded # of xcievers in the data nodes and that was fixed by changing configuration settings to 2k.<br/>
    However - one thing that bothered me was that even after datanodes recovered from overload and most of client servers had been shut down - we still observed errors in the logs of running servers.<br/>
    Further investigation showed that fix for <a href="http://issues.apache.org/jira/browse/HADOOP-1911" title="infinite loop in dfs -cat command."><del>HADOOP-1911</del></a> introduced another problem - the DFSInputStream instance might become unusable once number of failures over lifetime of this instance exceeds configured threshold.</p>
    
    <p>The fix for this specific issue seems to be trivial - just reset failure counter before reading next block (patch will be attached shortly).</p>
    
    <p>This seems to be also related to HADOOP-3185, but I'm not sure I really understand necessity of keeping track of failed block accesses in the DFS client.</p>
    
        HADOOP-4681: Also referenced
    
        This as-yet-uncommitted patch is recommended by HBase people.
        Applied patch "4681.patch" attached to the JIRA on 2008-11-18.
    
    Reason: Bugfix
    Author: Igor Bolotin
    Ref: UNKNOWN

commit ca547d89042fff3a38c0c93b6e0ece78e74ae064
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:11:10 2010 -0800

    HADOOP-4655. FileSystem.CACHE should be ref-counted
    
    Description: FileSystem.CACHE is not ref-counted, and could lead to resource leakage.
    Adds new method FileSystem.newInstance() that always returns a newly allocated
    FileSystem object.
    Reason: Bugfix
    Author: dhruba borthakur
    Ref: UNKNOWN

commit 15660507606b32c3c6c2878f8ed69fe106119bc9
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:10:51 2010 -0800

    MAPREDUCE-967. TaskTracker does not need to fully unjar job jars
    
    Description: In practice we have seen some users submitting job jars that consist of 10,000+ classes. Unpacking these jars into mapred.local.dir and then cleaning up after them has a significant cost (both in wall clock and in unnecessary heavy disk utilization). This cost can be easily avoided
    Reason: Performance improvement
    Author: Todd Lipcon
    Ref: UNKNOWN

commit 648e30e074a16de837fb4c604a198bc780c2e6c5
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:10:34 2010 -0800

    MAPREDUCE-968. NPE in distcp encountered when placing _logs directory on S3FileSystem
    
    Description: If distcp is pointed to an empty S3 bucket as the destination for an s3:// filesystem transfer, it will fail with the following exception
    
    <p>Copy failed: java.lang.NullPointerException<br/>
    at org.apache.hadoop.fs.s3.S3FileSystem.makeAbsolute(S3FileSystem.java:121)<br/>
    at org.apache.hadoop.fs.s3.S3FileSystem.getFileStatus(S3FileSystem.java:332)<br/>
    at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:633)<br/>
    at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1005)<br/>
    at org.apache.hadoop.tools.DistCp.copy(DistCp.java:650)<br/>
    at org.apache.hadoop.tools.DistCp.run(DistCp.java:857)<br/>
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)<br/>
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)<br/>
    at org.apache.hadoop.tools.DistCp.main(DistCp.java:884) </p>
    Reason: Bugfix
    Author: Aaron Kimball
    Ref: UNKNOWN

commit a61718b87c36dbeddcc6f9917438f81ebdda0214
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:10:22 2010 -0800

    HADOOP-6133. ReflectionUtils performance regression
    
    Description: <a href="http://issues.apache.org/jira/browse/HADOOP-4187" title="Create a MapReduce-specific ReflectionUtils that handles JobConf and JobConfigurable"><del>HADOOP-4187</del></a> introduced extra calls to Class.forName in ReflectionUtils.setConf. This caused a fairly large performance regression. Attached is a microbenchmark that shows the following timings (ms) for 100M constructions of new instances:
    
    <p>Explicit construction (new Test): around ~1.6sec<br/>
    Using Test.class.newInstance: around ~2.6sec<br/>
    ReflectionUtils on 0.18.3: ~8.0sec<br/>
    ReflectionUtils on 0.20.0: ~200sec</p>
    
    <p>This illustrates the ~80x slowdown caused by <a href="http://issues.apache.org/jira/browse/HADOOP-4187" title="Create a MapReduce-specific ReflectionUtils that handles JobConf and JobConfigurable"><del>HADOOP-4187</del></a>.</p>
    Reason: Performance improvement
    Author: Todd Lipcon
    Ref: UNKNOWN
    
    commit 5e299f831420ed52569eefc5ba815359a0ebc64e
    Author: Chad Metcalf <chad@cloudera.com>
    Date:   Tue Sep 15 22:21:42 2009 -0700
    
        HADOOP-6133: ReflectionUtils performance regression

commit b6f790774d34ed34bb7c649142dc770c25121ac3
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:10:13 2010 -0800

    HADOOP-5981. HADOOP-2838 doesnt work as expected
    
    Description: The substitution feature i.e X=$X:/tmp doesnt work as expected.
    
    <p>This issue completes the feature mentioned in <a href="http://issues.apache.org/jira/browse/HADOOP-2838" title="Add HADOOP_LIBRARY_PATH config setting so Hadoop will include external directories for jni"><del>HADOOP-2838</del></a>. <a href="http://issues.apache.org/jira/browse/HADOOP-2838" title="Add HADOOP_LIBRARY_PATH config setting so Hadoop will include external directories for jni"><del>HADOOP-2838</del></a> provided a way to set env variables in child process. This issue provides a way to inherit tt's env variables and append or reset it. So now <br/>
    X=$X:y will inherit X (if  there) and append y to it. </p>
    Reason: Bugfix
    Author: Amar Kamat
    Ref: UNKNOWN
    
    commit eb635e4de3a8b2b5bd9f34225770f24be42dcd83
    Author: Chad Metcalf <chad@cloudera.com>
    Date:   Tue Sep 15 22:29:50 2009 -0700
    
        HADOOP-5981: HADOOP-2838 doesnt work as expected

commit 5d4e93d8e0df3c445f56c5eb51965eef92bebd78
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:09:46 2010 -0800

    HADOOP-2838. Add HADOOP_LIBRARY_PATH config setting so Hadoop will include external directories for jni
    
    Description: Currently there is no way to configure Hadoop to use external JNI directories. I propose we add a new variable like HADOOP_CLASS_PATH that is added to the JAVA_LIBRARY_PATH before the process is run.
    
    <p>Now the users can set environment variables using mapred.child.env. They can do the following <br/>
    X=Y : set X to Y<br/>
    X=$X:Y : Append Y to X (which should be taken from the tasktracker)</p>
    Reason: Improves job launch flexibility
    Author: Amar Kamat
    Ref: UNKNOWN
    
    commit 9b3fc32fa793b338dc700a7f6c437402f80d6b7f
    Author: Chad Metcalf <chad@cloudera.com>
    Date:   Tue Sep 15 22:09:57 2009 -0700
    
        HADOOP-2838: Add HADOOP_LIBRARY_PATH config setting so Hadoop will include external directories for jni

commit 877429c3f94a1e937fbe29b4cbe8da573831d802
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:09:31 2010 -0800

    MAPREDUCE-814. Move completed Job history files to HDFS
    
    Description: Currently completed job history files remain on the jobtracker node. Having the files available on HDFS will enable clients to access these files more easily.
    Reason: New feature
    Author: Sharad Agarwal
    Ref: UNKNOWN
    
    commit c0575c0908fee4ec01f5bc0abbd7f4b2254dd38e
    Author: Chad Metcalf <chad@cloudera.com>
    Date:   Tue Sep 15 18:15:17 2009 -0700
    
        MAPREDUCE-814: Move completed Job history files to HDFS

commit a8bf06eac5312ede0982118801e4495285a442fe
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:08:12 2010 -0800

    MAPREDUCE-693. Conf files not moved to "done" subdirectory after JT restart
    
    Description: After <a href="http://issues.apache.org/jira/browse/MAPREDUCE-516" title="Fix the 'cluster drain' problem in the Capacity Scheduler wrt High RAM Jobs"><del>MAPREDUCE-516</del></a>, when a job is submitted and the JT is restarted (before job files have been written) and the job is killed after recovery, the conf files fail to be moved to the "done" subdirectory.<br/>
    The exact scenario to reproduce this issue is:
    <ul>
    	<li>Submit a job</li>
    	<li>Restart JT before anything is written to the job files</li>
    	<li>Kill the job</li>
    	<li>The old conf files remain in the history folder and fail to be moved to "done" subdirectory</li>
    </ul>
    
    Reason: bugfix
    Author: Amar Kamat
    Ref: UNKNOWN

commit cc22e9f92db6470d244fb17f57601b93bab6db80
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:07:55 2010 -0800

    MAPREDUCE-683. TestJobTrackerRestart fails with Map task completion events ordering mismatch
    
    Description: <tt>TestJobTrackerRestart</tt> fails consistently with Map task completion events ordering mismatch error.
    Reason: bugfix
    Author: Amar Kamat
    Ref: UNKNOWN

commit 57a67dff5d15e3833c7968254df076e440de2765
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:07:39 2010 -0800

    MAPREDUCE-416. Move the completed jobs' history files to a DONE subdirectory inside the configured history directory
    
    Description: Whenever a job completes, the history file can be moved to a directory called DONE. That would make the management of job history files easier (for example, administrators can move the history files from that directory to some other place, delete them, archive them, etc.).
    Reason: System management improvement
    Author: Amar Kamat
    Ref: UNKNOWN

commit 99dfdb9a98e1ebd643f47877be3541962c32dcd0
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:07:18 2010 -0800

    HADOOP-5733. Add map/reduce slot capacity and lost map/reduce slot capacity to JobTracker metrics
    
    Description: It would be nice to have the actual map/reduce slot capacity and the lost map/reduce slot capacity (# of blacklisted nodes * map-slot-per-node or reduce-slot-per-node). This information can be used to calculate a JT view of slot utilization.
    Reason: Metrics improvement
    Author: Sreekanth Ramakrishnan
    Ref: UNKNOWN

commit 955fe9433b13f21079f92e4035393b683486ad07
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:05:59 2010 -0800

    HADOOP-5738. Split waiting tasks field in JobTracker metrics to individual tasks
    
    Description: Currently, job tracker metrics reports waiting tasks as a single field in metrics. It would be better if we can split waiting tasks into maps and reduces.
    Reason: User experience improvement
    Author: Sreekanth Ramakrishnan
    Ref: UNKNOWN

commit 3b8f77cd452c1098c6af5907b787bf9167df806b
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:05:48 2010 -0800

    HADOOP-5442. The job history display needs to be paged
    
    Description: Currently the list of job history will try to render the entire list of jobs that have run. That doesn't scale up as more and more jobs run on a job tracker.
    Reason: Scalability improvement
    Author: Amar Kamat
    Ref: UNKNOWN

commit dfac0482267aaf0fabac97c163e0015306ec5b16
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:05:16 2010 -0800

    HADOOP-4842. Streaming combiner should allow command, not just JavaClass
    
    Description: Streaming jobs are way slower than Java jobs for many reasons, but certainly stopping the shell-only programmer from using the combiner feature won't help. Right now, the streaming usage says:
    
    <blockquote>
    <p>  -mapper   &lt;cmd|JavaClassName&gt;      The streaming command to run<br/>
      -combiner &lt;JavaClassName&gt; Combiner has to be a Java class<br/>
      -reducer  &lt;cmd|JavaClassName&gt;      The streaming command to run</p></blockquote>
    Reason: Usability improvement
    Author: Amareshwari Sriramadasu
    Ref: UNKNOWN

commit 33e4f0a87effa466914e292488c47977245edc96
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:04:06 2010 -0800

    MAPREDUCE-987. Exposing MiniDFS and MiniMR clusters as a single process command-line
    
    Description: It's hard to test non-Java programs that rely on significant mapreduce functionality.  The patch I'm proposing shortly will let you just type "bin/hadoop jar hadoop-hdfs-hdfswithmr-test.jar minicluster" to start a cluster (internally, it's using Mini{MR,HDFS}Cluster) with a specified number of daemons, etc.  A test that checks how some external process interacts with Hadoop might start minicluster as a subprocess, run through its thing, and then simply kill the java subprocess.
    
    <p>I've been using just such a system for a couple of weeks, and I like it.  It's significantly easier than developing a lot of scripts to start a pseudo-distributed cluster, and then clean up after it.  I figure others might find it useful as well.</p>
    
    <p>I'm at a bit of a loss as to where to put it in 0.21.  hdfs-with-mr tests have all the required libraries, so I've put it there.  I could conceivably split this into "minimr" and "minihdfs", but it's specifically the fact that they're configured to talk to each other that I like about having them together.  And one JVM is better than two for my test programs.</p>
    Reason: Testing feature
    Author: Philip Zeyliger
    Ref: UNKNOWN

commit 39ff7e5ee285df97c765a73271066df718be0e30
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 17:03:23 2010 -0800

    HADOOP-6267. build-contrib.xml unnecessarily enforces that contrib projects be located in contrib/ dir
    
    Description: build-contrib.xml currently sets hadoop.root to ${basedir}/../../../. This path is relative to the contrib project which is assumed to be inside src/contrib/. We occasionally work on contrib projects in other repositories until they're ready to contribute. We can use the &lt;dirname&gt; ant task to do this more correctly.
    Reason: Build system improvement
    Author: Todd Lipcon
    Ref: UNKNOWN

commit 139bea6660193cc73852832e03fe570437343e96
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 15:02:55 2010 -0800

    HDFS-528. Add ability for safemode to wait for a minimum number of live datanodes
    
    Description: When starting up a fresh cluster programatically, users often want to wait until DFS is "writable" before continuing in a script. "dfsadmin -safemode wait" doesn't quite work for this on a completely fresh cluster, since when there are 0 blocks on the system, 100% of them are accounted for before any DNs have reported.
    
    <p>This JIRA is to add a command which waits until a certain number of DNs have reported as alive to the NN.</p>
    Reason: New feature
    Author: Todd Lipcon
    Ref: UNKNOWN

commit b301746d45bde2759535549f87c6485f4ee577b2
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 15:02:38 2010 -0800

    HADOOP-4936. Improvements to TestSafeMode
    
    Description: TestSafeMode
    <ul class="alternate" type="square">
    	<li>needs a detailed description of the test case</li>
    	<li>should not use direct calls to the name-node rather call <tt>DistributedFileSystem</tt> methods.</li>
    </ul>
    
    Reason: Test coverage improvement
    Author: Konstantin Shvachko
    Ref: UNKNOWN

commit f04a321596a513e71354f2a6829b44e474077507
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 15:02:22 2010 -0800

    HADOOP-5650. Namenode log that indicates why it is not leaving safemode may be confusing
    
    Description: A namenode with a large number of datablocks is setup with dfs.safemode.threshold.pct set to 1.0. With a small number of unreported blocks, namenode prints the following as the reason for not leaving safe mode:<br/>
    <tt>The ratio of reported blocks 1.0000 has not reached the threshold 1.0000</tt>
    
    <p>With a large number of blocks, precision used for printing the log may not indicate the difference between the actual ratio of safe blocks to total blocks and the configured threshold. Printing number of blocks instead of ratio will improve the clarity.</p>
    Reason: User experience improvement
    Author: Suresh Srinivas
    Ref: UNKNOWN

commit 13e35e654c51a5b1cfe809ef1e2c4d2ca46ed612
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 15:01:52 2010 -0800

    HADOOP-4675. Current Ganglia metrics implementation is incompatible with Ganglia 3.1
    
    Description: Ganglia changed its wire protocol in the 3.1.x series; the current implementation only works for 3.0.x.
    
    Patched using
    https://issues.apache.org/jira/secure/attachment/12407207/HADOOP-4675-v7.patch
    
    Reason: Compatibility improvement
    Author: Brian Bockelman
    Ref: UNKNOWN
    
    commit dcf76896b1c8a7b891995b1546eef6ea3018e7ca
    Author: Philip Zeyliger <philip@cloudera.com>
    Date:   Tue Jul 28 15:28:18 2009 -0700
    
        HADOOP-4675. Current Ganglia metrics implementation is incompatible with Ganglia 3.1
    
        Patched using
        https://issues.apache.org/jira/secure/attachment/12407207/HADOOP-4675-v7.patch

commit 4305750d026b895b3afbd0d4a4ee4b3b42596016
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 15:01:29 2010 -0800

    HADOOP-6269. Missing synchronization for defaultResources in Configuration.addResource
    
    Description: Configuration.defaultResources is a simple ArrayList. In two places in Configuration it is accessed without appropriate synchronization, which we've seen to occasionally result in ConcurrentModificationExceptions.
    Reason: Bugfix (race condition)
    Author: Sreekanth Ramakrishnan
    Ref: UNKNOWN

commit 90f9c40df18fe464383de52e3d3952638a393e34
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 15:01:08 2010 -0800

    CLOUDERA-BUILD. Make some JT methods and classes public for use from within contrib plugins
    
    Author: Henry Robinson

commit f8e0599a434e1ce94158384f575e912e9f988229
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:59:40 2010 -0800

    MAPREDUCE-461. Enable ServicePlugins for the JobTracker
    
    Description: Allow ServicePlugins (see <a href="http://issues.apache.org/jira/browse/HADOOP-5257" title="Export namenode/datanode functionality through a pluggable RPC layer"><del>HADOOP-5257</del></a>) for the JobTracker.
    (Relies on HADOOP-5640)
    Reason: API Improvement
    Author: Todd Lipcon
    Ref: UNKNOWN

commit c58318cfa6e26b7dbacd4093d646fc8b66f9eda6
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:58:23 2010 -0800

    HADOOP-5640. Allow ServicePlugins to hook callbacks into key service events
    
    Description: <a href="http://issues.apache.org/jira/browse/HADOOP-5257" title="Export namenode/datanode functionality through a pluggable RPC layer"><del>HADOOP-5257</del></a> added the ability for NameNode and DataNode to start and stop ServicePlugin implementations at NN/DN start/stop. However, this is insufficient integration for some common use cases.
    
    <p>We should add some functionality for Plugins to subscribe to events generated by the service they're plugging into. Some potential hook points are:</p>
    
    <p>NameNode:</p>
    <ul class="alternate" type="square">
    	<li>new datanode registered</li>
    	<li>datanode has died</li>
    	<li>exception caught</li>
    	<li>etc?</li>
    </ul>
    
    <p>DataNode:</p>
    <ul class="alternate" type="square">
    	<li>startup</li>
    	<li>initial registration with NN complete (this is important for HADOOP-4707 to sync up datanode.dnRegistration.name with the NN-side registration)</li>
    	<li>namenode reconnect</li>
    	<li>some block transfer hooks?</li>
    	<li>exception caught</li>
    </ul>
    
    <p>I see two potential routes for implementation:</p>
    
    <p>1) We make an enum for the types of hookpoints and have a general function in the ServicePlugin interface. Something like:</p>
    
    <div class="code panel" style="border-width: 1px;"><div class="codeContent panelContent">
    <pre class="code-java"><span class="code-keyword">enum</span> HookPoint {
      DN_STARTUP,
      DN_RECEIVED_NEW_BLOCK,
      DN_CAUGHT_EXCEPTION,
     ...
    }
    
    void runHook(HookPoint hp, <span class="code-object">Object</span> value);</pre>
    </div></div>
    
    <p>2) We make classes specific to each "pluggable" as was originally suggested in HADDOP-5257. Something like:</p>
    
    <div class="code panel" style="border-width: 1px;"><div class="codeContent panelContent">
    <pre class="code-java">class DataNodePlugin {
      void datanodeStarted() {}
      void receivedNewBlock(block info, etc) {}
      void caughtException(Exception e) {}
      ...
    }</pre>
    </div></div>
    
    <p>I personally prefer option (2) since we can ensure plugin API compatibility at compile-time, and we avoid an ugly switch statement in a runHook() function.</p>
    
    <p>Interested to hear what people's thoughts are here.</p>
    Reason: API Improvement
    Author: Todd Lipcon
    Ref: UNKNOWN

commit 137999a0b48a81bed10a5f30868dbfe6d176956b
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:58:09 2010 -0800

    HADOOP-5257. Export namenode/datanode functionality through a pluggable RPC layer
    
    Description: Adding support for pluggable components would allow exporting DFS functionallity using arbitrary protocols, like Thirft or Protocol Buffers. I'm opening this issue on Dhruba's suggestion in HADOOP-4707.
    
    <p>Plug-in implementations would extend this base class:</p>
    
    <div class="code panel" style="border-width: 1px;"><div class="codeContent panelContent">
    <pre class="code-java"><span class="code-keyword">abstract</span> class Plugin {
    
        <span class="code-keyword">public</span> <span class="code-keyword">abstract</span> datanodeStarted(DataNode datanode);
    
        <span class="code-keyword">public</span> <span class="code-keyword">abstract</span> datanodeStopping();
    
        <span class="code-keyword">public</span> <span class="code-keyword">abstract</span> namenodeStarted(NameNode namenode);
    
        <span class="code-keyword">public</span> <span class="code-keyword">abstract</span> namenodeStopping();
    }</pre>
    </div></div>
    
    <p>Name node instances would then start the plug-ins according to a configuration object, and would also shut them down when the node goes down:</p>
    
    <div class="code panel" style="border-width: 1px;"><div class="codeContent panelContent">
    <pre class="code-java"><span class="code-keyword">public</span> class NameNode {
    
        <span class="code-comment">// [..]
    </span>
        <span class="code-keyword">private</span> void initialize(Configuration conf)
            <span class="code-comment">// [...]
    </span>        <span class="code-keyword">for</span> (Plugin p: PluginManager.loadPlugins(conf))
              p.namenodeStarted(<span class="code-keyword">this</span>);
        }
    
        <span class="code-comment">// [..]
    </span>
        <span class="code-keyword">public</span> void stop() {
            <span class="code-keyword">if</span> (stopRequested)
                <span class="code-keyword">return</span>;
            stopRequested = <span class="code-keyword">true</span>;
            <span class="code-keyword">for</span> (Plugin p: plugins)
                p.namenodeStopping();
            <span class="code-comment">// [..]
    </span>    }
    
        <span class="code-comment">// [..]
    </span>}</pre>
    </div></div>
    
    <p>Data nodes would do a similar thing in <tt>DataNode.startDatanode()</tt> and <tt>DataNode.shutdown</tt></p>
    Reason: MISSING: Reason for inclusion
    Author: Carlos Valiente
    Ref: UNKNOWN

commit 155394ca5eed2e2a6151a5c9d9452e9cfbb30a11
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:57:58 2010 -0800

    MAPREDUCE-971. distcp does not always remove distcp.tmp.dir
    
    Description: Sometimes distcp leaves behind its tmpdir when the target filesystem is s3n.
    Reason: Bugfix
    Author: Aaron Kimball
    Ref: UNKNOWN

commit 7575b83ba0cab30394bad0943ff906ab0609dc40
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:57:49 2010 -0800

    CLOUDERA-BUILD. Package sqoop docs.

commit 9321b18352e55d4d37c25335b578151b18f938f2
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:57:32 2010 -0800

    MAPREDUCE-923. Sqoop's ORM uses URLDecoder on a file, which replaces plus signs in a jar file name with spaces
    
    Description: In findThisJar, sqoop runs URLDecoder.decode on the resulting jar, which has the effect of replacing any + signs in the path with a space.  This obviously breaks the classpath variable that it's trying to set, and the sqoop-generated code fails to compile.  Ironically, Cloudera's hadoop distro is the one that puts + characters in jar files, and so exhibits the bug.  Here is an example from running sqoop with log4j at debug level.  Note the space in the very last term, which should read hadoop-0.20.0+61-sqoop.jar rather than hadoop-0.20.0 61-sqoop.jar.
    
    <p>09/08/27 18:00:07 DEBUG orm.CompilationManager: Invoking javac with args: -sourcepath ./ -d /tmp/sqoop/compile/ -classpath /usr/lib/hadoop-0.20/conf:/usr/java/jdk1.6.0_06/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-0.20.0+61-core.jar:/usr/lib/hadoop-0.20/lib/commons-cli-2.0-SNAPSHOT.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-0.20.0+61-fairscheduler.jar:/usr/lib/hadoop-0.20/lib/hadoop-0.20.0+61-scribe-log4j.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.14.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop-0.20/lib/junit-3.8.1.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar:/usr/local/hadoop/lib/hadoop-gpl-compression.jar:/usr/lib/hadoop-0.20/hadoop-0.20.0+61-core.jar:/usr/lib/hadoop-0.20/contrib/sqoop/hadoop-0.20.0 61-sqoop.jar</p>
    Reason: Bugfix
    Author: Aaron Kimball
    Ref: UNKNOWN

commit e97883c5b9c389f82a6447e4cb1678c0a0ed83ba
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:57:19 2010 -0800

    CLOUDERA-BUILD. Sqoop asciidoc syntax error
    
    Author: Aaron Kimball

commit 520bda2edcb90dfe9461e16b96aa4a048d33ed7b
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:57:11 2010 -0800

    HADOOP-5450. Add support for application-specific typecodes to typed bytes
    
    Description: For serializing objects of types that are not supported by typed bytes serialization, applications might want to use a custom serialization format. Right now, typecode 0 has to be used for the bytes resulting from this custom serialization, which could lead to problems when deserializing the objects because the application cannot know if a byte sequence following typecode 0 is a customly serialized object or just a raw sequence of bytes. Therefore, a range of typecodes that are treated as aliases for 0 should be added, such that different typecodes can be used for application-specific purposes.
    Reason: New feature
    Author: Klaas Bosteels
    Ref: UNKNOWN

commit b30fc99332c4a444d275731dac4b4245115d65b2
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:56:59 2010 -0800

    HADOOP-1722. Make streaming to handle non-utf8 byte array
    
    Description: Right now, the streaming framework expects the output sof the steam process (mapper or reducer) are line <br/>
    oriented UTF-8 text. This limit makes it impossible to use those programs whose outputs may be non-UTF-8<br/>
     (international encoding, or maybe even binary data). Streaming can overcome this limit by introducing a simple<br/>
    encoding protocol. For example, it can allow the mapper/reducer to hexencode its keys/values, <br/>
    the framework decodes them in the Java side.<br/>
    This way, as long as the mapper/reducer executables follow this encoding protocol, <br/>
    they can output arabitary bytearray and the streaming framework can handle them.
    Reason: New feature
    Author: Klaas Bosteels
    Ref: UNKNOWN

commit 921c135653736bcc279700435358058762bc8f78
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:56:43 2010 -0800

    CLOUDERA-BUILD. More Sqoop documentation updates
    
    Author: Aaron Kimball

commit be7f1dc031e17dc4f53ebe76d27c1b9242105785
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:56:26 2010 -0800

    MAPREDUCE-840. DBInputFormat leaves open transaction
    
    Description: (Reapplied after HADOOP-4687)
    Reason: MISSING: Reason for inclusion
    Author: Aaron Kimball
    Ref: UNKNOWN

commit 89a96d8fff80ac809dbda9582044a7c6b3986d16
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:56:07 2010 -0800

    MAPREDUCE-906. Updated Sqoop documentation
    
    Description: Provides the latest documentation for Sqoop, in both user-guide and manpage form. Built with asciidoc.
    Reason: Documentation
    Author: Aaron Kimball
    Ref: UNKNOWN

commit 51f867aea0667d0191b730ea3abf114e75cafa4b
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:55:54 2010 -0800

    MAPREDUCE-907. Sqoop should use more intelligent splits
    
    Description: Sqoop should use the new split generation / InputFormat in <a href="http://issues.apache.org/jira/browse/MAPREDUCE-885" title="More efficient SQL queries for DBInputFormat"><del>MAPREDUCE-885</del></a>
    Reason: Performance / scalability improvement
    Author: Aaron Kimball
    Ref: UNKNOWN

commit 239df04415dba8d12c7d3fbf33c580d473202e94
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:55:28 2010 -0800

    MAPREDUCE-885. More efficient SQL queries for DBInputFormat
    
    Description: DBInputFormat generates InputSplits by counting the available rows in a table, and selecting subsections of the table via the "LIMIT" and "OFFSET" SQL keywords. These are only meaningful in an ordered context, so the query also includes an "ORDER BY" clause on an index column. The resulting queries are often inefficient and require full table scans. Actually using multiple mappers with these queries can lead to O(n^2) behavior in the database, where n is the number of splits. Attempting to use parallelism with these queries is counter-productive.
    
    <p>A better mechanism is to organize splits based on data values themselves, which can be performed in the WHERE clause, allowing for index range scans of tables, and can better exploit parallelism in the database.</p>
    Reason: Performance and scalability improvement
    Author: Aaron Kimball
    Ref: UNKNOWN

commit 23a0d1882c797160cc7b6fae99fc5e686aa30191
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:55:16 2010 -0800

    MAPREDUCE-938. Postgresql support for Sqoop
    
    Description: Sqoop should be able to import from postgresql databases.
    Reason: Compatability improvement
    Author: Aaron Kimball
    Ref: UNKNOWN

commit 7b89feb34fafd2365f75ab744db9cb07a5443046
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:55:05 2010 -0800

    MAPREDUCE-876. Sqoop import of large tables can time out
    
    Description: Related to <a href="http://issues.apache.org/jira/browse/MAPREDUCE-875" title="Make DBRecordReader execute queries lazily"><del>MAPREDUCE-875</del></a>, Sqoop should use a background thread to ensure that progress is being reported while a database does external work for the MapReduce task.
    Reason: Scalability improvement
    Author: Aaron Kimball
    Ref: UNKNOWN

commit 61d4ef5175dca1859a1320f9e7cad1caeab5d982
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:54:49 2010 -0800

    MAPREDUCE-918. Test hsqldb server should be memory-only.
    
    Description: Sqoop launches a standalone hsqldb server for unit tests, but it currently writes its database to disk and uses a connect string of <tt>//localhost</tt>. If multiple test instances are running concurrently, one test server may serve to the other instance of the unit tests, causing race conditions.
    Reason: Bugfix in test harness
    Author: Aaron Kimball
    Ref: UNKNOWN

commit 1fc17ad34e8288b54503eeb15f788eb4e6a070dc
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:54:37 2010 -0800

    MAPREDUCE-875. Make DBRecordReader execute queries lazily
    
    Description: DBInputFormat's DBRecordReader executes the user's SQL query in the constructor. If the query is long-running, this can cause task timeout. The user is unable to spawn a background thread (e.g., in a MapRunnable) to inform Hadoop of on-going progress.
    Reason: Scalability improvement
    Author: Aaron Kimball
    Ref: UNKNOWN

commit 21fdb7a7fd501fd63e1a540c2b55cf410d057301
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:54:27 2010 -0800

    MAPREDUCE-825. JobClient completion poll interval of 5s causes slow tests in local mode
    
    Description: The JobClient.NetworkedJob.waitForCompletion() method polls for job completion every 5 seconds. When running a set of short tests in pseudo-distributed mode, this is unnecessarily slow and causes lots of wasted time. When bandwidth is not scarce, setting the poll interval to 100 ms results in a 4x speedup in some tests.  This interval should be parametrized to allow users to control the interval for testing purposes.
    Reason: Test performance improvement
    Author: Aaron Kimball
    Ref: UNKNOWN

commit f996b8a019bffefff183d7d688ccf95b8cb73de5
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:54:15 2010 -0800

    MAPREDUCE-750. Extensible ConnManager factory API
    
    Description: Sqoop uses the ConnFactory class to instantiate a ConnManager implementation based on the connect string and other arguments supplied by the user. This allows per-database logic to be encapsulated in different ConnManager instances, and dynamically chosen based on which database the user is actually importing from. But adding new ConnManager implementations requires modifying the source of a common ConnFactory class. An indirection layer should be used to delegate instantiation to a number of factory implementations which can be specified in the static configuration or at runtime.
    Reason: API flexibility improvement
    Author: Aaron Kimball
    Ref: UNKNOWN

commit 39bdff7bd3b83359884c90ae857d3f3144a94803
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:54:04 2010 -0800

    MAPREDUCE-749. Make Sqoop unit tests more Hudson-friendly
    
    Description: Hudson servers (other than Apache's) need to be able to run the sqoop unit tests which depend on thirdparty JDBC drivers / database implementations. The build.xml needs some refactoring to make this happen.
    Reason: Test coverage improvement
    Author: Aaron Kimball
    Ref: UNKNOWN

commit 0ca54f2722206685d9e36fcbb2656d0ac1957311
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:53:47 2010 -0800

    MAPREDUCE-792. javac warnings in DBInputFormat
    
    Description: <a href="http://issues.apache.org/jira/browse/MAPREDUCE-716" title="org.apache.hadoop.mapred.lib.db.DBInputformat not working with oracle"><del>MAPREDUCE-716</del></a> introduces javac warnings
    Reason: Technical debt
    Author: Aaron Kimball
    Ref: UNKNOWN

commit e39ae9d017e89e4df193b1f8075184320230499b
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:52:45 2010 -0800

    MAPREDUCE-716. org.apache.hadoop.mapred.lib.db.DBInputformat not working with oracle
    
    Description: Applied "trunk" version of the patch after incorporating
    HADOOP-4687's move of DBInputFormat-related files. (Prior patch was 0.20-branch
    specific)
    Reason: Branch compatibility improvement
    Author: Aaron Kimball
    Ref: UNKNOWN

commit 074e824f5d3d2f6ab862083e6eb4b0df8c881bfc
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:52:27 2010 -0800

    MAPREDUCE-910. MRUnit should support counters
    
    Description: incrCounter() is currently a dummy stub method in MRUnit that does nothing. Would be good for the mock reporter/context implementations to support counters.
    Reason: New feature
    Author: Aaron Kimball
    Ref: UNKNOWN

commit b4b7c5d9b4cba84bc47f4a48074fd295d060ab35
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:52:17 2010 -0800

    MAPREDUCE-798. MRUnit should be able to test a succession of MapReduce passes
    
    Description: MRUnit can currently test that the inputs to a given (mapper, reducer) "job" produce certain outputs at the end of the reducer. It would be good to support more end-to-end tests of a series of MapReduce jobs that form a longer pipeline surrounding some data.
    Reason: New Feature
    Author: Aaron Kimball
    Ref: UNKNOWN

commit 59677d22261974560117fa82e74d9a7f80f804d5
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:52:06 2010 -0800

    MAPREDUCE-800. MRUnit should support the new API
    
    Description: MRUnit's TestDriver implementations use the old org.apache.hadoop.mapred-based classes. TestDrivers and associated mock object implementations are required for org.apache.hadoop.mapreduce-based code.
    Reason: New feature (API Compatibility)
    Author: Aaron Kimball
    Ref: UNKNOWN

commit 7fda23b419b1c98e84eea43a0f35191d41032e18
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:51:53 2010 -0800

    MAPREDUCE-799. Some of MRUnit's self-tests were not being run
    
    Description: Due to method naming issues, some test cases were not being executed.
    Reason: Bugfix; test coverage
    Author: Aaron Kimball
    Ref: UNKNOWN

commit 20d5bf205e9f2864f3da53d30408ba97763a46e9
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:51:40 2010 -0800

    MAPREDUCE-797. MRUnit MapReduceDriver should support combiners
    
    Description: The MapReduceDriver allows you to specify a mapper and a reducer class with a simple sort/"shuffle" between the passes. It would be nice to also support another Reducer implementation being used as a combiner in the middle.
    Reason: New feature
    Author: Aaron Kimball
    Ref: UNKNOWN

commit 5c873336b3380e6c8f07ca28230ede9d41e4e840
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:50:05 2010 -0800

    Integrate with 0.21-branch versions of DBInputFormat
    
    Description: In 0.21 there is now a DBInputFormat in the mapred/lib/ package
    as well as mapreduce/lib/db. This patch backports the new API edition of
    DBInputFormat to CDH
    Reason: Cross-branch compatibility improvement
    Author: Aaron Kimball
    Ref: UNKNOWN

commit 51b650554e3bc8054e8ca966f5f552c522f7483d
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:49:52 2010 -0800

    HADOOP-5170. Set max map/reduce tasks on a per-job basis, either per-node or cluster-wide
    
    Description: There are a number of use cases for being able to do this.  The focus of this jira should be on finding what would be the simplest to implement that would satisfy the most use cases.
    
    <p>This could be implemented as either a per-node maximum or a cluster-wide maximum.  It seems that for most uses, the former is preferable however either would fulfill the requirements of this jira.</p>
    
    <p>Some of the reasons for allowing this feature (mine and from others on list):</p>
    <ul class="alternate" type="square">
    	<li>I have some very large CPU-bound jobs.  I am forced to keep the max map/node limit at 2 or 3 (on a 4 core node) so that I do not starve the Datanode and Regionserver.  I have other jobs that are network latency bound and would like to be able to run high numbers of them concurrently on each node.  Though I can thread some jobs, there are some use cases that are difficult to thread (scanning from hbase) and there's significant complexity added to the job rather than letting hadoop handle the concurrency.</li>
    	<li>Poor assignment of tasks to nodes creates some situations where you have multiple reducers on a single node but other nodes that received none.  A limit of 1 reducer per node for that job would prevent that from happening. (only works with per-node limit)</li>
    	<li>Poor mans MR job virtualization.  Since we can limit a jobs resources, this gives much more control in allocating and dividing up resources of a large cluster.  (makes most sense w/ cluster-wide limit)</li>
    </ul>
    
    Reason: Configuration improvement
    Author: Matei Zaharia
    Ref: UNKNOWN

commit 99e25a93542251debd248ed71cb380858ca8c9bd
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:49:40 2010 -0800

    HADOOP-6166. Improve PureJavaCrc32
    
    Description: Got some ideas to improve CRC32 calculation.
    Reason: Performance Improvement
    Author: Tsz Wo (Nicholas), SZE
    Ref: UNKNOWN

commit 2d0a97cefa559ab9059d976bda66f9dbcf051e79
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:49:28 2010 -0800

    MAPREDUCE-782. Use PureJavaCrc32 in mapreduce spills
    
    Description: <a href="http://issues.apache.org/jira/browse/HADOOP-6148" title="Implement a pure Java CRC32 calculator"><del>HADOOP-6148</del></a> implemented a Pure Java implementation of CRC32 which performs better than the built-in one. This issue is to make use of it in the mapred package
    Reason: Performance improvement
    Author: Todd Lipcon
    Ref: UNKNOWN

commit bb65cb649c2924b5a20f06deb9ecd66fc219eeeb
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:49:12 2010 -0800

    HDFS-496. Use PureJavaCrc32 in HDFS
    
    Description: Common now has a pure java CRC32 implementation which is more efficient than java.util.zip.CRC32. This issue is to make use of it.
    Reason: Performance improvement
    Author: Todd Lipcon
    Ref: UNKNOWN

commit ac73e6d51d5ad1df993097349602e5f3199b952a
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:48:40 2010 -0800

    HADOOP-6148. Implement a pure Java CRC32 calculator
    
    Description: We've seen a reducer writing 200MB to HDFS with replication = 1 spending a long time in crc calculation. In particular, it was spending 5 seconds in crc calculation out of a total of 6 for the write. I suspect that it is the java-jni border that is causing us grief.
    
    This outperforms java.util.zip.CRC32.
    Reason: Performance improvement
    Author: Scott Carey and Todd Lipcon
    Ref: UNKNOWN

commit e7430c8cbd2d182716ac7efb08cb2187c1edab95
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:48:08 2010 -0800

    Updated Sqoop documentation for MAPREDUCE-816, MAPREDUCE-789.
    
    Reason: Documentation improvement
    Author: Aaron Kimball
    Ref: UNKNOWN

commit aa75ab7f749604c354dcdb0b806aca9cd140f504
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:47:58 2010 -0800

    MAPREDUCE-789. Oracle support for Sqoop
    
    Description: A separate ConnManager is needed for Oracle to support its slightly different syntax and configuration
    Reason: Compatibility improvement
    Author: Aaron Kimball
    Ref: UNKNOWN

commit 6f017db468a82e336a28f451c7d90bc225130094
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:47:33 2010 -0800

    MAPREDUCE-840. DBInputFormat leaves open transaction
    
    Description: DBInputFormat.getSplits() does not call connection.commit() after the COUNT query. This can leave an open transaction against the database which interferes with other connections to the same table.
    Reason: bugfix
    Author: Aaron Kimball
    Ref: UNKNOWN

commit 84b622a5f6f5bd145f19f4c08b6263759ac51756
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:47:15 2010 -0800

    MAPREDUCE-816. Rename "local" mysql import to "direct"
    
    Description: A mysqldump-based fast path known as "local mode" is used in sqoop when users pass the argument <tt>-<del>local.</tt> The restriction that this only import from localhost was based on an implementation technique that was later abandoned in favor of a more general one, which can support remote hosts as well. Thus, <tt></del><del>local</tt> is a poor name for the flag. <tt></del>-direct</tt> is more general and more descriptive. This should be used instead.
    Reason: Interface clarification
    Author: Aaron Kimball
    Ref: UNKNOWN

commit ce75318a484615dc7b161a41710884f34db50c86
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:46:34 2010 -0800

    MAPREDUCE-716. org.apache.hadoop.mapred.lib.db.DBInputformat not working with oracle
    
    Description: <p>The out of the box implementation of the Hadoop is working properly with mysql/hsqldb, but NOT with oracle.<br/>
    Reason is DBInputformat is implemented with mysql/hsqldb specific query constructs like "LIMIT", "OFFSET".</p>
    
    <p>FIX:<br/>
    building a database provider specific logic based on the database providername (which we can get using connection).</p>
    
    Reason: Compatibility improvement
    Author: Aaron Kimball
    Ref: UNKNOWN

commit 338de775796c2102ce680eaa983b719b50e9f3ee
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:46:18 2010 -0800

    HADOOP-5469. Exposing Hadoop metrics via HTTP
    
    Description: Implement a "/metrics" URL on the HTTP server of Hadoop daemons, to expose metrics data to users via their web browsers, in plain-text and JSON.
    Reason: New feature
    Author: Philip Zeyliger
    Ref: UNKNOWN

commit cad421ec1c51382f81714ccafb96a6bb8bcc8aec
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:46:11 2010 -0800

    HADOOP-5469. Exposing Hadoop metrics via HTTP
    
    Description: Implement a "/metrics" URL on the HTTP server of Hadoop daemons, to expose metrics data to users via their web browsers, in plain-text and JSON.
    Reason: MISSING: Reason for inclusion
    Author: Philip Zeyliger
    Ref: UNKNOWN

commit 8b09839047997a4b5461703650b5779ec86c1844
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:45:49 2010 -0800

    CLOUDERA-BUILD. Added Sqoop documentation to installation script
    
    Author: Todd Lipcon

commit 7e77c6b13f06dec9c742bf76c81e2ec02d81c7cb
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:45:35 2010 -0800

    CLOUDERA-BUILD. Fix the hadoop/sqoop wrapper scripts
    
    Author: Matt Massie

commit 0caaf80f3a569b91f482de0dcb87f826967f5c7c
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:45:16 2010 -0800

    CLOUDERA-BUILD. Fix a bug in the hadoop/sqoop wrapper generation
    
    Author: Matt Massie
    Ref: UNKNOWN

commit bd8ddae402a876fe78cbb1482362935780b57d84
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:44:59 2010 -0800

    CLOUDERA-BUILD. Update the install hadoop script
    
    Author: Matt Massie
    Ref: UNKNOWN

commit 80cf01124877a5aebd742142b10fda45910f0328
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:44:42 2010 -0800

    CLOUDERA-BUILD. Rename the hadoop man page to be hadoop-0.20
    
    Author: Matt Massie
    Ref: UNKNOWN

commit 78cb9f21a3ddf04f8cef9e37a94f657448d0d111
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:43:51 2010 -0800

    HADOOP-5745. Allow setting the default value of maxRunningJobs for all pools
    
    Description: The &lt;pool&gt; element allows setting the maxRunningJobs for that pool. It wold be nice to be able to set a default value for all pools.
    
    <p>In out configuration, pools are autocreated.. every new uesre gets his own pool. We would like to allow each user to be able to run a max of 5 jobs at a time. For the etl pool, this limit will be set to a greater value,</p>
    Reason: Improved configuration flexibility
    Author: dhruba borthakur
    Ref: UNKNOWN

commit 3c39e1fa8c3c89fc8f11f1faff46397fa82d5116
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:43:13 2010 -0800

    MAPREDUCE-906. Updated Sqoop documentation.
    
    Description: Update Sqoop documentation with user guide and manpage.
    Reason: Documentation improvement
    Author: Aaron Kimball
    Ref: UNKNOWN

commit 79a2645bc81894331721ef94c255992075ccf195
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:42:14 2010 -0800

    CLOUDERA-BUILD. Added MySQL Connector/J library for Sqoop.
    
    Description: We can ship MySQL Connector/J with CDH because the licenses
    are compatible. However, the public Apache project will not include this
    library in their source repository due to stricter licensing concerns.
    Reason: Simplifies deployment of Sqoop for mysql users
    Author: Aaron Kimball
    Ref: UNKNOWN

commit 4a097b35bf1264a0606f2ebe410c45f16f900f03
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:42:05 2010 -0800

    MAPREDUCE-705. User-configurable quote and delimiter characters for Sqoop records and record reparsing
    
    Description: Sqoop needs a mechanism for users to govern how fields are quoted and what delimiter characters separate fields and records. With delimiters providing an unambiguous format, a parse method can reconstitute the generated record data object from a text-based representation of the same record.
    Reason: New feature
    Author: Aaron Kimball
    Ref: UNKNOWN

commit 58e23056af0e99ef611ac258719207cc9459a849
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:41:47 2010 -0800

    MAPREDUCE-710. Sqoop should read and transmit passwords in a more secure manner
    
    Description: Sqoop's current support for passwords involves reading passwords from the command line "--password foo", which makes the password visible to other users via 'ps'. An invisible-console approach should be taken.
    
    <p>Related, Sqoop transmits passwords to mysqldump in the same fashion, which is also insecure.</p>
    Reason: Security improvement
    Author: Aaron Kimball
    Ref: UNKNOWN

commit a67a0f77729fb9005b0c47872d6ba677f6434b41
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:41:34 2010 -0800

    MAPREDUCE-713. Sqoop has some superfluous imports
    
    Description: Some classes have vestigial imports that should be removed
    Reason: Code cleanup
    Author: Aaron Kimball
    Ref: UNKNOWN

commit 0a4dab2eac0ba8b6da5190bc53a9ce8e4344a336
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:41:01 2010 -0800

    MAPREDUCE-685. Sqoop will fail with OutOfMemory on large tables using mysql
    
    Description: The default MySQL JDBC client behavior is to buffer the entire ResultSet in the client before allowing the user to use the ResultSet object. On large SELECTs, this can cause OutOfMemory exceptions, even when the client intends to close the ResultSet after reading only a few rows. The MySQL ConnManager should configure its connection to use row-at-a-time delivery of results to the client.
    Reason: bugfix / scalability improvement
    Author: Aaron Kimball
    Ref: UNKNOWN

commit 499aa76b500136a0e8996898f468b088ca5d7ed3
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:40:50 2010 -0800

    MAPREDUCE-674. Sqoop should allow a "where" clause to avoid having to export entire tables
    
    Description: Sqoop currently only exports at the granularity of a table.  This doesn't work well on systems with large tables, where the overhead of performing a full dump each time is significant.  Allowing the user to specify a where clause is a relatively simple task which will give Sqoop a lot more flexibility.
    Reason: New feature
    Author: Kevin Weil
    Ref: UNKNOWN

commit ed4ba254d7708f363f5f1b4708e9e35061ad936c
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:40:37 2010 -0800

    MAPREDUCE-675. Sqoop should allow user-defined class and package names
    
    Description: Currently Sqoop generates a class for each table to be imported; the class names are equal to the table names and they are not part of any package.
    
    <p>This adds --class-name and --package-name parameters to Sqoop, allowing these aspects of code generation to be controlled.</p>
    Reason: New feature
    Author: Aaron Kimball
    Ref: UNKNOWN

commit 16e0ca8119b99b244c9eeafd78bb9eb43e4ba639
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:40:20 2010 -0800

    MAPREDUCE-703. Sqoop requires dependency on hsqldb in ivy
    
    Description: Sqoop builds crash without explicit dependency on hsqldb.
    Reason: build system bugfix
    Author: Aaron Kimball
    Ref: UNKNOWN

commit b8e54791e990328db983f070e9a04952301eda35
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:40:04 2010 -0800

    MAPREDUCE-692. Make Hudson run Sqoop unit tests
    
    Description: Running 'ant test-contrib' didn't test Sqoop because it wasn't explicitly listed in the build.xml file in src/contrib/
    Reason: Test coverage
    Author: Aaron Kimball
    Ref: UNKNOWN

commit 8a3b6472ae00542dadf7f7d60991ec0f21b38177
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:39:40 2010 -0800

    HADOOP-5968. Sqoop should only print a warning about mysql import speed once
    
    Description: After <a href="http://issues.apache.org/jira/browse/HADOOP-5844" title="Use mysqldump when connecting to local mysql instance in Sqoop"><del>HADOOP-5844</del></a>, Sqoop can use mysqldump as an alternative to JDBC for importing from MySQL. If you use the JDBC mechanism, it prints a warning if you could have enabled the mysqldump path instead. But the warning is printed multiple times (every time the LocalMySQLManager is instantiated), and also when the MySQL manager is used for informational queries (e.g., listing tables) rather than true imports.
    
    <p>It should only emit the warning once per session, and only then if it's actually doing an import.</p>
    Reason: User experience improvement
    Author: Aaron Kimball
    Ref: UNKNOWN

commit 86211e3714dc5b1dbcb7a3c328336277f6657de7
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:38:44 2010 -0800

    HADOOP-5967. Sqoop should only use a single map task
    
    Description: The current DBInputFormat implementation uses SELECT ... LIMIT ... OFFSET statements to
    read from a database table. This actually results in several queries all accessing the same table at
    the same time. Most database implementations will actually use a full table scan for each such
    query, starting at row 1 and scanning down until the OFFSET is reached before emitting data to the
    client. The upshot of this is that we see O(n^2) performance in the size of the table when using a
    large number of mappers, when a single mapper would read through the table in O(n) time in the number of rows.
    
    <p>This patch sets the number of map tasks to 1 in the MapReduce job sqoop launches.</p>
    Reason: Performance improvement
    Author: Aaron Kimball
    Ref: UNKNOWN
    
    commit 410db7130a8e85ceed46850f73e74f480d45994e
    Author: Aaron Kimball <aaron@cloudera.com>
    Date:   Thu Jul 23 16:10:21 2009 -0700
    
        HADOOP-5967: Sqoop should only use a single map task

commit b8f5d1d3a30a7461936f3f92bd9f007ed2db43e8
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:38:23 2010 -0800

    HADOOP-5887. Sqoop should create tables in Hive metastore after importing to HDFS
    
    Description: Sqoop (<a href="http://issues.apache.org/jira/browse/HADOOP-5815" title="Sqoop: A database import tool for Hadoop"><del>HADOOP-5815</del></a>) imports tables into HDFS; it is a straightforward enhancement to then generate a Hive DDL statement to recreate the table definition in the Hive metastore and move the imported table into the Hive warehouse directory from its upload target.
    
    <p>This feature enhancement makes this process automatic. An import is performed with sqoop in the usual way; providing the argument "--hive-import" will cause it to then issue a CREATE TABLE .. LOAD DATA INTO statement to a Hive shell. It generates a script file and then attempts to run "$HIVE_HOME/bin/hive" on it, or failing that, any "hive" on the $PATH; $HIVE_HOME can be overridden with --hive-home. As a result, no direct linking against Hive is necessary.</p>
    
    <p>The unit tests provided with this enhancement use a mock implementation of 'bin/hive' that compares the script it's fed with one from a directory full of "expected" scripts. The exact script file referenced is controlled via an environment variable. It doesn't actually load into a proper Hive metastore, but manual testing has shown that this process works in practice, so the mock implementation is a reasonable unit testing tool.</p>
    Reason: New feature
    Author: Aaron Kimball
    Ref: UNKNOWN

commit 50993494fdc7b2284837562b500e2840106bb3bb
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:37:48 2010 -0800

    CLOUDERA-BUILD. Address issue where docs were not properly copied through to release tarball
    
    Description:
        This was caused by some cleanup in build.xml early on in the CDH 0.20
        branch
    Reason: bugfix
    Author: Todd Lipcon
    Ref: UNKNOWN

commit 3ecb9c07279302d18f7367d49bcd98c4391cbb68
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:37:27 2010 -0800

    CLOUDERA-BUILD. Decrease build time by only rebuilding the native code for each platform
    
    Reason: build system improvement
    Author: Todd Lipcon
    Ref: UNKNOWN

commit f0c6a810ba7237ec7cc570ecad8a8665768b3d06
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:37:07 2010 -0800

    CLOUDERA-BUILD. Run jdiff against vanilla Hadoop during Cloudera release build
    
    Author: Todd Lipcon
    Ref: UNKNOWN

commit 9cf8f0cb6ed744439d8e90e3ba376edb5d9521f3
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:36:22 2010 -0800

    MAPREDUCE-415. JobControl Job does always has an unassigned name
    
    Description: When creating and adding org.apache.hadoop.mapred.jobcontrol.Job(s) they don't use the names specified in their respective JobConf files.  Instead it's just hardcoded to "unassigned".
    Reason: bugfix
    Author: Xavier Stevens
    Ref: UNKNOWN

commit 330f009bae260ac990426a988fc56913897a50ca
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:35:03 2010 -0800

    HADOOP-5805. problem using top level s3 buckets as input/output directories
    
    Description: When I specify top level s3 buckets as input or output directories, I get the following exception.
    
    <p>hadoop jar subject-map-reduce.jar s3n://infocloud-input s3n://infocloud-output</p>
    
    <p>java.lang.IllegalArgumentException: Path must be absolute: s3n://infocloud-output<br/>
            at org.apache.hadoop.fs.s3native.NativeS3FileSystem.pathToKey(NativeS3FileSystem.java:246)<br/>
            at org.apache.hadoop.fs.s3native.NativeS3FileSystem.getFileStatus(NativeS3FileSystem.java:319)<br/>
            at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:667)<br/>
            at org.apache.hadoop.mapred.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:109)<br/>
            at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:738)<br/>
            at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1026)<br/>
            at com.evri.infocloud.prototype.subjectmapreduce.SubjectMRDriver.run(SubjectMRDriver.java:63)<br/>
            at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)<br/>
            at com.evri.infocloud.prototype.subjectmapreduce.SubjectMRDriver.main(SubjectMRDriver.java:25)<br/>
            at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)<br/>
            at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)<br/>
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)<br/>
            at java.lang.reflect.Method.invoke(Method.java:597)<br/>
            at org.apache.hadoop.util.RunJar.main(RunJar.java:155)<br/>
            at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)<br/>
            at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)<br/>
            at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)<br/>
            at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)</p>
    
    <p>The workaround is to specify input/output buckets with sub-directories:</p>
    
    <p>hadoop jar subject-map-reduce.jar s3n://infocloud-input/input-subdir  s3n://infocloud-output/output-subdir</p>
    
    Reason: bugfix
    Author: Ian Nowland
    Ref: UNKNOWN

commit 35fa82b5c743e34d62449e0f4abffd885e0dfe4c
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:34:42 2010 -0800

    HADOOP-5656. Counter for S3N Read Bytes does not work
    
    Description: Counter for S3N Read Bytes does not work on trunk. On 0.18 branch neither read nor write byte counters work.
    Reason: Bugfix
    Author: Ian Nowland
    Ref: UNKNOWN

commit a6670de0a1c4b03c293ae47d1595e8c33764aaa5
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:33:43 2010 -0800

    HADOOP-5613. change S3Exception to checked exception
    
    Description: Currently the S3 filesystems can throw unchecked exceptions (S3Exception) which are not declared in the interface of FileSystem. These aren't caught by the various callers and can cause unpredictable behavior. IOExceptions are caught by most users of FileSystem since it is declared in the interface and hence is handled better.
    
    S3Exception now extends IOException.
    Reason: Improved error-checking at compile time for user applications.
    Author: Andrew Hitchcock
    Ref: UNKNOWN

commit 1f11b63a42ae441eb8d0693ed0e4e01aca553e42
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:33:09 2010 -0800

    HADOOP-5528. Binary partitioner
    
    Description: It would be useful to have a <tt>BinaryPartitioner</tt> that partitions <tt>BinaryComparable</tt> keys by hashing a configurable part of the bytes array corresponding to each key.
    Reason: New feature
    Author: Klaas Bosteels
    Ref: UNKNOWN

commit 716d3598e5a4a18cdfcfcf0dc800e263ef7c7685
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:32:47 2010 -0800

    HADOOP-5240. 'ant javadoc' does not check whether outputs are up to date and always rebuilds
    
    Description: Running 'ant javadoc' twice in a row calls the javadoc program both times; it doesn't check to see whether this is redundant work.
    Reason: Build system improvement
    Author: Aaron Kimball
    Ref: UNKNOWN

commit 2bb607d29d9080a7ca3bce72739ccef654d5392d
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:30:46 2010 -0800

    HADOOP-5175. Option to prohibit jars unpacking
    
    Description: The task tracker moves all unpacked jars into
    ${hadoop.tmp.dir}/mapred/local/taskTracker. When using a lot of external
    libraries via -libjars, this results in several thousand unpacked files.
    The amount of time needed to `du` these directories can increase to the point
    where tasks time out before starting. This patch provides an option to
    suppress jar unpacking.
    Reason: Scalability improvement
    Author: Todd Lipcon
    Ref: UNKNOWN

commit 349281bfa0243f5adbbd459266f4a9ac7ac8c1cc
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:30:16 2010 -0800

    CLOUDERA-BUILD. Fix scribe-log4j's ivy.xml to properly get log4j on the compile classpath
    
    Author: Todd Lipcon
    Reason: bugfix to build system
    Ref: UNKNOWN

commit b07aec5129e618bfeda8ba753fb5138e612b1a8b
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:29:33 2010 -0800

    HADOOP-4829. Allow FileSystem shutdown hook to be disabled
    
    Description: FileSystem sets a JVM shutdown hook so that it can clean up the FileSystem cache. This is great behavior when you are writing a client application, but when you're writing a server application, like the Collector or an HBase RegionServer, you need to control the shutdown of the application and HDFS much more closely. If you set your own shutdown hook, there's no guarantee that your hook will run before the HDFS one, preventing you from taking some shutdown actions.
    Reason: Integration improvement.
    Author: Todd Lipcon
    Ref: UNKNOWN

commit 154c6a6474b02e68c3418fddf9a8ee5d476a8b7d
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:28:14 2010 -0800

    HADOOP-3327. Shuffling fetchers waited too long between map output fetch re-tries
    
    Description: Improves handling of READ_TIMEOUT during map output copying.
    Author: Amareshwari Sriramadasu
    Reason: bugfix
    Ref: UNKNOWN
    
    commit 8a6293fc5c3733035dde8e4d3a68c414a1f800f8
    Author: Devaraj Das <ddas@apache.org>
    Date:   Thu Feb 5 05:35:09 2009 +0000
    
        HADOOP-3327. Improves handling of READ_TIMEOUT during map output copying. Contributed by Amareshwari Sriramadasu.
    
        git-svn-id: https://svn.apache.org/repos/asf/hadoop/core/trunk@741009 13f79535-47bb-0310-9956-ffa450edef68

commit 4ee0ecf4760d7adb3e1a81e018a3b5cd6d2e9775
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:27:44 2010 -0800

    MAPREDUCE-680. Reuse of Writable objects is improperly handled by MRUnit
    
    Description: As written, MRUnit's MockOutputCollector simply stores references to the objects passed in to its collect() method. Thus if the same Text (or other Writable) object is reused as an output containiner multiple times with different values, these separate values will not all be collected. MockOutputCollector needs to properly use io.serializations to deep copy the objects sent in.
    Reason: Bugfix; see description.
    Author: Aaron Kimball
    Ref: UNKNOWN
    
    commit 51bdfdcf947bc8447aa36d68ae802f154516b0b6
    Author: Aaron Kimball <aaron@cloudera.com>
    Date:   Wed Jul 15 10:40:47 2009 -0700
    
        MAPREDUCE-680. Reuse of Writable objects is improperly handled by MRUnit.

commit c2026460d4cf7049c67da65d3a2db2e9bcd9c848
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:27:14 2010 -0800

    HADOOP-5518. MRUnit unit test library
    
    Description: MRUnit is a tool to help authors of MapReduce programs write unit tests.
    
    Testing map() and reduce() methods requires some repeated work to mock the inputs and outputs of a Mapper or Reducer class, and ensure that the correct values are emitted to the OutputCollector based on inputs. Also, testing a mapper and reducer together requires running them with the sorted ordering guarantees made by the shuffle process.
    
    This library provides the above functionality to authors of maps and reduces; it allows you to test maps, reduces, and map-reduce pairs without needing to perform all the setup and teardown work associated with running a job.
    
    Reason: New feature
    Author: Aaron Kimball
    Ref: UNKNOWN

commit 6991a0eb635953bf3729bce330c426ed7d8b996a
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:26:29 2010 -0800

    CLOUDERA-BUILD. Add sqoop wrapper to bin
    
    Description: Adds a '/usr/bin/sqoop' wrapper script for users
    Reason: User-experience improvement
    Author: Aaron Kimball
    Ref: UNKNOWN

commit c365162d7db1ee70c8607ad84a11e4aa594224e7
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:25:56 2010 -0800

    HADOOP-5844. Use mysqldump when connecting to local mysql instance in Sqoop
    
    Description: Sqoop uses MapReduce + DBInputFormat to read the contents of a table into HDFS. On many databases, this implementation is O(N^2) in the number of rows. Also, the use of multiple mappers has low value in terms of throughput, because the database itself is inherently singlethreaded. While DBInputFormat/JDBC provides a useful fallback mechanism for importing from databases, db-specific dump utilities will nearly always provide faster throughput, and should be selected when available. This patch allows users to use mysqldump to read from local mysql instances instead of the MapReduce-based input.
    Reason: Performance improvement
    Author: Aaron Kimball
    Ref: UNKNOWN

commit eddbfbca420bfb81a3a565e4324f6189bfd97e41
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:24:58 2010 -0800

    HADOOP-5815. Sqoop: A database import tool for Hadoop
    
    Description:
    Sqoop is a tool designed to help users import existing relational databases into their Hadoop clusters. Sqoop uses JDBC to connect to a database, examine the schema for tables, and auto-generate the necessary classes to import data into HDFS. It then instantiates a MapReduce job to read the table from the database via the DBInputFormat (JDBC-based InputFormat). The table is read into a set of files loaded into HDFS. Both SequenceFile and text-based targets are supported.
    Reason: New feature
    Author: Aaron Kimball
    Ref: UNKNOWN

commit b33265ff77c71af61899a4b3add1e82cc195fdb7
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:23:53 2010 -0800

    MAPREDUCE-714. JobConf.findContainingJar unescapes unnecessarily on Linux
    
    Description: In JobConf.findContainingJar, the path name is decoded using URLDecoder.decode(...). This was done by Doug in r381794 (commit msg "Un-escape containing jar's path, which is URL-encoded.  This fixes things primarily on Windows, where paths are likely to contain spaces.") Unfortunately, jar paths do not appear to be URL encoded on Linux. If you try to use "hadoop jar" on a jar with a "+" in it, this function decodes it to a space and then the job cannot be submitted.
    Reason: Cloudera-based packages include a '+' in the filename; Hadoop's URL escaper will not
    properly handle jar filenames with a '+' without this patch.
    Author: Todd Lipcon
    Ref: UNKNOWN
    
    commit d9767d2cefab288e581732f71779f3ce8e3267e4
    Author: Todd Lipcon <todd@cloudera.com>
    Date:   Mon Jul 6 19:36:11 2009 -0700
    
        MAPREDUCE-714: Fix JobConf.findContainingJars to work with jars with + in the name

commit aaeb69f8dda72a2e7aecacd622e99c00bc961efa
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:23:23 2010 -0800

    CLOUDERA-BUILD. Add dependency libraries for Scribe/log4j
    
    Author: Todd Lipcon

commit cb7a3677942c1d2f9e0d2a75dbffa09fa6125e61
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:22:41 2010 -0800

    CLOUDERA-BUILD. Apply Scribe patches to Hadoop
    
    Description:
        scribe_hadoop_trunk.patch
        Also, add empty ivy infrastructure for scribe-log4j
    Author: Todd Lipcon

commit d5ead434b221076fb830308d2d112d53aa6dc59f
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:22:26 2010 -0800

    CLOUDERA-BUILD. Use cloudera's versioning info from cloudera.hash in saveVersion.sh
    
    Description:
        This should make the "hadoop version" output far more useful for
        determing exactly what code is running. The cloudera.hash property is
        set by cloudera/build.properties which is generated during the build
        process.

commit bf10e46e425395145dcc4b85db66d45cbf9797b0
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:21:45 2010 -0800

    CLOUDERA-BUILD. Move saveVersion.sh in build.xml to ensure build
    
    Description:
        This error is due to ant 1.7.1 not compiling package-info.java if the
        timestamp of the output class directory is newer than the package-info
        file itself. Since other compiles were happening after package-info.java
        was generated, the build dir was newer and compilation was being
        skipped.
    
        Move cloudera hooks inside the package task of build.xml
    
        Fixes an issue where the fair scheduler jar was not built before the
        hooks were run, and therefore was not included in the target lib/
        directory.
    
    Ref: CLOUDERA-436

commit 5359a3bbd2b09644825be99fdd354ff3276a5d59
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:21:36 2010 -0800

    CLOUDERA-BUILD. New versions of cloudera packaging scripts

commit ee255f3909b9938b1023be6a2c59a8429227c766
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:21:27 2010 -0800

    CLOUDERA-BUILD. Change paths to point to hadoop-0.20 where necessary

commit a2d051bcf456fde45c0a0c3aa512872ce6059a97
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:21:08 2010 -0800

    CLOUDERA-BUILD. Add Hadoop manpage to Hadoop 0.20 repository

commit 9600765ec5d6c3cef9ab34ecb573cbb876acf7ee
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:21:01 2010 -0800

    CLOUDERA-BUILD. Move install_hadoop.sh into hadoop repo

commit 77ac6923ad6e63874a429e7dd13c4a084b6a9556
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:20:52 2010 -0800

    CLOUDERA-BUILD. Add example-confs directory for storing configuration of conf.pseudo

commit 14256386d4cb155fea0f5745dd6c49fba74ff40f
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:20:43 2010 -0800

    CLOUDERA-BUILD. Replace hadoop-config.sh with Cloudera version

commit f7d0a20e0d74f1aac1fb96f3c08ce31e9b9ca5d9
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:20:25 2010 -0800

    CLOUDERA-BUILD. Remove redundant code in build.xml between package and bin-package

commit 0fa65091ecd9dd150d6afb93845d3fb10d80e115
Author: Aaron Kimball <aaron@cloudera.com>
Date:   Fri Mar 12 14:16:59 2010 -0800

    CLOUDERA-BUILD. Hook build.xml to enable contrib modules
