CDH 4 Release Notes
The following lists all Apache Hadoop Jiras included in CDH 4
that are not included in the Apache Hadoop base version 2.0.0. The
hadoop-2.0.0-cdh4.1.0.CHANGES.txt
file lists all changes included in CDH 4. The patch for each
change can be found in the cloudera/patches directory in the release tarball.
Changes Not In Apache Hadoop 2.0.0
Common
Bug
- [HADOOP-8855] - SSL-based image transfer does not work when Kerberos is disabled
- [HADOOP-8833] - fs -text should make sure to call inputstream.seek(0) before using input stream
- [HADOOP-8786] - HttpServer continues to start even if AuthenticationFilter fails to init
- [HADOOP-8821] - Findbugs warning Configuration.dumpDeprecatedKeys() concatenates strings using + in a loop
- [HADOOP-8780] - Update DeprecatedProperties apt file
- [HADOOP-8801] - ExitUtil#terminate should capture the exception stack trace
- [HADOOP-8781] - hadoop-config.sh should add JAVA_LIBRARY_PATH to LD_LIBRARY_PATH
- [HADOOP-8614] - IOUtils#skipFully hangs forever on EOF
- [HADOOP-8749] - HADOOP-8031 changed the way in which relative xincludes are handled in Configuration.
- [HADOOP-8766] - FileContextMainOperationsBaseTest should randomize the root dir
- [HADOOP-8648] - libhadoop: native CRC32 validation crashes when io.bytes.per.checksum=1
- [HADOOP-8770] - NN should not RPC to self to find trash defaults (causes deadlock)
- [HADOOP-8709] - globStatus changed behavior from 0.20/1.x
- [HADOOP-8738] - junit JAR is showing up in the distro
- [HADOOP-8632] - Configuration leaking class-loaders
- [HADOOP-8747] - Syntax error on cmake version 2.6 patch 2 in JNIFlags.cmake
- [HADOOP-8737] - cmake: always use JAVA_HOME to find libjvm.so, jni.h, jni_md.h
- [HADOOP-8703] - distcpV2: turn CRC checking off for 0 byte size
- [HADOOP-8537] - Two TFile tests failing recently
- [HADOOP-8626] - Typo in default setting for hadoop.security.group.mapping.ldap.search.filter.user
- [HADOOP-8031] - Configuration class fails to find embedded .jar resources; should use URL.openStream()
- [HADOOP-8573] - Configuration tries to read from an inputstream resource multiple times.
- [HADOOP-8633] - Interrupted FsShell copies may leave tmp files
- [HADOOP-8637] - FilterFileSystem#setWriteChecksum is broken
- [HADOOP-8550] - hadoop fs -touchz automatically created parent directories
- [HADOOP-8634] - Ensure FileSystem#close doesn't squawk for deleteOnExit paths
- [HADOOP-8606] - FileSystem.get may return the wrong filesystem
- [HADOOP-8721] - ZKFC should not retry 45 times when attempting a graceful fence during a failover
- [HADOOP-8686] - Fix warnings in native code
- [HADOOP-8642] - io.native.lib.available only controls zlib
- [HADOOP-8599] - Non empty response from FileSystem.getFileBlockLocations when asking for data beyond the end of file
- [HADOOP-7754] - Expose file descriptors from Hadoop-wrapped local FileSystems
- [HADOOP-8370] - Native build failure: javah: class file for org.apache.hadoop.classification.InterfaceAudience not found
- [HADOOP-8699] - some common testcases create core-site.xml in test-classes making other testcases to fail
- [HADOOP-8659] - Native libraries must build with soft-float ABI for Oracle JVM on ARM
- [HADOOP-8538] - CMake builds fail on ARM
- [HADOOP-7703] - WebAppContext should also be stopped and cleared
- [HADOOP-8551] - fs -mkdir creates parent directories without the -p option
- [HADOOP-8627] - FS deleteOnExit may delete the wrong path
- [HADOOP-8660] - TestPseudoAuthenticator failing with NPE
- [HADOOP-8480] - The native build should honor -DskipTests
- [HADOOP-8613] - AbstractDelegationTokenIdentifier#getUser() should set token auth type
- [HADOOP-8433] - Don't set HADOOP_LOG_DIR in hadoop-env.sh
- [HADOOP-8495] - Update Netty to avoid leaking file descriptors during shuffle
- [HADOOP-8110] - TestViewFsTrash occasionally fails
- [HADOOP-8499] - Lower min.user.id to 500 for the tests
- [HADOOP-8585] - Fix initialization circularity between UserGroupInformation and HadoopConfiguration
- [HADOOP-8587] - HarFileSystem access of harMetaCache isn't threadsafe
- [HADOOP-8586] - Fixup a bunch of SPNEGO misspellings
- [HADOOP-8460] - Document proper setting of HADOOP_PID_DIR and HADOOP_SECURE_DN_PID_DIR
- [HADOOP-8449] - hadoop fs -text fails with compressed sequence files with the codec file extension
- [HADOOP-8552] - Conflict: Same security.log.file for multiple users.
- [HADOOP-8444] - Fix the tests FSMainOperationsBaseTest.java and F ileContextMainOperationsBaseTest.java to avoid potential test failure
- [HADOOP-8481] - update BUILDING.txt to talk about cmake rather than autotools
- [HADOOP-8566] - AvroReflectSerializer.accept(Class) throws a NPE if the class has no package (primitive types and arrays)
- [HADOOP-8547] - Package hadoop-pipes examples/bin directory (again)
- [HADOOP-8372] - normalizeHostName() in NetUtils is not working properly in resolving a hostname start with numeric character
- [HADOOP-8168] - empty-string owners or groups causes {{MissingFormatWidthException}} in o.a.h.fs.shell.Ls.ProcessPath()
- [HADOOP-8507] - Avoid OOM while deserializing DelegationTokenIdentifer
- [HADOOP-8491] - EditLogFileOutputStream#preallocate should check for incomplete writes
- [HADOOP-8488] - test-patch.sh gives +1 even if the native build fails.
- [HADOOP-8512] - AuthenticatedURL should reset the Token when the server returns other than OK on authentication
- [HADOOP-8509] - JarFinder duplicate entry: META-INF/MANIFEST.MF exception
- [HADOOP-8485] - Don't hardcode "Apache Hadoop 0.23" in the docs
- [HADOOP-8450] - Remove src/test/system
- [HADOOP-8466] - hadoop-client POM incorrectly excludes avro
- [HADOOP-8452] - DN logs backtrace when running under jsvc and /jmx is loaded
- [HADOOP-8400] - All commands warn "Kerberos krb5 configuration not found" when security is not enabled
- [HADOOP-8422] - Deprecate FileSystem#getDefault* and getServerDefault methods that don't take a Path argument
- [HADOOP-8329] - Build fails with Java 7
- [HADOOP-8408] - MR doesn't work with a non-default ViewFS mount table and security enabled
- [HADOOP-8287] - etc/hadoop is missing hadoop-env.sh
- [HADOOP-8406] - CompressionCodecFactory.CODEC_PROVIDERS iteration is thread-unsafe
- [HADOOP-8405] - ZKFC tests leak ZK instances
- [HADOOP-8393] - hadoop-config.sh missing variable exports, causes Yarn jobs to fail with ClassNotFoundException MRAppMaster
- [HADOOP-8316] - Audit logging should be disabled by default
- [HADOOP-7868] - Hadoop native fails to compile when default linker option is -Wl,--as-needed
- [HADOOP-8257] - Auto-HA: TestZKFailoverControllerStress occasionally fails with Mockito error
- [HADOOP-8245] - Fix flakiness in TestZKFailoverController
- [HADOOP-8220] - ZKFailoverController doesn't handle failure to become active correctly
Improvement
- [HADOOP-8806] - libhadoop.so: dlopen should be better at locating libsnappy.so, etc.
- [HADOOP-7688] - When a servlet filter throws an exception in init(..), the Jetty server failed silently.
- [HADOOP-8278] - Make sure components declare correct set of dependencies
- [HADOOP-4572] - INode and its sub-classes should be package private
- [HADOOP-8609] - IPC server logs a useless message when shutting down socket
- [HADOOP-8623] - hadoop jar command should respect HADOOP_OPTS
- [HADOOP-8635] - Cannot cancel paths registered deleteOnExit
- [HADOOP-8525] - Provide Improved Traceability for Configuration
- [HADOOP-7664] - o.a.h.conf.Configuration complains of overriding final parameter even if the value with which its attempting to override is the same.
- [HADOOP-8075] - Lower native-hadoop library log from info to debug
- [HADOOP-8710] - Remove ability for users to easily run the trash emptier
- [HADOOP-8689] - Make trash a server side configuration option
- [HADOOP-8524] - Allow users to get source of a Configuration parameter
- [HADOOP-8687] - Upgrade log4j to 1.2.17
- [HADOOP-8463] - hadoop.security.auth_to_local needs a key definition and doc
- [HADOOP-3450] - Add tests to Local Directory Allocator for asserting their URI-returning capability
- [HADOOP-8620] - Add -Drequire.fuse and -Drequire.snappy
- [HADOOP-8541] - Better high-percentile latency metrics
- [HADOOP-8533] - Remove Parallel Call in IPC
- [HADOOP-8535] - Cut hadoop build times in half (upgrade maven-compiler-plugin to 2.5.1)
- [HADOOP-8358] - Config-related WARN for dfs.web.ugi can be avoided.
- [HADOOP-8373] - Port RPC.getServerAddress to 0.23
- [HADOOP-8398] - Cleanup BlockLocation
- [HADOOP-8388] - Remove unused BlockLocation serialization
- [HADOOP-8368] - Use CMake rather than autotools to build native code
- [HADOOP-8244] - Improve comments on ByteBufferReadable.read
- [HADOOP-8361] - Avoid out-of-memory problems when deserializing strings
- [HADOOP-8224] - Don't hardcode hdfs.audit.logger in the scripts
- [HADOOP-8353] - hadoop-daemon.sh and yarn-daemon.sh can be misleading on stop
- [HADOOP-8276] - Auto-HA: add config for java options to pass to zkfc daemon
- [HADOOP-8279] - Auto-HA: Allow manual failover to be invoked from zkfc.
- [HADOOP-8306] - ZKFC: improve error message when ZK is not running
- [HADOOP-8247] - Auto-HA: add a config to enable auto-HA, which disables manual FC
- [HADOOP-8246] - Auto-HA: automatically scope znode by nameservice ID
- [HADOOP-8215] - Security support for ZK Failover controller
New Feature
- [HADOOP-8581] - add support for HTTPS to the web UIs
- [HADOOP-8644] - AuthenticatedURL should be able to use SSLFactory
- [HADOOP-8465] - hadoop-auth should support ephemeral authentication
- [HADOOP-8458] - Add management hook to AuthenticationHandler to enable delegation token operations support
- [HADOOP-8135] - Add ByteBufferReadable interface to FSDataInputStream
Test
- [HADOOP-8260] - Auto-HA: Replace ClientBaseWithFixes with our own modified copy of the class
- [HADOOP-8228] - Auto HA: Refactor tests and add stress tests
HDFS
Bug
- [HDFS-3972] - Trash emptier fails in secure HA cluster
- [HDFS-3931] - TestDatanodeBlockScanner#testBlockCorruptionPolicy2 is broken
- [HDFS-3932] - NameNode Web UI broken if the rpc-address is set to the wildcard
- [HDFS-3951] - datanode web ui does not work over HTTPS when datanode is started in secure mode
- [HDFS-3936] - MiniDFSCluster shutdown races with BlocksMap usage
- [HDFS-3938] - remove current limitations from HttpFS docs
- [HDFS-3924] - Multi-byte id in HdfsVolumeId
- [HDFS-3902] - TestDatanodeBlockScanner#testBlockCorruptionPolicy is broken
- [HDFS-3928] - MiniDFSCluster should reset the first ExitException on shutdown
- [HDFS-3664] - BlockManager race when stopping active services
- [HDFS-3510] - Improve FSEditLog pre-allocation
- [HDFS-2797] - Fix misuses of InputStream#skip in the edit log code
- [HDFS-2757] - Cannot read a local block that's being written to when using the local read short circuit
- [HDFS-3897] - QJM: TestBlockToken fails after HDFS-3893
- [HDFS-3895] - hadoop-client must include commons-cli
- [HDFS-3828] - Block Scanner rescans blocks too frequently
- [HDFS-3466] - The SPNEGO filter for the NameNode should come out of the web keytab file
- [HDFS-3054] - distcp -skipcrccheck has no effect
- [HDFS-1490] - TransferFSImage should timeout
- [HDFS-3879] - Fix findbugs warning in TransferFsImage on branch-2
- [HDFS-3792] - Fix two findbugs introduced by HDFS-3695
- [HDFS-3861] - Deadlock in DFSClient
- [HDFS-3816] - Invalidate work percentage default value should be 0.32f instead of 32
- [HDFS-3733] - Audit logs should include WebHDFS access
- [HDFS-3707] - TestFSInputChecker: improper use of skip
- [HDFS-3731] - 2.0 release upgrade must handle blocks being written from 1.0
- [HDFS-3864] - NN does not update internal file mtime for OP_CLOSE when reading from the edit log
- [HDFS-3849] - When re-loading the FSImage, we should clear the existing genStamp and leases.
- [HDFS-3860] - HeartbeatManager#Monitor may wrongly hold the writelock of namesystem
- [HDFS-3856] - TestHDFSServerPorts failure is causing surefire fork failure
- [HDFS-3678] - Edit log files are never being purged from 2NN
- [HDFS-3194] - DataNode block scanner is running too frequently
- [HDFS-3715] - Fix TestFileCreation#testFileCreationNamenodeRestart
- [HDFS-3835] - Long-lived 2NN cannot perform a checkpoint if security is enabled and the NN restarts with outstanding delegation tokens
- [HDFS-3625] - Fix TestBackupNode by properly initializing edit log
- [HDFS-3614] - Revert unused MiniDFSCluster constructor from HDFS-3049
- [HDFS-3823] - QJM: TestQJMWithFaults fails occasionally because of missed setting of HTTP port
- [HDFS-3773] - TestNNWithQJM fails after HDFS-3741
- [HDFS-3794] - WebHDFS Open used with Offset returns the original (and incorrect) Content Length in the HTTP Header.
- [HDFS-3788] - distcp can't copy large files using webhdfs due to missing Content-Length header
- [HDFS-3808] - fuse_dfs: postpone libhdfs intialization until after fork
- [HDFS-3718] - Datanode won't shutdown because of runaway DataBlockScanner thread
- [HDFS-3048] - Small race in BlockManager#close
- [HDFS-3658] - TestDFSClientRetries#testNamenodeRestart failed
- [HDFS-3790] - test_fuse_dfs.c doesn't compile on centos 5
- [HDFS-2330] - In NNStorage.java, IOExceptions of stream closures can mask root exceptions.
- [HDFS-3760] - primitiveCreate is a write, not a read
- [HDFS-3756] - DelegationTokenFetcher creates 2 HTTP connections, the second one not properly configured
- [HDFS-3724] - add InterfaceAudience annotations to HttpFS classes and making inner enum static
- [HDFS-3732] - fuse_dfs: incorrect configuration value checked for connection expiry timer period
- [HDFS-3553] - Hftp proxy tokens are broken
- [HDFS-3696] - Create files with WebHdfsFileSystem goes OOM when file size is big
- [HDFS-3710] - libhdfs misuses O_RDONLY/WRONLY/RDWR
- [HDFS-3579] - libhdfs: fix exception handling
- [HDFS-3721] - hsync support broke wire compatibility
- [HDFS-3755] - Creating an already-open-for-write file with overwrite=true fails
- [HDFS-3754] - BlockSender doesn't shutdown ReadaheadPool threads
- [HDFS-2966] - TestNameNodeMetrics tests can fail under load
- [HDFS-3738] - TestDFSClientRetries#testFailuresArePerOperation sets incorrect timeout config
- [HDFS-3626] - Creating file with invalid path can corrupt edit log
- [HDFS-3679] - fuse_dfs notrash option sets usetrash
- [HDFS-3720] - hdfs.h must get packaged
- [HDFS-3608] - fuse_dfs: detect changes in UID ticket cache
- [HDFS-3597] - SNN can fail to start on upgrade
- [HDFS-3690] - BlockPlacementPolicyDefault incorrectly casts LOG
- [HDFS-1249] - with fuse-dfs, chown which only has owner (or only group) argument fails with Input/output error.
- [HDFS-3675] - libhdfs: follow documented return codes
- [HDFS-3673] - libhdfs: fix some compiler warnings
- [HDFS-3605] - Block mistakenly marked corrupt during edit log catchup phase of failover
- [HDFS-3646] - LeaseRenewer can hold reference to inactive DFSClient instances forever
- [HDFS-3577] - WebHdfsFileSystem can not read files larger than 24KB
- [HDFS-3581] - FSPermissionChecker#checkPermission sticky bit check missing range check
- [HDFS-3575] - HttpFS does not log Exception Stacktraces
- [HDFS-3157] - Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
- [HDFS-3551] - WebHDFS CREATE does not use client location for redirection
- [HDFS-766] - Error message not clear for set space quota out of boundary values.
- [HDFS-2914] - HA: Standby should not enter safemode when resources are low
- [HDFS-3266] - DFSTestUtil#waitCorruptReplicas doesn't sleep between checks
- [HDFS-3442] - Incorrect count for Missing Replicas in FSCK report
- [HDFS-3391] - TestPipelinesFailover#testLeaseRecoveryAfterFailover is failing
- [HDFS-1153] - dfsnodelist.jsp should handle invalid input parameters
- [HDFS-3398] - Client will not retry when primaryDN is down once it's just got pipeline
- [HDFS-3615] - Two BlockTokenSecretManager findbugs warnings
- [HDFS-3485] - DataTransferThrottler will over-throttle when currentTimeMillis jumps
- [HDFS-3574] - Fix small race and do some cleanup in GetImageServlet
- [HDFS-3385] - ClassCastException when trying to append a file
- [HDFS-3539] - libhdfs code cleanups
- [HDFS-3609] - libhdfs: don't force the URI to look like hdfs://hostname:port
- [HDFS-3492] - fix some misuses of InputStream#skip
- [HDFS-470] - libhdfs should handle 0-length reads from FSInputStream correctly
- [HDFS-3306] - fuse_dfs: don't lock release operations
- [HDFS-3633] - libhdfs: hdfsDelete should pass JNI_FALSE or JNI_TRUE
- [HDFS-3611] - NameNode prints unnecessary WARNs about edit log normally skipping a few bytes
- [HDFS-3629] - fix the typo in the error message about inconsistent storage layout version
- [HDFS-3415] - During NameNode starting up, it may pick wrong storage directory inspector when the layout versions of the storage directories are different
- [HDFS-3541] - Deadlock between recovery, xceiver and packet responder
- [HDFS-3580] - incompatible types; no instance(s) of type variable(s) V exist so that V conforms to boolean compiling HttpFSServer.java with OpenJDK
- [HDFS-3491] - HttpFs does not set permissions correctly
- [HDFS-3559] - DFSTestUtil: use Builder class to construct DFSTestUtil instances
- [HDFS-3572] - Cleanup code which inits SPNEGO in HttpServer
- [HDFS-3368] - Missing blocks due to bad DataNodes coming up and down.
- [HDFS-3428] - move DelegationTokenRenewer to common
- [HDFS-3490] - DN WebHDFS methods throw NPE if Namenode RPC address param not specified
- [HDFS-3436] - adding new datanode to existing pipeline fails in case of Append/Recovery
- [HDFS-3548] - NamenodeFsck.copyBlock fails to create a Block Reader
- [HDFS-711] - hdfsUtime does not handle atime = 0 or mtime = 0 correctly
- [HDFS-3446] - HostsFileReader silently ignores bad includes/excludes
- [HDFS-3603] - Decouple TestHDFSTrash from TestTrash
- [HDFS-3531] - EditLogFileOutputStream#preallocate should check for incomplete writes
- [HDFS-3524] - TestFileLengthOnClusterRestart failed due to error message change
- [HDFS-3517] - TestStartup should bind ephemeral ports
- [HDFS-3243] - TestParallelRead timing out on jenkins
- [HDFS-3067] - NPE in DFSInputStream.readBuffer if read is repeated on corrupted block
- [HDFS-3235] - MiniDFSClusterManager doesn't correctly support -format option
- [HDFS-3505] - DirectoryScanner does not join all threads in shutdown
- [HDFS-3501] - Checkpointing with security enabled will stop working after ticket lifetime expires
- [HDFS-3487] - offlineimageviewer should give byte offset information when it encounters an exception
- [HDFS-3486] - offlineimageviewer can't read fsimage files that contain persistent delegation tokens
- [HDFS-3484] - hdfs fsck doesn't work if NN HTTP address is set to 0.0.0.0 even if NN RPC address is configured
- [HDFS-3460] - HttpFS proxyuser validation with Kerberos ON uses full principal name
- [HDFS-3413] - TestFailureToReadEdits timing out
- [HDFS-2982] - Startup performance suffers when there are many edit log segments
- [HDFS-3440] - should more effectively limit stream memory consumption when reading corrupt edit logs
- [HDFS-3444] - hdfs groups command doesn't work with security enabled
- [HDFS-2800] - HA: TestStandbyCheckpoints.testCheckpointCancellation is racy
- [HDFS-3434] - InvalidProtocolBufferException when visiting DN browseDirectory.jsp
- [HDFS-3433] - GetImageServlet should allow administrative requestors when security is enabled
- [HDFS-3432] - TestDFSZKFailoverController tries to fail over too early
- [HDFS-3422] - TestStandbyIsHot timeouts too aggressive
- [HDFS-2759] - Pre-allocate HDFS edit log files after writing version number
- [HDFS-3031] - HA: Error (failed to close file) when uploading large file + kill active NN + manual failover
- [HDFS-3395] - NN doesn't start with HA+security enabled and HTTP address set to 0.0.0.0
- [HDFS-3414] - Balancer does not find NameNode if rpc-address or servicerpc-address are not set in client configs
- [HDFS-3026] - HA: Handle failure during HA state transition
- [HDFS-3396] - FUSE build fails on Ubuntu 12.04
- [HDFS-3328] - NPE in DataNode.getIpcPort
- [HDFS-3261] - TestHASafeMode fails on HDFS-3042 branch
- [HDFS-3037] - TestMulitipleNNDataBlockScanner#testBlockScannerAfterRestart is racy
- [HDFS-2976] - Remove unnecessary method (tokenRefetchNeeded) in DFSClient
Improvement
- [HDFS-3910] - DFSTestUtil#waitReplication should timeout
- [HDFS-3907] - Allow multiple users for local block readers
- [HDFS-3819] - Should check whether invalidate work percentage default value is not greater than 1.0f
- [HDFS-3723] - All commands should support meaningful --help
- [HDFS-3765] - Namenode INITIALIZESHAREDEDITS should be able to initialize all shared storages
- [HDFS-2727] - libhdfs should get the default block size from the server
- [HDFS-3826] - QJM: Some trivial logging / exception text improvements
- [HDFS-3796] - Speed up edit log tests by avoiding fsync()
- [HDFS-3672] - Expose disk-location information for blocks to enable better scheduling
- [HDFS-3513] - HttpFS should cache filesystems
- [HDFS-3667] - Add retry support to WebHdfsFileSystem
- [HDFS-3711] - Manually convert remaining tests to JUnit4
- [HDFS-3697] - Enable fadvise readahead by default
- [HDFS-3583] - Convert remaining tests to Junit4
- [HDFS-3650] - Use MutableQuantiles to provide latency histograms for various operations
- [HDFS-3537] - Move libhdfs and fuse-dfs source to native subdirectories
- [HDFS-3418] - Rename BlockWithLocationsProto datanodeIDs field to storageIDs
- [HDFS-3641] - Move server Util time methods to common and use now instead of System#currentTimeMillis
- [HDFS-3659] - Add missing @Override to methods across the hadoop-hdfs project
- [HDFS-3610] - fuse_dfs: Provide a way to use the default (configured) NN URI
- [HDFS-3612] - Single namenode image directory config warning can be improved
- [HDFS-799] - libhdfs must call DetachCurrentThread when a thread is destroyed
- [HDFS-3568] - fuse_dfs: add support for security
- [HDFS-3613] - GSet prints some INFO level values, which aren't really very useful to all
- [HDFS-2988] - Improve error message when storage directory lock fails
- [HDFS-3604] - Add dfs.webhdfs.enabled to hdfs-default.xml
- [HDFS-3170] - Add more useful metrics for write latency
- [HDFS-3481] - Refactor HttpFS handling of JAX-RS query string parameters
- [HDFS-3504] - Configurable retry in DFSClient
- [HDFS-3520] - Add transfer rate logging to TransferFsImage
- [HDFS-1013] - Miscellaneous improvements to HTML markup for web UIs
- [HDFS-3394] - Do not use generic in INodeFile.getLastBlock()
- [HDFS-3419] - Cleanup LocatedBlock
- [HDFS-3416] - Cleanup DatanodeID and DatanodeRegistration constructors used by testing
- [HDFS-3417] - Rename BalancerDatanode#getName to getDisplayName to be consistent with Datanode
- [HDFS-3401] - Cleanup DatanodeDescriptor creation in the tests
- [HDFS-3230] - Cleanup DatanodeID creation in the tests
- [HDFS-3369] - change variable names referring to inode in blockmanagement to more appropriate
- [HDFS-2857] - Cleanup BlockInfo class
- [HDFS-3516] - Check content-type in WebHdfsFileSystem
- [HDFS-3666] - Plumb more exception messages to terminate
- [HDFS-3663] - MiniDFSCluster should capture the code path that led to the first ExitException
- [HDFS-3582] - Hook daemon process exit for testing
- [HDFS-3343] - Improve metrics for DN read latency
- [HDFS-2391] - Newly set BalancerBandwidth value is not displayed anywhere
- [HDFS-3475] - Make the replication and invalidation rates configurable
- [HDFS-3372] - offlineEditsViewer should be able to read a binary edits file with recovery mode
- [HDFS-3514] - Add missing TestParallelLocalRead
- [HDFS-3110] - libhdfs implementation of direct read API
- [HDFS-2834] - ByteBuffer-based read API for DFSInputStream
- [HDFS-3454] - Balancer unconditionally logs InterruptedException at INFO level on shutdown if security is enabled
- [HDFS-3341] - Change minimum RPC versions to 2.0.0-SNAPSHOT instead of 2.0.0
- [HDFS-3438] - BootstrapStandby should not require a rollEdits on active node
- [HDFS-2885] - Remove "federation" from the nameservice config options
- [HDFS-3134] - Harden edit log loader against malformed or malicious input
- [HDFS-3335] - check for edit log corruption at the end of the log
- [HDFS-3404] - Make putImage in GetImageServlet infer remote address to fetch from request
- [HDFS-3400] - DNs should be able start with jsvc even if security is disabled
- [HDFS-3390] - DFSAdmin should print full stack traces of errors when DEBUG logging is enabled
- [HDFS-3223] - Auto HA: add zkfc to hadoop-daemon.sh script
New Feature
- [HDFS-3956] - QJM: purge temporary files when no longer within retention period
- [HDFS-3955] - QJM: Make acceptRecovery() atomic
- [HDFS-3950] - QJM: misc TODO cleanup, improved log messages, etc
- [HDFS-3943] - QJM: remove currently unused "md5sum" field.
- [HDFS-3926] - QJM: Add user documentation for QJM
- [HDFS-3894] - QJM: testRecoverAfterDoubleFailures can be flaky due to IPC client caching
- [HDFS-3840] - JournalNodes log JournalNotFormattedException backtrace error before being formatted
- [HDFS-3906] - QJM: quorum timeout on failover with large log segment
- [HDFS-3915] - QJM: Failover fails with auth error in secure cluster
- [HDFS-3914] - QJM: acceptRecovery should abort current segment
- [HDFS-3899] - QJM: Writer-side metrics
- [HDFS-3901] - QJM: send 'heartbeat' messages to JNs even when they are out-of-sync
- [HDFS-3900] - QJM: avoid validating log segments on log rolls
- [HDFS-3885] - QJM: optimize log sync when JN is lagging behind
- [HDFS-3898] - QJM: enable TCP_NODELAY for IPC
- [HDFS-3893] - QJM: Make QJM work with security enabled
- [HDFS-3726] - QJM: if a logger misses an RPC, don't retry that logger until next segment
- [HDFS-3891] - QJM: SBN fails if selectInputStreams throws RTE
- [HDFS-2793] - Add an admin command to trigger an edit log roll
- [HDFS-3870] - QJM: add metrics to JournalNode
- [HDFS-3884] - QJM: Journal format() should reset cached values
- [HDFS-3869] - QJM: expose non-file journal manager details in web UI
- [HDFS-3863] - QJM: track last "committed" txid
- [HDFS-3877] - QJM: Provide defaults for dfs.journalnode.*address
- [HDFS-3412] - Fix findbugs warnings in auto-HA branch
- [HDFS-3845] - Fixes for edge cases in QJM recovery protocol
- [HDFS-3839] - QJM: hadoop-daemon.sh should be updated to accept "journalnode"
- [HDFS-3571] - Allow EditLogFileInputStream to read from a remote URL
- [HDFS-3789] - JournalManager#format() should be able to throw IOException
- [HDFS-3695] - Genericize format() to non-file JournalManagers
- [HDFS-3573] - Supply NamespaceInfo when instantiating JournalManagers
- [HDFS-3800] - QJM: improvements to QJM fault testing
- [HDFS-3797] - QJM: add segment txid as a parameter to journal() RPC
- [HDFS-3799] - QJM: handle empty log segments during recovery
- [HDFS-3798] - Avoid throwing NPE when finalizeSegment() is called on invalid segment
- [HDFS-3795] - QJM: validate journal dir at startup
- [HDFS-3793] - Implement genericized format() in QJM
- [HDFS-3741] - QJM: exhaustive failure injection test for skipped RPCs
- [HDFS-3725] - Fix QJM startup when individual JNs have gaps
- [HDFS-3693] - QJM: JNStorage should read its storage info even before a writer becomes active
- [HDFS-3692] - QJM: support purgeEditLogs() call to remotely purge logs
- [HDFS-3694] - QJM: Fix getEditLogManifest to fetch httpPort if necessary
- [HDFS-3077] - Quorum-based protocol for reading and writing edit logs
- [HDFS-3049] - During the normal loading NN startup process, fall back on a different EditLog if we see one that is corrupt
- [HDFS-3150] - Add option for clients to contact DNs via hostname
- [HDFS-3190] - Simple refactors in existing NN code to assist QuorumJournalManager extension
- [HDFS-3113] - httpfs does not support delegation tokens
- [HDFS-3637] - Add support for encrypting the DataTransferProtocol
- [HDFS-744] - Support hsync in HDFS
- [HDFS-3535] - Audit logging should log denied accesses
- [HDFS-3159] - Document NN auto-failover setup and configuration
- [HDFS-3200] - Auto-HA: Scope all ZKFC configs by nameservice
- [HDFS-2185] - HA: HDFS portion of ZK-based FailoverController
Task
- [HDFS-3944] - Httpfs resolveAuthority() is not resolving host correctly
Test
- [HDFS-3291] - add test that covers HttpFS working w/ a non-HDFS Hadoop filesystem
- [HDFS-3634] - Add self-contained, mavenized fuse_dfs test
- [HDFS-3709] - TestStartup tests still binding to the ephemeral port
- [HDFS-3665] - Add a test for renaming across file systems via a symlink
- [HDFS-3606] - libhdfs: create self-contained unit test
Wish
- [HDFS-860] - fuse-dfs truncate behavior causes issues with scp
MapReduce
Bug
- [MAPREDUCE-4323] - NM leaks filesystems
- [MAPREDUCE-4444] - nodemanager fails to start when one of the local-dirs is bad
- [MAPREDUCE-4457] - mr job invalid transition TA_TOO_MANY_FETCH_FAILURE at FAILED
- [MAPREDUCE-4493] - Distibuted Cache Compatability Issues
- [MAPREDUCE-4456] - LocalDistributedCacheManager can get an ArrayIndexOutOfBounds when creating symlinks
- [MAPREDUCE-4483] - 2.0 build does not work
- [MAPREDUCE-4492] - Configuring total queue capacity between 100.5 and 99.5 at perticular level is sucessfull
- [MAPREDUCE-4440] - Change SchedulerApp & SchedulerNode to be a minimal interface
- [MAPREDUCE-4387] - RM gets fatal error and exits during TestRM
- [MAPREDUCE-4402] - TestFileInputFormat fails intermittently
- [MAPREDUCE-4419] - ./mapred queue -info <queuename> -showJobs displays all the jobs irrespective of <queuename>
- [MAPREDUCE-4342] - Distributed Cache gives inconsistent result if cache files get deleted from task tracker
- [MAPREDUCE-4380] - Empty Userlogs directory is getting created under logs directory
- [MAPREDUCE-4570] - ProcfsBasedProcessTree#constructProcessInfo() prints a warning if procfsDir/<pid>/stat is not found.
- [MAPREDUCE-2374] - "Text File Busy" errors launching MR tasks
- [MAPREDUCE-4470] - Fix TestCombineFileInputFormat.testForEmptyFile
- [MAPREDUCE-4068] - Jars in lib subdirectory of the submittable JAR are not added to the classpath
- [MAPREDUCE-4577] - HDFS-3672 broke TestCombineFileInputFormat.testMissingBlocks() test
- [MAPREDUCE-4372] - Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor and Shutdown hook manager
- [MAPREDUCE-3782] - teragen terasort jobs fail when using webhdfs://
- [MAPREDUCE-4562] - Support for "FileSystemCounter" legacy counter group name for compatibility reasons is creating incorrect counter name
- [MAPREDUCE-4053] - Counters group names deprecation is wrong, iterating over group names deprecated names don't show up
- [MAPREDUCE-4392] - Counters.makeCompactString() changed behavior from 0.20
- [MAPREDUCE-4407] - Add hadoop-yarn-server-tests-<version>-tests.jar to hadoop dist package
- [MAPREDUCE-4406] - Users should be able to specify the MiniCluster ResourceManager and JobHistoryServer ports
- [MAPREDUCE-4449] - Incorrect MR_HISTORY_STORAGE property name in JHAdminConfig
- [MAPREDUCE-4290] - JobStatus.getState() API is giving ambiguous values
- [MAPREDUCE-4320] - gridmix mainClass wrong in pom.xml
- [MAPREDUCE-4152] - map task left hanging after AM dies trying to connect to RM
- [MAPREDUCE-4306] - Problem running Distributed Shell applications as a user other than the one started the daemons
- [MAPREDUCE-4441] - Fix build issue caused by MR-3451
- [MAPREDUCE-3940] - ContainerTokens should have an expiry interval
- [MAPREDUCE-4494] - TestFifoScheduler failing with Metrics source QueueMetrics,q0=default already exists!
- [MAPREDUCE-4299] - Terasort hangs with MR2 FifoScheduler
- [MAPREDUCE-4447] - Remove aop from cruft from the ant build
- [MAPREDUCE-4224] - TestFifoScheduler throws org.apache.hadoop.metrics2.MetricsException
- [MAPREDUCE-2220] - Fix new API FileOutputFormat-related typos in mapred-default.xml
- [MAPREDUCE-3493] - Add the default mapreduce.shuffle.port property to mapred-default.xml
- [MAPREDUCE-4379] - Node Manager throws java.lang.OutOfMemoryError: Java heap space due to org.apache.hadoop.fs.LocalDirAllocator.contexts
- [MAPREDUCE-4467] - IndexCache failures due to missing synchronization
- [MAPREDUCE-4384] - Race conditions in IndexCache
- [MAPREDUCE-4498] - Remove hsqldb jar from Hadoop runtime classpath
- [MAPREDUCE-4233] - NPE can happen in RMNMNodeInfo.
- [MAPREDUCE-3893] - allow capacity scheduler configs maximum-applications and maximum-am-resource-percent configurable on a per queue basis
- [MAPREDUCE-4448] - Nodemanager crashes upon application cleanup if aggregation failed to start
- [MAPREDUCE-4437] - Race in MR ApplicationMaster can cause reducers to never be scheduled
- [MAPREDUCE-4395] - Possible NPE at ClientDistributedCacheManager#determineTimestamps
- [MAPREDUCE-3927] - Shuffle hang when set map.failures.percent
- [MAPREDUCE-4270] - data_join test classes are in the wrong packge
- [MAPREDUCE-4238] - mavenize data_join
- [MAPREDUCE-4416] - Some tests fail if Clover is enabled
- [MAPREDUCE-4031] - Node Manager hangs on shut down
- [MAPREDUCE-4295] - RM crashes due to DNS issue
- [MAPREDUCE-3889] - job client tries to use /tasklog interface, but that doesn't exist anymore
- [MAPREDUCE-4465] - Update description of yarn.nodemanager.address property
- [MAPREDUCE-3350] - Per-app RM page should have the list of application-attempts like on the app JHS page
- [MAPREDUCE-3873] - Nodemanager is not getting decommisioned if the absolute ip is given in exclude file.
- [MAPREDUCE-4383] - HadoopPipes.cc needs to include unistd.h
- [MAPREDUCE-4276] - Allow setting yarn.nodemanager.delete.debug-delay-sec property to "-1" for easier container debugging.
- [MAPREDUCE-4313] - TestTokenCache doesn't compile due TokenCache.getDelegationToken compilation error
- [MAPREDUCE-4297] - Usersmap file in gridmix should not fail on empty lines
- [MAPREDUCE-4302] - NM goes down if error encountered during log aggregation
- [MAPREDUCE-3993] - Graceful handling of codec errors during decompression
- [MAPREDUCE-2739] - MR-279: Update installation docs (remove YarnClientFactory)
- [MAPREDUCE-3870] - Invalid App Metrics
- [MAPREDUCE-4002] - MultiFileWordCount job fails if the input path is not from default file system
- [MAPREDUCE-4102] - job counters not available in Jobhistory webui for killed jobs
- [MAPREDUCE-4269] - documentation: Gridmix has javadoc warnings in StressJobFactory
- [MAPREDUCE-3543] - Mavenize Gridmix.
- [MAPREDUCE-4148] - MapReduce should not have a compile-time dependency on HDFS
- [MAPREDUCE-4197] - Include the hsqldb jar in the hadoop-mapreduce tar file
- [MAPREDUCE-4267] - mavenize pipes
- [MAPREDUCE-4341] - add types to capacity scheduler properties documentation
- [MAPREDUCE-4311] - Capacity scheduler.xml does not accept decimal values for capacity and maximum-capacity settings
- [MAPREDUCE-4307] - TeraInputFormat calls FileSystem.getDefaultBlockSize() without a Path - Failure when using ViewFileSystem
- [MAPREDUCE-4250] - hadoop-config.sh missing variable exports, causes Yarn jobs to fail with ClassNotFoundException MRAppMaster
- [MAPREDUCE-4237] - TestNodeStatusUpdater can fail if localhost has a domain associated with it
Improvement
- [MAPREDUCE-4422] - YARN_APPLICATION_CLASSPATH needs a documented default value in YarnConfiguration
- [MAPREDUCE-4283] - Display tail of aggregated logs by default
- [MAPREDUCE-4408] - allow jobs to set a JAR that is in the distributed cached
- [MAPREDUCE-4375] - Show Configuration Tracability in MR UI
- [MAPREDUCE-4157] - ResourceManager should not kill apps that are well behaved
- [MAPREDUCE-4427] - Enable the RM to work with AM's that are not managed by it
- [MAPREDUCE-4511] - Add IFile readahead
- [MAPREDUCE-3289] - Make use of fadvise in the NM's shuffle handler
- [MAPREDUCE-4146] - Support limits on task status string length and number of block locations in branch-2
- [MAPREDUCE-3907] - Document entries mapred-default.xml for the jobhistory server.
- [MAPREDUCE-3871] - Allow symlinking in LocalJobRunner DistributedCache
- [MAPREDUCE-3787] - [Gridmix] Improve STRESS mode
- [MAPREDUCE-3842] - stop webpages from automatic refreshing
- [MAPREDUCE-4301] - Dedupe some strings in MRAM for memory savings
New Feature
Task
- [MAPREDUCE-4253] - Tests for mapreduce-client-core are lying under mapreduce-client-jobclient