CDH 5.13.1 Release Notes
The following lists all Apache Hadoop Jiras included in CDH 5.13.1
that are not included in the Apache Hadoop base version 2.6.0. The
hadoop-2.6.0-cdh5.13.1.CHANGES.txt
file lists all changes included in CDH 5.13.1. The patch for each
change can be found in the cloudera/patches directory in the release tarball.
Changes Not In Apache Hadoop 2.6.0
MapReduce
Bug
- [MAPREDUCE-6815] - Fix flaky TestKill.testKillTask()
- [MAPREDUCE-6165] - [JDK8] TestCombineFileInputFormat failed on JDK8
- [MAPREDUCE-6201] - TestNetworkedJob fails on trunk
- [MAPREDUCE-6825] - YARNRunner#createApplicationSubmissionContext method is longer than 150 lines
- [MAPREDUCE-6850] - Shuffle Handler keep-alive connections are closed from the server side
- [MAPREDUCE-6762] - ControlledJob#toString failed with NPE when job status is not successfully updated
- [MAPREDUCE-6852] - Job#updateStatus() failed with NPE due to race condition
- [MAPREDUCE-6595] - Fix findbugs warnings in OutputCommitter and FileOutputCommitter
- [MAPREDUCE-6555] - TestMRAppMaster fails on trunk
- [MAPREDUCE-5653] - DistCp does not honour config-overrides for mapreduce.[map,reduce].memory.mb
- [MAPREDUCE-6839] - TestRecovery.testCrashed failed
- [MAPREDUCE-5883] - "Total megabyte-seconds" in job counters is slightly misleading
- [MAPREDUCE-6639] - Process hangs in LocatedFileStatusFetcher if FileSystem.get throws
- [MAPREDUCE-5155] - Race condition in test case TestFetchFailure cause it to fail
- [MAPREDUCE-6172] - TestDbClasses timeouts are too aggressive
- [MAPREDUCE-6715] - Fix Several Unsafe Practices
- [MAPREDUCE-6817] - The format of job start time in JHS is different from those of submit and finish time
- [MAPREDUCE-6571] - JobEndNotification info logs are missing in AM container syslog
- [MAPREDUCE-6801] - Fix flaky TestKill.testKillJob()
- [MAPREDUCE-6541] - Exclude scheduled reducer memory when calculating available mapper slots from headroom to avoid deadlock
- [MAPREDUCE-6789] - Fix TestAMWebApp failure
- [MAPREDUCE-6765] - MR should not schedule container requests in cases where reducer or mapper containers demand resource larger than the maximum supported
- [MAPREDUCE-6798] - Fix intermittent failure of TestJobHistoryParsing.testJobHistoryMethods()
- [MAPREDUCE-6740] - Enforce mapreduce.task.timeout to be at least mapreduce.task.progress-report.interval
- [MAPREDUCE-6579] - JobStatus#getFailureInfo should not output diagnostic information when the job is running
- [MAPREDUCE-6750] - TestHSAdminServer.testRefreshSuperUserGroups is failing
- [MAPREDUCE-6497] - Fix wrong value of JOB_FINISHED event in JobHistoryEventHandler
- [MAPREDUCE-6771] - RMContainerAllocator sends container diagnostics event after corresponding completion event
- [MAPREDUCE-6628] - Potential memory leak in CryptoOutputStream
- [MAPREDUCE-6641] - TestTaskAttempt fails in trunk
- [MAPREDUCE-6670] - TestJobListCache#testEviction sometimes fails on Windows with timeout
- [MAPREDUCE-4784] - TestRecovery occasionally fails
- [MAPREDUCE-6763] - Shuffle server listen queue is too small
- [MAPREDUCE-6764] - Teragen LOG initialization bug
- [MAPREDUCE-6761] - Regression when handling providers - invalid configuration ServiceConfiguration causes Cluster initialization failure
- [MAPREDUCE-6359] - RM HA setup, "Cluster" tab links populated with AM hostname instead of RM
- [MAPREDUCE-6259] - IllegalArgumentException due to missing job submit time
- [MAPREDUCE-6724] - Single shuffle to memory must not exceed Integer#MAX_VALUE
- [MAPREDUCE-6442] - Stack trace is missing when error occurs in client protocol provider's constructor
- [MAPREDUCE-6242] - Progress report log is incredibly excessive in application master
- [MAPREDUCE-6680] - JHS UserLogDir scan algorithm sometime could skip directory with update in CloudFS (Azure FileSystem, S3, etc.)
- [MAPREDUCE-6577] - MR AM unable to load native library without MR_AM_ADMIN_USER_ENV set
- [MAPREDUCE-6533] - testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken
- [MAPREDUCE-6374] - Distributed Cache File visibility should check permission of full path
- [MAPREDUCE-6633] - AM should retry map attempts if the reduce task encounters commpression related errors.
- [MAPREDUCE-6657] - job history server can fail on startup when NameNode is in start phase
- [MAPREDUCE-6558] - multibyte delimiters with compressed input files generate duplicate records
- [MAPREDUCE-6647] - MR usage counters use the resources requested instead of the resources allocated
- [MAPREDUCE-6701] - application master log can not be available when clicking jobhistory's am logs link
- [MAPREDUCE-6698] - Increase timeout on TestUnnecessaryBlockingOnHistoryFileInfo.testTwoThreadsQueryingDifferentJobOfSameUser
- [MAPREDUCE-6689] - MapReduce job can infinitely increase number of reducer resource requests
- [MAPREDUCE-6514] - Job hangs as ask is not updated after ramping down of all reducers
- [MAPREDUCE-6684] - High contention on scanning of user directory under immediate_done in Job History Server
- [MAPREDUCE-6675] - TestJobImpl.testUnusableNode failed
- [MAPREDUCE-6677] - LocalContainerAllocator doesn't specify resource of the containers allocated.
- [MAPREDUCE-2398] - MRBench: setting the baseDir parameter has no effect
- [MAPREDUCE-6513] - MR job got hanged forever when one NM unstable for some time
- [MAPREDUCE-6485] - MR job hanged forever because all resources are taken up by reducers and the last map attempt never get resource to run
- [MAPREDUCE-6580] - Test failure : TestMRJobsWithProfiler
- [MAPREDUCE-6426] - TestShuffleHandler#testGetMapOutputInfo is failing
- [MAPREDUCE-6492] - AsyncDispatcher exit with NPE on TaskAttemptImpl#sendJHStartEventForAssignedFailTask
- [MAPREDUCE-5982] - Task attempts that fail from the ASSIGNED state can disappear
- [MAPREDUCE-6518] - Set SO_KEEPALIVE on shuffle connections
- [MAPREDUCE-6199] - AbstractCounters are not reset completely on deserialization
- [MAPREDUCE-6474] - ShuffleHandler can possibly exhaust nodemanager file descriptors
- [MAPREDUCE-6221] - Stringifier is left unclosed in Chain#getChainElementConf()
- [MAPREDUCE-6554] - MRAppMaster servicestart failing with NPE in MRAppMaster#parsePreviousJobHistory
- [MAPREDUCE-6635] - Unsafe long to int conversion in UncompressedSplitLineReader and IndexOutOfBoundsException
- [MAPREDUCE-6425] - ShuffleHandler passes wrong "base" parameter to getMapOutputInfo if mapId is not in the cache.
- [MAPREDUCE-6334] - Fetcher#copyMapOutput is leaking usedMemory upon IOException during InMemoryMapOutput shuffle handler
- [MAPREDUCE-6166] - Reducers do not validate checksum of map outputs when fetching directly to disk
- [MAPREDUCE-6410] - Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings in secure cluster
- [MAPREDUCE-6324] - Uber jobs fail to update AMRM token when it rolls over
- [MAPREDUCE-6136] - MRAppMaster doesn't shutdown file systems
- [MAPREDUCE-6621] - Memory Leak in JobClient#submitJobInternal()
- [MAPREDUCE-6618] - YarnClientProtocolProvider leaking the YarnClient thread.
- [MAPREDUCE-6210] - Use getApplicationAttemptId() instead of getApplicationID() for logging AttemptId in RMContainerAllocator.java
- [MAPREDUCE-6535] - TaskID default constructor results in NPE on toString()
- [MAPREDUCE-4785] - TestMRApp occasionally fails
- [MAPREDUCE-6452] - NPE when intermediate encrypt enabled for LocalRunner
- [MAPREDUCE-6433] - launchTime may be negative
- [MAPREDUCE-6045] - need close the DataInputStream after open it in TestMapReduce.java
- [MAPREDUCE-6251] - JobClient needs additional retries at a higher level to address not-immediately-consistent dfs corner cases
- [MAPREDUCE-6460] - TestRMContainerAllocator.testAttemptNotFoundCausesRMCommunicatorException fails
- [MAPREDUCE-6637] - Testcase Failure : TestFileInputFormat.testSplitLocationInfo
- [MAPREDUCE-6620] - Jobs that did not start are shown as starting in 1969 in the JHS web UI
- [MAPREDUCE-6528] - Memory leak for HistoryFileManager.getJobSummary()
- [MAPREDUCE-6252] - JobHistoryServer should not fail when encountering a missing directory
- [MAPREDUCE-6589] - TestTaskLog outputs a log under directory other than target/test-dir
- [MAPREDUCE-6357] - MultipleOutputs.write() API should document that output committing is not utilized when input path is absolute
- [MAPREDUCE-6549] - multibyte delimiters with LineRecordReader cause duplicate records
- [MAPREDUCE-6550] - archive-logs tool changes log ownership to the Yarn user when using DefaultContainerExecutor
- [MAPREDUCE-6495] - Docs for archive-logs tool
- [MAPREDUCE-6233] - org.apache.hadoop.mapreduce.TestLargeSort.testLargeSort failed in trunk
- [MAPREDUCE-6302] - Preempt reducers after a configurable timeout irrespective of headroom
- [MAPREDUCE-6503] - archive-logs tool should use HADOOP_PREFIX instead of HADOOP_HOME
- [MAPREDUCE-6494] - Permission issue when running archive-logs tool as different users
- [MAPREDUCE-5649] - Reduce cannot use more than 2G memory for the final merge
- [MAPREDUCE-6361] - NPE issue in shuffle caused by concurrent issue between copySucceeded() in one thread and copyFailed() in another thread on the same host
- [MAPREDUCE-6303] - Read timeout when retrying a fetch error can be fatal to a reducer
- [MAPREDUCE-6480] - archive-logs tool may miss applications
- [MAPREDUCE-6273] - HistoryFileManager should check whether summaryFile exists to avoid FileNotFoundException causing HistoryFileInfo into MOVE_FAILED state
- [MAPREDUCE-6484] - Yarn Client uses local address instead of RM address as token renewer in a secure cluster when RM HA is enabled.
- [MAPREDUCE-5918] - LineRecordReader can return the same decompressor to CodecPool multiple times
- [MAPREDUCE-6481] - LineRecordReader may give incomplete record and wrong position/key information for uncompressed input sometimes.
- [MAPREDUCE-5948] - org.apache.hadoop.mapred.LineRecordReader does not handle multibyte record delimiters well
- [MAPREDUCE-6238] - MR2 can't run local jobs with -libjars command options which is a regression from MR1
- [MAPREDUCE-6413] - TestLocalJobSubmission is failing with unknown host
- [MAPREDUCE-6439] - AM may fail instead of retrying if RM shuts down during the allocate call
- [MAPREDUCE-6277] - Job can post multiple history files if attempt loses connection to the RM
- [MAPREDUCE-5817] - Mappers get rescheduled on node transition even after all reducers are completed
- [MAPREDUCE-6394] - Speed up Task processing loop in HsTasksBlock#render()
- [MAPREDUCE-6353] - Divide by zero error in MR AM when calculating available containers
- [MAPREDUCE-6376] - Add avro binary support for jhist files
- [MAPREDUCE-6121] - JobResourceUpdater#compareFs() doesn't handle HA namespaces
- [MAPREDUCE-5965] - Hadoop streaming throws error if list of input files is high. Error is: "error=7, Argument list too long at if number of input file is high"
- [MAPREDUCE-6387] - Serialize the recently added Task#encryptedSpillKey field at the end
- [MAPREDUCE-6339] - Job history file is not flushed correctly because isTimerActive flag is not set true when flushTimerTask is scheduled.
- [MAPREDUCE-6343] - JobConf.parseMaximumHeapSizeMB() fails to parse value greater than 2GB expressed in bytes
- [MAPREDUCE-6300] - Task list sort by task id broken
- [MAPREDUCE-3859] - CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs
- [MAPREDUCE-6266] - Job#getTrackingURL should consistently return a proper URL
- [MAPREDUCE-6076] - Zero map split input length combine with none zero map split input length will cause MR1 job hung.
- [MAPREDUCE-5875] - Make Counter limits consistent across JobClient, MRAppMaster, and YarnChild
- [MAPREDUCE-6275] - Race condition in FileOutputCommitter v2 for user-specified task output subdirs
- [MAPREDUCE-6230] - MR AM does not survive RM restart if RM activated a new AMRM secret key
- [MAPREDUCE-6223] - TestJobConf#testNegativeValueForTaskVmem failures
- [MAPREDUCE-5656] - bzip2 codec can drop records when reading data in splits
- [MAPREDUCE-6063] - In sortAndSpill of MapTask.java, size is calculated wrongly when bufend < bufstart.
- [MAPREDUCE-6196] - Fix BigDecimal ArithmeticException in PiEstimator
- [MAPREDUCE-6162] - mapred hsadmin fails on a secure cluster
- [MAPREDUCE-6147] - Support mapreduce.input.fileinputformat.split.maxsize
- [MAPREDUCE-5968] - Work directory is not deleted when downloadCacheObject throws IOException
- [MAPREDUCE-6198] - NPE from JobTracker#resolveAndAddToTopology in MR1 cause initJob and heartbeat failure.
- [MAPREDUCE-4490] - JVM reuse is incompatible with LinuxTaskController (and therefore incompatible with Security)
- [MAPREDUCE-6170] - TestUlimit failure on JDK8
- [MAPREDUCE-5966] - MR1 FairScheduler use of custom weight adjuster is not thread safe for comparisons
- [MAPREDUCE-5979] - FairScheduler: zero weight can cause sort failures
- [MAPREDUCE-5375] - Delegation Token renewal exception in jobtracker logs
- [MAPREDUCE-5707] - JobClient does not allow setting RPC timeout for communications with JT/RM
- [MAPREDUCE-5862] - Line records longer than 2x split size aren't handled correctly
- [MAPREDUCE-2779] - JobSplitWriter.java can't handle large job.split file
- [MAPREDUCE-2324] - Job should fail if a reduce task can't be scheduled anywhere
- [MAPREDUCE-5702] - TaskLogServlet#printTaskLog has spurious HTML closing tags
- [MAPREDUCE-4383] - HadoopPipes.cc needs to include unistd.h
- [MAPREDUCE-5206] - JT can show the same job multiple times in Retired Jobs section
- [MAPREDUCE-5508] - JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob
- [MAPREDUCE-5066] - JobTracker should set a timeout when calling into job.end.notification.url
- [MAPREDUCE-5198] - Race condition in cleanup during task tracker renint with LinuxTaskController
- [MAPREDUCE-4914] - TestMiniMRDFSSort fails with openJDK7
- [MAPREDUCE-5450] - Unnecessary Configuration instantiation in IFileInputStream slows down merge - Port to branch-1
- [MAPREDUCE-4366] - mapred metrics shows negative count of waiting maps and reduces
- [MAPREDUCE-5351] - JobTracker memory leak caused by CleanupQueue reopening FileSystem
- [MAPREDUCE-5364] - Deadlock between RenewalTimerTask methods cancel() and run()
- [MAPREDUCE-5038] - old API CombineFileInputFormat missing fixes that are in new API
- [MAPREDUCE-2359] - Distributed cache doesn't use non-default FileSystems correctly
- [MAPREDUCE-5250] - Searching for ';' in JobTracker History throws ArrayOutOfBoundException
- [MAPREDUCE-4970] - Child tasks (try to) create security audit log files
- [MAPREDUCE-323] - Improve the way job history files are managed
- [MAPREDUCE-4576] - Large dist cache can block tasktracker heartbeat
- [MAPREDUCE-3824] - Distributed caches are not removed properly
- [MAPREDUCE-2409] - Distributed Cache does not differentiate between file /archive for files with the same path
- [MAPREDUCE-5218] - Annotate (comment) internal classes as Private
- [MAPREDUCE-5193] - A few MR tests use block sizes which are smaller than the default minimum block size
- [MAPREDUCE-5154] - staging directory deletion fails because delegation tokens have been cancelled
- [MAPREDUCE-5133] - TestSubmitJob.testSecureJobExecution is flaky due to job dir deletion race
- [MAPREDUCE-2817] - MiniRMCluster hardcodes 'mapred.local.dir' configuration to 'build/test/mapred/local'
- [MAPREDUCE-5008] - Merger progress miscounts with respect to EOF_MARKER
- [MAPREDUCE-5028] - Maps fail when io.sort.mb is set to high value
- [MAPREDUCE-5073] - TestJobStatusPersistency.testPersistency fails on JDK7
- [MAPREDUCE-5072] - TestDelegationTokenRenewal.testDTRenewal fails in MR1 on jdk7
- [MAPREDUCE-5070] - TestClusterStatus.testClusterMetrics fails on JDK7
- [MAPREDUCE-2187] - map tasks timeout during sorting
- [MAPREDUCE-5047] - keep.failed.task.files=true causes job failure on secure clusters
- [MAPREDUCE-4843] - When using DefaultTaskController, JobLocalizer not thread safe
- [MAPREDUCE-4643] - Make job-history cleanup-period configurable
- [MAPREDUCE-4888] - NLineInputFormat drops data in 1.1 and beyond
- [MAPREDUCE-4962] - jobdetails.jsp uses display name instead of real name to get counters
- [MAPREDUCE-4963] - StatisticsCollector improperly keeps track of "Last Day" and "Last Hour" statistics for new TaskTrackers
- [MAPREDUCE-2264] - Job status exceeds 100% in some cases
- [MAPREDUCE-4929] - mapreduce.task.timeout is ignored
- [MAPREDUCE-4562] - Support for "FileSystemCounter" legacy counter group name for compatibility reasons is creating incorrect counter name
- [MAPREDUCE-4923] - Add toString method to TaggedInputSplit
- [MAPREDUCE-4315] - jobhistory.jsp throws 500 when a .txt file is found in /done
- [MAPREDUCE-4925] - The pentomino option parser may be buggy
- [MAPREDUCE-4924] - flakey test: org.apache.hadoop.mapred.TestClusterMRNotification.testMR
- [MAPREDUCE-3475] - JT can't renew its own tokens
Improvement
- [MAPREDUCE-6829] - Add peak memory usage counter for each task
- [MAPREDUCE-6478] - Add an option to skip cleanupJob stage or ignore cleanup failure during commitJob().
- [MAPREDUCE-5485] - Allow repeating job commit by extending OutputCommitter API
- [MAPREDUCE-5981] - Log levels of certain MR logs can be changed to DEBUG
- [MAPREDUCE-5335] - Rename Job Tracker terminology in ShuffleSchedulerImpl
- [MAPREDUCE-6728] - Give fetchers hint when ShuffleHandler rejects a shuffling connection
- [MAPREDUCE-6776] - yarn.app.mapreduce.client.job.max-retries should have a more useful default
- [MAPREDUCE-6718] - add progress log to JHS during startup
- [MAPREDUCE-6473] - Job submission can take a long time during Cluster initialization
- [MAPREDUCE-6751] - Add debug log message when splitting is not possible due to unsplittable compression
- [MAPREDUCE-6741] - add MR support to redact job conf properties
- [MAPREDUCE-6652] - Add configuration property to prevent JHS from loading jobs with a task count greater than X
- [MAPREDUCE-6719] - The list of -libjars archives should be replaced with a wildcard in the distributed cache to reduce the application footprint in the state store
- [MAPREDUCE-6686] - Add a way to download the job config from the mapred CLI
- [MAPREDUCE-6297] - Task Id of the failed task in diagnostics should link to the task page
- [MAPREDUCE-5583] - Ability to limit running map and reduce tasks
- [MAPREDUCE-6384] - Add the last reporting reducer info for too many fetch failure diagnostics
- [MAPREDUCE-5932] - Provide an option to use a dedicated reduce-side shuffle log
- [MAPREDUCE-6100] - replace "mapreduce.job.credentials.binary" with MRJobConfig.MAPREDUCE_JOB_CREDENTIALS_BINARY for better readability.
- [MAPREDUCE-6443] - Add JvmPauseMonitor to Job History Server
- [MAPREDUCE-6622] - Add capability to set JHS job cache to a task-based limit
- [MAPREDUCE-6640] - mapred job -history command should be able to take Job ID
- [MAPREDUCE-6566] - Add retry support to mapreduce CLI tool
- [MAPREDUCE-6627] - Add machine-readable output to mapred job -history command
- [MAPREDUCE-6431] - JobClient should be an AutoClosable
- [MAPREDUCE-6059] - Speed up history server startup time
- [MAPREDUCE-6436] - JobHistory cache issue
- [MAPREDUCE-6265] - Make ContainerLauncherImpl.INITIAL_POOL_SIZE configurable to better control to launch/kill containers
- [MAPREDUCE-4653] - TestRandomAlgorithm has an unused "import" statement
- [MAPREDUCE-6267] - Refactor JobSubmitter#copyAndConfigureFiles into it's own class
- [MAPREDUCE-5465] - Tasks are often killed before they exit on their own
- [MAPREDUCE-6057] - Remove obsolete entries from mapred-default.xml
- [MAPREDUCE-1305] - Running distcp with -delete incurs avoidable penalties
- [MAPREDUCE-4815] - Speed up FileOutputCommitter#commitJob for many output files
- [MAPREDUCE-4736] - Remove obsolete option [-rootDir] from TestDFSIO
- [MAPREDUCE-6194] - Bubble up final exception in failures during creation of output collectors
- [MAPREDUCE-6169] - MergeQueue should release reference to the current item from key and value at the end of the iteration to save memory.
- [MAPREDUCE-6143] - add configuration for mapreduce speculative execution in MR2
- [MAPREDUCE-6077] - Remove CustomModule examples in nativetask
- [MAPREDUCE-6074] - native-task: fix release audit, javadoc, javac warnings
- [MAPREDUCE-5974] - Allow specifying multiple MapOutputCollectors with fallback
- [MAPREDUCE-6069] - native-task: Style fixups and dead code removal
- [MAPREDUCE-6067] - native-task: fix some counter issues
- [MAPREDUCE-6055] - native-task: findbugs, interface annotations, and other misc cleanup
- [MAPREDUCE-6056] - nativetask: move system test working dir to target dir and cleanup test config xml files
- [MAPREDUCE-6058] - native-task: KVTest and LargeKVTest should check mr job is sucessful
- [MAPREDUCE-6054] - native-task: speed up test runs
- [MAPREDUCE-5977] - Fix or suppress native-task gcc warnings
- [MAPREDUCE-6025] - native-task: fix native library distribution
- [MAPREDUCE-6035] - native-task: sources/test-sources jar distribution
- [MAPREDUCE-6026] - native-task: fix logging
- [MAPREDUCE-6006] - native-task: add native tests to maven and fix bug in pom.xml
- [MAPREDUCE-5978] - native-task CompressTest failure on Ubuntu
- [MAPREDUCE-5976] - native-task should not fail to build if snappy is missing
- [MAPREDUCE-5984] - native-task: reuse lz4 sources in hadoop-common
- [MAPREDUCE-6005] - native-task: fix some valgrind errors
- [MAPREDUCE-5995] - native-task: revert changes which expose Text internals
- [MAPREDUCE-5991] - native-task should not run unit tests if native profile is not enabled
- [MAPREDUCE-6000] - native-task: Simplify ByteBufferDataReader/Writer
- [MAPREDUCE-5997] - native-task: Use DirectBufferPool from Hadoop Common
- [MAPREDUCE-5996] - native-task: Rename system tests into standard directory layout
- [MAPREDUCE-5994] - native-task: TestBytesUtil fails
- [MAPREDUCE-5985] - native-task: Fix build on macosx
- [MAPREDUCE-2841] - Task level native optimization
- [MAPREDUCE-6088] - TestTokenCache tests should use their own JobConf instances
- [MAPREDUCE-5777] - Support utf-8 text with BOM (byte order marker)
- [MAPREDUCE-5651] - Backport Fair Scheduler queue placement policies to branch-1
- [MAPREDUCE-3310] - Custom grouping comparator cannot be set for Combiners
- [MAPREDUCE-3169] - Create a new MiniMRCluster equivalent which only provides client APIs cross MR1 and MR2
- [MAPREDUCE-5609] - Add debug log message when sending job end notification
- [MAPREDUCE-5367] - Local jobs all use same local working directory
- [MAPREDUCE-5379] - Include token tracking ids in jobconf
- [MAPREDUCE-2494] - Make the distributed cache delete entires using LRU priority
- [MAPREDUCE-2495] - The distributed cache cleanup thread has no monitoring to check to see if it has died for some reason
- [MAPREDUCE-1568] - TrackerDistributedCacheManager should clean up cache in a background thread
- [MAPREDUCE-4777] - In TestIFile, testIFileReaderWithCodec relies on testIFileWriterWithCodec
- [MAPREDUCE-4977] - Documentation for pluggable shuffle and pluggable sort
- [MAPREDUCE-2931] - CLONE - LocalJobRunner should support parallel mapper execution
- [MAPREDUCE-2492] - [MAPREDUCE] The new MapReduce API should make available task's progress to the task
New Feature
- [MAPREDUCE-6871] - Allow users to specify racks and nodes for strict locality for AMs
- [MAPREDUCE-6415] - Create a tool to combine aggregated logs into HAR files
- [MAPREDUCE-6304] - Specifying node labels when submitting MR jobs
- [MAPREDUCE-5785] - Derive heap size or mapreduce.*.memory.mb automatically
- [MAPREDUCE-4049] - plugin for generic shuffle service
- [MAPREDUCE-4808] - Refactor MapOutput and MergeManager to facilitate reuse by Shuffle implementations
- [MAPREDUCE-4807] - Allow MapOutputBuffer to be pluggable
Task
- [MAPREDUCE-6264] - Remove httpclient dependency from hadoop-mapreduce-client
- [MAPREDUCE-6388] - Remove deprecation warnings from JobHistoryServer classes
- [MAPREDUCE-4246] - Failure in deleting user directories in Secure hadoop
Test
- [MAPREDUCE-6831] - Flaky test TestJobImpl.testKilledDuringKillAbort
- [MAPREDUCE-6738] - TestJobListCache.testAddExisting failed intermittently in slow VM testbed
- [MAPREDUCE-6191] - TestJavaSerialization fails with getting incorrect MR job result
Common
Bug
- [HADOOP-14949] - TestKMS#testACLs fails intermittently
- [HADOOP-12622] - RetryPolicies (other than FailoverOnNetworkExceptionRetry) should put on retry failed reason or the log from RMProxy's retry could be very misleading.
- [HADOOP-14824] - Update ADLS SDK to 2.2.2 for MSI fix
- [HADOOP-14760] - Add missing override to LoadBalancingKMSClientProvider
- [HADOOP-13868] - New defaults for S3A multi-part configuration
- [HADOOP-14586] - StringIndexOutOfBoundsException breaks org.apache.hadoop.util.Shell on 2.7.x with Java 9
- [HADOOP-13588] - ConfServlet should respect Accept request header
- [HADOOP-13867] - FilterFileSystem should override rename(.., options) to take effect of Rename options called via FilterFileSystem implementations
- [HADOOP-9137] - Support connection limiting in IPC server
- [HADOOP-14646] - FileContextMainOperationsBaseTest#testListStatusFilterWithSomeMatches never runs
- [HADOOP-14533] - Size of args cannot be less than zero in TraceAdmin#run as its linkedlist
- [HADOOP-14540] - Replace MRv1 specific terms in HostsFileReader
- [HADOOP-14376] - Memory leak when reading a compressed file using the native library
- [HADOOP-13414] - Hide Jetty Server version header in HTTP responses
- [HADOOP-14024] - KMS JMX endpoint throws ClassNotFoundException
- [HADOOP-14464] - hadoop-aws doc header warning #5 line wrapped
- [HADOOP-14563] - LoadBalancingKMSClientProvider#warmUpEncryptedKeys swallows IOException
- [HADOOP-14511] - WritableRpcEngine.Invocation#toString NPE on null parameters
- [HADOOP-12906] - AuthenticatedURL should convert a 404/Not Found into an FileNotFoundException.
- [HADOOP-13815] - TestKMS#testDelegationTokensOpsSimple and TestKMS#testDelegationTokensOpsKerberized Fails in Trunk
- [HADOOP-13813] - TestDelegationTokenFetcher#testDelegationTokenWithoutRenewer is failing
- [HADOOP-11385] - Prevent cross site scripting attack on JMXJSONServlet
- [HADOOP-12054] - RPC client should not retry for InvalidToken exceptions
- [HADOOP-13552] - RetryInvocationHandler logs all remote exceptions
- [HADOOP-11628] - SPNEGO auth does not work with CNAMEs in JDK8
- [HADOOP-12751] - While using kerberos Hadoop incorrectly assumes names with '@' to be non-simple
- [HADOOP-12469] - distcp should not ignore the ignoreFailures option
- [HADOOP-14100] - Upgrade Jsch jar to latest version to fix vulnerability in old versions
- [HADOOP-13138] - Unable to append to a SequenceFile with Compression.NONE.
- [HADOOP-13700] - Remove unthrown IOException from TrashPolicy#initialize and #getInstance signatures
- [HADOOP-13441] - Document LdapGroupsMapping keystore password properties
- [HADOOP-11962] - Sasl message with MD5 challenge text shouldn't be LOG out even in debug level.
- [HADOOP-14268] - Fix markdown itemization in hadoop-aws documents
- [HADOOP-14369] - NetworkTopology calls expensive toString() when logging
- [HADOOP-14256] - [S3A DOC] Correct the format for "Seoul" example
- [HADOOP-14204] - S3A multipart commit failing, "UnsupportedOperationException at java.util.Collections$UnmodifiableList.sort"
- [HADOOP-14059] - typo in s3a rename(self, subdir) error message
- [HADOOP-14170] - FileSystemContractBaseTest is not cleaning up test directory clearly
- [HADOOP-14087] - S3A typo in pom.xml test exclusions
- [HADOOP-14028] - S3A BlockOutputStreams doesn't delete temporary files in multipart uploads or handle part upload failures
- [HADOOP-14092] - Typo in hadoop-aws index.md
- [HADOOP-14058] - Fix NativeS3FileSystemContractBaseTest#testDirWithDifferentMarkersWorks
- [HADOOP-13626] - Remove distcp dependency on FileStatus serialization
- [HADOOP-13163] - Reuse pre-computed filestatus in Distcp-CopyMapper
- [HADOOP-12611] - TestZKSignerSecretProvider#testMultipleInit occasionally fail
- [HADOOP-12181] - Fix intermittent test failure of TestZKSignerSecretProvider
- [HADOOP-11400] - GraphiteSink does not reconnect to Graphite after 'broken pipe'
- [HADOOP-14214] - DomainSocketWatcher::add()/delete() should not self interrupt while looping await()
- [HADOOP-14195] - CredentialProviderFactory$getProviders is not thread-safe
- [HADOOP-11878] - FileContext.java # fixRelativePart should check for not null for a more informative exception
- [HADOOP-14131] - kms.sh creates bogus dir for tomcat logs
- [HADOOP-14114] - S3A can no longer handle unencoded + in URIs
- [HADOOP-13826] - S3A Deadlock in multipart copy due to thread pool limits.
- [HADOOP-13903] - Improvements to KMS logging to help debug authorization errors
- [HADOOP-14017] - User friendly name for ADLS user and group
- [HADOOP-13804] - MutableStat mean loses accuracy if add(long, long) is used
- [HADOOP-14029] - Fix KMSClientProvider for non-secure proxyuser use case
- [HADOOP-13988] - KMSClientProvider does not work with WebHDFS and Apache Knox w/ProxyUser
- [HADOOP-13749] - KMSClientProvider combined with KeyProviderCache can result in wrong UGI being used
- [HADOOP-13805] - UGI.getCurrentUser() fails if user does not have a keytab associated
- [HADOOP-13976] - Path globbing does not match newlines
- [HADOOP-13958] - Bump up release year to 2017
- [HADOOP-11859] - PseudoAuthenticationHandler fails with httpcomponents v4.4
- [HADOOP-13119] - Web UI error accessing links which need authorization when Kerberos
- [HADOOP-14001] - Improve delegation token validity checking
- [HADOOP-13929] - ADLS connector should not check in contract-test-options.xml
- [HADOOP-14044] - Synchronization issue in delegation token cancel functionality
- [HADOOP-13433] - Race in UGI.reloginFromKeytab
- [HADOOP-13830] - Intermittent failure of ITestS3NContractRootDir#testRecursiveRootListing: "Can not create a Path from an empty string"
- [HADOOP-12655] - TestHttpServer.testBindAddress bind port range is wider than expected
- [HADOOP-13928] - TestAdlFileContextMainOperationsLive.testGetFileContext1 runtime error
- [HADOOP-11619] - FTPFileSystem should override getDefaultPort
- [HADOOP-12667] - s3a: Support createNonRecursive API
- [HADOOP-13164] - Optimize S3AFileSystem::deleteUnnecessaryFakeDirectories
- [HADOOP-13508] - FsPermission string constructor does not recognize sticky bit
- [HADOOP-13375] - o.a.h.security.TestGroupsCaching.testBackgroundRefreshCounters seems flaky
- [HADOOP-13864] - KMS should not require truststore password
- [HADOOP-10823] - TestReloadingX509TrustManager is flaky
- [HADOOP-13847] - KMSWebApp should close KeyProviderCryptoExtension
- [HADOOP-13838] - KMSTokenRenewer should close providers
- [HADOOP-13512] - ReloadingX509TrustManager should keep reloading in case of exception
- [HADOOP-13601] - Fix typo in a log messages of AbstractDelegationTokenSecretManager
- [HADOOP-12597] - In kms-site.xml configuration "hadoop.security.keystore.JavaKeyStoreProvider.password" should be updated with new name
- [HADOOP-13201] - Print the directory paths when ViewFs denies the rename operation on internal dirs
- [HADOOP-13072] - WindowsGetSpaceUsed constructor should be public
- [HADOOP-13362] - DefaultMetricsSystem leaks the source name when a source unregisters
- [HADOOP-12483] - Maintain wrapped SASL ordering for postponed IPC responses
- [HADOOP-11780] - Prevent IPC reader thread death
- [HADOOP-13056] - Print expected values when rejecting a server's determined principal
- [HADOOP-13406] - S3AFileSystem: Consider reusing filestatus in delete() and mkdirs()
- [HADOOP-13389] - TestS3ATemporaryCredentials.testSTS error when using IAM credentials
- [HADOOP-13387] - users always get told off for using S3 even when not using it.
- [HADOOP-13287] - TestS3ACredentials#testInstantiateFromURL fails if AWS secret key contains '+'.
- [HADOOP-13638] - KMS should set UGI's Configuration object properly
- [HADOOP-3733] - "s3:" URLs break when Secret Key contains a slash, even if encoded
- [HADOOP-13558] - UserGroupInformation created from a Subject incorrectly tries to renew the Kerberos ticket
- [HADOOP-13183] - S3A proxy tests fail after httpclient/httpcore upgrade.
- [HADOOP-13353] - LdapGroupsMapping getPassward shouldn't return null when IOException throws
- [HADOOP-11252] - RPC client does not time out by default
- [HADOOP-10062] - race condition in MetricsSystemImpl#publishMetricsNow that causes incorrect results
- [HADOOP-13116] - Jets3tNativeS3FileSystemContractTest does not run.
- [HADOOP-12801] - Suppress obsolete S3FileSystem tests.
- [HADOOP-12851] - S3AFileSystem Uptake of ProviderUtils.excludeIncompatibleCredentialProviders
- [HADOOP-12846] - Credential Provider Recursive Dependencies
- [HADOOP-11922] - Misspelling of threshold in log4j.properties for tests in hadoop-tools
- [HADOOP-11720] - [JDK8] Fix javadoc errors caused by incorrect or illegal tags in hadoop-tools
- [HADOOP-11412] - POMs mention "The Apache Software License" rather than "Apache License"
- [HADOOP-12609] - Fix intermittent failure of TestDecayRpcScheduler
- [HADOOP-13526] - Add detailed logging in KMS for the authentication failure of proxy user
- [HADOOP-13487] - Hadoop KMS should load old delegation tokens from Zookeeper on startup
- [HADOOP-12001] - Limiting LDAP search conflicts with posixGroup addition
- [HADOOP-13476] - CredentialProviderFactory fails at class loading from libhdfs (JNI)
- [HADOOP-10748] - HttpServer2 should not load JspServlet
- [HADOOP-13202] - Avoid possible overflow in org.apache.hadoop.util.bloom.BloomFilter#getNBytes
- [HADOOP-13494] - ReconfigurableBase can log sensitive information
- [HADOOP-13299] - JMXJsonServlet is vulnerable to TRACE
- [HADOOP-13254] - Create framework for configurable disk checkers
- [HADOOP-13437] - KMS should reload whitelist and default key ACLs when hot-reloading
- [HADOOP-11469] - KMS should skip default.key.acl and whitelist.key.acl when loading key acl
- [HADOOP-13297] - Add missing dependency in setting maven-remote-resource-plugin to fix builds
- [HADOOP-13350] - Additional fix to LICENSE and NOTICE
- [HADOOP-12893] - Verify LICENSE.txt and NOTICE.txt
- [HADOOP-13461] - NPE in KeyProvider.rollNewVersion
- [HADOOP-13192] - org.apache.hadoop.util.LineReader cannot handle multibyte delimiters correctly
- [HADOOP-12991] - Conflicting default ports in DelegateToFileSystem
- [HADOOP-12636] - Prevent ServiceLoader failure init for unused FileSystems
- [HADOOP-12613] - TestFind.processArguments occasionally fails
- [HADOOP-13457] - Remove hardcoded absolute path for shell executable
- [HADOOP-13434] - Add quoting to Shell class
- [HADOOP-12465] - Incorrect javadoc in WritableUtils.java
- [HADOOP-13189] - FairCallQueue makes callQueue larger than the configured capacity.
- [HADOOP-12589] - Fix intermittent test failure of TestCopyPreserveFlag
- [HADOOP-11872] - "hadoop dfs" command prints message about using "yarn jar" on Windows(branch-2 only)
- [HADOOP-11149] - Increase the timeout of TestZKFailoverController
- [HADOOP-13381] - KMS clients should use KMS Delegation Tokens from current UGI.
- [HADOOP-13443] - KMS should check the type of underlying keyprovider of KeyProviderExtension before falling back to default
- [HADOOP-12252] - LocalDirAllocator should not throw NPE with empty string configuration.
- [HADOOP-8437] - getLocalPathForWrite should throw IOException for invalid paths
- [HADOOP-8436] - NPE In getLocalPathForWrite ( path, conf ) when the required context item is not configured
- [HADOOP-13351] - TestDFSClientSocketSize buffer size tests are flaky
- [HADOOP-11361] - Fix a race condition in MetricsSourceAdapter.updateJmxCache
- [HADOOP-13316] - Enforce Kerberos authentication for required ops in DelegationTokenAuthenticator
- [HADOOP-13251] - Authenticate with Kerberos credentials when renewing KMS delegation token
- [HADOOP-13155] - Implement TokenRenewer to renew and cancel delegation tokens in KMS
- [HADOOP-13228] - Add delegation token to the connection in DelegationTokenAuthenticator
- [HADOOP-13079] - Add -q option to Ls to print ? instead of non-printable characters
- [HADOOP-13270] - BZip2CompressionInputStream finds the same compression marker twice in corner case, causing duplicate data blocks
- [HADOOP-13255] - KMSClientProvider should check and renew tgt when doing delegation token operations.
- [HADOOP-13132] - Handle ClassCastException on AuthenticationException in LoadBalancingKMSClientProvider
- [HADOOP-12787] - KMS SPNEGO sequence does not work with WEBHDFS
- [HADOOP-12659] - Incorrect usage of config parameters in token manager of KMS
- [HADOOP-11180] - Change log message "token.Token: Cannot find class for token kind kms-dt" to debug
- [HADOOP-13157] - Follow-on improvements to hadoop credential commands
- [HADOOP-12942] - hadoop credential commands non-obviously use password of "none"
- [HADOOP-13098] - Dynamic LogLevel setting page should accept case-insensitive log level string
- [HADOOP-13043] - Add LICENSE.txt entries for bundled javascript dependencies
- [HADOOP-13042] - Restore lost leveldbjni LICENSE and NOTICE changes
- [HADOOP-12761] - incremental maven build is not really incremental
- [HADOOP-12773] - HBase classes fail to load with client/job classloader enabled
- [HADOOP-12958] - PhantomReference for filesystem statistics can trigger OOM
- [HADOOP-8818] - Should use equals() rather than == to compare String or Text in MD5MD5CRC32FileChecksum and TFileDumper
- [HADOOP-13052] - ChecksumFileSystem mishandles crc file permissions
- [HADOOP-12406] - AbstractMapWritable.readFields throws ClassNotFoundException with custom writables
- [HADOOP-13030] - Handle special characters in passwords in KMS startup script
- [HADOOP-11409] - FileContext.getFileContext can stack overflow if default fs misconfigured
- [HADOOP-8751] - NPE in Token.toString() when Token is constructed using null identifier
- [HADOOP-12964] - Http server vulnerable to clickjacking
- [HADOOP-12954] - Add a way to change hadoop.security.token.service.use_ip
- [HADOOP-11467] - KerberosAuthenticator can connect to a non-secure cluster
- [HADOOP-12972] - Lz4Compressor#getLibraryName returns the wrong version number
- [HADOOP-7817] - RawLocalFileSystem.append() should give FSDataOutputStream with accurate .getPos()
- [HADOOP-11321] - copyToLocal cannot save a file to an SMB share unless the user has Full Control permissions.
- [HADOOP-12962] - KMS key names are incorrectly encoded when creating key
- [HADOOP-10945] - 4-digit octal umask permissions throws a parse error
- [HADOOP-12771] - Fix typo in JvmPauseMonitor#getNumGcWarnThreadholdExceeded
- [HADOOP-12313] - NPE in JvmPauseMonitor when calling stop() before start()
- [HADOOP-10015] - UserGroupInformation prints out excessive ERROR warnings
- [HADOOP-12749] - Create a threadpoolexecutor that overrides afterExecute to log uncaught exceptions/errors
- [HADOOP-12810] - FileSystem#listLocatedStatus causes unnecessary RPC calls
- [HADOOP-12706] - TestLocalFsFCStatistics#testStatisticsThreadLocalDataCleanUp times out occasionally
- [HADOOP-11722] - Some Instances of Services using ZKDelegationTokenSecretManager go down when old token cannot be deleted
- [HADOOP-12699] - TestKMS#testKMSProvider intermittently fails during 'test rollover draining'
- [HADOOP-11000] - HAServiceProtocol's health state is incorrectly transitioned to SERVICE_NOT_RESPONDING
- [HADOOP-10941] - Proxy user verification NPEs if remote host is unresolvable
- [HADOOP-12119] - hadoop fs -expunge does not work for federated namespace
- [HADOOP-12352] - Delay in checkpointing Trash can leave trash for 2 intervals before deleting
- [HADOOP-12213] - Interrupted exception can occur when Client#stop is called
- [HADOOP-10365] - BufferedOutputStream in FileUtil#unpackEntries() should be closed in finally block
- [HADOOP-12191] - Bzip2Factory is not thread safe
- [HADOOP-12100] - ImmutableFsPermission should not override applyUmask since that method doesn't modify the FsPermission
- [HADOOP-9658] - SnappyCodec#checkNativeCodeLoaded may unexpectedly fail when native code is not loaded
- [HADOOP-11868] - Invalid user logins trigger large backtraces in server log
- [HADOOP-12107] - long running apps may have a huge number of StatisticsData instances under FileSystem
- [HADOOP-11730] - Regression: s3n read failure recovery broken
- [HADOOP-12755] - Fix typo in defaultFS warning message
- [HADOOP-12718] - Incorrect error message by fs -put local dir without permission
- [HADOOP-12605] - Fix intermittent failure of TestIPC.testIpcWithReaderQueuing
- [HADOOP-12240] - Fix tests requiring native library to be skipped in non-native profile
- [HADOOP-11287] - Simplify UGI#reloginFromKeytab for Java 7+
- [HADOOP-11870] - [JDK8] AuthenticationFilter, CertificateUtil, SignerSecretProviders, KeyAuthorizationKeyProvider Javadoc issues
- [HADOOP-10134] - [JDK8] Fix Javadoc errors caused by incorrect or illegal tags in doc comments
- [HADOOP-12559] - KMS connection failures should trigger TGT renewal
- [HADOOP-12584] - Disable browsing the static directory in HttpServer2
- [HADOOP-12573] - TestRPC.testClientBackOff failing
- [HADOOP-12348] - MetricsSystemImpl creates MetricsSourceAdapter with wrong time unit parameter.
- [HADOOP-11098] - [JDK8] Max Non Heap Memory default changed between JDK7 and 8
- [HADOOP-12588] - Fix intermittent test failure of TestGangliaMetrics
- [HADOOP-12682] - Fix TestKMS#testKMSRestart* failure
- [HADOOP-11218] - Add TLSv1.1,TLSv1.2 to KMS, HttpFS, SSLFactory
- [HADOOP-12604] - Exception may be swallowed in KMSClientProvider
- [HADOOP-12656] - MiniKdc throws "address in use" BindException
- [HADOOP-12615] - Fix NPE in MiniKMS.start()
- [HADOOP-12474] - MiniKMS should use random ports for Jetty server by default
- [HADOOP-12417] - TestWebDelegationToken failing with port in use
- [HADOOP-12482] - Race condition in JMX cache update
- [HADOOP-12468] - Partial group resolution failure should not result in user lockout
- [HADOOP-12577] - Bump up commons-collections version to 3.2.2 to address a security flaw
- [HADOOP-12304] - Applications using FileContext fail with the default file system configured to be wasb/s3/etc.
- [HADOOP-12464] - Interrupted client may try to fail-over and retry
- [HADOOP-9692] - Improving log message when SequenceFile reader throws EOFException on zero-length file
- [HADOOP-10406] - TestIPC.testIpcWithReaderQueuing may fail
- [HADOOP-12418] - TestRPC.testRPCInterruptedSimple fails intermittently
- [HADOOP-12451] - Setting HADOOP_HOME explicitly should be allowed
- [HADOOP-12448] - TestTextCommand: use mkdirs rather than mkdir to create test directory
- [HADOOP-12447] - Clean up some htrace integration issues
- [HADOOP-12440] - TestRPC#testRPCServerShutdown did not produce the desired thread states before shutting down
- [HADOOP-10852] - NetgroupCache is not thread-safe
- [HADOOP-11932] - MetricsSinkAdapter hangs when being stopped
- [HADOOP-11295] - RPC Server Reader thread can't shutdown if RPCCallQueue is full
- [HADOOP-12078] - The default retry policy does not handle RetriableException correctly
- [HADOOP-12186] - ActiveStandbyElector shouldn't call monitorLockNodeAsync multiple times
- [HADOOP-11604] - Prevent ConcurrentModificationException while closing domain sockets during shutdown of DomainSocketWatcher thread.
- [HADOOP-7165] - listLocatedStatus(path, filter) is not redefined in FilterFs
- [HADOOP-12175] - FsShell must load SpanReceiverHost to support tracing
- [HADOOP-12124] - Add HTrace support for FsShell
- [HADOOP-11526] - Memory leak in Bzip2Compressor and Bzip2Decompressor
- [HADOOP-11548] - checknative should display a nicer error message when openssl support is not compiled in
- [HADOOP-10798] - globStatus() should always return a sorted list of files
- [HADOOP-11201] - Hadoop Archives should support globs resolving to files
- [HADOOP-11491] - HarFs incorrectly declared as requiring an authority
- [HADOOP-9907] - Webapp http://hostname:port/metrics link is not working
- [HADOOP-12346] - Increase some default timeouts / retries for S3a connector
- [HADOOP-12317] - Applications fail on NM restart on some linux distro because NM container recovery declares AM container as LOST
- [HADOOP-12362] - Set hadoop.tmp.dir and hadoop.log.dir in pom
- [HADOOP-11209] - Configuration#updatingResource/finalParameters are not thread-safe
- [HADOOP-11876] - Refactor code to make it more readable, minor maybePrintStats bug
- [HADOOP-12200] - TestCryptoStreamsWithOpensslAesCtrCryptoCodec should be skipped in non-native profile
- [HADOOP-12017] - Hadoop archives command should use configurable replication factor when closing
- [HADOOP-11704] - DelegationTokenAuthenticationFilter must pass ipaddress instead of hostname to ProxyUsers#authorize()
- [HADOOP-12201] - Add tracing to FileSystem#createFileSystem and Globber#glob
- [HADOOP-12171] - Shorten overly-long htrace span names for server
- [HADOOP-11912] - Extra configuration key used in TraceUtils should respect prefix
- [HADOOP-11186] - documentation should talk about hadoop.htrace.spanreceiver.classes, not hadoop.trace.spanreceiver.classes
- [HADOOP-12159] - Move DistCpUtils#compareFs() to org.apache.hadoop.fs.FileUtil and fix for HA namespaces
- [HADOOP-12164] - Fix TestMove and TestFsShellReturnCode failed to get command name using reflection.
- [HADOOP-12103] - Small refactoring of DelegationTokenAuthenticationFilter to allow code sharing
- [HADOOP-8151] - Error handling in snappy decompressor throws invalid exceptions
- [HADOOP-11934] - Use of JavaKeyStoreProvider in LdapGroupsMapping causes infinite loop
- [HADOOP-11837] - AuthenticationFilter should destroy SignerSecretProvider in Tomcat deployments
- [HADOOP-11815] - HttpServer2 should destroy SignerSecretProvider when it stops
- [HADOOP-11754] - RM fails to start in non-secure mode due to authentication filter failure
- [HADOOP-11748] - The secrets of auth cookies should not be specified in configuration in clear text
- [HADOOP-11014] - Potential resource leak in JavaKeyStoreProvider due to unclosed stream
- [HADOOP-11969] - ThreadLocal initialization in several classes is not thread safe
- [HADOOP-11973] - Ensure ZkDelegationTokenSecretManager namespace znodes get created with ACLs
- [HADOOP-11402] - Negative user-to-group cache entries are never cleared for never-again-accessed users
- [HADOOP-11238] - Update the NameNode's Group Cache in the background when possible
- [HADOOP-11710] - Make CryptoOutputStream behave like DFSOutputStream wrt synchronization
- [HADOOP-11891] - OsSecureRandom should lazily fill its reservoir
- [HADOOP-11802] - DomainSocketWatcher thread terminates sometimes after there is an I/O error during requestShortCircuitShm
- [HADOOP-11724] - DistCp throws NPE when the target directory is root.
- [HADOOP-11670] - Regression: s3a auth setup broken
- [HADOOP-11183] - Memory-based S3AOutputstream
- [HADOOP-11521] - Make connection timeout configurable in s3a
- [HADOOP-11674] - oneByteBuf in CryptoInputStream and CryptoOutputStream should be non static
- [HADOOP-11445] - Bzip2Codec: Data block is skipped when position of newly created stream is equal to start of split
- [HADOOP-11584] - s3a file block size set to 0 in getFileStatus
- [HADOOP-11522] - Update S3A Documentation
- [HADOOP-11570] - S3AInputStream.close() downloads the remaining bytes of the object from S3
- [HADOOP-11446] - S3AOutputStream should use shared thread pool to avoid OutOfMemoryError
- [HADOOP-10953] - NetworkTopology#add calls NetworkTopology#toString without holding the netlock
- [HADOOP-11363] - Hadoop maven surefire-plugin uses must set heap size
- [HADOOP-11350] - The size of header buffer of HttpServer is too small when HTTPS is enabled
- [HADOOP-10714] - AmazonS3Client.deleteObjects() need to be limited to 1000 entries per call
- [HADOOP-11482] - Clients that cache instances of KMSClientProvider fail Authentication when "addDelegationTokens()" method is called after authentication token validity expires
- [HADOOP-11348] - Remove unused variable from CMake error message for finding openssl
- [HADOOP-11368] - Fix SSLFactory truststore reloader thread leak in KMSClientProvider
- [HADOOP-11329] - Add JAVA_LIBRARY_PATH to KMS startup options
- [HADOOP-11343] - Overflow is not properly handled in caclulating final iv for AES CTR
- [HADOOP-11355] - When accessing data in HDFS and the key has been deleted, a Null Pointer Exception is shown.
- [HADOOP-11342] - KMS key ACL should ignore ALL operation for default key ACL and whitelist key ACL
- [HADOOP-11344] - KMS kms-config.sh sets a default value for the keystore password even in non-ssl setup
- [HADOOP-11337] - KeyAuthorizationKeyProvider access checks need to be done atomically
- [HADOOP-11333] - Fix deadlock in DomainSocketWatcher when the notification pipe is full
- [HADOOP-11322] - key based ACL check in KMS always check KeyOpType.MANAGEMENT even actual KeyOpType is not MANAGEMENT
- [HADOOP-11300] - KMS startup scripts must not display the keystore / truststore passwords
- [HADOOP-11312] - Fix unit tests to not use uppercase key names
- [HADOOP-11157] - ZKDelegationTokenSecretManager never shuts down listenerThreadPool
- [HADOOP-11311] - Restrict uppercase key names from being created with JCEKS
- [HADOOP-11230] - Add missing dependency of bouncycastle for kms, httpfs, hdfs, MR and YARN
- [HADOOP-11289] - Fix typo in RpcUtil log message
- [HADOOP-11187] - NameNode - KMS communication fails after a long period of inactivity
- [HADOOP-11272] - Allow ZKSignerSecretProvider and ZKDelegationTokenSecretManager to use the same curator client
- [HADOOP-11267] - TestSecurityUtil fails when run with JDK8 because of empty principal names
- [HADOOP-10840] - Fix OutOfMemoryError caused by metrics system in Azure File System
- [HADOOP-10690] - Lack of synchronization on access to InputStream in NativeAzureFileSystem#NativeAzureFsInputStream#close()
- [HADOOP-10689] - InputStream is not closed in AzureNativeFileSystemStore#retrieve()
- [HADOOP-11156] - DelegateToFileSystem should implement getFsStatus(final Path f).
- [HADOOP-8912] - adding .gitattributes file to prevent CRLF and LF mismatches for source and text files
- [HADOOP-11035] - distcp on mr1(branch-1) fails with NPE using a short relative source path.
- [HADOOP-8329] - Build fails with Java 7
- [HADOOP-8552] - Conflict: Same security.log.file for multiple users.
Improvement
- [HADOOP-13017] - Implementations of InputStream.read(buffer, offset, bytes) to exit 0 if bytes==0
- [HADOOP-14844] - Remove requirement to specify TenantGuid for MSI Token Provider
- [HADOOP-14688] - Intern strings in KeyVersion and EncryptedKeyVersion
- [HADOOP-14521] - KMS client needs retry logic
- [HADOOP-14705] - Add batched interface reencryptEncryptedKeys to KMS
- [HADOOP-14251] - Credential provider should handle property key deprecation
- [HADOOP-13827] - Add reencryptEncryptedKey interface to KMS
- [HADOOP-14401] - maven-project-info-reports-plugin can be removed
- [HADOOP-14321] - Explicitly exclude S3A root dir ITests from parallel runs
- [HADOOP-11572] - s3a delete() operation fails during a concurrent delete of child entries
- [HADOOP-14260] - Configuration.dumpConfiguration should redact sensitive information
- [HADOOP-13628] - Support to retrieve specific property from configuration via REST API
- [HADOOP-14627] - Support MSI and DeviceCode token provider in ADLS
- [HADOOP-14678] - AdlFilesystem#initialize swallows exception when getting user name
- [HADOOP-14349] - Rename ADLS CONTRACT_ENABLE_KEY
- [HADOOP-14174] - Set default ADLS access token provider type to ClientCredential
- [HADOOP-14038] - Rename ADLS credential properties
- [HADOOP-14542] - Add IOUtils.cleanupWithLogger that accepts slf4j logger API
- [HADOOP-14629] - Improve exception checking in FileContext related JUnit tests
- [HADOOP-14440] - Add metrics for connections dropped
- [HADOOP-14515] - Specifically configure zookeeper-related log levels in KMS log4j
- [HADOOP-14524] - Make CryptoCodec Closeable so it can be cleaned up proactively
- [HADOOP-14523] - OpensslAesCtrCryptoCodec.finalize() holds excessive amounts of memory
- [HADOOP-13854] - KMS should log error details in KMSExceptionsProvider
- [HADOOP-13174] - Add more debug logs for delegation tokens and authentication
- [HADOOP-13720] - Add more info to the msgs printed in AbstractDelegationTokenSecretManager for better supportability
- [HADOOP-11709] - Time.NANOSECONDS_PER_MILLISECOND - use class-level final constant instead of method variable
- [HADOOP-11812] - Implement listLocatedStatus for ViewFileSystem to speed up split calculation
- [HADOOP-14407] - DistCp - Introduce a configurable copy buffer size
- [HADOOP-14242] - Make KMS Tomcat SSL property sslEnabledProtocols and clientAuth configurable
- [HADOOP-14141] - Store KMS SSL keystore password in catalina.properties
- [HADOOP-14241] - Add ADLS sensitive config keys to default list
- [HADOOP-14230] - TestAdlFileSystemContractLive fails to clean up
- [HADOOP-14197] - Fix ADLS doc for credential provider
- [HADOOP-14196] - Azure Data Lake doc is missing required config entry
- [HADOOP-14173] - Remove unused AdlConfKeys#ADL_EVENTS_TRACKING_SOURCE
- [HADOOP-14153] - ADL module has messed doc structure
- [HADOOP-14123] - Remove misplaced ADL service provider config file for FileSystem
- [HADOOP-14255] - S3A to delete unnecessary fake directory objects in mkdirs()
- [HADOOP-14417] - Update default SSL cipher list for KMS
- [HADOOP-14135] - Remove URI parameter in AWSCredentialProvider constructors
- [HADOOP-14120] - needless S3AFileSystem.setOptionalPutRequestParameters in S3ABlockOutputStream putObject()
- [HADOOP-14138] - Remove S3A ref from META-INF service discovery, rely on existing core-default entry
- [HADOOP-14099] - Split S3 testing documentation out into its own file
- [HADOOP-14081] - S3A: Consider avoiding array copy in S3ABlockOutputStream (ByteArrayBlock)
- [HADOOP-14019] - fix some typos in the s3a docs
- [HADOOP-13871] - ITestS3AInputStreamPerformance.testTimeToOpenAndReadWholeFileBlocks performance awful
- [HADOOP-11794] - Enable distcp to copy blocks in parallel
- [HADOOP-13169] - Randomize file list in SimpleCopyListing
- [HADOOP-14104] - Client should always ask namenode for kms provider path.
- [HADOOP-14246] - Authentication Tokens should use SecureRandom instead of Random and 256 bit secrets
- [HADOOP-11379] - Fix new findbugs warnings in hadoop-auth*
- [HADOOP-13503] - Improve SaslRpcClient failure logging
- [HADOOP-12672] - RPC timeout should not override IPC ping interval
- [HADOOP-11599] - Client#getTimeout should use IPC_CLIENT_PING_DEFAULT when IPC_CLIENT_PING_KEY is not configured.
- [HADOOP-14083] - KMS should support old SSL clients
- [HADOOP-14127] - Add log4j configuration to enable logging in hadoop-distcp's tests
- [HADOOP-14102] - Relax error message assertion in S3A test ITestS3AEncryptionSSEC
- [HADOOP-14113] - review ADL Docs
- [HADOOP-14040] - Use shaded aws-sdk uber-JAR 1.11.86
- [HADOOP-13782] - Make MutableRates metrics thread-local write, aggregate-on-read
- [HADOOP-11447] - Add a more meaningful toString method to SampleStat and MutableStat
- [HADOOP-13742] - Expose "NumOpenConnectionsPerUser" as a metric
- [HADOOP-13336] - S3A to support per-bucket configuration
- [HADOOP-13496] - Include file lengths in Mismatch in length error for distcp
- [HADOOP-14050] - Add process name to kms process
- [HADOOP-13204] - ber-jira: S3a phase III: scale and tuning
- [HADOOP-13627] - Have an explicit KerberosAuthException for UGI to throw, text from public constants
- [HADOOP-14003] - Make additional KMS tomcat settings configurable
- [HADOOP-13855] - Fix a couple of the s3a statistic names to be consistent with the rest
- [HADOOP-13857] - S3AUtils.translateException to map (wrapped) InterruptedExceptions to InterruptedIOEs
- [HADOOP-13823] - s3a rename: fail if dest file exists
- [HADOOP-13801] - regression: ITestS3AMiniYarnCluster failing
- [HADOOP-12009] - Clarify FileSystem.listStatus() sorting order & fix FileSystemContractBaseTest:testListStatus
- [HADOOP-13900] - Remove snapshot version of SDK dependency from Azure Data Lake Store File System
- [HADOOP-13680] - fs.s3a.readahead.range to use getLongBytes
- [HADOOP-13502] - Split fs.contract.is-blobstore flag into more descriptive flags for use by contract tests.
- [HADOOP-13614] - Purge some superfluous/obsolete S3 FS tests that are slowing test runs down
- [HADOOP-13207] - Specify FileSystem listStatus, listFiles and RemoteIterator
- [HADOOP-13309] - Document S3A known limitations in file ownership and permission model.
- [HADOOP-12774] - s3a should use UGI.getCurrentUser.getShortname() for username
- [HADOOP-13727] - S3A: Reduce high number of connections to EC2 Instance Metadata Service caused by InstanceProfileCredentialsProvider.
- [HADOOP-13735] - ITestS3AFileContextStatistics.testStatistics() failing
- [HADOOP-13560] - S3ABlockOutputStream to support huge (many GB) file writes
- [HADOOP-13962] - Update ADLS SDK to 2.1.4
- [HADOOP-13956] - Read ADLS credentials from Credential Provider
- [HADOOP-13692] - hadoop-aws should declare explicit dependency on Jackson 2 jars to prevent classpath conflicts.
- [HADOOP-13953] - Make FTPFileSystem's data connection mode and transfer mode configurable
- [HADOOP-12977] - s3a to handle delete("/", true) robustly
- [HADOOP-13674] - S3A can provide a more detailed error message when accessing a bucket through an incorrect S3 endpoint.
- [HADOOP-13599] - s3a close() to be non-synchronized, so avoid risk of deadlock on shutdown
- [HADOOP-13540] - improve section on troubleshooting s3a auth problems
- [HADOOP-13911] - Remove TRUSTSTORE_PASSWORD related scripts from KMS
- [HADOOP-13257] - Improve Azure Data Lake contract tests.
- [HADOOP-13037] - Refactor Azure Data Lake Store as an independent FileSystem
- [HADOOP-13541] - explicitly declare the Joda time version S3A depends on
- [HADOOP-12325] - RPC Metrics : Add the ability track and log slow RPCs
- [HADOOP-13590] - Retry until TGT expires even if the UGI renewal thread encountered exception
- [HADOOP-7930] - Kerberos relogin interval in UserGroupInformation should be configurable
- [HADOOP-13641] - Update UGI#spawnAutoRenewalThreadForUserCreds to reduce indentation
- [HADOOP-13252] - Tune S3A provider plugin mechanism
- [HADOOP-12974] - Create a CachingGetSpaceUsed implementation that uses df
- [HADOOP-12975] - Add jitter to CachingGetSpaceUsed's thread
- [HADOOP-12973] - make DU pluggable
- [HADOOP-12453] - Support decoding KMS Delegation Token with its own Identifier
- [HADOOP-13130] - s3a failures can surface as RTEs, not IOEs
- [HADOOP-13669] - KMS Server should log exceptions before throwing
- [HADOOP-13684] - Snappy may complain Hadoop is built without snappy if libhadoop is not found.
- [HADOOP-13034] - Log message about input options in distcp lacks some items
- [HADOOP-10300] - Allowed deferred sending of call responses
- [HADOOP-13405] - doc for fs.s3a.acl.default indicates incorrect values
- [HADOOP-13208] - S3A listFiles(recursive=true) to do a bulk listObjects instead of walking the pseudo-tree of directories
- [HADOOP-13693] - Remove the message about HTTP OPTIONS in SPNEGO initialization message from kms audit log
- [HADOOP-13698] - Document caveat for KeyShell when underlying KeyProvider does not delete a key
- [HADOOP-13442] - Optimize UGI group lookups
- [HADOOP-13324] - s3a tests don't authenticate with S3 frankfurt (or other V4 auth only endpoints)
- [HADOOP-13188] - S3A file-create should throw error rather than overwrite directories
- [HADOOP-13317] - Add logs to KMS server-side to improve supportability
- [HADOOP-13212] - Provide an option to set the socket buffers in S3AFileSystem
- [HADOOP-13239] - Deprecate s3:// in branch-2
- [HADOOP-13203] - S3A: Support fadvise "random" mode for high performance readPositioned() reads
- [HADOOP-13241] - document s3a better
- [HADOOP-13237] - s3a initialization against public bucket fails if caller lacks any credentials
- [HADOOP-12807] - S3AFileSystem should read AWS credentials from environment variables
- [HADOOP-13171] - Add StorageStatistics to S3A; instrument some more operations
- [HADOOP-13131] - Add tests to verify that S3A supports SSE-S3 encryption
- [HADOOP-13162] - Consider reducing number of getFileStatus calls in S3AFileSystem.mkdirs
- [HADOOP-13145] - In DistCp, prevent unnecessary getFileStatus call when not preserving metadata.
- [HADOOP-13158] - S3AFileSystem#toString might throw NullPointerException due to null cannedACL.
- [HADOOP-13113] - Enable parallel test execution for hadoop-aws.
- [HADOOP-13047] - S3a Forward seek in stream length to be configurable
- [HADOOP-13122] - Customize User-Agent header sent in HTTP requests by S3A.
- [HADOOP-12891] - S3AFileSystem should configure Multipart Copy threshold and chunk size
- [HADOOP-12444] - Support lazy seek in S3AInputStream
- [HADOOP-12169] - ListStatus on empty dir in S3A lists itself instead of returning an empty list
- [HADOOP-12994] - Specify PositionedReadable, add contract tests, fix problems
- [HADOOP-12292] - Make use of DeleteObjects optional
- [HADOOP-11520] - Clean incomplete multi-part uploads in S3A tests
- [HADOOP-11381] - Fix findbugs warnings in hadoop-distcp, hadoop-aws, hadoop-azure, and hadoop-openstack
- [HADOOP-12782] - Faster LDAP group name resolution with ActiveDirectory
- [HADOOP-13380] - TestBasicDiskValidator should not write data to /tmp
- [HADOOP-13103] - Group resolution from LDAP may fail on javax.naming.ServiceUnavailableException
- [HADOOP-13298] - Fix the leftover L&N files in hadoop-build-tools/src/main/resources/META-INF/
- [HADOOP-13154] - S3AFileSystem printAmazonServiceException/printAmazonClientException appear copy & paste of AWS examples
- [HADOOP-13290] - Appropriate use of generics in FairCallQueue
- [HADOOP-12800] - Copy docker directory from 2.8 to 2.7/2.6 repos to enable pre-commit Jenkins runs
- [HADOOP-10048] - LocalDirAllocator should avoid holding locks while accessing the filesystem
- [HADOOP-11901] - BytesWritable fails to support 2G chunks due to integer overflow
- [HADOOP-12928] - Update netty to 3.10.5.Final to sync with zookeeper
- [HADOOP-13263] - Reload cached groups in background after expiry
- [HADOOP-11031] - Design Document for Credential Provider API
- [HADOOP-12963] - Allow using path style addressing for accessing the s3 endpoint
- [HADOOP-13199] - Add doc for distcp -filters
- [HADOOP-12982] - Document missing S3A and S3 properties
- [HADOOP-12711] - Remove dependency on commons-httpclient for ServletUtil
- [HADOOP-12772] - NetworkTopologyWithNodeGroup.getNodeGroup() can loop infinitely for invalid 'loc' values
- [HADOOP-12805] - Annotate CanUnbuffer with @InterfaceAudience.Public
- [HADOOP-12789] - log classpath of ApplicationClassLoader at INFO level
- [HADOOP-11613] - Remove commons-httpclient dependency from hadoop-azure
- [HADOOP-12841] - Update s3-related properties in core-default.xml
- [HADOOP-12825] - Log slow name resolutions
- [HADOOP-11687] - Ignore x-* and response headers when copying an Amazon S3 object
- [HADOOP-12280] - Skip unit tests based on maven profile rather than NativeCodeLoader.isNativeCodeLoaded
- [HADOOP-7139] - Allow appending to existing SequenceFiles
- [HADOOP-11404] - Clarify the "expected client Kerberos principal is null" authorization message
- [HADOOP-12901] - Add warning log when KMSClientProvider cannot create a connection to the KMS server
- [HADOOP-12828] - Print user when services are started
- [HADOOP-12668] - Support excluding weak Ciphers in HttpServer2 through ssl-server.conf
- [HADOOP-12829] - StatisticsDataReferenceCleaner swallows interrupt exceptions
- [HADOOP-12817] - Enable TLS v1.1 and 1.2
- [HADOOP-12764] - Increase default value of KMS maxHttpHeaderSize and make it configurable
- [HADOOP-10651] - Add ability to restrict service access using IP addresses and hostnames
- [HADOOP-12788] - OpensslAesCtrCryptoCodec should log which random number generator is used.
- [HADOOP-12759] - RollingFileSystemSink should eagerly rotate directories
- [HADOOP-12566] - Add NullGroupMapping
- [HADOOP-12683] - Add number of samples in last interval in snapshot of MutableStat
- [HADOOP-11984] - Enable parallel JUnit tests in pre-commit.
- [HADOOP-12049] - Control http authentication cookie persistence via configuration
- [HADOOP-11262] - Enable YARN to use S3A
- [HADOOP-12259] - Utility to Dynamic port allocation
- [HADOOP-12625] - Add a config to disable the /logs endpoints
- [HADOOP-12568] - Update core-default.xml to describe posixGroups support
- [HADOOP-7713] - dfs -count -q should label output column
- [HADOOP-11171] - Enable using a proxy server to connect to S3a.
- [HADOOP-12269] - Update aws-sdk dependency to 1.10.6
- [HADOOP-11464] - Reinstate support for launching Hadoop processes on Windows using Cygwin.
- [HADOOP-11918] - Listing an empty s3a root directory throws FileNotFound.
- [HADOOP-11032] - Replace use of Guava's Stopwatch with Hadoop's StopWatch
- [HADOOP-10987] - Provide an iterator-based listing API for FileSystem
- [HADOOP-11506] - Configuration variable expansion regex expensive for long values
- [HADOOP-12413] - AccessControlList should avoid calling getGroupNames in isUserInList with empty groups.
- [HADOOP-12404] - Disable caching for JarURLConnection to avoid sharing JarFile with other users when loading resource from URL in Configuration class.
- [HADOOP-11422] - Check CryptoCodec is AES-CTR for Crypto input/output stream
- [HADOOP-12183] - Annotate the HTrace span created by FsShell with the command-line arguments passed by the user
- [HADOOP-11544] - Remove unused configuration keys for tracing
- [HADOOP-11261] - Set custom endpoint for S3A
- [HADOOP-12369] - Point hadoop-project/pom.xml java.security.krb5.conf within target folder
- [HADOOP-12367] - Move TestFileUtil's test resources to resources folder
- [HADOOP-12368] - Mark ViewFileSystemBaseTest and ViewFsBaseTest as abstract
- [HADOOP-1540] - Support file exclusion list in distcp
- [HADOOP-11827] - Speed-up distcp buildListing() using threadpool
- [HADOOP-12318] - Expose underlying LDAP exceptions in SaslPlainServer
- [HADOOP-11785] - Reduce number of listStatus operation in distcp buildListing()
- [HADOOP-12172] - FsShell mkdir -p makes an unnecessary check for the existence of the parent.
- [HADOOP-11659] - o.a.h.fs.FileSystem.Cache#remove should use a single hash map lookup
- [HADOOP-10597] - RPC Server signals backoff to clients when all request queues are full
- [HADOOP-11894] - Bump the version of Apache HTrace to 3.2.0-incubating
- [HADOOP-11971] - Move test utilities for tracing from hadoop-hdfs to hadoop-common
- [HADOOP-11242] - Record the time of calling in tracing span of IPC server
- [HADOOP-11714] - Add more trace log4j messages to SpanReceiverHost
- [HADOOP-11498] - Bump the version of HTrace to 3.1.0-incubating
- [HADOOP-12158] - Improve error message in TestCryptoStreamsWithOpensslAesCtrCryptoCodec when OpenSSL is not installed
- [HADOOP-12059] - S3Credentials should support use of CredentialProvider
- [HADOOP-10670] - Allow AuthenticationFilters to load secret from signature secret files
- [HADOOP-12043] - Display warning if defaultFs is not set when running fs commands.
- [HADOOP-11483] - HardLink.java should use the jdk7 createLink method
- [HADOOP-11711] - Provide a default value for AES/CTR/NoPadding CryptoCodec classes
- [HADOOP-11692] - Improve authentication failure WARN message to avoid user confusion
- [HADOOP-11421] - Add IOUtils#listDirectory
- [HADOOP-11416] - Move ChunkedArrayList into hadoop-common
- [HADOOP-11427] - ChunkedArrayList: fix removal via iterator and implement get
- [HADOOP-11430] - Add GenericTestUtils#disableLog, GenericTestUtils#setLogLevel
- [HADOOP-11620] - Add support for load balancing across a group of KMS for HA
- [HADOOP-11607] - Reduce log spew in S3AFileSystem
- [HADOOP-10476] - Bumping the findbugs version to 3.0.0
- [HADOOP-10530] - Make hadoop trunk build on Java7+ only
- [HADOOP-11455] - KMS and Credential CLI should request confirmation for deletion by default
- [HADOOP-11184] - Update Hadoop's lz4 to r123
- [HADOOP-9374] - Add tokens from -tokenCacheFile into UGI
- [HADOOP-10626] - Limit Returning Attributes for LDAP search
- [HADOOP-11188] - hadoop-azure: automatically expand page blobs when they become full
- [HADOOP-10809] - hadoop-azure: page blob support
- [HADOOP-8757] - Metrics should disallow names with invalid characters
- [HADOOP-11399] - Java Configuration file and .xml files should be automatically cross-compared
- [HADOOP-11301] - [optionally] update jmx cache to drop old metrics
- [HADOOP-11410] - make the rpath of libhadoop.so configurable
- [HADOOP-11173] - Improve error messages for some KeyShell commands
- [HADOOP-6857] - FsShell should report raw disk usage including replication factor
- [HADOOP-11323] - WritableComparator#compare keeps reference to byte array
- [HADOOP-11291] - Log the cause of SASL connection failures
- [HADOOP-10847] - Remove the usage of sun.security.x509.* in testing code
- [HADOOP-10786] - Fix UGI#reloginFromKeytab on Java 8
- [HADOOP-10535] - Make the retry numbers in ActiveStandbyElector configurable
- [HADOOP-10451] - Remove unused field and imports from SaslRpcServer
- [HADOOP-9756] - Additional cleanup RPC code
- [HADOOP-8700] - Move the checksum type constants to an enum
New Feature
- [HADOOP-14504] - ProvidedFileStatusIterator#next() may throw IndexOutOfBoundsException
- [HADOOP-14433] - ITestS3GuardConcurrentOps.testConcurrentTableCreations fails without table name configured
- [HADOOP-13760] - S3Guard: add delete tracking
- [HADOOP-13453] - S3Guard: Instrument new functionality with Hadoop metrics.
- [HADOOP-14323] - ITestS3GuardListConsistency failure w/ Local, authoritative metadata store
- [HADOOP-14266] - S3Guard: S3AFileSystem::listFiles() to employ MetadataStore
- [HADOOP-14107] - ITestS3GuardListConsistency fails intermittently
- [HADOOP-14263] - TestS3GuardTool hangs/fails when offline: it's an IT test
- [HADOOP-14288] - TestDynamoDBMetadataStore is broken unless we can fail faster without a table version
- [HADOOP-14144] - s3guard: CLI diff non-empty after import on new table
- [HADOOP-14215] - DynamoDB client should waitForActive on existing tables
- [HADOOP-14172] - S3Guard: import does not import empty directory
- [HADOOP-14051] - S3Guard: link docs from index, fix typos
- [HADOOP-14282] - S3Guard: DynamoDBMetadata::prune() should self interrupt correctly
- [HADOOP-13926] - S3Guard: S3AFileSystem::listLocatedStatus() to employ MetadataStore
- [HADOOP-14236] - S3Guard: S3AFileSystem::rename() should move non-listed sub-directory entries in metadata store
- [HADOOP-14227] - S3Guard: ITestS3AConcurrentOps is not cleaning up test data
- [HADOOP-13966] - Add ability to start DDB local server in every test
- [HADOOP-14036] - S3Guard: intermittent duplicate item keys failure
- [HADOOP-14181] - Add validation of DynamoDB region
- [HADOOP-14168] - S3GuardTool tests should not run if S3Guard is not set up
- [HADOOP-13914] - s3guard: improve S3AFileStatus#isEmptyDirectory handling
- [HADOOP-13345] - S3Guard: Improved Consistency for S3A
- [HADOOP-14027] - Implicitly creating DynamoDB table ignores endpoint config
- [HADOOP-14129] - ITestS3ACredentialsInURL sometimes fails
- [HADOOP-14094] - Rethink S3GuardTool options
- [HADOOP-14125] - s3guard tool tests aren't isolated; can't run in parallel
- [HADOOP-14130] - Simplify DynamoDBClientFactory for creating Amazon DynamoDB clients
- [HADOOP-14110] - In S3AFileSystem, make getAmazonClient() package private; export getBucketLocation()
- [HADOOP-14041] - CLI command to prune old metadata
- [HADOOP-14096] - s3guard: regression in dirListingUnion
- [HADOOP-13904] - DynamoDBMetadataStore to handle DDB throttling failures through retry policy
- [HADOOP-14046] - Metastore destruction test creates table without version marker
- [HADOOP-14085] - Drop unnecessary type assertion and cast
- [HADOOP-14079] - Fix breaking link in s3guard.md
- [HADOOP-14013] - S3Guard: fix multi-bucket integration tests
- [HADOOP-13876] - S3Guard: better support for multi-bucket access
- [HADOOP-13995] - s3guard cli: make tests easier to run and address failure
- [HADOOP-14020] - Optimize dirListingUnion
- [HADOOP-13985] - s3guard: add a version marker to every table
- [HADOOP-14049] - Honour AclBit flag associated to file/folder permission for Azure datalake account
- [HADOOP-13877] - S3Guard: fix TestDynamoDBMetadataStore when fs.s3a.s3guard.ddb.table is set
- [HADOOP-13589] - S3Guard: Allow execution of all S3A integration tests with S3Guard enabled.
- [HADOOP-13650] - S3Guard: Provide command line tools to manipulate metadata store.
- [HADOOP-13908] - S3Guard: Existing tables may not be initialized correctly in DynamoDBMetadataStore
- [HADOOP-13960] - Initialize DynamoDBMetadataStore without associated S3AFileSystem
- [HADOOP-13931] - S3AGuard: Use BatchWriteItem in DynamoDBMetadataStore#put()
- [HADOOP-13934] - S3Guard: DynamoDBMetadataStore#move() could be throwing exception due to BatchWriteItem limits
- [HADOOP-13937] - Mock bucket locations in MockS3ClientFactory
- [HADOOP-13455] - S3Guard: Write end user docs, change table autocreate default.
- [HADOOP-13899] - tune dynamodb client & tests
- [HADOOP-13886] - s3guard: ITestS3AFileOperationCost.testFakeDirectoryDeletion failure
- [HADOOP-13893] - dynamodb dependency -> compile
- [HADOOP-13449] - S3Guard: Implement DynamoDBMetadataStore.
- [HADOOP-13793] - s3guard: add inconsistency injection, integration tests
- [HADOOP-13850] - s3guard to log choice of metadata store at debug
- [HADOOP-13651] - S3Guard: S3AFileSystem Integration with MetadataStore
- [HADOOP-13631] - S3Guard: implement move() for LocalMetadataStore, add unit tests
- [HADOOP-13452] - S3Guard: Implement access policy for intra-client consistency with in-memory metadata store.
- [HADOOP-13573] - S3Guard: create basic contract tests for MetadataStore implementations
- [HADOOP-13448] - S3Guard: Define MetadataStore interface.
- [HADOOP-13446] - Support running isolated unit tests separate from AWS integration tests.
- [HADOOP-13447] - Refactor S3AFileSystem to support introduction of separate metadata repository and tests.
- [HADOOP-13368] - DFSOpsCountStatistics$OpType#fromSymbol and s3a.Statistic#fromSymbol should be O(1) operation
- [HADOOP-13283] - Support reset operation for new global storage statistics and per FS storage stats
- [HADOOP-13305] - Define common statistics names across schemes
- [HADOOP-13291] - Probing stats in DFSOpsCountStatistics/S3AStorageStatistics should be correctly implemented
- [HADOOP-13288] - Guard null stats key in FileSystemStorageStatistics
- [HADOOP-13280] - FileSystemStorageStatistics#getLong(readOps) should return readOps + largeReadOps
- [HADOOP-13284] - FileSystemStorageStatistics must not attempt to read non-existent rack-aware read stats in branch-2.8
- [HADOOP-13065] - Add a new interface for retrieving FS and FC Statistics
- [HADOOP-13396] - Allow pluggable audit loggers in KMS
- [HADOOP-10971] - Add -C flag to make `hadoop fs -ls` print filenames only
- [HADOOP-8934] - Shell command ls should include sort options
- [HADOOP-12537] - S3A to support Amazon STS temporary credentials
- [HADOOP-12723] - S3A: Add ability to plug in any AWSCredentialsProvider
- [HADOOP-12548] - Read s3a creds from a Credential Provider
- [HADOOP-12847] - hadoop daemonlog should support https and SPNEGO for Kerberized cluster
- [HADOOP-12360] - Create StatsD metrics2 sink
- [HADOOP-12702] - Add an HDFS metrics sink
- [HADOOP-8989] - hadoop fs -find feature
- [HADOOP-9477] - Add posixGroups support for LDAP groups mapping service
- [HADOOP-11341] - KMS support for whitelist key ACLs
- [HADOOP-10728] - Metrics system for Windows Azure Storage Filesystem
- [HADOOP-9629] - Support Windows Azure Storage - Blob as a file system in Hadoop
Task
- [HADOOP-14324] - Refine S3 server-side-encryption key as encryption secret; improve error reporting and diagnostics
- [HADOOP-13139] - Branch-2: S3a to use thread pool that blocks clients
- [HADOOP-11814] - Reformat hadoop-annotations, o.a.h.classification.tools
- [HADOOP-11463] - Replace method-local TransferManager object with S3AFileSystem#transfers
- [HADOOP-11492] - Bump up curator version to 2.7.1
Test
- [HADOOP-12696] - Add Tests for S3FileSystem Contract
- [HADOOP-13395] - Enhance TestKMSAudit
- [HADOOP-11432] - Fix SymlinkBaseTest#testCreateLinkUsingPartQualPath2
- [HADOOP-12736] - TestTimedOutTestsListener#testThreadDumpAndDeadlocks sometimes times out
- [HADOOP-12715] - TestValueQueue#testgetAtMostPolicyALL fails intermittently
- [HADOOP-10668] - TestZKFailoverControllerStress#testExpireBackAndForth occasionally fails
- [HADOOP-11165] - TestUTF8 fails when run against java 8
YARN
Bug
- [YARN-7099] - ResourceHandlerModule.parseConfiguredCGroupPath only works for privileged yarn users.
- [YARN-6757] - Refactor the usage of yarn.nodemanager.linux-container-executor.cgroups.mount-path
- [YARN-6432] - FairScheduler: Reserve preempted resources for corresponding applications
- [YARN-5876] - TestResourceTrackerService#testGracefulDecommissionWithApp fails intermittently on trunk
- [YARN-6050] - AMs can't be scheduled on racks or nodes
- [YARN-5875] - TestTokenClientRMService#testTokenRenewalWrongUser fails
- [YARN-3078] - LogCLIHelpers lacks of a blank space before string 'does not exist'
- [YARN-6643] - TestRMFailover fails rarely due to port conflict
- [YARN-3749] - We should make a copy of configuration when init MiniYARNCluster with multiple RMs
- [YARN-2890] - MiniYarnCluster should turn on timeline service if configured to do so
- [YARN-3742] - YARN RM will shut down if ZKClient creation times out
- [YARN-4593] - Deadlock in AbstractService.getConfig()
- [YARN-6615] - AmIpFilter drops query parameters on redirect
- [YARN-4209] - RMStateStore FENCED state doesn't work due to updateFencedState called by stateMachine.doTransition
- [YARN-2946] - DeadLocks in RMStateStore<->ZKRMStateStore
- [YARN-2136] - RMStateStore can explicitly handle store/update events when fenced
- [YARN-6249] - TestFairSchedulerPreemption fails inconsistently.
- [YARN-6380] - FSAppAttempt keeps redundant copy of the queue
- [YARN-6334] - TestRMFailover#testAutomaticFailover always passes even when it should fail
- [YARN-6510] - Fix profs stat file warning caused by process names that includes parenthesis
- [YARN-6500] - Do not mount inaccessible cgroups directories in CgroupsLCEResourcesHandler
- [YARN-6453] - fairscheduler-statedump.log gets generated regardless of service
- [YARN-6360] - Prevent FS state dump logger from cramming other log files
- [YARN-6368] - Decommissioning an NM results in a -1 exit code
- [YARN-6433] - Only accessible cgroup mount directories should be selected for a controller
- [YARN-6448] - Continuous scheduling thread crashes while sorting nodes
- [YARN-6359] - TestRM#testApplicationKillAtAcceptedState fails rarely due to race condition
- [YARN-6264] - AM not launched when a single vcore is available on the cluster
- [YARN-1047] - Expose # of pre-emptions as a queue counter
- [YARN-3251] - Fix CapacityScheduler deadlock when computing absolute max avail capacity (short term fix for 2.6.1)
- [YARN-6218] - Fix TestAMRMClient when using FairScheduler
- [YARN-6231] - FairSchedulerTestBase helper methods should call scheduler.update to avoid flakiness
- [YARN-6215] - FairScheduler preemption and update should not run concurrently
- [YARN-6172] - FSLeafQueue demand update needs to be atomic
- [YARN-6222] - TestFairScheduler.testReservationMetrics is flaky
- [YARN-6210] - FS: Node reservations can interfere with preemption
- [YARN-6193] - FairScheduler might not trigger preemption when using DRF
- [YARN-6163] - FS Preemption is a trickle for severely starved applications
- [YARN-6171] - ConcurrentModificationException on FSAppAttempt.containersToPreempt
- [YARN-5798] - Set UncaughtExceptionHandler for all FairScheduler threads
- [YARN-3933] - FairScheduler: Multiple calls to completedContainer are not safe
- [YARN-6112] - UpdateCallDuration is calculated only when debug logging is enabled
- [YARN-6144] - FairScheduler: preempted resources can become negative
- [YARN-5830] - FairScheduler: Avoid preempting AM containers
- [YARN-4882] - Change the log level to DEBUG for recovering completed applications
- [YARN-3957] - FairScheduler NPE In FairSchedulerQueueInfo causing scheduler page to return 500
- [YARN-4752] - FairScheduler should preempt for a ResourceRequest and all preempted containers should be on the same node
- [YARN-2336] - Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree
- [YARN-5182] - MockNodes.newNodes creates one more node per rack than requested
- [YARN-5920] - Fix deadlock in TestRMHA.testTransitionedToStandbyShouldNotHang
- [YARN-5859] - TestResourceLocalizationService#testParallelDownloadAttemptsForPublicResource sometimes fails
- [YARN-2306] - Add test for leakage of reservation metrics in fair scheduler
- [YARN-5752] - TestLocalResourcesTrackerImpl#testLocalResourceCache times out
- [YARN-5257] - Fix unreleased resources and null dereferences
- [YARN-3396] - Handle URISyntaxException in ResourceLocalizationService
- [YARN-2988] - Graph#save() may leak file descriptors
- [YARN-4232] - TopCLI console support for HA mode
- [YARN-5136] - Error in handling event type APP_ATTEMPT_REMOVED to the scheduler
- [YARN-3269] - Yarn.nodemanager.remote-app-log-dir could not be configured to fully qualified path
- [YARN-5694] - ZKRMStateStore can prevent the transition to standby in branch-2.7 if the ZK node is unreachable
- [YARN-4201] - AMBlacklist does not work for minicluster
- [YARN-5942] - "Overridden" is misspelled as "overriden" in FairScheduler.md
- [YARN-5677] - RM should transition to standby when connection is lost for an extended period
- [YARN-5453] - FairScheduler#update may skip update demand resource of child queue/app if current demand reached maxResource
- [YARN-4743] - FairSharePolicy breaks TimSort assumption
- [YARN-4767] - Network issues can cause persistent RM UI outage
- [YARN-3582] - NPE in WebAppProxyServlet
- [YARN-5672] - FairScheduler: wrong queue name in log when adding application
- [YARN-5834] - TestNodeStatusUpdater.testNMRMConnectionConf compares nodemanager wait time to the incorrect value
- [YARN-5862] - TestDiskFailures.testLocalDirsFailures failed
- [YARN-5246] - NMWebAppFilter web redirects drop query parameters
- [YARN-3654] - ContainerLogsPage web UI should not have meta-refresh
- [YARN-4820] - ResourceManager web redirects in HA mode drops query parameters
- [YARN-5837] - NPE when getting node status of a decommissioned node after an RM restart
- [YARN-5001] - Aggregated Logs root directory is created with wrong group if nonexistent
- [YARN-3601] - Fix UT TestRMFailover.testRMWebAppRedirect
- [YARN-5462] - TestNodeStatusUpdater.testNodeStatusUpdaterRetryAndNMShutdown fails intermittently
- [YARN-5009] - NMLeveldbStateStoreService database can grow substantially leading to longer recovery times
- [YARN-4255] - container-executor does not clean up docker operation command files.
- [YARN-4017] - container-executor overuses PATH_MAX
- [YARN-4004] - container-executor should print output of docker logs if the docker container exits with non-0 exit status
- [YARN-5353] - ResourceManager can leak delegation tokens when they are shared across apps
- [YARN-5754] - Null check missing for earliest in FifoPolicy
- [YARN-5197] - RM leaks containers if running container disappears from node update
- [YARN-3375] - NodeHealthScriptRunner.shouldRun() check is performing 3 times for starting NodeHealthScriptRunner
- [YARN-4794] - Deadlock in NMClientImpl
- [YARN-4927] - TestRMHA#testTransitionedToActiveRefreshFail fails with FairScheduler
- [YARN-3239] - WebAppProxy does not support a final tracking url which has query fragments and params
- [YARN-3094] - reset timer for liveness monitors after RM recovery
- [YARN-5693] - Reduce loglevel to Debug in ContainerManagementProtocolProxy and AMRMClientImpl
- [YARN-4115] - Reduce loglevel of ContainerManagementProtocolProxy to Debug
- [YARN-2246] - Job History Link in RM UI is redirecting to the URL which contains Job Id twice
- [YARN-5655] - TestContainerManagerSecurity#testNMTokens is asserting
- [YARN-4940] - yarn node -list -all failed if RM start with decommissioned node
- [YARN-5107] - TestContainerMetrics fails
- [YARN-5549] - AMLauncher#createAMContainerLaunchContext() should not log the command to be launched indiscriminately
- [YARN-4556] - TestFifoScheduler.testResourceOverCommit fails
- [YARN-2977] - TestNMClient get failed intermittently
- [YARN-4459] - container-executor should only kill process groups
- [YARN-5272] - Handle queue names consistently in FairScheduler
- [YARN-4321] - Incessant retries if NoAuthException is thrown by Zookeeper in non HA mode
- [YARN-2019] - Retrospect on decision of making RM crashed if any exception throw in ZKRMStateStore
- [YARN-5077] - Fix FSLeafQueue#getFairShare() for queues with zero fairshare
- [YARN-5048] - DelegationTokenRenewer#skipTokenRenewal may throw NPE
- [YARN-2356] - yarn status command for non-existent application/application attempt/container is too verbose
- [YARN-4411] - RMAppAttemptImpl#createApplicationAttemptReport throws IllegalArgumentException
- [YARN-4866] - FairScheduler: AMs can consume all vcores leading to a livelock when using FAIR policy
- [YARN-4916] - TestNMProxy.tesNMProxyRPCRetry fails.
- [YARN-4979] - FSAppAttempt demand calculation considers demands at multiple locality levels different
- [YARN-3344] - Fix warning - procfs stat file is not in the expected format
- [YARN-4984] - LogAggregationService shouldn't swallow exception in handling createAppDir() which cause thread leak.
- [YARN-4288] - NodeManager restart should keep retrying to register to RM while connection exception happens during RM failed over.
- [YARN-3896] - RMNode transitioned from RUNNING to REBOOTED because its response id had not been reset synchronously
- [YARN-3804] - Both RM are on standBy state when kerberos user not in yarn.admin.acl
- [YARN-4795] - ContainerMetrics drops records
- [YARN-4344] - NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations
- [YARN-3802] - Two RMNodes for the same NodeId are used in RM sometimes after NM is reconnected.
- [YARN-3554] - Default value for maximum nodemanager connect wait time is too high
- [YARN-4414] - Nodemanager connection errors are retried at multiple levels
- [YARN-2952] - Incorrect version check in RMStateStore
- [YARN-1984] - LeveldbTimelineStore does not handle db exceptions properly
- [YARN-4723] - NodesListManager$UnknownNodeId ClassCastException
- [YARN-4127] - RM fail with noAuth error if switched from failover mode to non-failover mode
- [YARN-4850] - test-fair-scheduler.xml isn't valid xml
- [YARN-4935] - TestYarnClient#testSubmitIncorrectQueue fails with FairScheduler
- [YARN-2993] - Several fixes (missing acl check, error log msg ...) and some refinement in AdminService
- [YARN-4168] - Test TestLogAggregationService.testLocalFileDeletionOnDiskFull failing
- [YARN-3055] - The token is not renewed properly if it's shared by jobs (oozie) in DelegationTokenRenewer
- [YARN-3493] - RM fails to come up with error "Failed to load/recover state" when mem settings are changed
- [YARN-3780] - Should use equals when compare Resource in RMNodeImpl#ReconnectNodeTransition
- [YARN-2046] - Out of band heartbeats are sent only on container kill and possibly too early
- [YARN-3809] - Failed to launch new attempts because ApplicationMasterLauncher's threads all hang
- [YARN-3476] - Nodemanager can fail to delete local logs if log aggregation fails
- [YARN-3537] - NPE when NodeManager.serviceInit fails and stopRecoveryStore invoked
- [YARN-3457] - NPE when NodeManager.serviceInit fails and stopRecoveryStore called
- [YARN-3585] - NodeManager cannot exit on SHUTDOWN event triggered and NM recovery is enabled
- [YARN-4096] - App local logs are leaked if log aggregation fails to initialize for the app
- [YARN-3472] - Possible leak in DelegationTokenRenewer#allTokens
- [YARN-3281] - Add RMStateStore to StateMachine visualization list
- [YARN-3857] - Memory leak in ResourceManager with SIMPLE mode
- [YARN-4047] - ClientRMService getApplications has high scheduler lock contention
- [YARN-4313] - Race condition in MiniMRYarnCluster when getting history server address
- [YARN-3393] - Getting application(s) goes wrong when app finishes before starting the attempt
- [YARN-4598] - Invalid event: RESOURCE_FAILED at CONTAINER_CLEANEDUP_AFTER_KILL
- [YARN-4722] - AsyncDispatcher logs redundant event queue sizes
- [YARN-3382] - Some of UserMetricsInfo metrics are incorrectly set to root queue metrics
- [YARN-3131] - YarnClientImpl should check FAILED and KILLED state in submitApplication
- [YARN-2894] - When ACL's are enabled, if RM switches then application can not be viewed from web.
- [YARN-3336] - FileSystem memory leak in DelegationTokenRenewer
- [YARN-4761] - NMs reconnecting with changed capabilities can lead to wrong cluster resource calculations on fair scheduler
- [YARN-3160] - Non-atomic operation on nodeUpdateQueue in RMNodeImpl
- [YARN-3104] - RM generates new AMRM tokens every heartbeat between rolling and activation
- [YARN-3071] - Remove invalid char from sample conf in doc of FairScheduler
- [YARN-2243] - Order of arguments for Preconditions.checkNotNull() is wrong in SchedulerApplicationAttempt ctor
- [YARN-2945] - FSLeafQueue#assignContainer - document the reason for using both write and read locks
- [YARN-3753] - RM failed to come up with "java.io.IOException: Wait for ZKClient creation timed out"
- [YARN-2424] - LCE should support non-cgroups, non-secure mode
- [YARN-4005] - Completed container whose app is finished is not removed from NMStateStore
- [YARN-3695] - ServerProxy (NMProxy, etc.) shouldn't retry forever for non network exception.
- [YARN-2997] - NM keeps sending already-sent completed containers to RM until containers are removed from context
- [YARN-2731] - Fixed RegisterApplicationMasterResponsePBImpl to properly invoke maybeInitBuilder
- [YARN-4629] - Distributed shell breaks under strong security
- [YARN-4812] - TestFairScheduler#testContinuousScheduling fails intermittently
- [YARN-3604] - removeApplication in ZKRMStateStore should also disable watch.
- [YARN-4717] - TestResourceLocalizationService.testPublicResourceInitializesLocalDir fails Intermittently due to IllegalArgumentException from cleanup
- [YARN-2893] - AMLaucher: sporadic job failures due to EOFException in readTokenStorageStream
- [YARN-3641] - NodeManager: stopRecoveryStore() shouldn't be skipped when exceptions happen in stopping NM's sub-services.
- [YARN-4613] - TestClientRMService#testGetClusterNodes fails occasionally
- [YARN-4701] - When task logs are not available, port 8041 is referenced instead of port 8042
- [YARN-4729] - SchedulerApplicationAttempt#getTotalRequiredResources can throw an NPE
- [YARN-3304] - ResourceCalculatorProcessTree#getCpuUsagePercent default return value is inconsistent with other getters
- [YARN-4348] - ZKRMStateStore.syncInternal shouldn't wait for sync completion for avoiding blocking ZK's event thread
- [YARN-3798] - ZKRMStateStore shouldn't create new session without occurrance of SESSIONEXPIED
- [YARN-4424] - Fix deadlock in RMAppImpl
- [YARN-3102] - Decommisioned Nodes not listed in Web UI
- [YARN-3266] - RMContext inactiveNodes should have NodeId as map key
- [YARN-3733] - Fix DominantRC#compare() does not work as expected if cluster resource is empty
- [YARN-3619] - ContainerMetrics unregisters during getMetrics and leads to ConcurrentModificationException
- [YARN-2749] - Some testcases from TestLogAggregationService fails in trunk
- [YARN-3369] - Missing NullPointer check in AppSchedulingInfo causes RM to die
- [YARN-4155] - TestLogAggregationService.testLogAggregationServiceWithInterval failing
- [YARN-4573] - TestRMAppTransitions.testAppRunningKill and testAppKilledKilled fail on trunk
- [YARN-4477] - FairScheduler: Handle condition which can result in an infinite loop in attemptScheduling.
- [YARN-4546] - ResourceManager crash due to scheduling opportunity overflow
- [YARN-3446] - FairScheduler headroom calculation should exclude nodes in the blacklist
- [YARN-4440] - FSAppAttempt#getAllowedLocalityLevelByTime should init the lastScheduler time
- [YARN-4387] - Fix typo in FairScheduler log message
- [YARN-4256] - YARN fair scheduler vcores with decimal values
- [YARN-4066] - Large number of queues choke fair scheduler
- [YARN-3697] - FairScheduler: ContinuousSchedulingThread can fail to shutdown
- [YARN-3878] - AsyncDispatcher can hang while stopping if it is configured for draining events on stop
- [YARN-2991] - TestRMRestart.testDecomissionedNMsMetricsOnRMRestart intermittently fails on trunk
- [YARN-3875] - FSSchedulerNode#reserveResource() doesn't print Application Id properly in log
- [YARN-3790] - usedResource from rootQueue metrics may get stale data for FS scheduler after recovering the container
- [YARN-3655] - FairScheduler: potential livelock due to maxAMShare limitation and container reservation
- [YARN-3395] - FairScheduler: Trim whitespaces when using username for queuename
- [YARN-3495] - Confusing log generated by FairScheduler
- [YARN-3415] - Non-AM containers can be counted towards amResourceUsage of a Fair Scheduler queue
- [YARN-3010] - Fix recent findbug issue in AbstractYarnScheduler
- [YARN-3832] - Resource Localization fails on a cluster due to existing cache directories
- [YARN-3925] - ContainerLogsUtils#getContainerLogFile fails to read container log files from full disks.
- [YARN-3850] - NM fails to read files from full disks which can lead to container logs being lost and other issues
- [YARN-1912] - ResourceLocalizer started without any jvm memory control
- [YARN-4354] - Public resource localization fails with NPE
- [YARN-4380] - TestResourceLocalizationService.testDownloadingResourcesOnContainerKill fails intermittently
- [YARN-4393] - TestResourceLocalizationService#testFailedDirsResourceRelease fails intermittently
- [YARN-4398] - Yarn recover functionality causes the cluster running slowly and the cluster usage rate is far below 100
- [YARN-4408] - NodeManager still reports negative running containers
- [YARN-4235] - FairScheduler PrimaryGroup does not handle empty groups returned for a user
- [YARN-3768] - ArrayIndexOutOfBoundsException with empty environment variables
- [YARN-4347] - Resource manager fails with Null pointer exception
- [YARN-4367] - SLS webapp doesn't load
- [YARN-4302] - SLS not able start due to NPE in SchedulerApplicationAttempt#getResourceUsageReport
- [YARN-4041] - Slow delegation token renewal can severely prolong RM recovery
- [YARN-4284] - condition for AM blacklisting is too narrow
- [YARN-3564] - Fix TestContainerAllocation.testAMContainerAllocationWhenDNSUnavailable fails randomly
- [YARN-4270] - Limit application resource reservation on nodes for non-node/rack specific requests
- [YARN-4180] - AMLauncher does not retry on failures when talking to NM
- [YARN-4204] - ConcurrentModificationException in FairSchedulerQueueInfo
- [YARN-3385] - Race condition: KeeperException$NoNodeException will cause RM shutdown during ZK node deletion.
- [YARN-2821] - Distributed shell app master becomes unresponsive sometimes
- [YARN-3602] - TestResourceLocalizationService.testPublicResourceInitializesLocalDir fails Intermittently due to IOException from cleanup
- [YARN-3982] - container-executor parsing of container-executor.cfg broken in trunk and branch-2
- [YARN-3852] - Add docker container support to container-executor
- [YARN-1519] - check if sysconf is implemented before using it
- [YARN-2847] - Linux native container executor segfaults if default banned user detected
- [YARN-3990] - AsyncDispatcher may overloaded with RMAppNodeUpdateEvent when Node is connected/disconnected
- [YARN-3823] - Fix mismatch in default values for yarn.scheduler.maximum-allocation-vcores property
- [YARN-2194] - Cgroups cease to work in RHEL7
- [YARN-2809] - Implement workaround for linux kernel panic when removing cgroup
- [YARN-3535] - Scheduler must re-request container resources when RMContainer transitions from ALLOCATED to KILLED
- [YARN-3453] - Fair Scheduler: Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing
- [YARN-3793] - Several NPEs when deleting local files on NM recovery
- [YARN-3064] - TestRMRestart/TestContainerResourceUsage/TestNodeManagerResync failure with allocation timeout
- [YARN-3835] - hadoop-yarn-server-resourcemanager test package bundles core-site.xml, yarn-site.xml
- [YARN-3842] - NMProxy should retry on NMNotYetReadyException
- [YARN-3143] - RM Apps REST API can return NPE or entries missing id and other fields
- [YARN-3762] - FairScheduler: CME on FSParentQueue#getQueueUserAclInfo
- [YARN-3675] - FairScheduler: RM quits when node removal races with continousscheduling on the same node
- [YARN-3473] - Fix RM Web UI configuration for some properties
- [YARN-3485] - FairScheduler headroom calculation doesn't consider maxResources for Fifo and FairShare policies
- [YARN-3021] - YARN's delegation-token handling disallows certain trust setups to operate properly over DistCp
- [YARN-3464] - Race condition in LocalizerRunner kills localizer before localizing all resources
- [YARN-3516] - killing ContainerLocalizer action doesn't take effect when private localizer receives FETCH_FAILURE status.
- [YARN-3241] - FairScheduler handles "invalid" queue names inconsistently
- [YARN-3465] - Use LinkedHashMap to preserve order of resource requests
- [YARN-3024] - LocalizerRunner should give DIE action when all resources are localized
- [YARN-3351] - AppMaster tracking URL is broken in HA
- [YARN-2713] - "RM Home" link in NM should point to one of the RMs in an HA setup
- [YARN-2958] - RMStateStore seems to unnecessarily and wrongly store sequence number separately
- [YARN-3090] - DeletionService can silently ignore deletion task failures
- [YARN-3088] - LinuxContainerExecutor.deleteAsUser can throw NPE if native executor returns an error
- [YARN-3074] - Nodemanager dies when localizer runner tries to write to a full disk
- [YARN-3231] - FairScheduler: Changing queueMaxRunningApps interferes with pending jobs
- [YARN-3222] - RMNodeImpl#ReconnectNodeTransition should send scheduler events in sequential order
- [YARN-3194] - RM should handle NMContainerStatuses sent by NM while registering if NM is Reconnected node
- [YARN-3238] - Connection timeouts to nodemanagers are retried at multiple levels
- [YARN-3296] - yarn.nodemanager.container-monitor.process-tree.class is configurable but ResourceCalculatorProcessTree class is marked Private
- [YARN-3089] - LinuxContainerExecutor does not handle file arguments to deleteAsUser
- [YARN-2978] - ResourceManager crashes with NPE while getting queue info
- [YARN-3103] - AMRMClientImpl does not update AMRM token properly
- [YARN-2743] - Yarn jobs via oozie fail with failed to renew token (secure) or digest mismatch (unsecure) errors when RM is being killed
- [YARN-2340] - NPE thrown when RM restart after queue is STOPPED. There after RM can not recovery application's and remain in standby
- [YARN-2917] - Potential deadlock in AsyncDispatcher when system.exit called in AsyncDispatcher#dispatch and AsyscDispatcher#serviceStop from shutdown hook
- [YARN-2857] - ConcurrentModificationException in ContainerLogAppender
- [YARN-2972] - DelegationTokenRenewer thread pool never expands
- [YARN-3247] - TestQueueMappings should use CapacityScheduler explicitly
- [YARN-3242] - Asynchrony in ZK-close can lead to ZKRMStateStore watcher receiving events for old client
- [YARN-2865] - Application recovery continuously fails with "Application with id already present. Cannot duplicate"
- [YARN-3101] - In Fair Scheduler, fix canceling of reservations for exceeding max share
- [YARN-3256] - TestClientToAMTokens#testClientTokenRace is not running against all Schedulers even when using ParameterizedSchedulerTestBase
- [YARN-2990] - FairScheduler's delay-scheduling always waits for node-local and rack-local delays, even for off-rack-only requests
- [YARN-3082] - Non thread safe access to systemCredentials in NodeHeartbeatResponse processing
- [YARN-2964] - RM prematurely cancels tokens for jobs that submit jobs (oozie)
- [YARN-3079] - Scheduler should also update maximumAllocation when updateNodeResource.
- [YARN-2461] - Fix PROCFS_USE_SMAPS_BASED_RSS_ENABLED property in YarnConfiguration
- [YARN-3027] - Scheduler should use totalAvailable resource from node instead of availableResource for maxAllocation
- [YARN-2992] - ZKRMStateStore crashes due to session expiry
- [YARN-2975] - FSLeafQueue app lists are accessed without required locks
- [YARN-2675] - containersKilled metrics is not updated when the container is killed during localization
- [YARN-2910] - FSLeafQueue can throw ConcurrentModificationException
- [YARN-2931] - PublicLocalizer may fail until directory is initialized by LocalizeRunner
- [YARN-2905] - AggregatedLogsBlock page can infinitely loop if the aggregated log file is corrupted
- [YARN-2874] - Dead lock in "DelegationTokenRenewer" which blocks RM to execute any further apps
- [YARN-2414] - RM web UI: app page will crash if app is failed before any attempt has been created
- [YARN-1703] - Too many connections are opened for proxy server when applicationMaster UI is accessed.
- [YARN-2432] - RMStateStore should process the pending events before close
- [YARN-2811] - In Fair Scheduler, reservation fulfillments shouldn't ignore max share
- [YARN-2856] - Application recovery throw InvalidStateTransitonException: Invalid event: ATTEMPT_KILLED at ACCEPTED
- [YARN-2742] - FairSchedulerConfiguration should allow extra spaces between value and unit
- [YARN-2315] - FairScheduler: Set current capacity in addition to capacity
- [YARN-2816] - NM fail to start with NPE during container recovery
- [YARN-2735] - diskUtilizationPercentageCutoff and diskUtilizationSpaceCutoff are initialized twice in DirectoryCollection
- [YARN-570] - Time strings are formated in different timezone
- [YARN-1553] - Do not use HttpConfig.isSecure() in YARN
Improvement
- [YARN-7261] - Add debug message for better download latency monitoring
- [YARN-7207] - Cache the RM proxy server address
- [YARN-6802] - Add Max AM Resource and AM Resource Usage to Leaf Queue View in FairScheduler WebUI
- [YARN-4995] - FairScheduler: Display per-queue demand on the scheduler page
- [YARN-6752] - Display reserved resources in web UI per application
- [YARN-6751] - Display reserved resources in web UI per queue
- [YARN-2780] - Log aggregated resource allocation in rm-appsummary.log
- [YARN-6381] - FSAppAttempt has several variables that should be final
- [YARN-6042] - Dump scheduler and queue state information into FairScheduler DEBUG log
- [YARN-6194] - Cluster capacity in SchedulingPolicy is updated only on allocation file reload
- [YARN-6125] - The application attempt's diagnostic message should have a maximum size
- [YARN-6061] - Add an UncaughtExceptionHandler for critical threads in RM
- [YARN-2301] - Improve yarn container command
- [YARN-6131] - FairScheduler: Lower update interval for faster tests
- [YARN-4805] - Don't go through all schedulers in ParameterizedTestBase
- [YARN-5181] - ClusterNodeTracker: add method to get list of nodes matching a specific resourceName
- [YARN-4719] - Add a helper library to maintain node state and allows common queries
- [YARN-4544] - All the log messages about rolling monitoring interval are shown with WARN level
- [YARN-3412] - RM tests should use MockRM where possible
- [YARN-5890] - FairScheduler should log information about AM-resource-usage and max-AM-share for queues
- [YARN-2913] - Fair scheduler should have ability to set MaxResourceDefault for each queue
- [YARN-4710] - Reduce logging application reserved debug info in FSAppAttempt#assignContainer
- [YARN-4911] - Bad placement policy in FairScheduler causes the RM to crash
- [YARN-5616] - Clean up WeightAdjuster
- [YARN-5082] - Limit ContainerId increase in fair scheduler if the num of node app reserved reached the limit
- [YARN-5736] - YARN container executor config does not handle white space
- [YARN-3722] - Merge multiple TestWebAppUtils into o.a.h.yarn.webapp.util.TestWebAppUtils
- [YARN-4132] - Separate configs for nodemanager to resourcemanager connection timeout and retries
- [YARN-4245] - Clean up container-executor binary invocation interface
- [YARN-2980] - Move health check script related functionality to hadoop-common
- [YARN-5483] - Optimize RMAppAttempt#pullJustFinishedContainers
- [YARN-4702] - FairScheduler: Allow setting maxResources for ad hoc queues
- [YARN-4958] - The file localization process should allow for wildcards to reduce the application footprint in the state store
- [YARN-4568] - Fix message when NodeManager runs into errors initializing the recovery directory
- [YARN-5035] - FairScheduler: Adjust maxAssign dynamically when assignMultiple is turned on
- [YARN-4878] - Expose scheduling policy and max running apps over JMX for Yarn queues
- [YARN-4579] - Allow DefaultContainerExecutor container log directory permissions to be configurable
- [YARN-3147] - Clean up RM web proxy code
- [YARN-2940] - Fix new findbugs warnings in rest of the hadoop-yarn components
- [YARN-3100] - Make YARN authorization pluggable
- [YARN-4784] - Fairscheduler: defaultQueueSchedulingPolicy should not accept FIFO
- [YARN-4541] - Change log message in LocalizedResource#handle() to DEBUG
- [YARN-4690] - Skip object allocation in FSAppAttempt#getResourceUsage when possible
- [YARN-4436] - DistShell ApplicationMaster.ExecBatScripStringtPath is misspelled
- [YARN-3077] - RM should create yarn.resourcemanager.zk-state-store.parent-path recursively
- [YARN-4560] - Make scheduler error checking message more user friendly
- [YARN-3410] - YARN admin should be able to remove individual application records from RMStateStore
- [YARN-4095] - Avoid sharing AllocatorPerContext object in LocalDirAllocator between ShuffleHandler and LocalDirsHandlerService.
- [YARN-3503] - Expose disk utilization percentage and bad local and log dir counts on NM via JMX
- [YARN-4158] - Remove duplicate close for LogWriter in AppLogAggregatorImpl#uploadLogsForContainers
- [YARN-4072] - ApplicationHistoryServer, WebAppProxyServer, NodeManager and ResourceManager to support JvmPauseMonitor as a service
- [YARN-4031] - Add JvmPauseMonitor to ApplicationHistoryServer and WebAppProxyServer
- [YARN-4019] - Add JvmPauseMonitor to ResourceManager and NodeManager
- [YARN-1156] - Enhance NodeManager AllocatedGB and AvailableGB metrics for aggregation of decimal values
- [YARN-3713] - Remove duplicate function call storeContainerDiagnostics in ContainerDiagnosticsUpdateTransition
- [YARN-4697] - NM aggregation thread pool is not bound by limits
- [YARN-4569] - Remove incorrect part of maxResources in FairScheduler documentation
- [YARN-4462] - FairScheduler: Disallow preemption from a queue
- [YARN-2768] - Avoid cloning Resource in FSAppAttempt#updateDemand
- [YARN-3259] - FairScheduler: Trigger fairShare updates on node events
- [YARN-3258] - FairScheduler: Need to add more logging to investigate allocations
- [YARN-2643] - Don't create a new DominantResourceCalculator on every FairScheduler.allocate call
- [YARN-4136] - LinuxContainerExecutor loses info when forwarding ResourceHandlerException
- [YARN-3727] - For better error recovery, check if the directory exists before using it for localization.
- [YARN-2891] - Failed Container Executor does not provide a clear error message
- [YARN-1393] - SLS: Add how-to-use instructions
- [YARN-4310] - FairScheduler: Log skipping reservation messages at DEBUG level
- [YARN-3920] - FairScheduler: Limit node reservations to large containers
- [YARN-3943] - Use separate threshold configurations for disk-full detection and disk-not-full detection.
- [YARN-2005] - Blacklisting support for scheduling AMs
- [YARN-3469] - ZKRMStateStore: Avoid setting watches that are not required
- [YARN-4086] - Allow Aggregated Log readers to handle HAR files
- [YARN-3950] - Add unique YARN_SHELL_ID environment variable to DistributedShell
- [YARN-3350] - YARN RackResolver spams logs with messages at info level
- [YARN-3961] - Expose pending, running and reserved containers of a queue in REST api and yarn top
- [YARN-3348] - Add a 'yarn top' tool to help understand cluster usage
- [YARN-2937] - Fix new findbugs warnings in hadoop-yarn-nodemanager
- [YARN-1287] - Consolidate MockClocks
- [YARN-2921] - Fix MockRM/MockAM#waitForState sleep too long
- [YARN-3467] - Expose allocatedMB, allocatedVCores, and runningContainers metrics on running Applications in RM Web UI
- [YARN-3491] - PublicLocalizer#addResource is too slow.
- [YARN-3363] - add localization and container launch time to ContainerMetrics at NM to show these timing information for each active container.
- [YARN-3428] - Debug log resources to be localized for a container
- [YARN-2868] - FairScheduler: Metric for latency to allocate first container for an application
- [YARN-3011] - NM dies because of the failure of resource localization
- [YARN-3122] - Metrics for container's actual CPU usage
- [YARN-668] - TokenIdentifier serialization should consider Unknown fields
- [YARN-2615] - ClientToAMTokenIdentifier and DelegationTokenIdentifier should allow extended fields
- [YARN-3022] - Expose Container resource information from NodeManager for monitoring
- [YARN-2984] - Metrics for container's actual memory usage
- [YARN-2641] - Decommission nodes on -refreshNodes instead of next NM-RM heartbeat
- [YARN-2957] - Create unit test to automatically compare YarnConfiguration and yarn-default.xml
- [YARN-2604] - Scheduler should consider max-allocation-* in conjunction with the largest node
- [YARN-2669] - FairScheduler: queue names shouldn't allow periods
- [YARN-2679] - Add metric for container launch duration
- [YARN-2802] - ClusterMetrics to include AM launch and register delays
- [YARN-2254] - TestRMWebServicesAppsModification should run against both CS and FS
- [YARN-2766] - ApplicationHistoryManager is expected to return a sorted list of apps/attempts/containers
New Feature
- [YARN-3006] - Improve the error message when attempting manual failover with auto-failover enabled
- [YARN-3223] - Resource update during NM graceful decommission
- [YARN-2] - Enhance CS to schedule accounting for both memory and cpu cores
- [YARN-4092] - RM HA UI redirection needs to be fixed when both RMs are in standby mode
- [YARN-3893] - Both RM in active state when Admin#transitionToActive failure from refeshAll()
- [YARN-5566] - Client-side NM graceful decom is not triggered when jobs finish
- [YARN-5434] - Add -client|server argument for graceful decom
- [YARN-3226] - UI changes for decommissioning node
- [YARN-3212] - RMNode State Transition Update with DECOMMISSIONING state
- [YARN-3445] - Cache runningApps in RMNode for getting running apps on given NodeId
- [YARN-2605] - [RM HA] Rest api endpoints doing redirect incorrectly
- [YARN-4101] - RM should print alert messages if Zookeeper and Resourcemanager gets connection issue
- [YARN-2079] - Recover NonAggregatingLogHandler state upon nodemanager restart
- [YARN-3844] - Make hadoop-yarn-project Native code -Wall-clean
- [YARN-3365] - Add support for using the 'tc' tool via container-executor
- [YARN-2619] - NodeManager: Add cgroups support for disk I/O isolation
- [YARN-3366] - Outbound network bandwidth : classify/shape traffic originating from YARN containers
- [YARN-3443] - Create a 'ResourceHandler' subsystem to ease addition of support for new resource types on the NM
- [YARN-1514] - Utility to benchmark ZKRMStateStore#loadState for ResourceManager-HA
- [YARN-2404] - Remove ApplicationAttemptState and ApplicationState class in RMStateStore class
- [YARN-2360] - Fair Scheduler: Display dynamic fair share for queues on the scheduler page
- [YARN-1898] - Standby RM's conf, stacks, logLevel, metrics, jmx and logs links are redirecting to Active RM
Task
- [YARN-5308] - FairScheduler: Move continuous scheduling related tests to TestContinuousScheduling
- [YARN-2902] - Killing a container that is localizing can orphan resources in the DOWNLOADING state
- [YARN-5704] - Provide config knobs to control enabling/disabling new/work in progress features in container-executor
- [YARN-3217] - Remove httpclient dependency from hadoop-yarn-server-web-proxy
- [YARN-90] - NodeManager should identify failed disks becoming good again
- [YARN-3136] - getTransferredContainers can be a bottleneck during AM registration
- [YARN-3505] - Node's Log Aggregation Report with SUCCEED should not cached in RMApps
- [YARN-1402] - Related Web UI, CLI changes on exposing client API to check log aggregation status
- [YARN-1376] - NM need to notify the log aggregation status to RM through Node heartbeat
- [YARN-2581] - NMs need to find a way to get LogAggregationContext
Test
- [YARN-4363] - In TestFairScheduler, testcase should not create FairScheduler redundantly
- [YARN-4555] - TestDefaultContainerExecutor#testContainerLaunchError fails on non-english locale environment
- [YARN-5608] - TestAMRMClient.setup() fails with ArrayOutOfBoundsException
- [YARN-4989] - TestWorkPreservingRMRestart#testCapacitySchedulerRecovery fails intermittently
- [YARN-5024] - TestContainerResourceUsage#testUsageAfterAMRestartWithMultipleContainers random failure
- [YARN-5343] - TestContinuousScheduling#testSortedNodes fails intermittently
- [YARN-2871] - TestRMRestart#testRMRestartGetApplicationList sometime fails in trunk
- [YARN-4704] - TestResourceManager#testResourceAllocation() fails when using FairScheduler
- [YARN-2666] - TestFairScheduler.testContinuousScheduling fails Intermittently
- [YARN-1979] - TestDirectoryCollection fails when the umask is unusual
HDFS
Bug
- [HDFS-7582] - Enforce maximum number of ACL entries separately per access and default.
- [HDFS-12518] - Re-encryption should handle task cancellation and progress better
- [HDFS-11755] - Underconstruction blocks can be considered missing
- [HDFS-11445] - FSCK shows overall health stauts as corrupt even one replica is corrupt
- [HDFS-12494] - libhdfs SIGSEGV in setTLSExceptionStrings
- [HDFS-9396] - Total files and directories on jmx and web UI on standby is uninitialized
- [HDFS-6763] - Initialize file system-wide quota once on transitioning to active
- [HDFS-9003] - ForkJoin thread pool leaks
- [HDFS-12424] - Datatable sorting on the Datanode Information page in the Namenode UI is broken
- [HDFS-12458] - TestReencryptionWithKMS fails regularly
- [HDFS-11851] - getGlobalJNIEnv() may deadlock if exception is thrown
- [HDFS-12357] - Let NameNode to bypass external attribute provider for special user
- [HDFS-12191] - Provide option to not capture the accessTime change of a file to snapshot if no other modification has been done to this file
- [HDFS-12369] - Edit log corruption due to hard lease recovery of not-closed file which has snapshots
- [HDFS-12400] - Provide a way for NN to drain the local key cache before re-encryption
- [HDFS-12359] - Re-encryption should operate with minimum KMS ACL requirements.
- [HDFS-12363] - Possible NPE in BlockManager$StorageInfoDefragmenter#scanAndCompactStorages
- [HDFS-12383] - Re-encryption updater should handle canceled tasks better
- [HDFS-12336] - Listing encryption zones still fails when deleted EZ is not a direct child of snapshottable directory
- [HDFS-11849] - JournalNode startup failure exception should be logged in log file
- [HDFS-11833] - HDFS architecture documentation describes outdated placement policy
- [HDFS-12157] - Do fsyncDirectory(..) outside of FSDataset lock
- [HDFS-5042] - Completed files lost after power failure
- [HDFS-11738] - Hedged pread takes more time when block moved from initial locations
- [HDFS-11945] - Internal lease recovery may not be retried for a long time
- [HDFS-11377] - Balancer hung due to no available mover threads
- [HDFS-11932] - BPServiceActor thread name is not correctly set
- [HDFS-11303] - Hedged read might hang infinitely if read data from all DN failed
- [HDFS-10468] - HDFS read ends up ignoring an interrupt
- [HDFS-11711] - DN should not delete the block On "Too many open files" Exception
- [HDFS-12278] - LeaseManager operations are inefficient in 2.8.
- [HDFS-7847] - Modify NNThroughputBenchmark to be able to operate on a remote NameNode
- [HDFS-12217] - HDFS snapshots doesn't capture all open files when one of the open files is deleted
- [HDFS-8312] - Trash does not descent into child directories to check for permissions
- [HDFS-11472] - Fix inconsistent replica size after a data pipeline failure
- [HDFS-12140] - Remove BPOfferService lock contention to get block pool id
- [HDFS-11674] - reserveSpaceForReplicas is not released if append request failed due to mirror down and replica recovered
- [HDFS-7886] - TestFileTruncate#testTruncateWithDataNodesRestart runs timeout sometimes
- [HDFS-11736] - OIV tests should not write outside 'target' directory.
- [HDFS-12089] - Fix ambiguous NN retry log message
- [HDFS-10816] - TestComputeInvalidateWork#testDatanodeReRegistration fails due to race between test and replication monitor
- [HDFS-11967] - TestJMXGet fails occasionally
- [HDFS-12112] - TestBlockManager#testBlockManagerMachinesArray sometimes fails with NPE
- [HDFS-12139] - HTTPFS liststatus returns incorrect pathSuffix for path of file
- [HDFS-8870] - Lease is leaked on write failure
- [HDFS-11960] - Successfully closed files can stay under-replicated.
- [HDFS-10506] - OIV's ReverseXML processor cannot reconstruct some snapshot details
- [HDFS-8856] - Make LeaseManager#countPath O(1)
- [HDFS-10220] - A large number of expired leases can make namenode unresponsive and cause failover
- [HDFS-10985] - o.a.h.ha.TestZKFailoverController should not use fixed time sleep before assertions
- [HDFS-11708] - Positional read will fail if replicas moved to different DNs after stream is opened
- [HDFS-9672] - o.a.h.hdfs.TestLeaseRecovery2 fails intermittently
- [HDFS-11198] - NN UI should link DN web address using hostnames
- [HDFS-11340] - DataNode reconfigure for disks doesn't remove the failed volumes
- [HDFS-10301] - BlockReport retransmissions may lead to storages falsely being declared zombie if storage report processing happens out of order
- [HDFS-11741] - Long running balancer may fail due to expired DataEncryptionKey
- [HDFS-11732] - Backport HDFS-8498 to branch-2.7: Blocks can be committed with wrong size
- [HDFS-10797] - Disk usage summary of snapshots causes renamed blocks to get counted twice
- [HDFS-11515] - -du throws ConcurrentModificationException
- [HDFS-7959] - WebHdfs logging is missing on Datanode
- [HDFS-11702] - Remove indefinite caching of key provider uri in DFSClient
- [HDFS-8586] - Dead Datanode is allocated for write when client is from deadnode
- [HDFS-11714] - Newly added NN storage directory won't get initialized and cause space exhaustion
- [HDFS-10987] - Make Decommission less expensive when lot of blocks present.
- [HDFS-11352] - Potential deadlock in NN when failing over
- [HDFS-11391] - Numeric usernames do no work with WebHDFS FS (write access)
- [HDFS-11609] - Some blocks can be permanently lost if nodes are decommissioned while dead
- [HDFS-7939] - Two fsimage_rollback_* files are created which are not deleted after rollback.
- [HDFS-11709] - StandbyCheckpointer should handle an non-existing legacyOivImageDir gracefully
- [HDFS-11529] - Add libHDFS API to return last exception
- [HDFS-11724] - libhdfs compilation is broken on OS X
- [HDFS-9670] - DistCp throws NPE when source is root
- [HDFS-10642] - TestLazyPersistReplicaRecovery#testDnRestartWithSavedReplicas fails intermittently
- [HDFS-11689] - New exception thrown by DFSClient#isHDFSEncryptionEnabled broke hacky hive code
- [HDFS-10715] - NPE when applying AvailableSpaceBlockPlacementPolicy
- [HDFS-9599] - TestDecommissioningStatus.testDecommissionStatus occasionally fails
- [HDFS-11499] - Decommissioning stuck because of failing recovery
- [HDFS-11441] - Add escaping to error message in KMS web UI
- [HDFS-11378] - Verify multiple DataNodes can be decommissioned/maintenance at the same time
- [HDFS-10808] - DiskBalancer does not execute multi-steps plan-redux
- [HDFS-10559] - DiskBalancer: Use SHA1 for Plan ID
- [HDFS-8307] - Spurious DNS Queries from hdfs shell
- [HDFS-10713] - Throttle FsNameSystem lock warnings
- [HDFS-9467] - Fix data race accessing writeLockHeldTimeStamp in FSNamesystem
- [HDFS-10966] - Enhance Dispatcher logic on deciding when to give up a source DataNode
- [HDFS-11280] - Allow WebHDFS to reuse HTTP connections to NN
- [HDFS-11379] - DFSInputStream may infinite loop requesting block locations
- [HDFS-11087] - NamenodeFsck should check if the output writer is still writable.
- [HDFS-11263] - ClassCastException when we use Bzipcodec for Fsimage compression
- [HDFS-11180] - Intermittent deadlock in NameNode when failover happens.
- [HDFS-10733] - NameNode terminated after full GC thinking QJM is unresponsive.
- [HDFS-11363] - Need more diagnosis info when seeing Slow waitForAckedSeqno
- [HDFS-3716] - Purger should remove stale fsimage ckpt files
- [HDFS-11160] - VolumeScanner reports write-in-progress replicas as corrupt incorrectly
- [HDFS-11271] - Typo in NameNode UI
- [HDFS-4366] - Block Replication Policy Implementation May Skip Higher-Priority Blocks for Lower-Priority Blocks
- [HDFS-11132] - Allow AccessControlException in contract tests when getFileStatus on subdirectory of existing files
- [HDFS-11056] - Concurrent append and read operations lead to checksum error
- [HDFS-11197] - Listing encryption zones fails when deleting a EZ that is on a snapshotted directory
- [HDFS-9885] - Correct the distcp counters name while displaying counters
- [HDFS-10915] - Fix time measurement bug in TestDatanodeRestart#testWaitForRegistrationOnRestart
- [HDFS-10878] - TestDFSClientRetries#testIdempotentAllocateBlockAndClose throws ConcurrentModificationException
- [HDFS-9444] - Add utility to find set of available ephemeral ports to ServerSocketUtil
- [HDFS-10627] - Volume Scanner marks a block as "suspect" even if the exception is network-related
- [HDFS-11053] - Unnecessary superuser check in versionRequest()
- [HDFS-11015] - Enforce timeout in balancer
- [HDFS-10656] - Optimize conversion of byte arrays back to path string
- [HDFS-10655] - Fix path related byte array conversion bugs
- [HDFS-10809] - getNumEncryptionZones causes NPE in branch-2.7
- [HDFS-9500] - datanodesSoftwareVersions map may counting wrong when rolling upgrade
- [HDFS-8721] - Add a metric for number of encryption zones
- [HDFS-10674] - Optimize creating a full path from an inode
- [HDFS-10694] - BlockManager.processReport() should print blockReportId in each log message.
- [HDFS-9839] - Reduce verbosity of processReport logging
- [HDFS-10276] - HDFS should not expose path info that user has no permission to see.
- [HDFS-8405] - Fix a typo in NamenodeFsck
- [HDFS-10883] - `getTrashRoot`'s behavior is not consistent in DFS after enabling EZ.
- [HDFS-9724] - Degraded performance in WebHDFS listing as it does not reuse ObjectMapper
- [HDFS-8542] - WebHDFS getHomeDirectory behavior does not match specification
- [HDFS-8037] - CheckAccess in WebHDFS silently accepts malformed FsActions parameters
- [HDFS-7224] - Allow reuse of NN connections via webhdfs
- [HDFS-10312] - Large block reports may fail to decode at NameNode due to 64 MB protobuf maximum length restriction.
- [HDFS-10763] - Open files can leak permanently due to inconsistent lease update
- [HDFS-9279] - Decomissioned capacity should not be considered for configured/used capacity
- [HDFS-10870] - Wrong dfs.namenode.acls.enabled default in HdfsPermissionsGuide.apt.vm
- [HDFS-10505] - OIV's ReverseXML processor should support ACLs
- [HDFS-9934] - ReverseXML oiv processor should bail out if the XML file's layoutVersion doesn't match oiv's
- [HDFS-10556] - DistCpOptions should be validated automatically
- [HDFS-10397] - Distcp should ignore -delete option if -diff option is provided instead of exiting
- [HDFS-10313] - Distcp need to enforce the order of snapshot names passed to -diff
- [HDFS-8809] - HDFS fsck reports under construction blocks as "CORRUPT"
- [HDFS-6962] - ACL inheritance conflicts with umaskmode
- [HDFS-10216] - distcp -diff relative path exception
- [HDFS-10722] - Fix race condition in TestEditLog#testBatchedSyncWithClosedLogs
- [HDFS-10423] - Increase default value of httpfs maxHttpHeaderSize
- [HDFS-9038] - DFS reserved space is erroneously counted towards non-DFS used.
- [HDFS-10747] - o.a.h.hdfs.tools.DebugAdmin usage message is misleading
- [HDFS-742] - A down DataNode makes Balancer to hang on repeatingly asking NameNode its partial block list
- [HDFS-9790] - HDFS Balancer should exit with a proper message if upgrade is not finalized
- [HDFS-8923] - Add -source flag to balancer usage message
- [HDFS-10960] - TestDataNodeHotSwapVolumes#testRemoveVolumeBeingWritten fails at disk error verification after volume remove
- [HDFS-10760] - DataXceiver#run() should not log InvalidToken exception as an error
- [HDFS-10962] - TestRequestHedgingProxyProvider is flaky
- [HDFS-10609] - Uncaught InvalidEncryptionKeyException during pipeline recovery may abort downstream applications
- [HDFS-10738] - Fix TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration test failure
- [HDFS-8224] - Schedule a block for scanning if its metadata file is corrupt
- [HDFS-9764] - DistCp doesn't print value for several arguments including -numListstatusThreads
- [HDFS-10879] - TestEncryptionZonesWithKMS#testReadWrite fails intermittently
- [HDFS-10832] - Propagate ACL bit and isEncrypted bit in HttpFS FileStatus permissions
- [HDFS-10641] - TestBlockManager#testBlockReportQueueing fails intermittently
- [HDFS-10599] - DiskBalancer: Execute CLI via Shell
- [HDFS-9781] - FsDatasetImpl#getBlockReports can occasionally throw NullPointerException
- [HDFS-10553] - DiskBalancer: Rename Tools/DiskBalancer class to Tools/DiskBalancerCLI
- [HDFS-9063] - Correctly handle snapshot path for getContentSummary
- [HDFS-10415] - TestDistributedFileSystem#MyDistributedFileSystem attempts to set up statistics before initialize() is called
- [HDFS-10291] - TestShortCircuitLocalRead failing
- [HDFS-10277] - PositionedReadable test testReadFullyZeroByteFile failing in HDFS
- [HDFS-4660] - Block corruption can happen during pipeline recovery
- [HDFS-9696] - Garbage snapshot records lingering forever
- [HDFS-10270] - TestJMXGet:testNameNode() fails
- [HDFS-4210] - Throw helpful exception when DNS entry for JournalNode cannot be resolved
- [HDFS-10567] - Improve plan command help message
- [HDFS-8269] - getBlockLocations() does not resolve the .reserved path and generates incorrect edit logs when updating the atime
- [HDFS-10549] - Correctly revoke file leases when closing files
- [HDFS-7517] - Remove redundant non-null checks in FSNamesystem#getBlockLocations
- [HDFS-10458] - getFileEncryptionInfo should return quickly for non-encrypted cluster
- [HDFS-9958] - BlockManager#createLocatedBlocks can throw NPE for corruptBlocks on failed storages.
- [HDFS-9533] - seen_txid in the shared edits directory is modified during bootstrapping
- [HDFS-9530] - ReservedSpace is not cleared for abandoned Blocks
- [HDFS-8780] - Fetching live/dead datanode list with arg true for removeDecommissionNode,returns list with decom node.
- [HDFS-10691] - FileDistribution fails in hdfs oiv command due to ArrayIndexOutOfBoundsException
- [HDFS-8682] - Should not remove decommissioned node,while calculating the number of live/dead decommissioned node.
- [HDFS-10474] - hftp copy fails when file name with Chinese+special char in branch-2
- [HDFS-10457] - DataNode should not auto-format block pool directory if VERSION is missing
- [HDFS-9413] - getContentSummary() on standby should throw StandbyException
- [HDFS-8897] - Balancer should handle fs.defaultFS trailing slash in HA
- [HDFS-10544] - Balancer doesn't work with IPFailoverProxyProvider
- [HDFS-10693] - metaSave should print blocks, not LightWeightHashSet
- [HDFS-10335] - Mover$Processor#chooseTarget() always chooses the first matching target storage group
- [HDFS-10245] - Fix the findbug warnings in branch-2.7
- [HDFS-10623] - Remove unused import of httpclient.HttpConnection from TestWebHdfsTokens.
- [HDFS-9476] - TestDFSUpgradeFromImage#testUpgradeFromRel1BBWImage occasionally fail
- [HDFS-10347] - Namenode report bad block method doesn't log the bad block or datanode.
- [HDFS-9048] - DistCp documentation is out-of-dated
- [HDFS-9033] - dfsadmin -metasave prints "NaN" for cache used%
- [HDFS-10681] - DiskBalancer: query command should report Plan file path apart from PlanID
- [HDFS-8633] - Fix setting of dfs.datanode.readahead.bytes in hdfs-default.xml to match DFSConfigKeys
- [HDFS-8600] - TestWebHdfsFileSystemContract.testGetFileBlockLocations fails in branch-2.7
- [HDFS-7934] - Update RollingUpgrade rollback documentation: should use bootstrapstandby for standby NN
- [HDFS-10653] - Optimize conversion from path string to components
- [HDFS-10643] - Namenode should use loginUser(hdfs) to generateEncryptedKey
- [HDFS-4176] - EditLogTailer should call rollEdits with a timeout
- [HDFS-10716] - In Balancer, the target task should be removed when its size < 0.
- [HDFS-10336] - TestBalancer failing intermittently because of not reseting UserGroupInformation completely
- [HDFS-6434] - Default permission for creating file should be 644 for WebHdfs/HttpFS
- [HDFS-10588] - False alarm in datanode log - ERROR - Disk Balancer is not enabled
- [HDFS-9276] - Failed to Update HDFS Delegation Token for long running application in HA mode
- [HDFS-10688] - BPServiceActor may run into a tight loop for sending block report when hitting IOException
- [HDFS-10598] - DiskBalancer does not execute multi-steps plan.
- [HDFS-9765] - TestBlockScanner#testVolumeIteratorWithCaching fails intermittently
- [HDFS-9939] - Increase DecompressorStream skip buffer size
- [HDFS-10600] - PlanCommand#getThrsholdPercentage should not use throughput value.
- [HDFS-10512] - VolumeScanner may terminate due to NPE in DataNode.reportBadBlocks
- [HDFS-9137] - DeadLock between DataNode#refreshVolumes and BPOfferService#registrationSucceeded
- [HDFS-9141] - Thread leak in Datanode#refreshVolumes
- [HDFS-9466] - TestShortCircuitCache#testDataXceiverCleansUpSlotsOnFailure is flaky
- [HDFS-10396] - Using -diff option with DistCp may get "Comparison method violates its general contract" exception
- [HDFS-10463] - TestRollingFileSystemSinkWithHdfs needs some cleanup
- [HDFS-6376] - Distcp data between two HA clusters requires another configuration
- [HDFS-10525] - Fix NPE in CacheReplicationMonitor#rescanCachedBlockMap
- [HDFS-8581] - ContentSummary on / skips further counts on yielding lock
- [HDFS-10516] - Fix bug when warming up EDEK cache of more than one encryption zone
- [HDFS-10481] - HTTPFS server should correctly impersonate as end user to open file
- [HDFS-10360] - DataNode may format directory and lose blocks if current/VERSION is missing
- [HDFS-10381] - DataStreamer DataNode exclusion log message should be warning
- [HDFS-10324] - Trash directory in an encryption zone should be pre-created with correct permissions
- [HDFS-10372] - Fix for failing TestFsDatasetImpl#testCleanShutdownOfVolume
- [HDFS-9428] - Fix intermittent failure of TestDNFencing.testQueueingWithAppend
- [HDFS-8142] - DistributedFileSystem encryption zone commands should resolve relative paths
- [HDFS-9917] - IBR accumulate more objects when SNN was down for sometime.
- [HDFS-10344] - DistributedFileSystem#getTrashRoots should skip encryption zone that does not have .Trash
- [HDFS-10260] - TestFsDatasetImpl#testCleanShutdownOfVolume often fails
- [HDFS-9874] - Long living DataXceiver threads cause volume shutdown to block.
- [HDFS-9812] - Streamer threads leak if failure happens when closing DFSOutputStream
- [HDFS-2043] - TestHFlush failing intermittently
- [HDFS-9478] - Reason for failing ipc.FairCallQueue contruction should be thrown
- [HDFS-10186] - DirectoryScanner: Improve logs by adding full path of both actual and expected block directories
- [HDFS-10320] - Rack failures may result in NN terminate
- [HDFS-10178] - Permanent write failures can happen if pipeline recoveries occur for the first packet
- [HDFS-10271] - Extra bytes are getting released from reservedSpace for append
- [HDFS-7261] - storageMap is accessed without synchronization in DatanodeDescriptor#updateHeartbeatState()
- [HDFS-10182] - Hedged read might overwrite user's buf
- [HDFS-9555] - LazyPersistFileScrubber should still sleep if there are errors in the clear progress
- [HDFS-10319] - Balancer should not try to pair storages with different types
- [HDFS-10239] - Fsshell mv fails if port usage doesn't match in src and destination paths
- [HDFS-9752] - Permanent write failures may happen to slow writers during datanode rolling upgrades
- [HDFS-2956] - calling fetchdt without a --renewer argument throws NPE
- [HDFS-3519] - Checkpoint upload may interfere with a concurrent saveNamespace
- [HDFS-9589] - Block files which have been hardlinked should be duplicated before the DataNode appends to the them
- [HDFS-6520] - hdfs fsck -move passes invalid length value when creating BlockReader
- [HDFS-10267] - Extra "synchronized" on FsDatasetImpl#recoverAppend and FsDatasetImpl#recoverClose
- [HDFS-8496] - Calling stopWriter() with FSDatasetImpl lock held may block other threads
- [HDFS-8855] - Webhdfs client leaks active NameNode connections
- [HDFS-9730] - Storage ID update does not happen when there is a layout change
- [HDFS-10223] - peerFromSocketAndKey performs SASL exchange before setting connection timeouts
- [HDFS-10197] - TestFsDatasetCache failing intermittently due to timeout
- [HDFS-7452] - skip StandbyException log for getCorruptFiles()
- [HDFS-7166] - SbNN Web UI shows #Under replicated blocks and #pending deletion blocks
- [HDFS-8211] - DataNode UUID is always null in the JMX counter
- [HDFS-9947] - Block#toString should not output information from derived classes
- [HDFS-7697] - Mark the PB OIV tool as experimental
- [HDFS-9766] - TestDataNodeMetrics#testDataNodeTimeSpend fails intermittently
- [HDFS-9881] - DistributedFileSystem#getTrashRoot returns incorrect path for encryption zones
- [HDFS-9858] - RollingFileSystemSink can throw an NPE on non-secure clusters
- [HDFS-9844] - Correct path creation in getTrashRoot to handle root dir
- [HDFS-9549] - TestCacheDirectives#testExceedsCapacity is flaky
- [HDFS-9584] - NPE in distcp when ssl configuration file does not exist in class path.
- [HDFS-9608] - Disk IO imbalance in HDFS with heterogeneous storages
- [HDFS-9799] - Reimplement getCurrentTrashDir to remove incompatibility
- [HDFS-9780] - RollingFileSystemSink doesn't work on secure clusters
- [HDFS-8153] - Error Message points to wrong parent directory in case of path component name length error
- [HDFS-9426] - Rollingupgrade finalization is not backward compatible
- [HDFS-9294] - DFSClient deadlock when close file and failed to renew lease
- [HDFS-7725] - Incorrect "nodes in service" metrics caused all writes to fail
- [HDFS-8767] - RawLocalFileSystem.listStatus() returns null for UNIX pipefile
- [HDFS-9305] - Delayed heartbeat processing causes storm of subsequent heartbeats
- [HDFS-9178] - Slow datanode I/O can cause a wrong node to be marked bad
- [HDFS-9634] - webhdfs client side exceptions don't provide enough details
- [HDFS-9661] - Deadlock in DN.FsDatasetImpl between moveBlockAcrossStorage and createRbw
- [HDFS-7163] - WebHdfsFileSystem should retry reads according to the configured retry policy.
- [HDFS-4937] - ReplicationMonitor can infinite-loop in BlockPlacementPolicyDefault#chooseRandom()
- [HDFS-8676] - Delayed rolling upgrade finalization can cause heartbeat expiration and write failures
- [HDFS-9106] - Transfer failure during pipeline recovery causes permanent write failures
- [HDFS-5215] - dfs.datanode.du.reserved is not considered while computing available space
- [HDFS-9600] - do not check replication if the block is under construction
- [HDFS-8576] - Lease recovery should return true if the lease can be released and the file can be closed
- [HDFS-8595] - TestCommitBlockSynchronization fails in branch-2.7
- [HDFS-9574] - Reduce client failures during datanode restart
- [HDFS-8522] - Change heavily recorded NN logs from INFO to DEBUG level
- [HDFS-9701] - DN may deadlock when hot-swapping under load
- [HDFS-9406] - FSImage may get corrupted after deleting snapshot
- [HDFS-9514] - TestDistributedFileSystem.testDFSClientPeerWriteTimeout failing; exception being swallowed
- [HDFS-8785] - TestDistributedFileSystem is failing in trunk
- [HDFS-9289] - Make DataStreamer#block thread safe and verify genStamp in commitBlock
- [HDFS-7373] - Clean up temporary files after fsimage transfer failures
- [HDFS-9612] - DistCp worker threads are not terminated after jobs are done.
- [HDFS-8311] - DataStreamer.transfer() should timeout the socket InputStream.
- [HDFS-9655] - NN should start JVM pause monitor before loading fsimage
- [HDFS-9347] - Invariant assumption in TestQuorumJournalManager.shutdown() is wrong
- [HDFS-9458] - TestBackupNode always binds to port 50070, which can cause bind failures.
- [HDFS-9174] - Fix findbugs warnings in FSOutputSummer.tracer and DirectoryScanner$ReportCompiler.currentThread
- [HDFS-9274] - Default value of dfs.datanode.directoryscan.throttle.limit.ms.per.sec should be consistent
- [HDFS-9176] - TestDirectoryScanner#testThrottling often fails.
- [HDFS-7819] - Log WARN message for the blocks which are not in Block ID based layout
- [HDFS-9619] - SimulatedFSDataset sometimes can not find blockpool for the correct namenode
- [HDFS-9356] - Decommissioning node does not have Last Contact value in the UI
- [HDFS-9357] - NN UI renders icons of decommissioned DN incorrectly
- [HDFS-8779] - WebUI fails to display block IDs that are larger than 2^53 - 1
- [HDFS-9193] - Fix incorrect references the usages of the DN in dfshealth.js
- [HDFS-8388] - Time and Date format need to be in sync in NameNode UI page
- [HDFS-8292] - Move conditional in fmt_time from dfs-dust.js to status.html
- [HDFS-8214] - Secondary NN Web UI shows wrong date for Last Checkpoint
- [HDFS-7953] - NN Web UI fails to navigate to paths that contain #
- [HDFS-8470] - fsimage loading progress should update inode, delegation token and cache pool count.
- [HDFS-9358] - TestNodeCount#testNodeCount timed out
- [HDFS-9565] - TestDistributedFileSystem.testLocatedFileStatusStorageIdsTypes is flaky due to race condition
- [HDFS-9445] - Datanode may deadlock while handling a bad volume
- [HDFS-9519] - Some coding improvement inSecondaryNameNode#main
- [HDFS-6348] - SecondaryNameNode not terminating properly on runtime exceptions
- [HDFS-6533] - TestBPOfferService#testBasicFunctionalitytest fails intermittently
- [HDFS-9313] - Possible NullPointerException in BlockManager if no excess replica can be chosen
- [HDFS-9083] - Replication violates block placement policy.
- [HDFS-9073] - Fix failures in TestLazyPersistLockedMemory#testReleaseOnEviction
- [HDFS-9072] - Fix random failures in TestJMXGet
- [HDFS-9067] - o.a.h.hdfs.server.datanode.fsdataset.impl.TestLazyWriter is failing in trunk
- [HDFS-9470] - Encryption zone on root not loaded from fsimage after NN restart
- [HDFS-9220] - Reading small file (< 512 bytes) that is open for append fails due to incorrect checksum
- [HDFS-6101] - TestReplaceDatanodeOnFailure fails occasionally
- [HDFS-6694] - TestPipelinesFailover.testPipelineRecoveryStress tests fail intermittently with various symptoms
- [HDFS-9236] - Missing sanity check for block size during block recovery
- [HDFS-8772] - Fix TestStandbyIsHot#testDatanodeRestarts which occasionally fails
- [HDFS-9364] - Unnecessary DNS resolution attempts when creating NameNodeProxies
- [HDFS-9249] - NPE is thrown if an IOException is thrown in NameNode constructor
- [HDFS-9351] - checkNNStartup() need to be called when fsck calls FSNamesystem.getSnapshottableDirs()
- [HDFS-9231] - fsck doesn't list correct file path when Bad Replicas/Blocks are in a snapshot
- [HDFS-1522] - Merge Block.BLOCK_FILE_PREFIX and DataStorage.BLOCK_FILE_PREFIX into one constant
- [HDFS-9329] - TestBootstrapStandby#testRateThrottling is flaky because fsimage size is smaller than IO buffer size
- [HDFS-9332] - Fix Precondition failures from NameNodeEditLogRoller while saving namespace
- [HDFS-9268] - fuse_dfs chown crashes when uid is passed as -1
- [HDFS-7798] - Checkpointing failure caused by shared KerberosAuthenticator
- [HDFS-9290] - DFSClient#callAppend() is not backward compatible for slightly older NameNodes
- [HDFS-3059] - ssl-server.xml causes NullPointer
- [HDFS-9286] - HttpFs does not parse ACL syntax correctly for operation REMOVEACLENTRIES
- [HDFS-9273] - ACLs on root directory may be lost after NN restart
- [HDFS-6753] - Initialize checkDisk when DirectoryScanner not able to get files list for scanning
- [HDFS-7533] - Datanode sometimes does not shutdown on receiving upgrade shutdown command
- [HDFS-7282] - Fix intermittent TestShortCircuitCache and TestBlockReaderFactory failures resulting from TemporarySocketDirectory GC
- [HDFS-9100] - HDFS Balancer does not respect dfs.client.use.datanode.hostname
- [HDFS-9147] - Fix the setting of visibleLength in ExternalBlockReader
- [HDFS-7609] - Avoid retry cache collision when Standby NameNode loading edits
- [HDFS-8270] - create() always retried with hardcoded timeout when file already exists with open lease
- [HDFS-9092] - Nfs silently drops overlapping write requests and causes data copying to fail
- [HDFS-9001] - DFSUtil.getNsServiceRpcUris() can return too many entries in a non-HA, non-federated cluster
- [HDFS-8245] - Standby namenode doesn't process DELETED_BLOCK if the add block request is in edit log.
- [HDFS-9107] - Prevent NN's unrecoverable death spiral after full GC
- [HDFS-8273] - FSNamesystem#Delete() should not call logSync() when holding the lock
- [HDFS-8219] - setStoragePolicy with folder behavior is different after cluster restart
- [HDFS-7885] - Datanode should not trust the generation stamp provided by client
- [HDFS-8147] - Mover should not schedule two replicas to the same DN storage
- [HDFS-7999] - FsDatasetImpl#createTemporary sometimes holds the FSDatasetImpl lock for a very long time
- [HDFS-7930] - commitBlockSynchronization() does not remove locations
- [HDFS-7871] - NameNodeEditLogRoller can keep printing "Swallowing exception" message
- [HDFS-6833] - DirectoryScanner should not register a deleting block with memory of DataNode
- [HDFS-9080] - update htrace version to 4.0.1
- [HDFS-7425] - NameNode block deletion logging uses incorrect appender.
- [HDFS-8568] - TestClusterId#testFormatWithEmptyClusterIdOption is failing
- [HDFS-7733] - NFS: readdir/readdirplus return null directory attribute on failure
- [HDFS-7763] - fix zkfc hung issue due to not catching exception in a corner case
- [HDFS-9133] - ExternalBlockReader and ReplicaAccessor need to return -1 on read when at EOF
- [HDFS-8046] - Allow better control of getContentSummary
- [HDFS-8950] - NameNode refresh doesn't remove DataNodes that are no longer in the allowed list
- [HDFS-8429] - Avoid stuck threads if there is an error in DomainSocketWatcher that stops the thread
- [HDFS-9123] - Copying from the root to a subdirectory should be forbidden
- [HDFS-8626] - Reserved RBW space is not released if creation of RBW File fails
- [HDFS-8863] - The remaining space check in BlockPlacementPolicyDefault is flawed
- [HDFS-7009] - Active NN and standby NN have different live nodes
- [HDFS-8226] - Non-HA rollback compatibility broken
- [HDFS-7997] - The first non-existing xattr should also throw IOException
- [HDFS-7714] - Simultaneous restart of HA NameNodes and DataNode can cause DataNode to register successfully with only one NameNode.
- [HDFS-7742] - favoring decommissioning node for replication can cause a block to stay underreplicated for long periods
- [HDFS-8451] - DFSClient probe for encryption testing interprets empty URI property for "enabled"
- [HDFS-8404] - Pending block replication can get stuck using older genstamp
- [HDFS-6945] - BlockManager should remove a block from excessReplicateMap and decrement ExcessBlocks metric when the block is removed
- [HDFS-8321] - CacheDirectives and CachePool operations should throw RetriableException in safemode
- [HDFS-8174] - Update replication count to live rep count in fsck report
- [HDFS-6291] - FSImage may be left unclosed in BootstrapStandby#doRun()
- [HDFS-3384] - DataStreamer thread should be closed immediatly when failed to setup a PipelineForAppendOrRecovery
- [HDFS-8179] - DFSClient#getServerDefaults returns null within 1 hour of system start
- [HDFS-7785] - Improve diagnostics information for HttpPutFailedException
- [HDFS-8001] - RpcProgramNfs3 : wrong parsing of dfs.blocksize
- [HDFS-8995] - Flaw in registration bookeeping can make DN die on reconnect
- [HDFS-8551] - Fix hdfs datanode CLI usage message
- [HDFS-8552] - Fix hdfs CLI usage message for namenode and zkfc
- [HDFS-8565] - Typo in dfshealth.html - "Decomissioning"
- [HDFS-5802] - NameNode does not check for inode type before traversing down a path
- [HDFS-8407] - hdfsListDirectory must set errno to 0 on success
- [HDFS-7807] - libhdfs htable.c: fix htable resizing, add unit test
- [HDFS-8051] - FsVolumeList#addVolume should release volume reference if not put it into BlockScanner.
- [HDFS-8486] - DN startup may cause severe data loss
- [HDFS-7637] - Fix the check condition for reserved path
- [HDFS-8964] - When validating the edit log, do not read at or beyond the file offset that is being written
- [HDFS-7916] - 'reportBadBlocks' from datanodes to standby Node BPServiceActor goes for infinite loop
- [HDFS-8846] - Add a unit test for INotify functionality across a layout version upgrade
- [HDFS-7929] - inotify unable fetch pre-upgrade edit log segments once upgrade starts
- [HDFS-8845] - DiskChecker should not traverse the entire tree
- [HDFS-8930] - Block report lease may leak if the 2nd full block report comes when NN is still in safemode
- [HDFS-8867] - Enable optimized block reports
- [HDFS-8596] - TestDistributedFileSystem et al tests are broken in branch-2 due to incorrect setting of "datanode" attribute
- [HDFS-8572] - DN always uses HTTP/localhost@REALM principals in SPNEGO
- [HDFS-8173] - NPE thrown at DataNode shutdown when HTTP server was not able to create
- [HDFS-7816] - Unable to open webhdfs paths with "+"
- [HDFS-7945] - The WebHdfs system on DN does not honor the length parameter
- [HDFS-7818] - OffsetParam should return the default value instead of throwing NPE when the value is unspecified
- [HDFS-6662] - WebHDFS cannot open a file if its path contains "%"
- [HDFS-7406] - SimpleHttpProxyHandler puts incorrect "Connection: Close" header
- [HDFS-7277] - Remove explicit dependency on netty 3.2 in BKJournal
- [HDFS-8055] - NullPointerException when topology script is missing.
- [HDFS-8548] - Minicluster throws NPE on shutdown
- [HDFS-8850] - VolumeScanner thread exits with exception if there is no block pool to be scanned but there are suspicious blocks
- [HDFS-8806] - Inconsistent metrics: number of missing blocks with replication factor 1 not properly cleared
- [HDFS-8163] - Using monotonicNow for block report scheduling causes test failures on recently restarted systems
- [HDFS-7608] - hdfs dfsclient newConnectedPeer has no write timeout
- [HDFS-8151] - Always use snapshot path as source when invalid snapshot names are used for diff based distcp
- [HDFS-8036] - Use snapshot path as source when using snapshot diff report in DistCp
- [HDFS-8072] - Reserved RBW space is not released if client terminates while writing block
- [HDFS-7501] - TransactionsSinceLastCheckpoint can be negative on SBNs
- [HDFS-7932] - Speed up the shutdown of datanode during rolling upgrade
- [HDFS-8358] - TestTraceAdmin fails
- [HDFS-7194] - Fix findbugs "inefficient new String constructor" warning in DFSClient#PATH
- [HDFS-8213] - DFSClient should use hdfs.client.htrace HTrace configuration prefix rather than hadoop.htrace
- [HDFS-8063] - Fix intermittent test failures in TestTracing
- [HDFS-8026] - Trace FSOutputSummer#writeChecksumChunks rather than DFSOutputStream#writeChunk
- [HDFS-7963] - Fix expected tracing spans in TestTracing along with HDFS-7054
- [HDFS-7198] - Fix findbugs "unchecked conversion" warning in DFSClient#getPathTraceScope
- [HDFS-7227] - Fix findbugs warning about NP_DEREFERENCE_OF_READLINE_VALUE in SpanReceiverHost
- [HDFS-7202] - Should be able to omit package name of SpanReceiver on "hadoop trace -add"
- [HDFS-8681] - BlockScanner is incorrectly disabled by default
- [HDFS-8656] - Preserve compatibility of ClientProtocol#rollingUpgrade after finalization
- [HDFS-7894] - Rolling upgrade readiness is not updated in jmx until query command is issued.
- [HDFS-8665] - Fix replication check in DFSTestUtils#waitForReplication
- [HDFS-8646] - Prune cached replicas from DatanodeDescriptor state on replica invalidation
- [HDFS-7990] - IBR delete ack should not be delayed
- [HDFS-8337] - Accessing httpfs via webhdfs doesn't work from a jar with kerberos
- [HDFS-6300] - Prevent multiple balancers from running simultaneously
- [HDFS-7833] - DataNode reconfiguration does not recalculate valid volumes required, based on configured failed volumes tolerated.
- [HDFS-7722] - DataNode#checkDiskError should also remove Storage when error is found.
- [HDFS-8380] - Always call addStoredBlock on blocks which have been shifted from one storage to another
- [HDFS-7980] - Incremental BlockReport will dramatically slow down the startup of a namenode
- [HDFS-7566] - Remove obsolete entries from hdfs-default.xml
- [HDFS-8305] - HDFS INotify: the destination field of RenameOp should always end with the file name
- [HDFS-7869] - Inconsistency in the return information while performing rolling upgrade
- [HDFS-8127] - NameNode Failover during HA upgrade can cause DataNode to finalize upgrade
- [HDFS-3443] - Fix NPE when namenode transition to active during startup by adding checkNNStartup() in NameNodeRpcServer
- [HDFS-6673] - Add delimited format support to PB OIV tool
- [HDFS-7884] - NullPointerException in BlockSender
- [HDFS-4448] - Allow HA NN to start in secure mode with wildcard address configured
- [HDFS-7915] - The DataNode can sometimes allocate a ShortCircuitShm slot and fail to tell the DFSClient about it because of a network error
- [HDFS-7931] - DistributedFIleSystem should not look for keyProvider in cache if Encryption is disabled
- [HDFS-8099] - Change "DFSInputStream has been closed already" message to debug log level
- [HDFS-7996] - After swapping a volume, BlockReceiver reports ReplicaNotFoundException
- [HDFS-7587] - Edit log corruption can happen if append fails with a quota violation
- [HDFS-7961] - Trigger full block report after hot swapping disk
- [HDFS-7960] - The full block report should prune zombie storages even if they're not empty
- [HDFS-7881] - TestHftpFileSystem#testSeek fails in branch-2
- [HDFS-7596] - NameNode should prune dead storages from storageMap
- [HDFS-7756] - Restore method signature for LocatedBlock#getLocations()
- [HDFS-7647] - DatanodeManager.sortLocatedBlocks sorts DatanodeInfos but not StorageIDs
- [HDFS-7788] - Post-2.6 namenode may not start up with an image containing inodes created with an old release.
- [HDFS-7830] - DataNode does not release the volume lock when adding a volume fails.
- [HDFS-7575] - Upgrade should generate a unique storage ID for each volume
- [HDFS-7682] - {{DistributedFileSystem#getFileChecksum}} of a snapshotted file includes non-snapshotted content
- [HDFS-7389] - Named user ACL cannot stop the user from accessing the FS entity.
- [HDFS-7719] - BlockPoolSliceStorage#removeVolumes fails to remove some in-memory state associated with volumes
- [HDFS-6651] - Deletion failure can leak inodes permanently
- [HDFS-7611] - deleteSnapshot and delete of a file can leave orphaned blocks in the blocksMap on NameNode restart.
- [HDFS-7470] - SecondaryNameNode need twice memory when calling reloadFromImageFile
- [HDFS-7615] - Remove longReadLock
- [HDFS-7686] - Re-add rapid rescan of possibly corrupt block feature to the block scanner
- [HDFS-7721] - The HDFS BlockScanner may run fast during the first hour
- [HDFS-7734] - Class cast exception in NameNode#main
- [HDFS-7632] - MiniDFSCluster configures DataNode data directories incorrectly if using more than 1 DataNode and more than 2 storage locations per DataNode.
- [HDFS-7548] - Corrupt block reporting delayed until datablock scanner thread detects it
- [HDFS-7097] - Allow block reports to be processed during checkpointing on standby name node
- [HDFS-7457] - DatanodeID generates excessive garbage
- [HDFS-7603] - The background replication queue initialization may not let others run
- [HDFS-7443] - Datanode upgrade to BLOCKID_BASED_LAYOUT fails if duplicate block files are present in the same volume
- [HDFS-7704] - DN heartbeat to Active NN may be blocked and expire if connection to Standby NN continues to time out.
- [HDFS-7496] - Fix FsVolume removal race conditions on the DataNode by reference-counting the volume instances
- [HDFS-7366] - BlockInfo should take replication as an short in the constructor
- [HDFS-7358] - Clients may get stuck waiting when using ByteArrayManager
- [HDFS-7552] - change FsVolumeList toString() to fix TestDataNodeVolumeFailureToleration
- [HDFS-7503] - Namenode restart after large deletions can cause slow processReport (due to logging)
- [HDFS-7315] - DFSTestUtil.readFileBuffer opens extra FSDataInputStream
- [HDFS-7399] - Lack of synchronization in DFSOutputStream#Packet#getLastByteOffsetBlock()
- [HDFS-7741] - Remove unnecessary synchronized in FSDataInputStream and HdfsDataInputStream
- [HDFS-7610] - Fix removal of dynamically added DN volumes
- [HDFS-7494] - Checking of closed in DFSInputStream#pread() should be protected by synchronization
- [HDFS-7263] - Snapshot read can reveal future bytes for appended files.
- [HDFS-7698] - Fix locking on HDFS read statistics and add a method for clearing them.
- [HDFS-7744] - Fix potential NPE in DFSInputStream after setDropBehind or setReadahead is called
- [HDFS-7530] - Allow renaming of encryption zone roots
- [HDFS-7423] - various typos and message formatting fixes in nfs daemon and doc
- [HDFS-7707] - Edit log corruption due to delayed block removal again
- [HDFS-7718] - Store KeyProvider in ClientContext to avoid leaking key provider threads when using FileContext
- [HDFS-6425] - Large postponedMisreplicatedBlocks has impact on blockReport latency
- [HDFS-7560] - ACLs removed by removeDefaultAcl() will be back after NameNode restart/failover
- [HDFS-7497] - Inconsistent report of decommissioning DataNodes between dfsadmin and NameNode webui
- [HDFS-7489] - Incorrect locking in FsVolumeList#checkDirs can hang datanodes
- [HDFS-6917] - Add an hdfs debug command to validate blocks, call recoverlease, etc.
- [HDFS-7146] - NFS ID/Group lookup requires SSSD enumeration on the server
- [HDFS-4882] - Prevent the Namenode's LeaseManager from looping forever in checkLeases
- [HDFS-7225] - Remove stale block invalidation work when DN re-registers with different UUID
- [HDFS-7213] - processIncrementalBlockReport performance degradation
- [HDFS-7235] - DataNode#transferBlock should report blocks that don't exist using reportBadBlock
- [HDFS-7301] - TestMissingBlocksAlert should use MXBeans instead of old web UI
- [HDFS-6824] - Additional user documentation for HDFS encryption.
- [HDFS-7209] - Populate EDEK cache when creating encryption zone
- [HDFS-5685] - DistCp will fail to copy with -delete switch
- [HDFS-7077] - Separate CipherSuite from crypto protocol version
- [HDFS-6776] - Using distcp to copy data between insecure and secure cluster via webdhfs doesn't work
Improvement
- [HDFS-8818] - Allow Balancer to run faster
- [HDFS-11384] - Add option for balancer to disperse getBlocks calls to avoid NameNode's rpc.CallQueueLength spike
- [HDFS-8541] - Mover should exit with NO_MOVE_PROGRESS if there is no move progress
- [HDFS-8540] - Mover should exit with NO_MOVE_BLOCK if no block can be moved
- [HDFS-12642] - Log block and datanode details in BlockRecoveryWorker
- [HDFS-8865] - Improve quota initialization performance
- [HDFS-11345] - Document the configuration key for FSNamesystem lock fairness
- [HDFS-10480] - Add an admin command to list currently open files
- [HDFS-11210] - Enhance key rolling to guarantee new KeyVersion is returned from generateEncryptedKeys after a key is rolled
- [HDFS-9503] - Replace -namenode option with -fs for NNThroughputBenchmark
- [HDFS-9421] - NNThroughputBenchmark replication test NPE with -namenode option
- [HDFS-12042] - Lazy initialize AbstractINodeDiffList#diffs for snapshots to reduce memory consumption
- [HDFS-12078] - Add time unit to the description of property dfs.namenode.stale.datanode.interval in hdfs-default.xml
- [HDFS-11891] - DU#refresh should print the path of the directory when an exception is caught
- [HDFS-11881] - NameNode consumes a lot of memory for snapshot diff report generation
- [HDFS-11861] - ipc.Client.Connection#sendRpcRequest should log request name
- [HDFS-11914] - Add more diagnosis info for fsimage transfer failure.
- [HDFS-11383] - Intern strings in BlockLocation and ExtendedBlock
- [HDFS-8674] - Improve performance of postponed block scans
- [HDFS-11421] - Make WebHDFS' ACLs RegEx configurable
- [HDFS-11579] - Make HttpFS Tomcat SSL property sslEnabledProtocols and clientAuth configurable
- [HDFS-7433] - Optimize performance of DatanodeManager's node map
- [HDFS-11816] - Update default SSL cipher list for HttpFS
- [HDFS-2219] - Fsck should work with fully qualified file paths.
- [HDFS-11687] - Add new public encryption APIs required by Hive
- [HDFS-11531] - Expose hedged read metrics via libHDFS API
- [HDFS-11402] - HDFS Snapshots should capture point-in-time copies of OPEN files
- [HDFS-6757] - Simplify lease manager with INodeID
- [HDFS-10300] - TestDistCpSystem should share MiniDFSCluster
- [HDFS-11466] - Change dfs.namenode.write-lock-reporting-threshold-ms default from 1000ms to 5000ms
- [HDFS-11418] - HttpFS should support old SSL clients
- [HDFS-10896] - Move lock logging logic from FSNamesystem into FSNamesystemLock
- [HDFS-10817] - Add Logging for Long-held NN Read Locks
- [HDFS-10798] - Make the threshold of reporting FSNamesystem lock contention configurable
- [HDFS-9145] - Tracking methods that hold FSNamesytemLock for too long
- [HDFS-8883] - NameNode Metrics : Add FSNameSystem lock Queue Length
- [HDFS-10534] - NameNode WebUI should display DataNode usage histogram
- [HDFS-10941] - Improve BlockManager#processMisReplicatesAsync log
- [HDFS-11390] - Add process name to httpfs process
- [HDFS-8640] - Make reserved RBW space visible through JMX
- [HDFS-10292] - Add block id when client got Unable to close file exception
- [HDFS-11292] - log lastWrittenTxId etc info in logSyncAll
- [HDFS-11306] - Print remaining edit logs from buffer if edit log can't be rolled.
- [HDFS-11275] - Check groupEntryIndex and throw a helpful exception on failures when removing ACL.
- [HDFS-9205] - Do not schedule corrupt blocks for replication
- [HDFS-10807] - Doc about upgrading to a version of HDFS with snapshots may be confusing
- [HDFS-11069] - Tighten the authorization of datanode RPC
- [HDFS-9019] - Adding informative message to sticky bit permission denied exception
- [HDFS-7933] - fsck should also report decommissioning replicas.
- [HDFS-7537] - fsck is confusing when dfs.namenode.replication.min > 1 && missing replicas && NN restart
- [HDFS-10756] - Expose getTrashRoot to HTTPFS and WebHDFS
- [HDFS-10823] - Implement HttpFSFileSystem#listStatusIterator
- [HDFS-10837] - Standardize serializiation of WebHDFS DirectoryListing
- [HDFS-10784] - Implement WebHdfsFileSystem#listStatusIterator
- [HDFS-6565] - Use jackson instead jetty json in hdfs-client
- [HDFS-7384] - 'getfacl' command and 'getAclStatus' output should be in sync
- [HDFS-9223] - Code cleanup for DatanodeDescriptor and HeartbeatManager
- [HDFS-11120] - TestEncryptionZones should waitActive
- [HDFS-11080] - Update HttpFS to use ConfigRedactor
- [HDFS-9951] - Use string constants for XML tags in OfflineImageReconstructor
- [HDFS-10298] - Document the usage of distcp -diff option
- [HDFS-9638] - Improve DistCp Help and documentation
- [HDFS-9630] - DistCp minor refactoring and clean up
- [HDFS-7411] - Refactor and improve decommissioning logic into DecommissionManager
- [HDFS-9005] - Provide configuration support for upgrade domain
- [HDFS-7964] - Add support for async edit logging
- [HDFS-7413] - Some unit tests should use NameNodeProtocols instead of FSNameSystem
- [HDFS-8709] - Clarify automatic sync in FSEditLog#logEdit
- [HDFS-11009] - Add a tool to reconstruct block meta file from CLI
- [HDFS-10628] - Log HDFS Balancer exit message to its own log
- [HDFS-9214] - Support reconfiguring dfs.datanode.balance.max.concurrent.moves without DN restart
- [HDFS-8826] - Balancer may not move blocks efficiently in some cases
- [HDFS-9257] - improve error message for "Absolute path required" in INode.java to contain the rejected path
- [HDFS-11012] - Unnecessary INFO logging on DFSClients for InvalidToken
- [HDFS-10963] - Reduce log level when network topology cannot find enough datanodes.
- [HDFS-2390] - dfsadmin -setBalancerBandwidth doesnot validate -ve value
- [HDFS-10876] - Dispatcher#dispatch should log IOException stacktrace
- [HDFS-10875] - Optimize du -x to cache intermediate result
- [HDFS-4396] - Add START_MSG/SHUTDOWN_MSG for ZKFC
- [HDFS-7478] - Move org.apache.hadoop.hdfs.server.namenode.NNConf to FSNamesystem
- [HDFS-7420] - Delegate permission checks to FSDirectory
- [HDFS-7415] - Move FSNameSystem.resolvePath() to FSDirectory
- [HDFS-9601] - NNThroughputBenchmark.BlockReportStats should handle NotReplicatedYetException on adding block
- [HDFS-8986] - Add option to -du to calculate directory space usage excluding snapshots
- [HDFS-10822] - Log DataNodes in the write pipeline
- [HDFS-10625] - VolumeScanner to report why a block is found bad
- [HDFS-9906] - Remove spammy log spew when a datanode is restarted
- [HDFS-7463] - Simplify FSNamesystem#getBlockLocationsUpdateTimes
- [HDFS-8101] - DFSClient use of non-constant DFSConfigKeys pulls in WebHDFS classes at runtime
- [HDFS-8521] - Add @VisibleForTesting annotation to {{BlockPoolSlice#selectReplicaToDelete}}
- [HDFS-2580] - NameNode#main(...) can make use of GenericOptionsParser.
- [HDFS-10703] - HA NameNode Web UI should show last checkpoint time
- [HDFS-10225] - DataNode hot swap drives should disallow storage type changes.
- [HDFS-9805] - TCP_NODELAY not set before SASL handshake in data transfer pipeline
- [HDFS-9700] - DFSClient and DFSOutputStream should set TCP_NODELAY on sockets for DataTransferProtocol
- [HDFS-9782] - RollingFileSystemSink should have configurable roll interval
- [HDFS-7597] - DelegationTokenIdentifier should cache the TokenIdentifier to UGI mapping
- [HDFS-9732] - Improve DelegationTokenIdentifier.toString() for better logging
- [HDFS-9085] - Show renewer information in DelegationTokenIdentifier#toString
- [HDFS-9259] - Make SO_SNDBUF size configurable at DFSClient side for hdfs write scenario
- [HDFS-8829] - Make SO_RCVBUF and SO_SNDBUF size configurable for DataTransferProtocol sockets and allow configuring auto-tuning
- [HDFS-3702] - Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client
- [HDFS-9198] - Coalesce IBR processing in the NN
- [HDFS-7600] - Refine hdfs admin classes to reuse common code
- [HDFS-9710] - Change DN to send block receipt IBRs in batches
- [HDFS-9726] - Refactor IBR code to a new class
- [HDFS-4946] - Allow preferLocalNode in BlockPlacementPolicyDefault to be configurable
- [HDFS-8946] - Improve choosing datanode storage for block placement
- [HDFS-8884] - Fail-fast check in BlockPlacementPolicyDefault#chooseTarget
- [HDFS-8131] - Implement a space balanced block placement policy
- [HDFS-8073] - Split BlockPlacementPolicyDefault.chooseTarget(..) so it can be easily overrided.
- [HDFS-8713] - Convert DatanodeDescriptor to use SLF4J logging
- [HDFS-10264] - Logging improvements in FSImageFormatProtobuf.Saver
- [HDFS-9629] - Update the footer of Web UI to show year 2016
- [HDFS-9669] - TcpPeerServer should respect ipc.server.listen.queue.size
- [HDFS-10297] - Increase default balance bandwidth and concurrent moves
- [HDFS-8578] - On upgrade, Datanode should process all storage/data dirs in parallel
- [HDFS-9405] - Warmup NameNode EDEK caches in background thread
- [HDFS-9157] - [OEV and OIV] : Unnecessary parsing for mandatory arguements if "-h" option is specified as the only option
- [HDFS-9159] - [OIV] : return value of the command is not correct if invalid value specified in "-p (processor)" option
- [HDFS-6133] - Make Balancer support exclude specified path
- [HDFS-8831] - Trash Support for deletion in HDFS encryption zone
- [HDFS-9795] - OIV Delimited should show which files are ACL-enabled.
- [HDFS-9637] - Tests for RollingFileSystemSink
- [HDFS-7962] - Remove duplicated logs in BlockManager
- [HDFS-9541] - Add hdfsStreamBuilder API to libhdfs to support defaultBlockSizes greater than 2 GB
- [HDFS-9260] - Improve the performance and GC friendliness of NameNode startup and full block reports
- [HDFS-8143] - HDFS Mover tool should exit after some retry when failed to move blocks.
- [HDFS-9221] - HdfsServerConstants#ReplicaState#getState should avoid calling values() since it creates a temporary array
- [HDFS-9264] - Minor cleanup of operations on FsVolumeList#volumes
- [HDFS-9715] - Check storage ID uniqueness on datanode startup
- [HDFS-9721] - Allow Delimited PB OIV tool to run upon fsimage that contains INodeReference
- [HDFS-9350] - Avoid creating temprorary strings in Block.toString() and getBlockName()
- [HDFS-9576] - HTrace: collect position/length information on read operations
- [HDFS-9434] - Recommission a datanode with 500k blocks may pause NN for 30 seconds
- [HDFS-6888] - Allow selectively audit logging ops
- [HDFS-9569] - Log the name of the fsimage being loaded for better supportability
- [HDFS-6249] - Output AclEntry in PBImageXmlWriter
- [HDFS-8873] - Allow the directoryScanner to be rate-limited
- [HDFS-6407] - Add sorting and pagination in the datanode tab of the NN Web UI
- [HDFS-8816] - Improve visualization for the Datanode tab in the NN UI
- [HDFS-7483] - Display information per tier on the Namenode UI
- [HDFS-7390] - Provide JMX metrics per storage type
- [HDFS-8209] - Support different number of datanode directories in MiniDFSCluster.
- [HDFS-7832] - Show 'Last Modified' in Namenode's 'Browse Filesystem'
- [HDFS-7683] - Combine usages and percent stats in NameNode UI
- [HDFS-9160] - [OIV-Doc] : Missing details of "delimited" for processor options
- [HDFS-9281] - Change TestDeleteBlockPool to not explicitly use File to check block pool existence.
- [HDFS-9474] - TestPipelinesFailover should not fail when printing debug message
- [HDFS-8647] - Abstract BlockManager's rack policy into BlockPlacementPolicy
- [HDFS-9491] - Tests should get the number of pending async delets via FsDatasetTestUtils
- [HDFS-9267] - TestDiskError should get stored replicas through FsDatasetTestUtils.
- [HDFS-9490] - MiniDFSCluster should change block generation stamp via FsDatasetTestUtils
- [HDFS-9314] - Improve BlockPlacementPolicyDefault's picking of excess replicas
- [HDFS-8722] - Optimize datanode writes for small writes and flushes
- [HDFS-9252] - Change TestFileTruncate to use FsDatasetTestUtils to get block file size and genstamp.
- [HDFS-9282] - Make data directory count and storage raw capacity related tests FsDataset-agnostic
- [HDFS-9308] - Add truncateMeta() and deleteMeta() to MiniDFSCluster
- [HDFS-9363] - Add fetchReplica() to FsDatasetTestUtils to return FsDataset-agnostic Replica.
- [HDFS-9331] - Modify TestNameNodeMXBean#testNameNodeMXBeanInfo() to account for filesystem entirely allocated for DFS use
- [HDFS-9297] - Update TestBlockMissingException to use corruptBlockOnDataNodesByDeletingBlockFile()
- [HDFS-9255] - Consolidate block recovery related implementation into a single class
- [HDFS-9439] - Include status of closeAck into exception message in DataNode#run
- [HDFS-8056] - Decommissioned dead nodes should continue to be counted as dead after NN restart
- [HDFS-9292] - Make TestFileConcorruption independent to underlying FsDataset Implementation.
- [HDFS-9291] - Fix TestInterDatanodeProtocol to be FsDataset-agnostic.
- [HDFS-9312] - Fix TestReplication to be FsDataset-agnostic.
- [HDFS-6482] - Use block ID-based block layout on datanodes
- [HDFS-8808] - dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby
- [HDFS-9251] - Refactor TestWriteToReplica and TestFsDatasetImpl to avoid explicitly creating Files in tests code.
- [HDFS-7210] - Avoid two separate RPC's namenode.append() and namenode.getFileInfo() for an append call from DFSClient
- [HDFS-9250] - Add Precondition check to LocatedBlock#addCachedLoc
- [HDFS-7758] - Retire FsDatasetSpi#getVolumes() and use FsDatasetSpi#getVolumeRefs() instead
- [HDFS-9188] - Make block corruption related tests FsDataset-agnostic.
- [HDFS-9238] - Update TestFileCreation#testLeaseExpireHardLimit() to avoid using DataNodeTestUtils#getFile()
- [HDFS-7454] - Reduce memory footprint for AclEntries in NameNode
- [HDFS-8361] - Choose SSD over DISK in block placement
- [HDFS-9065] - Include commas on # of files, blocks, total filesystem objects in NN Web UI
- [HDFS-5795] - RemoteBlockReader2#checkSuccess() shoud print error status
- [HDFS-9132] - Pass genstamp to ReplicaAccessorBuilder
- [HDFS-8384] - Allow NN to startup if there are files having a lease but are not under construction
- [HDFS-7773] - Additional metrics in HDFS to be accessed via jmx.
- [HDFS-7314] - When the DFSClient lease cannot be renewed, abort open-for-write files rather than the entire DFSClient
- [HDFS-328] - Improve fs -setrep error message for invalid replication factors
- [HDFS-2360] - Ugly stacktrace when quota exceeds
- [HDFS-6184] - Capture NN's thread dump when it fails over
- [HDFS-8443] - Document dfs.namenode.service.handler.count in hdfs-site.xml
- [HDFS-8735] - Inotify : All events classes should implement toString() API.
- [HDFS-9021] - Use a yellow elephant rather than a blue one in diagram
- [HDFS-8860] - Remove unused Replica copyOnWrite code
- [HDFS-7978] - Add LOG.isDebugEnabled() guard for some LOG.debug(..)
- [HDFS-8965] - Harden edit log reading code against out of memory errors
- [HDFS-8862] - BlockManager#excessReplicateMap should use a HashMap
- [HDFS-8792] - BlockManager#postponedMisreplicatedBlocks should use a LightWeightHashSet to save memory
- [HDFS-7270] - Add congestion signaling capability to DataNode write protocol
- [HDFS-7404] - Remove o.a.h.hdfs.server.datanode.web.resources
- [HDFS-7279] - Use netty to implement DatanodeWebHdfsMethods
- [HDFS-7280] - Use netty 4 in WebImageViewer
- [HDFS-8924] - Add pluggable interface for reading replicas in DFSClient
- [HDFS-8828] - Utilize Snapshot diff report to build diff copy list in distcp
- [HDFS-8821] - Explain message "Operation category X is not supported in state standby"
- [HDFS-8887] - Expose storage type and storage ID in BlockLocation
- [HDFS-8659] - Block scanner INFO message is spamming logs
- [HDFS-8589] - Fix unused imports in BPServiceActor and BlockReportLeaseManager
- [HDFS-7923] - The DataNodes should rate-limit their full block reports by asking the NN on heartbeat messages
- [HDFS-7491] - Add incremental blockreport latency to DN metrics
- [HDFS-7979] - Initialize block report IDs with a random number
- [HDFS-7435] - PB encoding of block reports is very inefficient
- [HDFS-5782] - BlockListAsLongs should take lists of Replicas rather than concrete classes
- [HDFS-7858] - Improve HA Namenode Failover detection on the client
- [HDFS-7535] - Utilize Snapshot diff report for distcp
- [HDFS-8133] - Improve readability of deleted block check
- [HDFS-7223] - Tracing span description of IPC client is too long
- [HDFS-7675] - Remove unused member DFSClient#spanReceiverHost
- [HDFS-7186] - Document the "hadoop trace" command.
- [HDFS-7890] - Improve information on Top users for metrics in RollingWindowsManager and lower log level
- [HDFS-8546] - Use try with resources in DataStorage and Storage
- [HDFS-7546] - Document, and set an accepting default for dfs.namenode.kerberos.principal.pattern
- [HDFS-8582] - Support getting a list of reconfigurable config properties and do not generate spurious reconfig warnings
- [HDFS-8549] - Abort the balancer if an upgrade is in progress
- [HDFS-8204] - Mover/Balancer should not schedule two replicas to the same DN
- [HDFS-7772] - Document hdfs balancer -exclude/-include option in HDFSCommands.html
- [HDFS-316] - Balancer should run for a configurable # of iterations
- [HDFS-8535] - Clarify that dfs usage in dfsadmin -report output includes all block replicas.
- [HDFS-7604] - Track and display failed DataNode storage locations in NameNode.
- [HDFS-7917] - Use file to replace data dirs in test to simulate a disk failure.
- [HDFS-7645] - Rolling upgrade is restoring blocks from trash multiple times
- [HDFS-7312] - Update DistCp v1 to optionally not use tmp location (branch-1 only)
- [HDFS-7761] - cleanup unnecssary code logic in LocatedBlock
- [HDFS-7668] - Convert site documentation from apt to markdown
- [HDFS-7182] - JMX metrics aren't accessible when NN is busy
- [HDFS-7386] - Replace check "port number < 1024" with shared isPrivilegedPort method
- [HDFS-3342] - SocketTimeoutException in BlockSender.sendChunks could have a better error message
- [HDFS-7430] - Rewrite the BlockScanner to use O(1) memory and use multiple threads
- [HDFS-7712] - Switch blockStateChangeLog to use slf4j
- [HDFS-7706] - Switch BlockManager logging to use slf4j
- [HDFS-5631] - Expose interfaces required by FsDatasetSpi implementations
- [HDFS-7640] - print NFS Client in the NFS log
- [HDFS-7310] - Mover can give first priority to local DN if it has target storage type available in local DN
- [HDFS-7398] - Reset cached thread-local FSEditLogOp's on every FSEditLog#logEdit
- [HDFS-7323] - Move the get/setStoragePolicy commands out from dfsadmin
- [HDFS-6803] - Documenting DFSClient#DFSInputStream expectations reading and preading in concurrent context
- [HDFS-6735] - A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream
- [HDFS-7531] - Improve the concurrent access on FsVolumeList
- [HDFS-7790] - Do not create optional fields in DFSInputStream unless they are needed
- [HDFS-7336] - Unused member DFSInputStream.buffersize
- [HDFS-7694] - FSDataInputStream should support "unbuffer"
- [HDFS-7771] - fuse_dfs should permit FILE: on the front of KRB5CCNAME
- [HDFS-7579] - Improve log reporting during block report rpc failure
- [HDFS-7331] - Add Datanode network counts to datanode jmx page
- [HDFS-7067] - ClassCastException while using a key created by keytool to create encryption zone.
- [HDFS-7513] - HDFS inotify: add defaultBlockSize to CreateEvent
- [HDFS-7426] - Change nntop JMX format to be a JSON blob
- [HDFS-6779] - Add missing version subcommand for hdfs
- [HDFS-7446] - HDFS inotify should have the ability to determine what txid it has read up to
- [HDFS-7356] - Use DirectoryListing.hasMore() directly in nfs
- [HDFS-7419] - Improve error messages for DataNode hot swap drive feature
- [HDFS-7409] - Allow dead nodes to finish decommissioning if all files are fully replicated
- [HDFS-6741] - Improve permission denied message when FSPermissionChecker#checkOwner fails
- [HDFS-7165] - Separate block metrics for files with replication count 1
- [HDFS-7283] - Bump DataNode OOM log from WARN to ERROR
- [HDFS-7257] - Add the time of last HA state transition to NN's /jmx page
- [HDFS-7266] - HDFS Peercache enabled check should not lock on object
- [HDFS-7252] - small refinement to the use of isInAnEZ in FSNamesystem
- [HDFS-7242] - Code improvement for FSN#checkUnreadableBySuperuser
- [HDFS-7278] - Add a command that allows sysadmins to manually trigger full block reports from a DN
- [HDFS-6168] - Remove deprecated methods in DistributedFileSystem
- [HDFS-4817] - make HDFS advisory caching configurable on a per-file basis
New Feature
- [HDFS-10899] - Add functionality to re-encrypt EDEKs
- [HDFS-12117] - HttpFS does not seem to support SNAPSHOT related methods for WebHDFS REST Interface
- [HDFS-11412] - Maintenance minimum replication config value allowable range should be [0, DefaultReplication]
- [HDFS-11411] - Avoid OutOfMemoryError in TestMaintenanceState test runs
- [HDFS-11265] - Extend visualization for Maintenance Mode under Datanode tab in the NameNode UI
- [HDFS-7701] - Support reporting per storage type quota and usage with hadoop/hdfs shell
- [HDFS-11296] - Maintenance state expiry should be an epoch time and not jvm monotonic
- [HDFS-11259] - Update fsck to display maintenance state info
- [HDFS-9391] - Update webUI/JMX to display maintenance state info
- [HDFS-9390] - Block management for maintenance states
- [HDFS-10918] - Add a tool to get FileEncryptionInfo from CLI
- [HDFS-9392] - Admins support for maintenance state
- [HDFS-9820] - Improve distcp to support efficient restore to an earlier snapshot
- [HDFS-9389] - Add maintenance states to AdminStates
- [HDFS-9926] - ozone : Add volume commands to CLI
- [HDFS-7912] - Erasure Coding: track BlockInfo instead of Block in UnderReplicatedBlocks and PendingReplicationBlocks
- [HDFS-10552] - DiskBalancer "-query" results in NPE if no plan for the node
- [HDFS-10517] - DiskBalancer: Support help command
- [HDFS-9461] - DiskBalancer: Add Report Command
- [HDFS-10541] - Diskbalancer: When no actions in plan, error message says "Plan was generated more than 24 hours ago"
- [HDFS-10501] - DiskBalancer: Use the default datanode port if port is not provided.
- [HDFS-10500] - Diskbalancer: Print out information when a plan is not generated.
- [HDFS-10403] - DiskBalancer: Add cancel command
- [HDFS-9547] - DiskBalancer : Add user documentation
- [HDFS-10402] - DiskBalancer: Add QueryStatus command
- [HDFS-10399] - DiskBalancer: Add JMX for DiskBalancer
- [HDFS-10496] - DiskBalancer: ExecuteCommand checks planFile in a wrong way
- [HDFS-10478] - DiskBalancer: resolve volume path names
- [HDFS-10476] - DiskBalancer: Plan command output directory should be a sub-directory
- [HDFS-8008] - Support client-side back off when the datanodes are congested
- [HDFS-9546] - DiskBalancer : Add Execute command
- [HDFS-9545] - DiskBalancer : Add Plan Command
- [HDFS-9543] - DiskBalancer : Add Data mover
- [HDFS-9735] - DiskBalancer : Refactor moveBlockAcrossStorage to be used by disk balancer
- [HDFS-9720] - DiskBalancer : Add configuration parameters
- [HDFS-9709] - DiskBalancer : Add tests for disk balancer using a Mock Mover class.
- [HDFS-9703] - DiskBalancer : getBandwidth implementation
- [HDFS-9702] - DiskBalancer : getVolumeMap implementation
- [HDFS-9817] - Use SLF4J in new classes
- [HDFS-9856] - Suppress Jenkins warning for sample JSON file
- [HDFS-9683] - DiskBalancer : Add cancelPlan implementation
- [HDFS-9681] - DiskBalancer : Add QueryPlan implementation
- [HDFS-9671] - DiskBalancer : SubmitPlan implementation
- [HDFS-9647] - DiskBalancer : Add getRuntimeSettings
- [HDFS-9645] - DiskBalancer : Add Query RPC
- [HDFS-9595] - DiskBalancer : Add cancelPlan RPC
- [HDFS-9588] - DiskBalancer : Add submitDiskbalancer RPC
- [HDFS-9611] - DiskBalancer : Replace htrace json imports with jackson
- [HDFS-9526] - DiskBalancer: change htrace...JsonIgnore to codehaus...JsonIgnore
- [HDFS-9469] - DiskBalancer : Add Planner
- [HDFS-9449] - DiskBalancer : Add connectors
- [HDFS-9420] - DiskBalancer : Add DataModels
- [HDFS-9835] - OIV: add ReverseXML processor which reconstructs an fsimage from an XML file
- [HDFS-9804] - Allow long-running Balancer to login with keytab
- [HDFS-9244] - Support nested encryption zones
- [HDFS-8805] - Archival Storage: getStoragePolicy should not need superuser privilege
- [HDFS-6663] - Admin command to track file and locations from block id
- [HDFS-7054] - Make DFSOutputStream tracing more fine-grained
- [HDFS-7189] - Add trace spans for DFSClient metadata operations
- [HDFS-7623] - Add htrace configuration properties to core-default.xml and update user doc about how to enable htrace
- [HDFS-7055] - Add tracing to DFSInputStream
- [HDFS-6488] - Support HDFS superuser in NFSv3 gateway
- [HDFS-7449] - Add metrics to NFS gateway
- [HDFS-7424] - Add web UI for NFS gateway
- [HDFS-6982] - nntop: top-like tool for name node users
- [HDFS-7035] - Make adding a new data directory to the DataNode an atomic operation and improve error handling
- [HDFS-6877] - Avoid calling checkDisk when an HDFS volume is removed during a write.
- [HDFS-7254] - Add documentation for hot swaping DataNode drives
- [HDFS-7222] - Expose DataNode network errors as a metric
- [HDFS-6826] - Plugin interface to enable delegation of HDFS authorization assertions
- [HDFS-5511] - improve CacheManipulator interface to allow better unit testing
Task
- [HDFS-5928] - show namespace and namenode ID on NN dfshealth page
- [HDFS-9377] - Fix findbugs warnings in FSDirSnapshotOp
- [HDFS-2486] - Review issues with UnderReplicatedBlocks
Test
- [HDFS-11272] - Refine the assert messages in TestFSDirAttrOp
- [HDFS-9745] - TestSecureNNWithQJM#testSecureMode sometimes fails with timeouts
- [HDFS-8039] - Fix TestDebugAdmin#testRecoverLease and testVerfiyBlockChecksumCommand on Windows
- [HDFS-9949] - Add a test case to ensure that the DataNode does not regenerate its UUID when a storage directory is cleared
- [HDFS-9888] - Allow reseting KerberosName in unit tests
- [HDFS-8038] - PBImageDelimitedTextWriter#getEntry output HDFS path in platform-specific format.
- [HDFS-9300] - TestDirectoryScanner.testThrottle() is still a little flakey
- [HDFS-9626] - TestBlockReplacement#testBlockReplacement fails occasionally
- [HDFS-7553] - fix the TestDFSUpgradeWithHA due to BindException
- [HDFS-9429] - Tests in TestDFSAdminWithHA intermittently fail with EOFException
- [HDFS-9339] - Extend full test of KMS ACLs
- [HDFS-9295] - Add a thorough test of the full KMS code path
- [HDFS-9410] - Some tests should always reset sysout and syserr
- [HDFS-7448] - TestBookKeeperHACheckpoints fails in trunk build
Impala
Bug
- [IMPALA-2607] - S3: need to determine the right default for fs.s3a.connection.maximum