CDH 4 Release Notes
The following lists all Apache Pig Jiras included in CDH 4
that are not included in the Apache Pig base version 0.10.0. The
pig-0.10.0-cdh4.2.0.CHANGES.txt
file lists all changes included in CDH 4. The patch for each
change can be found in the cloudera/patches directory in the release tarball.
Changes Not In Apache Pig 0.10.0
Pig
Bug
- [PIG-3147] - Spill failing with "java.lang.RuntimeException: InternalCachedBag.spill() should not be called"
- [PIG-3071] - update hcatalog jar and path to hbase storage handler jar in pig script
- [PIG-3022] - TestRegisteredJarVisibility.testRegisteredJarVisibility fails with hadoop-2.0.x
- [PIG-3125] - Fix zebra compilation error
- [PIG-3029] - TestTypeCheckingValidatorNewLP has some path reference issues for cross-platform execution
- [PIG-3107] - bin and autocomplete are missing in src release
- [PIG-3051] - java.lang.IndexOutOfBoundsException failure with LimitOptimizer + ColumnPruning
- [PIG-3106] - Missing license header in several java file
- [PIG-3085] - Errors and lacks in document "Built In Functions"
- [PIG-3050] - Fix FindBugs multithreading warnings
- [PIG-3066] - Fix TestPigRunner in trunk
- [PIG-3101] - Increase io.sort.mb in YARN MiniCluster
- [PIG-3096] - Make PigUnit thread safe
- [PIG-3099] - Pig unit test fixes for TestGrunt(1), TestStore(2), TestEmptyInputDir(3)
- [PIG-3020] - "Duplicate uid in schema" error when joining two relations derived from the same load statement
- [PIG-2341] - Need better documentation on Pig/HBase integration
- [PIG-3033] - test-patch failed with javadoc warnings
- [PIG-2924] - PigStats should not be assuming all Storage classes to be file-based storage
- [PIG-2978] - TestLoadStoreFuncLifeCycle fails with hadoop-2.0.x
- [PIG-2980] - documentation for DateTime datatype
- [PIG-2966] - Test failures on CentOS 6 because MALLOC_ARENA_MAX is not set
- [PIG-2885] - TestJobSumission and TestHBaseStorage don't work with HBase 0.94 and ZK 3.4.3
- [PIG-3039] - Not possible to use custom version of jackson jars
- [PIG-2937] - generated field in nested foreach does not inherit the variable name as the field name
- [PIG-3045] - Specifying sorting field(s) at nightly.conf - fix sortArgs
- [PIG-2979] - Pig.jar doesn't work with hadoop-2.0.x
- [PIG-2405] - svn tags/release-0.9.1: some unit test case failed with open JDK
- [PIG-3035] - With latest version of hadoop23 pig does not return the correct exception stack trace from backend
- [PIG-2832] - org.apache.pig.pigunit.pig.PigServer does not initialize udf.import.list of PigContext
- [PIG-2953] - "which" utility does not exist on Windows
- [PIG-2960] - Increase the timeout for unit test
- [PIG-2942] - DevTests, TestLoad has a false failure on Windows
- [PIG-2801] - grunt "sh" command should invoke the shell implicitly instead of calling exec directly with the command tokens
- [PIG-2798] - pig streaming tests assume interpreters are auto-resolved
- [PIG-2796] - Local temporary paths are not always valid HDFS path names.
- [PIG-2795] - Fix test cases that generate pig scripts with "load " + pathStr to encode "\" in the path
- [PIG-3018] - Refactor TestScriptLanguage to remove duplication and write script in different files
- [PIG-2973] - TestStreaming test times out
- [PIG-3001] - TestExecutableManager.testAddJobConfToEnv fails randomly
- [PIG-3017] - Pig's object serialization should use compression
- [PIG-2968] - ColumnMapKeyPrune fails to prune a subtree inside foreach
- [PIG-2913] - org.apache.pig.test.TestPigServerWithMacros fails sometimes because it picks up previous minicluster configuration file
- [PIG-2999] - Regression after PIG-2975: BinInterSedesTupleRawComparator secondary sort failing
- [PIG-2998] - Fix TestScriptLanguage and TestMacroExpansion
- [PIG-2975] - TestTypedMap.testOrderBy failing with incorrect result
- [PIG-2950] - Fix tiny documentation error in BagToString builtin.
- [PIG-2967] - Fix Glob_local test failure for Pig E2E Test Framework
- [PIG-2940] - HBaseStorage store fails in secure cluster
- [PIG-2931] - $ signs in the replacement string make parameter substitution fail
- [PIG-1283] - COUNT on null bag causes failure
- [PIG-2958] - Pig tests do not appear to have a logger attached
- [PIG-2926] - TestPoissonSampleLoader failing on rhel environment
- [PIG-2985] - TestRank1,2,3 fail with hadoop-2.0.x
- [PIG-2943] - DevTests, Refactor Windows checks to use new Util.WINDOWS method for code health
- [PIG-2794] - Pig test: add utils to simplify testing on Windows
- [PIG-2928] - Fix e2e test failures in trunk: FilterBoolean_23/24
- [PIG-2908] - Fix unit tests to work with jdk7
- [PIG-2971] - Add new parameter to specify the streaming environment
- [PIG-2963] - Illustrate command and POPackageLite
- [PIG-2965] - RANDOM should allow seed initialization for ease of testing
- [PIG-2961] - BinInterSedesRawComparator broken by TUPLE_number patch
- [PIG-2932] - Setting high default_parallel causes IOException in local mode
- [PIG-2737] - TestIndexedStorage is failing, should be refactored
- [PIG-2944] - ivysettings.xml does not let you override .m2/repository
- [PIG-2935] - Catch NoSuchMethodError when StoreFuncInterface's new cleanupOnSuccess method isn't implemented.
- [PIG-2920] - e2e tests override PERL5LIB environment variable
- [PIG-2917] - SpillableMemoryManager memory leak for WeakReference
- [PIG-2938] - All unit tests that use MR2 MiniCluster are broken in trunk
- [PIG-2936] - Tuple serialization bug
- [PIG-2929] - Improve documentation around AVG, CONCAT, MIN, MAX
- [PIG-2930] - ant test doesn't compile in trunk
- [PIG-2791] - Pig does not work with ViewFileSystem
- [PIG-2833] - org.apache.pig.pigunit.pig.PigServer does not initialize set default log level of pigContext
- [PIG-2852] - Old Hadoop API being used sometimes when running parellel in local mode
- [PIG-2712] - Pig does not call OutputCommitter.abortJob() on the underlying OutputFormat
- [PIG-2918] - Avoid Spillable bag overhead where possible
- [PIG-2744] - Handle Pig command line with XML special characters
- [PIG-2637] - Command-line option -e throws TokenMgrError exception
- [PIG-2887] - Macro cannot handle negative number
- [PIG-2844] - ant makepom is misconfigured
- [PIG-2912] - Pig should clone JobConf while creating JobContextImpl and TaskAttemptContextImpl in Hadoop23
- [PIG-2901] - Errors and lacks in document "Pig Latin Basics"
- [PIG-2905] - Improve documentation around REPLACE
- [PIG-2781] - LOSort isEqual method
- [PIG-2886] - Add Scan TimeRange to HBaseStorage
- [PIG-2896] - Pig does not fail anymore if two macros are declared with the same name
- [PIG-2895] - jodatime jar missing in pig-withouthadoop.jar
- [PIG-2893] - fix DBStorage compile issue
- [PIG-2821] - HBaseStorage should work with secure hbase
- [PIG-2890] - Revert PIG-2578
- [PIG-1314] - Add DateTime Support to Pig
- [PIG-2848] - TestBuiltInBagToTupleOrString fails now that mock.Storage enforces not overwriting output
- [PIG-2785] - NoClassDefFoundError after upgrading to pig 0.10.0 from 0.9.0
- [PIG-2884] - JobControlCompiler mis-logs after reducer estimation
- [PIG-2556] - CSVExcelStorage load: quoted field with newline as first character sees newline as record end
- [PIG-2662] - skew join does not honor its config parameters
- [PIG-2876] - Bump up Xerces version
- [PIG-2871] - Refactor signature for PigReducerEstimator
- [PIG-2866] - PigServer fails with macros without a script file
- [PIG-2851] - Add flag to ant to run tests with a debugger port
- [PIG-2860] - TestAvroStorageUtils.testGetConcretePathFromGlob fails on some version of hadoop
- [PIG-2859] - Fix few e2e test failures
- [PIG-2861] - PlanHelper imports org.python.google.common.collect.Lists instead of org.google.common.collect.Lists
- [PIG-2837] - AvroStorage throws StackOverFlowError
- [PIG-2856] - AvroStorage doesn't load files in the directories when a glob pattern matches both files and directories.
- [PIG-2569] - Fix org.apache.pig.test.TestInvoker.testSpeed
- [PIG-2854] - AvroStorage doesn't work with Avro 1.7.1
- [PIG-2729] - Macro expansion does not use pig.import.search.path - UnitTest borked
- [PIG-2779] - Refactoring the code for setting number of reducers
- [PIG-2814] - Fix issues with Sample operator documentation
- [PIG-2849] - Errors in document "Getting Started"
- [PIG-2841] - Inconsistent URL in Docs
- [PIG-2843] - Typo in Documentation
- [PIG-2839] - mock.Storage overwrites output with the last relation written when storing UNION
- [PIG-2840] - Fix SchemaTuple bugs
- [PIG-2842] - TestNewPlanOperatorPlan fails when new Configuration() picks up a previous minicluster conf file
- [PIG-2827] - TOP exception bug
- [PIG-2825] - StoreFunc signature setting in LogicalPlan broken
- [PIG-2800] - pig.additional.jars path separator should align with File.pathSeparator instead of being hard-coded to ":"
- [PIG-2797] - Tests should not create their own file URIs through string concatenation, should use Util.generateURI instead
- [PIG-2820] - relToAbsolutePath is not replayed properly when Grunt reparses the script after PIG-2699
- [PIG-2815] - class loader management in PigContext
- [PIG-2813] - Test regressions from PIG-2632
- [PIG-2780] - MapReduceLauncher should break early when one of the jobs throws an exception
- [PIG-2809] - TestUDFContext broken by PIG-2699
- [PIG-2807] - TestParser TestPigStorage TestNewPlanOperatorPlan broken by PIG-2699
- [PIG-2806] - Fix merge join test regression
- [PIG-2782] - Specifying sorting field(s) at nightly.conf
- [PIG-2783] - Fix Iterator_1 e2e test for Hadoop 23
- [PIG-2790] - After Pig-2699 the script schema (LOAD ... USING ... AS {script schema}) is passed after getSchema is called
- [PIG-2787] - change the module name in ivy to lowercase to match the maven repo
- [PIG-2766] - Pig-HCat Usability
- [PIG-2777] - Docs are broken due to malformed xml after PIG-2673
- [PIG-2775] - Register jar does not goes to classpath in some cases
- [PIG-2748] - Change the names of the jar produced in the build folder to match maven conventions
- [PIG-2746] - Pig doesn't detect all forms of compression extensions properly
- [PIG-2761] - With hadoop23 importing modules inside python script does not work
- [PIG-2745] - Pig e2e test RubyUDFs fails in MR mode when running from tarball
- [PIG-2699] - Reduce the number of instances of Load and Store Funcs down to 2+1. It should be 1 in the front-end and 1 in the backend
- [PIG-2759] - Typo in document "Built In Functions"
- [PIG-2593] - Filter by a boolean value does not work
- [PIG-2665] - Bundled Jython jar in Pig 0.10.0-RC breaks module import in Python scripts with embedded Pig Latin
- [PIG-2736] - Support implicit cast from bytearray to boolean
- [PIG-2741] - Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created
- [PIG-2669] - Pig release should include pig-default.properties after rebuild
- [PIG-2739] - PyList should map to Bag automatically in Jython
- [PIG-2508] - PIG can unpredictably ignore deprecated Hadoop config options
- [PIG-2691] - Duplicate TOKENIZE schema
- [PIG-2730] - TFileStorage getStatistics incorrectly throws an exception instead of returning null
- [PIG-2717] - Tuple field mangled during flattening
- [PIG-2173] - piggybank datetime conversion javadocs not properly formatted
- [PIG-2721] - Wrong output generated while loading bags as input
- [PIG-2727] - PigStorage Source tagging does not need pig.splitCombination to be turned off
- [PIG-2714] - Pig documentation on TOP funcation has issues
- [PIG-2693] - LoadFunc.setLocation should be called before LoadMetadata.getStatistics
- [PIG-2639] - Utils.getSchemaFromString should automatically give name to all types, but fails on boolean
- [PIG-2666] - LoadFunc.setLocation() is not called when pig script only has Order By
- [PIG-2680] - TOBAG output schema reporting
- [PIG-2685] - error in EvalFunc ctor when implementing Algebraic UDF whose return type is parameterized
- [PIG-2640] - Usage message gives wrong information for Pig additional jars
- [PIG-2652] - Skew join and order by don't trigger reducer estimation
- [PIG-2671] - e2e harness: Reference local test path via :LOCALTESTPATH:
- [PIG-2670] - glitches on copyright years in documentation
- [PIG-2616] - JobControlCompiler.getInputSizeFromLoader must handle exceptions from LoadFunc.getStatistics.
- [PIG-2257] - AvroStorage doesn't recognize schema_file field when JSON isn't used in the constructor
- [PIG-2644] - Piggybank's HadoopJobHistoryLoader throws NPE when reading broken history file
- [PIG-2627] - Custom partitioner not set when POSplit is involved in Plan
- [PIG-2596] - Jython UDF does not handle boolean output
- [PIG-2578] - Multiple Store-commands mess up mapred.output.dir.
- [PIG-2649] - org.apache.pig.parser.ParserValidationException does not expose the cause exception
- [PIG-2642] - StoreMetadata.storeSchema can't access files in the output directory (Hadoop 0.23)
- [PIG-2621] - Documentation inaccurate regarding Pig Properties in trunk
- [PIG-2540] - AvroStorage can't read schema on amazon s3 in elastic mapreduce
- [PIG-2550] - Custom tuple results in "Unexpected datatype 110 while reading tuplefrom binary file" while spilling
- [PIG-2442] - Multiple Stores in pig streaming causes infinite waiting
- [PIG-1270] - Push limit into loader
- [PIG-2609] - e2e harness: make hdfs base path configurable (outside default.conf)
- [PIG-2608] - Typo in PigStorage documentation for source tagging
- [PIG-2505] - AvroStorage won't read any file not ending in .avro
- [PIG-2585] - Enable ignored e2e test cases
- [PIG-2563] - IndexOutOfBoundsException: while projecting fields from a bag
- [PIG-2411] - AvroStorage UDF in PiggyBank fails to STORE a bag of single-field tuples as Avro arrays
- [PIG-438] - Handle realiasing of existing Alias (A=B;)
- [PIG-2576] - Change in behavior for UDFContext.getUDFContext().getJobConf() in front-end
- [PIG-2573] - Automagically setting parallelism based on input file size does not work with HCatalog
- [PIG-2590] - running ant tar and rpm targets on same copy of pig source results in problems
- [PIG-2581] - HashFNV inconsistent/non-deterministic due to default platform encoding
- [PIG-2588] - e2e harness: use pig command for cluster deploy
- [PIG-2570] - LimitOptimizer fails with dynamic LIMIT argument
- [PIG-2543] - PigStats.isSuccessful returns false if embedded pig script has sh commands
- [PIG-2572] - e2e harness deploy fails when using pig that does not bundle hadoop
- [PIG-2568] - PigOutputCommitter hide exception in commitJob
- [PIG-2514] - REGEX_EXTRACT not returning correct group with non greedy regex
- [PIG-2509] - Util.getSchemaFromString fails with java.lang.NullPointerException when a tuple in a bag has no name (as when used in MongoStorage UDF)
- [PIG-2559] - Embedded pig in python; invoking sys.exit(0) causes script failure
- [PIG-2564] - Build fails - Hadoop 0.23.1-SNAPSHOT no longer available
- [PIG-2535] - Bug in new logical plan results in no output for join
- [PIG-2534] - Pig generating infinite map outputs
- [PIG-2532] - Registered classes fail deserialization in frontend
- [PIG-2549] - org.apache.pig.piggybank.storage.avro - Broken documentation link for AvroStorage
- [PIG-2502] - Make "hcat.bin" configurable in e2e test
- [PIG-2530] - Reusing alias name in nested foreach causes incorrect results
- [PIG-2322] - varargs functions do not get passed the arguments in Python embedding
- [PIG-2533] - Pig MR job exceptions masked on frontend
- [PIG-2497] - Order of execution of fs, store and sh commands in Pig is not maintained
- [PIG-2493] - UNION causes casting issues
- [PIG-2501] - Changes needed to contrib/piggybank/java/build.xml in order to build piggybank.jar with Hadoop 0.23
- [PIG-2499] - Pig TestGrunt.testShellCommand occasionally fails
- [PIG-2484] - Fix several e2e test failures/aborts for 23
- [PIG-2426] - ProgressableReporter.progress(String msg) is an empty function
- [PIG-2413] - e2e test should support testing against two cluster
- [PIG-2326] - Pig minicluster tests can not be run from eclipse
- [PIG-2479] - changingPattern should be used with checkmodified in ivysettings.xml
- [PIG-2347] - Fix Pig Unit tests for hadoop 23
- [PIG-2477] - TestBuiltin testLFText/testSFPig failing against 23 due to invalid test setup -- InvalidInputException
- [PIG-2462] - getWrappedSplit is incorrectly returning the first split instead of the current split.
- [PIG-2472] - piggybank unit tests write directly to /tmp
- [PIG-2342] - Pig tutorial documentation needs to update about building tutorial
- [PIG-2267] - Make the name of the columns in schema optional
- [PIG-2458] - Can't have spaces in parameter substitution
- [PIG-2410] - Piggybank does not compile in 23
- [PIG-2418] - rpm release package does not take PIG_CLASSPATH
- [PIG-2430] - An EvalFunc which overrides getArgToFuncMapping with FuncSpec with constructor arguments is not properly instantiated with said arguments
- [PIG-2457] - JsonLoaderStorage tests is broken for e2e
- [PIG-2432] - Eclipse .classpath file is out of date
- [PIG-2363] - _logs for streaming commands bug in new parser
- [PIG-2453] - Fetching schema can be very slow for multi-thousand LOADs
- [PIG-2291] - PigStats.isSuccessful returns false if embedded pig script has dump
- [PIG-2427] - getSchemaFromString throws away the name of the tuple that is in a bag
- [PIG-2425] - Aggregate Warning does not work as expected on Embedding Pig in Java 0.9.1
- [PIG-2331] - BinStorage in LOAD statement failing when input has curly braces
- [PIG-2415] - A fix for 0.23 local mode: put "yarn-default.xml" into the configuration
- [PIG-2391] - Bzip_2 test is broken
- [PIG-2402] - inIllustrator condition in PigMapReduce is wrong for hadoop 23
- [PIG-2370] - SkewedParitioner results in Kerberos error
- [PIG-2374] - streaming regression with dotNext
- [PIG-2387] - BinStorageRecordReader causes negative progress
- [PIG-2354] - Several fixes for bin/pig
- [PIG-2385] - Store statements not getting processed
- [PIG-2358] - JobStats.getHadoopCounters() is never set and always returns null
- [PIG-2384] - Generic Invokers should use PigContext to resolve classes
- [PIG-2327] - bin/pig doesn't have any hooks for picking up ZK installation deployed from tarballs
- [PIG-2184] - Not able to provide positional reference to macro invocations
- [PIG-2311] - Class cast exception thrown in STRSPLIT even after casting properly
- [PIG-2209] - JsonMetadata fails to find schema for glob paths
- [PIG-2355] - ant clean does not clean e2e test build artifacts
- [PIG-2352] - e2e test harness' use of environment variables causes unintended effects between tests
- [PIG-2165] - Need a way to deal with params and param_file in embedded pig in python
- [PIG-2339] - HCatLoader loads all the partitions in a partitioned table even though a filter clause on the partitions is specified in the Pig script
- [PIG-2346] - TypeCastInsert should not insert Foreach if there is no as statement
- [PIG-2313] - NPE in ILLUSTRATE trying to get StatusReporter in STORE
- [PIG-2275] - NullPointerException from ILLUSTRATE
- [PIG-2119] - DuplicateForEachColumnRewrite makes assumptions about the position of LOGGenerate in the plan
Improvement
- [PIG-3044] - Trigger POPartialAgg compaction under GC pressure
- [PIG-2934] - HBaseStorage filter optimizations
- [PIG-3019] - Need a target in build.xml for source releases
- [PIG-2898] - Parallel execution of e2e tests
- [PIG-2976] - Reduce HBaseStorage logging
- [PIG-2947] - Documentation for Rank operator
- [PIG-2964] - Add helper method getJobList() to PigStats.JobGraph. Extend visibility of couple methods on same class
- [PIG-2946] - Documentation of "history" and "clear" commands
- [PIG-2877] - Make SchemaTuple work in foreach (and thus, in loads)
- [PIG-2923] - Lazily register bags with SpillableMemoryManager
- [PIG-2909] - Add a new option for ignoring corrupted files to AvroStorage load func
- [PIG-2915] - Builtin TOP udf is sensitive to null input bags
- [PIG-2882] - Use Deque instead of Stack
- [PIG-2835] - Optimizing the convertion from bytes to Integer/Long
- [PIG-2888] - Improve performance of POPartialAgg
- [PIG-2850] - Pig should support loading macro files as resources stored in JAR files
- [PIG-2875] - Add recursive record support to AvroStorage
- [PIG-2858] - Improve PlanHelper to allow finding any PhysicalOperator in a plan
- [PIG-2492] - AvroStorage should recognize globs and commas
- [PIG-2706] - Add clear to list of grunt commands
- [PIG-2763] - Groovy UDFs
- [PIG-2808] - Add *.project to .gitignore
- [PIG-2750] - add artifacts to the ivy.xml for other jars Pig generates
- [PIG-2770] - Allow easy inclusion of custom build targets
- [PIG-2697] - pretty print schema
- [PIG-2673] - Allow Merge join to follow an ORDER statement
- [PIG-2166] - UDFs to join a bag
- [PIG-2711] - e2e harness: cache benchmark results between test runs
- [PIG-2732] - Let's get rid of the deprecated Tuple methods
- [PIG-2658] - Add start time for pig script in generated Map-Reduce job conf
- [PIG-2735] - Add a pig.version.suffix property in build.xml to easily override with a build number
- [PIG-2705] - outputSchema modification from scripting UDFs
- [PIG-2724] - Make Tuple Iterable
- [PIG-2733] - Add *.patch, *.log, *.orig, *.rej, *.class to gitignore
- [PIG-2638] - Optimize BinInterSedes treatment of longs
- [PIG-2709] - PigAvroRecordReader should specify which file has a problem when throwing IOException
- [PIG-2600] - Better Map support
- [PIG-2702] - Make Pig local mode (and tests) faster by working around the hard coded sleep(5000) in hadoop's JobControl
- [PIG-2696] - Enhance Job Stat to print out median map and reduce time
- [PIG-2659] - add source location of the aliases in the physical plan
- [PIG-2583] - Add Grunt command to list the statements in cache
- [PIG-2688] - Log the aliases being processed for the current job
- [PIG-2664] - Allow PPNL impls to get more job info during the run
- [PIG-2663] - Expose helpful ScriptState methods
- [PIG-2677] - Add target to build.xml to generate clover summary reports
- [PIG-2587] - Compute LogicalPlan signature and store in job conf
- [PIG-2574] - Make reducer estimator plugable
- [PIG-2541] - Automatic record provenance (source tagging) for PigStorage
- [PIG-2601] - Additional document for 0.10
- [PIG-2623] - Support S3 paths for registering scripting UDFs.
- [PIG-2604] - Pig should print its build info at runtime
- [PIG-2182] - Add more append support to DataByteArray
- [PIG-2565] - Support IMPORT for macros stored in S3 Buckets
- [PIG-2538] - Add helper wrapper classes for StoreFunc
- [PIG-2518] - Add ability to clean ivy cache in build.xml
- [PIG-2010] - Bundle registered jars via distributed cache
- [PIG-2515] - Make CustomFormatToISO return null on Exception in parsing dates
- [PIG-2491] - Pig docs still mention hadoop-site.xml
- [PIG-2503] - Make @MonitoredUDF inherited
- [PIG-2504] - Incorrect sample provided for REGEX_EXTRACT
- [PIG-2496] - Cache resolved classes in PigContext
- [PIG-2349] - Ant build repeats ivy-buildJar several times
- [PIG-2282] - Automatically update Eclipse .classpath file when new libs are added to the classpath through Ivy
- [PIG-2431] - Upgrade bundled hadoop version to 1.0.0
- [PIG-2468] - Speed up TestBuiltin
- [PIG-2467] - Speed up TestCommit
- [PIG-2400] - Document has based aggregation support
- [PIG-2460] - Use guava 11 instead of r06
- [PIG-2447] - piggybank: get hive dependency from maven
- [PIG-2437] - Use Ivy to get automaton.jar
- [PIG-2448] - Convert more tests to use LOCAL mode
- [PIG-2438] - Do not hardcode commons-lang version in build.xml
- [PIG-2422] - Add log messages for Jython schema definitions
- [PIG-2403] - Reduce code duplication in SUM, MAX, MIN udfs
- [PIG-2382] - Modify .gitignore to ignore pig-withouthadoop.jar
- [PIG-2380] - Expose version information more cleanly
- [PIG-2365] - Current TOP implementation needlessly results in a null bag name
- [PIG-2151] - Add annotation to specify output schema in Java UDFs
- [PIG-2337] - Provide UDF with input schema
- [PIG-2230] - Display empty param instead of usage error message
New Feature
- [PIG-2579] - Support for multiple input schemas in AvroStorage
- [PIG-2879] - Pig current releases lack a UDF startsWith.This UDF tests if a given string starts with the specified prefix.
- [PIG-2900] - Streaming should provide conf settings in the environment
- [PIG-2353] - RANK function like in SQL
- [PIG-1891] - Enable StoreFunc to make intelligent decision based on job success or failure
- [PIG-2855] - Provide a method to measure time spent in UDFs
- [PIG-2765] - Implementing RollupDimensions UDF and adding ROLLUP clause in CUBE operator
- [PIG-2726] - Handling legitimate NULL values
- [PIG-2651] - Provide a much easier to use accumulator interface
- [PIG-2710] - Implement Naive CUBE operator
- [PIG-2066] - Accumulators should be able to early-terminate
- [PIG-2547] - Easier UDFs: Convenient EvalFunc super-classes
- [PIG-2660] - PPNL should get notified of plan before it gets executed
- [PIG-2650] - Convenience mock Loader and Storer to simplify unit testing of Pig scripts
- [PIG-2317] - Ruby/Jruby UDFs
- [PIG-2589] - Additional e2e test for 0.10 new features
- [PIG-2548] - Support for providing parameters to python script
- [PIG-2525] - Support pluggable PigProgressNotifcationListeners on the command line
- [PIG-2456] - Pig should have a pigrc to specify default script cache
- [PIG-2482] - Integrate HCat DDL command into Pig
- [PIG-2359] - Support more efficient Tuples when schemas are known
- [PIG-2443] - [Piggybank] Add UDFs to check if a String is an Integer And if a String is Numeric
- [PIG-2332] - JsonLoader/JsonStorage
- [PIG-2328] - Add builtin UDFs for building and using bloom filters
- [PIG-2125] - Make Pig work with hadoop .NEXT
- [PIG-2338] - Need signature for EvalFunc
Task
- [PIG-3034] - Remove Penny code from Pig repository
- [PIG-2817] - Documentation for Groovy UDFs
- [PIG-2488] - Move Python unit tests to e2e tests
- [PIG-2444] - Remove the Zebra *.xml documentation files from the TRUNK and Branch-10
- [PIG-2300] - Pig Docs - release 0.10.0 (and 0.9.1)
Test
- [PIG-3076] - make TestScalarAliases more reliable
- [PIG-2982] - add unit tests for DateTime type that test setting timezone
- [PIG-2708] - split MiniCluster based tests out of org.apache.pig.test.TestInputOutputFileValidator
- [PIG-2740] - get rid of "java[77427:1a03] Unable to load realm info from SCDynamicStore" log lines when running pig tests
- [PIG-2245] - Add end to end test for tokenize