Package: hadoop Version: 0.18.3-6cloudera0.3.0~intrepid Architecture: all Maintainer: Todd Lipcon Installed-Size: 21712 Depends: adduser, sun-java6-jre Recommends: hadoop-native Homepage: http://hadoop.apache.org/core/ Priority: extra Section: misc Filename: pool/contrib/h/hadoop/hadoop_0.18.3-6cloudera0.3.0~intrepid_all.deb Size: 12456720 SHA256: 499ba140d3f0f02913fc1ac369cf307730fa9fba1f6cf58a0c63e39176e92550 SHA1: de529c73312f3cd9186db1171583beb6dd900bd1 MD5sum: 69d17abbf5b36c4488b559c29fd29a76 Description: A software platform for processing vast amounts of data Hadoop is a software platform that lets one easily write and run applications that process vast amounts of data. . Here's what makes Hadoop especially useful: * Scalable: Hadoop can reliably store and process petabytes. * Economical: It distributes the data and processing across clusters of commonly available computers. These clusters can number into the thousands of nodes. * Efficient: By distributing the data, Hadoop can process it in parallel on the nodes where the data is located. This makes it extremely rapid. * Reliable: Hadoop automatically maintains multiple copies of data and automatically redeploys computing tasks based on failures. . Hadoop implements MapReduce, using the Hadoop Distributed File System (HDFS). MapReduce divides applications into many small blocks of work. HDFS creates multiple replicas of data blocks for reliability, placing them on compute nodes around the cluster. MapReduce can then process the data where it is located. Package: hadoop-conf-pseudo Source: hadoop Version: 0.18.3-6cloudera0.3.0~intrepid Architecture: all Maintainer: Todd Lipcon Installed-Size: 224 Depends: hadoop (= 0.18.3-6cloudera0.3.0~intrepid) Homepage: http://hadoop.apache.org/core/ Priority: extra Section: misc Filename: pool/contrib/h/hadoop/hadoop-conf-pseudo_0.18.3-6cloudera0.3.0~intrepid_all.deb Size: 93314 SHA256: d23f6e49ac15598fca003bbcbe0ef8e669a0bf3c652b44dedf35a0a0579c1841 SHA1: 5c70040243952d31493cf446aced8e470d3365bf MD5sum: f411bc1419ff1d3e9163cd6f9b2e31f7 Description: Pseudo-distributed Hadoop configuration Contains configuration files for a "pseudo-distributed" Hadoop deployment. In this mode, each of the hadoop components runs as a separate Java process, but all on the same machine. Package: hadoop-datanode Source: hadoop Version: 0.18.3-6cloudera0.3.0~intrepid Architecture: all Maintainer: Todd Lipcon Installed-Size: 132 Depends: hadoop (= 0.18.3-6cloudera0.3.0~intrepid) Homepage: http://hadoop.apache.org/core/ Priority: extra Section: misc Filename: pool/contrib/h/hadoop/hadoop-datanode_0.18.3-6cloudera0.3.0~intrepid_all.deb Size: 79588 SHA256: cf334e2d281d63a419697927652b9362b5d23eb7989eba170704ae758244f5f6 SHA1: 5d26277bb7aad54e5d3ba4a409542e4cd419b74c MD5sum: 390cc08c2f3338f9b0af6597238065e5 Description: Data Node for Hadoop The Data Nodes in the Hadoop Cluster are responsible for serving up blocks of data over the network to Hadoop Distributed Filesystem (HDFS) clients. Package: hadoop-doc Source: hadoop Version: 0.18.3-6cloudera0.3.0~intrepid Architecture: all Maintainer: Todd Lipcon Installed-Size: 37052 Homepage: http://hadoop.apache.org/core/ Priority: extra Section: doc Filename: pool/contrib/h/hadoop/hadoop-doc_0.18.3-6cloudera0.3.0~intrepid_all.deb Size: 5715068 SHA256: 3c070450218b35776b693e4c89b3887b31814752edd73b2d8aa9e392c4cea2eb SHA1: 23b2d8d1711d0cfde0cce3d1dc0f378a48fc9f03 MD5sum: 015f175abddb5a6641c3dca2792c3523 Description: Documentation for Hadoop This package contains the Java Documentation for Hadoop and its relevant APIs. Package: hadoop-jobtracker Source: hadoop Version: 0.18.3-6cloudera0.3.0~intrepid Architecture: all Maintainer: Todd Lipcon Installed-Size: 132 Depends: hadoop (= 0.18.3-6cloudera0.3.0~intrepid) Homepage: http://hadoop.apache.org/core/ Priority: extra Section: misc Filename: pool/contrib/h/hadoop/hadoop-jobtracker_0.18.3-6cloudera0.3.0~intrepid_all.deb Size: 79624 SHA256: 40764fef03f394f8fd35ebae0d264649131f388fcd35849720e0c62fa07e7cf3 SHA1: efc91e642fde937f9e134641717c240923bcd9fb MD5sum: c368924a0e3ce968e5b103ea84c3c38d Description: Job Tracker for Hadoop The jobtracker is a central service which is responsible for managing the tasktracker services running on all nodes in a Hadoop Cluster. The jobtracker allocates work to the tasktracker nearest to the data with an available work slot. Package: hadoop-namenode Source: hadoop Version: 0.18.3-6cloudera0.3.0~intrepid Architecture: all Maintainer: Todd Lipcon Installed-Size: 132 Depends: hadoop (= 0.18.3-6cloudera0.3.0~intrepid) Homepage: http://hadoop.apache.org/core/ Priority: extra Section: misc Filename: pool/contrib/h/hadoop/hadoop-namenode_0.18.3-6cloudera0.3.0~intrepid_all.deb Size: 79586 SHA256: 542fda47805219b107a09c7e2ac2767391c6c9f6ffaf3b32c4c34aa466bfaad4 SHA1: 672de184ccba1a759e51f7cb30004bea3628eeab MD5sum: 8f366eff47614ab1d8be3c19c26b2151 Description: Name Node for Hadoop The Hadoop Distributed Filesystem (HDFS) requires one unique server, the namenode, which manages the block locations of files on the filesystem. Package: hadoop-native Source: hadoop Version: 0.18.3-6cloudera0.3.0~intrepid Architecture: amd64 Maintainer: Todd Lipcon Installed-Size: 232 Depends: libc6 (>= 2.4), hadoop (= 0.18.3-6cloudera0.3.0~intrepid), liblzo2-2, libz1 Enhances: hadoop Homepage: http://hadoop.apache.org/core/ Priority: extra Section: misc Filename: pool/contrib/h/hadoop/hadoop-native_0.18.3-6cloudera0.3.0~intrepid_amd64.deb Size: 95026 SHA256: 42985a4355d5166ec04cc583287abb2196bdadb670df40b453e196a3f12ace81 SHA1: 16d39b9ba47b837b3720fb13c0034d66aa2cd210 MD5sum: ffe079e30cb555190ca8730e02ac83a1 Description: Native libraries for Hadoop (e.g., compression) This optional package contains native libraries that increase the performance of Hadoop's compression. Package: hadoop-pipes Source: hadoop Version: 0.18.3-6cloudera0.3.0~intrepid Architecture: amd64 Maintainer: Todd Lipcon Installed-Size: 396 Depends: hadoop (= 0.18.3-6cloudera0.3.0~intrepid) Homepage: http://hadoop.apache.org/core/ Priority: extra Section: misc Filename: pool/contrib/h/hadoop/hadoop-pipes_0.18.3-6cloudera0.3.0~intrepid_amd64.deb Size: 138476 SHA256: f5ecf6c98c2dd57d17db3a16bb3decfd077b841b8d013b8cf286277d7091be9f SHA1: 93a4982a2a1e240c14015e62f337490d108540f4 MD5sum: 937d043f4a370001014925594a864a66 Description: Interface to author Hadoop MapReduce jobs in C++ Contains Hadoop Pipes, a library which allows Hadoop MapReduce jobs to be written in C++. Package: hadoop-secondarynamenode Source: hadoop Version: 0.18.3-6cloudera0.3.0~intrepid Architecture: all Maintainer: Todd Lipcon Installed-Size: 132 Depends: hadoop (= 0.18.3-6cloudera0.3.0~intrepid) Homepage: http://hadoop.apache.org/core/ Priority: extra Section: misc Filename: pool/contrib/h/hadoop/hadoop-secondarynamenode_0.18.3-6cloudera0.3.0~intrepid_all.deb Size: 79614 SHA256: 076799a6f45f071c1888a31a1816bfe8e9c996c8dc425181dfead658df6ecc2f SHA1: c4b43b941e887b45fc5d92cfd7160506ec7859b4 MD5sum: f165a03bb06814a6829b67aea03d17a7 Description: Secondary Name Node for Hadoop The Secondary Name Node is responsible for checkpointing file system images. It is _not_ a failover pair for the namenode, and may safely be run on the same machine. Package: hadoop-source Source: hadoop Version: 0.18.3-6cloudera0.3.0~intrepid Architecture: all Maintainer: Todd Lipcon Installed-Size: 45564 Homepage: http://hadoop.apache.org/core/ Priority: extra Section: misc Filename: pool/contrib/h/hadoop/hadoop-source_0.18.3-6cloudera0.3.0~intrepid_all.deb Size: 16285110 SHA256: d01adbbcb2e6506493d4c06050a007e446acfe23ba2faee94a28d55c3ec946c9 SHA1: 863ba6195ffcd092a8cb730f399cf28c17607ace MD5sum: 4e833f7eb00a70900d9e1d6c050572c5 Description: Source code for Hadoop This package contains the source code for Hadoop and its contrib modules. Package: hadoop-tasktracker Source: hadoop Version: 0.18.3-6cloudera0.3.0~intrepid Architecture: all Maintainer: Todd Lipcon Installed-Size: 132 Depends: hadoop (= 0.18.3-6cloudera0.3.0~intrepid) Homepage: http://hadoop.apache.org/core/ Priority: extra Section: misc Filename: pool/contrib/h/hadoop/hadoop-tasktracker_0.18.3-6cloudera0.3.0~intrepid_all.deb Size: 79592 SHA256: 3d419decd814fbad7951918f807f8f124294c166dd3b1446a7b9941dd2fe8fea SHA1: e18ab1dc8a28227233646320a2f59cfed4ccb22b MD5sum: f1fc24f671fc324d4848fb095c61323d Description: Task Tracker for Hadoop The Task Tracker is the Hadoop service that accepts MapReduce tasks and computes results. Each node in a Hadoop cluster that should be doing computation should run a Task Tracker. Package: hive Version: 0.3.0-0cloudera0.3.0~intrepid Architecture: all Maintainer: Todd Lipcon Installed-Size: 10996 Depends: java6-jre | java6-sdk, hadoop Homepage: http://hadoop.apache.org/hive/ Priority: extra Section: misc Filename: pool/contrib/h/hive/hive_0.3.0-0cloudera0.3.0~intrepid_all.deb Size: 9498732 SHA256: 41b240a2fb6be4e3edf3dd940fe8451f59fe5d0c97a3523b6297a2c0ecf1509e SHA1: 767c6034f2d8aa77c540b7b5fadde6ba28e0d0bd MD5sum: dcb8a5ab1f87dacb08aa01dee97bb144 Description: A data warehouse infrastructure built on top of Hadoop Hive is a data warehouse infrastructure built on top of Hadoop that provides tools to enable easy data summarization, adhoc querying and analysis of large datasets data stored in Hadoop files. It provides a mechanism to put structure on this data and it also provides a simple query language called Hive QL which is based on SQL and which enables users familiar with SQL to query this data. At the same time, this language also allows traditional map/reduce programmers to be able to plug in their custom mappers and reducers to do more sophisticated analysis which may not be supported by the built-in capabilities of the language. Package: libhdfs0 Source: hadoop Version: 0.18.3-6cloudera0.3.0~intrepid Architecture: amd64 Maintainer: Todd Lipcon Installed-Size: 160 Depends: hadoop (= 0.18.3-6cloudera0.3.0~intrepid), libc6 (>= 2.4) Homepage: http://hadoop.apache.org/core/ Priority: extra Section: misc Filename: pool/contrib/h/hadoop/libhdfs0_0.18.3-6cloudera0.3.0~intrepid_amd64.deb Size: 91458 SHA256: ee757fe6e2657c0c3828c79c478c4615b8c7e5abc78ee3c0ab8d7c1b814f33e9 SHA1: 53c25f65bc265ea9188f3372f25d90583ca28887 MD5sum: a54b95363fe854045ac795aab16893d6 Description: JNI Bindings to access Hadoop HDFS from C See http://wiki.apache.org/hadoop/LibHDFS Package: libhdfs0-dev Source: hadoop Version: 0.18.3-6cloudera0.3.0~intrepid Architecture: amd64 Maintainer: Todd Lipcon Installed-Size: 156 Depends: hadoop (= 0.18.3-6cloudera0.3.0~intrepid), libhdfs0 (= 0.18.3-6cloudera0.3.0~intrepid) Homepage: http://hadoop.apache.org/core/ Priority: extra Section: libdevel Filename: pool/contrib/h/hadoop/libhdfs0-dev_0.18.3-6cloudera0.3.0~intrepid_amd64.deb Size: 86784 SHA256: a7aca966d5b1e592e1bdb9026487a15141cfbd16c9e59e9cee833d604572360e SHA1: 585913444553345f35932d2d56a1ea84f0d08519 MD5sum: 650a83bdcf30a8b477b71b1f33df9709 Description: Development support for libhdfs0 Includes examples and header files for accessing HDFS from C Package: pig Version: 0.2.0-0cloudera0.3.0~intrepid Architecture: all Maintainer: Todd Lipcon Installed-Size: 42348 Depends: java6-jre | java6-sdk, hadoop Homepage: http://hadoop.apache.org/pig/ Priority: extra Section: misc Filename: pool/contrib/p/pig/pig_0.2.0-0cloudera0.3.0~intrepid_all.deb Size: 16051374 SHA256: f5fe7acffb10b9cb15253b39f9078c143f87b97bb66b5da3615f6f97ac27cd09 SHA1: 54799fd6d613718444c655ffbc2f546c7b72b096 MD5sum: 4565c93740002c4f17b922130f5cc98b Description: A platform for analyzing large data sets using Hadoop Pig is a platform for analyzing large data sets that consists of a high-level language for expressing data analysis programs, coupled with infrastructure for evaluating these programs. The salient property of Pig programs is that their structure is amenable to substantial parallelization, which in turns enables them to handle very large data sets. . At the present time, Pig's infrastructure layer consists of a compiler that produces sequences of Map-Reduce programs, for which large-scale parallel implementations already exist (e.g., the Hadoop subproject). Pig's language layer currently consists of a textual language called Pig Latin, which has the following key properties: . * Ease of programming It is trivial to achieve parallel execution of simple, "embarrassingly parallel" data analysis tasks. Complex tasks comprised of multiple interrelated data transformations are explicitly encoded as data flow sequences, making them easy to write, understand, and maintain. * Optimization opportunities The way in which tasks are encoded permits the system to optimize their execution automatically, allowing the user to focus on semantics rather than efficiency. * Extensibility Users can create their own functions to do special-purpose processing. Package: python-hive Source: hive Version: 0.3.0-0cloudera0.3.0~intrepid Architecture: all Maintainer: Todd Lipcon Installed-Size: 688 Depends: python, python-support (>= 0.7.1) Provides: python2.4-hive, python2.5-hive Homepage: http://hadoop.apache.org/hive/ Priority: extra Section: python Filename: pool/contrib/h/hive/python-hive_0.3.0-0cloudera0.3.0~intrepid_all.deb Size: 49602 SHA256: e42ce87ffb866691205797799a01a45d0852e4d6bebb857609162830b0e63943 SHA1: c35775544014700e6cee68cf89b8c7b3966077af MD5sum: 2ee0e5e1bd694e1565ed3cda887137fd Description: Python client library to talk to the Hive Metastore This is a generated Thrift client to talk to the Hive Metastore.