Package: hadoop Version: 0.18.3-6cloudera0.3.0~intrepid Architecture: all Maintainer: Todd Lipcon Installed-Size: 21712 Depends: adduser, sun-java6-jre Recommends: hadoop-native Homepage: http://hadoop.apache.org/core/ Priority: extra Section: misc Filename: pool/contrib/h/hadoop/hadoop_0.18.3-6cloudera0.3.0~intrepid_all.deb Size: 12456720 SHA256: 499ba140d3f0f02913fc1ac369cf307730fa9fba1f6cf58a0c63e39176e92550 SHA1: de529c73312f3cd9186db1171583beb6dd900bd1 MD5sum: 69d17abbf5b36c4488b559c29fd29a76 Description: A software platform for processing vast amounts of data Hadoop is a software platform that lets one easily write and run applications that process vast amounts of data. . Here's what makes Hadoop especially useful: * Scalable: Hadoop can reliably store and process petabytes. * Economical: It distributes the data and processing across clusters of commonly available computers. These clusters can number into the thousands of nodes. * Efficient: By distributing the data, Hadoop can process it in parallel on the nodes where the data is located. This makes it extremely rapid. * Reliable: Hadoop automatically maintains multiple copies of data and automatically redeploys computing tasks based on failures. . Hadoop implements MapReduce, using the Hadoop Distributed File System (HDFS). MapReduce divides applications into many small blocks of work. HDFS creates multiple replicas of data blocks for reliability, placing them on compute nodes around the cluster. MapReduce can then process the data where it is located. Package: hadoop-conf-pseudo Source: hadoop Version: 0.18.3-6cloudera0.3.0~intrepid Architecture: all Maintainer: Todd Lipcon Installed-Size: 224 Depends: hadoop (= 0.18.3-6cloudera0.3.0~intrepid) Homepage: http://hadoop.apache.org/core/ Priority: extra Section: misc Filename: pool/contrib/h/hadoop/hadoop-conf-pseudo_0.18.3-6cloudera0.3.0~intrepid_all.deb Size: 93314 SHA256: d23f6e49ac15598fca003bbcbe0ef8e669a0bf3c652b44dedf35a0a0579c1841 SHA1: 5c70040243952d31493cf446aced8e470d3365bf MD5sum: f411bc1419ff1d3e9163cd6f9b2e31f7 Description: Pseudo-distributed Hadoop configuration Contains configuration files for a "pseudo-distributed" Hadoop deployment. In this mode, each of the hadoop components runs as a separate Java process, but all on the same machine. Package: hadoop-datanode Source: hadoop Version: 0.18.3-6cloudera0.3.0~intrepid Architecture: all Maintainer: Todd Lipcon Installed-Size: 132 Depends: hadoop (= 0.18.3-6cloudera0.3.0~intrepid) Homepage: http://hadoop.apache.org/core/ Priority: extra Section: misc Filename: pool/contrib/h/hadoop/hadoop-datanode_0.18.3-6cloudera0.3.0~intrepid_all.deb Size: 79588 SHA256: cf334e2d281d63a419697927652b9362b5d23eb7989eba170704ae758244f5f6 SHA1: 5d26277bb7aad54e5d3ba4a409542e4cd419b74c MD5sum: 390cc08c2f3338f9b0af6597238065e5 Description: Data Node for Hadoop The Data Nodes in the Hadoop Cluster are responsible for serving up blocks of data over the network to Hadoop Distributed Filesystem (HDFS) clients. Package: hadoop-doc Source: hadoop Version: 0.18.3-6cloudera0.3.0~intrepid Architecture: all Maintainer: Todd Lipcon Installed-Size: 37052 Homepage: http://hadoop.apache.org/core/ Priority: extra Section: doc Filename: pool/contrib/h/hadoop/hadoop-doc_0.18.3-6cloudera0.3.0~intrepid_all.deb Size: 5715068 SHA256: 3c070450218b35776b693e4c89b3887b31814752edd73b2d8aa9e392c4cea2eb SHA1: 23b2d8d1711d0cfde0cce3d1dc0f378a48fc9f03 MD5sum: 015f175abddb5a6641c3dca2792c3523 Description: Documentation for Hadoop This package contains the Java Documentation for Hadoop and its relevant APIs. Package: hadoop-jobtracker Source: hadoop Version: 0.18.3-6cloudera0.3.0~intrepid Architecture: all Maintainer: Todd Lipcon Installed-Size: 132 Depends: hadoop (= 0.18.3-6cloudera0.3.0~intrepid) Homepage: http://hadoop.apache.org/core/ Priority: extra Section: misc Filename: pool/contrib/h/hadoop/hadoop-jobtracker_0.18.3-6cloudera0.3.0~intrepid_all.deb Size: 79624 SHA256: 40764fef03f394f8fd35ebae0d264649131f388fcd35849720e0c62fa07e7cf3 SHA1: efc91e642fde937f9e134641717c240923bcd9fb MD5sum: c368924a0e3ce968e5b103ea84c3c38d Description: Job Tracker for Hadoop The jobtracker is a central service which is responsible for managing the tasktracker services running on all nodes in a Hadoop Cluster. The jobtracker allocates work to the tasktracker nearest to the data with an available work slot. Package: hadoop-namenode Source: hadoop Version: 0.18.3-6cloudera0.3.0~intrepid Architecture: all Maintainer: Todd Lipcon Installed-Size: 132 Depends: hadoop (= 0.18.3-6cloudera0.3.0~intrepid) Homepage: http://hadoop.apache.org/core/ Priority: extra Section: misc Filename: pool/contrib/h/hadoop/hadoop-namenode_0.18.3-6cloudera0.3.0~intrepid_all.deb Size: 79586 SHA256: 542fda47805219b107a09c7e2ac2767391c6c9f6ffaf3b32c4c34aa466bfaad4 SHA1: 672de184ccba1a759e51f7cb30004bea3628eeab MD5sum: 8f366eff47614ab1d8be3c19c26b2151 Description: Name Node for Hadoop The Hadoop Distributed Filesystem (HDFS) requires one unique server, the namenode, which manages the block locations of files on the filesystem. Package: hadoop-native Source: hadoop Version: 0.18.3-6cloudera0.3.0~intrepid Architecture: i386 Maintainer: Todd Lipcon Installed-Size: 220 Depends: libc6 (>= 2.4), hadoop (= 0.18.3-6cloudera0.3.0~intrepid), liblzo2-2, libz1 Enhances: hadoop Homepage: http://hadoop.apache.org/core/ Priority: extra Section: misc Filename: pool/contrib/h/hadoop/hadoop-native_0.18.3-6cloudera0.3.0~intrepid_i386.deb Size: 93884 SHA256: 20445767d9cb347e310552b61903cd10c113fefeb903af617dcb91431bfb9ffb SHA1: cd2bd1daadb5081a2a65c3c6fa5b5be09396453b MD5sum: 0090f17e88fc70b575984c170cb66bb1 Description: Native libraries for Hadoop (e.g., compression) This optional package contains native libraries that increase the performance of Hadoop's compression. Package: hadoop-pipes Source: hadoop Version: 0.18.3-6cloudera0.3.0~intrepid Architecture: i386 Maintainer: Todd Lipcon Installed-Size: 320 Depends: hadoop (= 0.18.3-6cloudera0.3.0~intrepid) Homepage: http://hadoop.apache.org/core/ Priority: extra Section: misc Filename: pool/contrib/h/hadoop/hadoop-pipes_0.18.3-6cloudera0.3.0~intrepid_i386.deb Size: 134842 SHA256: 9a60dd8aeff68ec0134e8bb31012c7bdad2af92729ecbfedb8bf538bc26d9bcb SHA1: 7e784cd84d310cdf4241cf48cfa774225e625a2e MD5sum: 10ed90f8837324d77a01ff8a2f82cbaa Description: Interface to author Hadoop MapReduce jobs in C++ Contains Hadoop Pipes, a library which allows Hadoop MapReduce jobs to be written in C++. Package: hadoop-secondarynamenode Source: hadoop Version: 0.18.3-6cloudera0.3.0~intrepid Architecture: all Maintainer: Todd Lipcon Installed-Size: 132 Depends: hadoop (= 0.18.3-6cloudera0.3.0~intrepid) Homepage: http://hadoop.apache.org/core/ Priority: extra Section: misc Filename: pool/contrib/h/hadoop/hadoop-secondarynamenode_0.18.3-6cloudera0.3.0~intrepid_all.deb Size: 79614 SHA256: 076799a6f45f071c1888a31a1816bfe8e9c996c8dc425181dfead658df6ecc2f SHA1: c4b43b941e887b45fc5d92cfd7160506ec7859b4 MD5sum: f165a03bb06814a6829b67aea03d17a7 Description: Secondary Name Node for Hadoop The Secondary Name Node is responsible for checkpointing file system images. It is _not_ a failover pair for the namenode, and may safely be run on the same machine. Package: hadoop-source Source: hadoop Version: 0.18.3-6cloudera0.3.0~intrepid Architecture: all Maintainer: Todd Lipcon Installed-Size: 45564 Homepage: http://hadoop.apache.org/core/ Priority: extra Section: misc Filename: pool/contrib/h/hadoop/hadoop-source_0.18.3-6cloudera0.3.0~intrepid_all.deb Size: 16285110 SHA256: d01adbbcb2e6506493d4c06050a007e446acfe23ba2faee94a28d55c3ec946c9 SHA1: 863ba6195ffcd092a8cb730f399cf28c17607ace MD5sum: 4e833f7eb00a70900d9e1d6c050572c5 Description: Source code for Hadoop This package contains the source code for Hadoop and its contrib modules. Package: hadoop-tasktracker Source: hadoop Version: 0.18.3-6cloudera0.3.0~intrepid Architecture: all Maintainer: Todd Lipcon Installed-Size: 132 Depends: hadoop (= 0.18.3-6cloudera0.3.0~intrepid) Homepage: http://hadoop.apache.org/core/ Priority: extra Section: misc Filename: pool/contrib/h/hadoop/hadoop-tasktracker_0.18.3-6cloudera0.3.0~intrepid_all.deb Size: 79592 SHA256: 3d419decd814fbad7951918f807f8f124294c166dd3b1446a7b9941dd2fe8fea SHA1: e18ab1dc8a28227233646320a2f59cfed4ccb22b MD5sum: f1fc24f671fc324d4848fb095c61323d Description: Task Tracker for Hadoop The Task Tracker is the Hadoop service that accepts MapReduce tasks and computes results. Each node in a Hadoop cluster that should be doing computation should run a Task Tracker. Package: hive Version: 0.3.0-0cloudera0.3.0~intrepid Architecture: all Maintainer: Todd Lipcon Installed-Size: 10996 Depends: java6-jre | java6-sdk, hadoop Homepage: http://hadoop.apache.org/hive/ Priority: extra Section: misc Filename: pool/contrib/h/hive/hive_0.3.0-0cloudera0.3.0~intrepid_all.deb Size: 9498732 SHA256: 41b240a2fb6be4e3edf3dd940fe8451f59fe5d0c97a3523b6297a2c0ecf1509e SHA1: 767c6034f2d8aa77c540b7b5fadde6ba28e0d0bd MD5sum: dcb8a5ab1f87dacb08aa01dee97bb144 Description: A data warehouse infrastructure built on top of Hadoop Hive is a data warehouse infrastructure built on top of Hadoop that provides tools to enable easy data summarization, adhoc querying and analysis of large datasets data stored in Hadoop files. It provides a mechanism to put structure on this data and it also provides a simple query language called Hive QL which is based on SQL and which enables users familiar with SQL to query this data. At the same time, this language also allows traditional map/reduce programmers to be able to plug in their custom mappers and reducers to do more sophisticated analysis which may not be supported by the built-in capabilities of the language. Package: libhdfs0 Source: hadoop Version: 0.18.3-6cloudera0.3.0~intrepid Architecture: i386 Maintainer: Todd Lipcon Installed-Size: 156 Depends: hadoop (= 0.18.3-6cloudera0.3.0~intrepid), libc6 (>= 2.4) Homepage: http://hadoop.apache.org/core/ Priority: extra Section: misc Filename: pool/contrib/h/hadoop/libhdfs0_0.18.3-6cloudera0.3.0~intrepid_i386.deb Size: 90242 SHA256: cb5cbccad5c725d7811f18f45ccd79dfdaf7b80a79f96ae29a3b751ce750d653 SHA1: f8e3781761898823bff57c382ebbf8ee8b984dd2 MD5sum: 0bfc8da64b4ec29373e56d1b3253c5c6 Description: JNI Bindings to access Hadoop HDFS from C See http://wiki.apache.org/hadoop/LibHDFS Package: libhdfs0-dev Source: hadoop Version: 0.18.3-6cloudera0.3.0~intrepid Architecture: i386 Maintainer: Todd Lipcon Installed-Size: 156 Depends: hadoop (= 0.18.3-6cloudera0.3.0~intrepid), libhdfs0 (= 0.18.3-6cloudera0.3.0~intrepid) Homepage: http://hadoop.apache.org/core/ Priority: extra Section: libdevel Filename: pool/contrib/h/hadoop/libhdfs0-dev_0.18.3-6cloudera0.3.0~intrepid_i386.deb Size: 86788 SHA256: 25417e2177cf5811cf40a4f25dc97e6e4928508ada96a38b3bdd1974a813542f SHA1: 0993f67d4f6be2b7df34931ab304be82323a207f MD5sum: a1a069986e29be971983b4b371c2a9fa Description: Development support for libhdfs0 Includes examples and header files for accessing HDFS from C Package: pig Version: 0.2.0-0cloudera0.3.0~intrepid Architecture: all Maintainer: Todd Lipcon Installed-Size: 42348 Depends: java6-jre | java6-sdk, hadoop Homepage: http://hadoop.apache.org/pig/ Priority: extra Section: misc Filename: pool/contrib/p/pig/pig_0.2.0-0cloudera0.3.0~intrepid_all.deb Size: 16051374 SHA256: f5fe7acffb10b9cb15253b39f9078c143f87b97bb66b5da3615f6f97ac27cd09 SHA1: 54799fd6d613718444c655ffbc2f546c7b72b096 MD5sum: 4565c93740002c4f17b922130f5cc98b Description: A platform for analyzing large data sets using Hadoop Pig is a platform for analyzing large data sets that consists of a high-level language for expressing data analysis programs, coupled with infrastructure for evaluating these programs. The salient property of Pig programs is that their structure is amenable to substantial parallelization, which in turns enables them to handle very large data sets. . At the present time, Pig's infrastructure layer consists of a compiler that produces sequences of Map-Reduce programs, for which large-scale parallel implementations already exist (e.g., the Hadoop subproject). Pig's language layer currently consists of a textual language called Pig Latin, which has the following key properties: . * Ease of programming It is trivial to achieve parallel execution of simple, "embarrassingly parallel" data analysis tasks. Complex tasks comprised of multiple interrelated data transformations are explicitly encoded as data flow sequences, making them easy to write, understand, and maintain. * Optimization opportunities The way in which tasks are encoded permits the system to optimize their execution automatically, allowing the user to focus on semantics rather than efficiency. * Extensibility Users can create their own functions to do special-purpose processing. Package: python-hive Source: hive Version: 0.3.0-0cloudera0.3.0~intrepid Architecture: all Maintainer: Todd Lipcon Installed-Size: 688 Depends: python, python-support (>= 0.7.1) Provides: python2.4-hive, python2.5-hive Homepage: http://hadoop.apache.org/hive/ Priority: extra Section: python Filename: pool/contrib/h/hive/python-hive_0.3.0-0cloudera0.3.0~intrepid_all.deb Size: 49602 SHA256: e42ce87ffb866691205797799a01a45d0852e4d6bebb857609162830b0e63943 SHA1: c35775544014700e6cee68cf89b8c7b3966077af MD5sum: 2ee0e5e1bd694e1565ed3cda887137fd Description: Python client library to talk to the Hive Metastore This is a generated Thrift client to talk to the Hive Metastore.