Package: hadoop Version: 0.18.3-6cloudera0.3.0~hardy Architecture: all Maintainer: Todd Lipcon Installed-Size: 21704 Depends: adduser, sun-java6-jre Recommends: hadoop-native Homepage: http://hadoop.apache.org/core/ Priority: extra Section: misc Filename: pool/contrib/h/hadoop/hadoop_0.18.3-6cloudera0.3.0~hardy_all.deb Size: 12453782 SHA256: dfa688b108005b4248bce8f1b6d73c6e75906c236b803ba29198eccde224e0f7 SHA1: 377402c7fca9a4d04f90cebd72a1f3c22cfc522f MD5sum: 9329029946f72d5c76d179600ef117dd Description: A software platform for processing vast amounts of data Hadoop is a software platform that lets one easily write and run applications that process vast amounts of data. . Here's what makes Hadoop especially useful: * Scalable: Hadoop can reliably store and process petabytes. * Economical: It distributes the data and processing across clusters of commonly available computers. These clusters can number into the thousands of nodes. * Efficient: By distributing the data, Hadoop can process it in parallel on the nodes where the data is located. This makes it extremely rapid. * Reliable: Hadoop automatically maintains multiple copies of data and automatically redeploys computing tasks based on failures. . Hadoop implements MapReduce, using the Hadoop Distributed File System (HDFS). MapReduce divides applications into many small blocks of work. HDFS creates multiple replicas of data blocks for reliability, placing them on compute nodes around the cluster. MapReduce can then process the data where it is located. Package: hadoop-conf-pseudo Source: hadoop Version: 0.18.3-6cloudera0.3.0~hardy Architecture: all Maintainer: Todd Lipcon Installed-Size: 212 Depends: hadoop (= 0.18.3-6cloudera0.3.0~hardy) Homepage: http://hadoop.apache.org/core/ Priority: extra Section: misc Filename: pool/contrib/h/hadoop/hadoop-conf-pseudo_0.18.3-6cloudera0.3.0~hardy_all.deb Size: 93186 SHA256: 5f95cb9b3c3905c644c84ea062a8dcbb8abad9b467ed6fea9820c1992f62e07c SHA1: d3127e5f9b52129f9fe9a60d965b79d53595a6ed MD5sum: 2f04ed5ed6c812a0b0324a6ee2017c9d Description: Pseudo-distributed Hadoop configuration Contains configuration files for a "pseudo-distributed" Hadoop deployment. In this mode, each of the hadoop components runs as a separate Java process, but all on the same machine. Package: hadoop-datanode Source: hadoop Version: 0.18.3-6cloudera0.3.0~hardy Architecture: all Maintainer: Todd Lipcon Installed-Size: 132 Depends: hadoop (= 0.18.3-6cloudera0.3.0~hardy) Homepage: http://hadoop.apache.org/core/ Priority: extra Section: misc Filename: pool/contrib/h/hadoop/hadoop-datanode_0.18.3-6cloudera0.3.0~hardy_all.deb Size: 79586 SHA256: 664749e5c65d22b9f8895544cfd980a4a169b277475ce30cc568516cd9b286fa SHA1: 7fb28ac7d23c46b0acf785fc5a66aad514499772 MD5sum: bbbe55655e1feaaad4d7da34cc249b33 Description: Data Node for Hadoop The Data Nodes in the Hadoop Cluster are responsible for serving up blocks of data over the network to Hadoop Distributed Filesystem (HDFS) clients. Package: hadoop-doc Source: hadoop Version: 0.18.3-6cloudera0.3.0~hardy Architecture: all Maintainer: Todd Lipcon Installed-Size: 37040 Homepage: http://hadoop.apache.org/core/ Priority: extra Section: doc Filename: pool/contrib/h/hadoop/hadoop-doc_0.18.3-6cloudera0.3.0~hardy_all.deb Size: 5709022 SHA256: 38f0412488d4ec6a0f2fb26e8be3f7b3597cfbc0c924c02948812138e2363ca0 SHA1: eaed807d4eca9c2032198906d202e48d0a888c46 MD5sum: 152e919e39ef7cd374e827f4ff24fab4 Description: Documentation for Hadoop This package contains the Java Documentation for Hadoop and its relevant APIs. Package: hadoop-jobtracker Source: hadoop Version: 0.18.3-6cloudera0.3.0~hardy Architecture: all Maintainer: Todd Lipcon Installed-Size: 132 Depends: hadoop (= 0.18.3-6cloudera0.3.0~hardy) Homepage: http://hadoop.apache.org/core/ Priority: extra Section: misc Filename: pool/contrib/h/hadoop/hadoop-jobtracker_0.18.3-6cloudera0.3.0~hardy_all.deb Size: 79620 SHA256: d0a0d473bec21e227a1fed740ab4856d74d94d6f6c62440c1a4b1c397ffd9155 SHA1: 42dd6898df7043c2259612f22acac7a0a52341a0 MD5sum: 93d7fb67bcce0de552038c26317a43f0 Description: Job Tracker for Hadoop The jobtracker is a central service which is responsible for managing the tasktracker services running on all nodes in a Hadoop Cluster. The jobtracker allocates work to the tasktracker nearest to the data with an available work slot. Package: hadoop-namenode Source: hadoop Version: 0.18.3-6cloudera0.3.0~hardy Architecture: all Maintainer: Todd Lipcon Installed-Size: 132 Depends: hadoop (= 0.18.3-6cloudera0.3.0~hardy) Homepage: http://hadoop.apache.org/core/ Priority: extra Section: misc Filename: pool/contrib/h/hadoop/hadoop-namenode_0.18.3-6cloudera0.3.0~hardy_all.deb Size: 79584 SHA256: e83c16e25852f2dd21746b3ad3e782e31d198fbeeb3c31ffe3f3f8329e1bd156 SHA1: 97c2e9d7ebba10c0dcfb844f6c618e98cc2e63fb MD5sum: e6b0dbe0ae3ed205600d6e390b58d19f Description: Name Node for Hadoop The Hadoop Distributed Filesystem (HDFS) requires one unique server, the namenode, which manages the block locations of files on the filesystem. Package: hadoop-native Source: hadoop Version: 0.18.3-6cloudera0.3.0~hardy Architecture: i386 Maintainer: Todd Lipcon Installed-Size: 204 Depends: hadoop (= 0.18.3-6cloudera0.3.0~hardy), libc6 (>= 2.4), liblzo2-2, libz1 Enhances: hadoop Homepage: http://hadoop.apache.org/core/ Priority: extra Section: misc Filename: pool/contrib/h/hadoop/hadoop-native_0.18.3-6cloudera0.3.0~hardy_i386.deb Size: 93170 SHA256: a9feb1f142521d027b1a2b5a190f722c2b7f91dcebd0d72822e7a6173e189278 SHA1: 0e7b0a9aaac7643f826b573a38f93d9a241fa3fe MD5sum: b8c829f2a54c9eba72b45fd8233da072 Description: Native libraries for Hadoop (e.g., compression) This optional package contains native libraries that increase the performance of Hadoop's compression. Package: hadoop-pipes Source: hadoop Version: 0.18.3-6cloudera0.3.0~hardy Architecture: i386 Maintainer: Todd Lipcon Installed-Size: 356 Depends: hadoop (= 0.18.3-6cloudera0.3.0~hardy) Homepage: http://hadoop.apache.org/core/ Priority: extra Section: misc Filename: pool/contrib/h/hadoop/hadoop-pipes_0.18.3-6cloudera0.3.0~hardy_i386.deb Size: 140134 SHA256: 5b587edab769b99b642e4ff06a5dc251cb899642b481631168824ad3659180cd SHA1: 95f6c5ae02eb25f5698a27ba79a8f1532800aad6 MD5sum: a4b356d12f17f785be61788a6ad34154 Description: Interface to author Hadoop MapReduce jobs in C++ Contains Hadoop Pipes, a library which allows Hadoop MapReduce jobs to be written in C++. Package: hadoop-secondarynamenode Source: hadoop Version: 0.18.3-6cloudera0.3.0~hardy Architecture: all Maintainer: Todd Lipcon Installed-Size: 132 Depends: hadoop (= 0.18.3-6cloudera0.3.0~hardy) Homepage: http://hadoop.apache.org/core/ Priority: extra Section: misc Filename: pool/contrib/h/hadoop/hadoop-secondarynamenode_0.18.3-6cloudera0.3.0~hardy_all.deb Size: 79616 SHA256: ee2d1933f90520261519fbdae73f3a3148d41395dfa8d72f3f48b6885a3e08dd SHA1: 3c40d7afb7c4e9b7e93f1f7aed8acf42dc3e8c86 MD5sum: dbea2563607059457425afe1fe193488 Description: Secondary Name Node for Hadoop The Secondary Name Node is responsible for checkpointing file system images. It is _not_ a failover pair for the namenode, and may safely be run on the same machine. Package: hadoop-source Source: hadoop Version: 0.18.3-6cloudera0.3.0~hardy Architecture: all Maintainer: Todd Lipcon Installed-Size: 45564 Homepage: http://hadoop.apache.org/core/ Priority: extra Section: misc Filename: pool/contrib/h/hadoop/hadoop-source_0.18.3-6cloudera0.3.0~hardy_all.deb Size: 16287264 SHA256: c1c1f2631a8bced0c771ba9ae061422fb48f7baa37fb90bd051508530ff8c870 SHA1: 642f6571ab4973cf9a49ff9d67ffd33d5e3a86f0 MD5sum: 5205f4eefd04a0426acbea3517a7df9f Description: Source code for Hadoop This package contains the source code for Hadoop and its contrib modules. Package: hadoop-tasktracker Source: hadoop Version: 0.18.3-6cloudera0.3.0~hardy Architecture: all Maintainer: Todd Lipcon Installed-Size: 132 Depends: hadoop (= 0.18.3-6cloudera0.3.0~hardy) Homepage: http://hadoop.apache.org/core/ Priority: extra Section: misc Filename: pool/contrib/h/hadoop/hadoop-tasktracker_0.18.3-6cloudera0.3.0~hardy_all.deb Size: 79592 SHA256: 2a2b23d801859a0ded800d975d41b5982edce32d934a1f2e9d353849f6e50983 SHA1: 4361756b1ae32c8e36214b511272eb1f34e519cb MD5sum: 5f861b50556821af25a60f3403861a32 Description: Task Tracker for Hadoop The Task Tracker is the Hadoop service that accepts MapReduce tasks and computes results. Each node in a Hadoop cluster that should be doing computation should run a Task Tracker. Package: hive Version: 0.3.0-0cloudera0.3.0~hardy Architecture: all Maintainer: Todd Lipcon Installed-Size: 10996 Depends: hadoop, java6-jre | java6-sdk Homepage: http://hadoop.apache.org/hive/ Priority: extra Section: misc Filename: pool/contrib/h/hive/hive_0.3.0-0cloudera0.3.0~hardy_all.deb Size: 9505524 SHA256: e19f808e3c8f95d6992c2820c606ed8967584f7eb30a0a0d8cd0d1e416d58544 SHA1: 9ded4d470860fa5e36b7b506bc24261d4051cf4f MD5sum: 5cd7069c37fbc795c5c35ac74a6323be Description: A data warehouse infrastructure built on top of Hadoop Hive is a data warehouse infrastructure built on top of Hadoop that provides tools to enable easy data summarization, adhoc querying and analysis of large datasets data stored in Hadoop files. It provides a mechanism to put structure on this data and it also provides a simple query language called Hive QL which is based on SQL and which enables users familiar with SQL to query this data. At the same time, this language also allows traditional map/reduce programmers to be able to plug in their custom mappers and reducers to do more sophisticated analysis which may not be supported by the built-in capabilities of the language. Package: libhdfs0 Source: hadoop Version: 0.18.3-6cloudera0.3.0~hardy Architecture: i386 Maintainer: Todd Lipcon Installed-Size: 156 Depends: hadoop (= 0.18.3-6cloudera0.3.0~hardy), libc6 (>= 2.4) Homepage: http://hadoop.apache.org/core/ Priority: extra Section: misc Filename: pool/contrib/h/hadoop/libhdfs0_0.18.3-6cloudera0.3.0~hardy_i386.deb Size: 90254 SHA256: 2530b0debb018e20f655543abbb0d7115ddfe60d41a43acd0e1032089cabfda2 SHA1: e33fc18e17faee3ea4b83b67f3d02a2b8006b5c4 MD5sum: 70025f50c0b816d08803173fdd1fa113 Description: JNI Bindings to access Hadoop HDFS from C See http://wiki.apache.org/hadoop/LibHDFS Package: libhdfs0-dev Source: hadoop Version: 0.18.3-6cloudera0.3.0~hardy Architecture: i386 Maintainer: Todd Lipcon Installed-Size: 156 Depends: hadoop (= 0.18.3-6cloudera0.3.0~hardy), libhdfs0 (= 0.18.3-6cloudera0.3.0~hardy) Homepage: http://hadoop.apache.org/core/ Priority: extra Section: libdevel Filename: pool/contrib/h/hadoop/libhdfs0-dev_0.18.3-6cloudera0.3.0~hardy_i386.deb Size: 86366 SHA256: 2b424214a9b524718aee40a9666eb2f4d3c676cc7a6d9d7f6b96e1e2d8faf3e6 SHA1: 4e2b0342cb4c748ef4b65c38cd9aebdcce1a2d78 MD5sum: 0d2304b62c66088e0d920ab21f080756 Description: Development support for libhdfs0 Includes examples and header files for accessing HDFS from C Package: pig Version: 0.2.0-0cloudera0.3.0~hardy Architecture: all Maintainer: Todd Lipcon Installed-Size: 42348 Depends: hadoop, java6-jre | java6-sdk Homepage: http://hadoop.apache.org/pig/ Priority: extra Section: misc Filename: pool/contrib/p/pig/pig_0.2.0-0cloudera0.3.0~hardy_all.deb Size: 16036552 SHA256: 8a7cd5a1f07dfcdb97584321c566d7cf4c1d2a36046c259c347f06120c92cc90 SHA1: 34708fbd88f63b406aff474bdea909bfb30fca5d MD5sum: f4ca7c60871504961a6330d469db93bb Description: A platform for analyzing large data sets using Hadoop Pig is a platform for analyzing large data sets that consists of a high-level language for expressing data analysis programs, coupled with infrastructure for evaluating these programs. The salient property of Pig programs is that their structure is amenable to substantial parallelization, which in turns enables them to handle very large data sets. . At the present time, Pig's infrastructure layer consists of a compiler that produces sequences of Map-Reduce programs, for which large-scale parallel implementations already exist (e.g., the Hadoop subproject). Pig's language layer currently consists of a textual language called Pig Latin, which has the following key properties: . * Ease of programming It is trivial to achieve parallel execution of simple, "embarrassingly parallel" data analysis tasks. Complex tasks comprised of multiple interrelated data transformations are explicitly encoded as data flow sequences, making them easy to write, understand, and maintain. * Optimization opportunities The way in which tasks are encoded permits the system to optimize their execution automatically, allowing the user to focus on semantics rather than efficiency. * Extensibility Users can create their own functions to do special-purpose processing. Package: python-hive Source: hive Version: 0.3.0-0cloudera0.3.0~hardy Architecture: all Maintainer: Todd Lipcon Installed-Size: 688 Depends: python, python-support (>= 0.7.1) Provides: python2.3-hive, python2.4-hive, python2.5-hive Homepage: http://hadoop.apache.org/hive/ Priority: extra Section: python Filename: pool/contrib/h/hive/python-hive_0.3.0-0cloudera0.3.0~hardy_all.deb Size: 49672 SHA256: 417f0ba324c51d6c883fdb00af3133bf5e4912b759a75a5e1ccf2169b223b6e4 SHA1: 6bd902eb545e04b4988e50330b5fc295a4197a64 MD5sum: 863460233f55b2b21e459ac1dd71aaf2 Description: Python client library to talk to the Hive Metastore This is a generated Thrift client to talk to the Hive Metastore.