r1 - 02 Nov 2007 - 20:35:06 - AndrewLynnYou are here: TWiki >  Freed2007 Web > PosteventReport > ArunMurthy

Arun Murthy

Open Source Grid Computing: Apache Hadoop

Hadoop is a framework for running applications on large clusters built of commodity hardware. The Hadoop framework transparently provides applications both reliability and data motion. Hadoop implements a computational paradigm named Map/Reduce, where the application is divided into many small fragments of work, each of which may be executed or re-executed on any node in the cluster. In addition, it provides a distributed file system (HDFS) that stores data on the compute nodes, providing very high aggregate bandwidth across the cluster. Both Map/Reduce and the distributed file system are designed so that node failures are automatically handled by the framework. Hadoop is a sub-project of Apache Lucene, and has been demonstrated to work in a reliable & performant manner on clusters of 2000 nodes and continues to scale further.

-- AndrewLynn - 02 Nov 2007

Edit | WYSIWYG | Attach | Printable | Raw View | Backlinks: Web, All Webs | History: r1 | More topic actions
 
Powered by TWiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback