An Overview of Apache’s Hadoop

Posted by linuxaria on Jun 14, 2012 6:43 AM EDT
Linux-news.org; By Chandra Heitzman
Mail this story
Print this story

Designed by the Apache Software Foundation, Hadoop is a Java-based open-source platform designed to process massive amounts of data in a distributed computing environment. Hadoop’s key innovations lay in its ability to store and access massive amounts of data over thousands of computers and to coherently present that data. Though data warehouses can store data on a similar scale, they are costly and do not allow for effective exploration of huge amounts of discordant data. Hadoop addresses this limitation by taking a data query and distributing it over multiple computer clusters. By distributing the workload over thousands of loosely networked computers (nodes), Hadoop can potentially examine and present petabytes of heterogeneous data in a meaningful format. Even so, the software is fully scalable and can operate on a single server or small network.

Full Story

  Nav
» Read more about: Story Type: Reviews

« Return to the newswire homepage

This topic does not have any threads posted yet!

You cannot post until you login.