An Amazingly Critical Software Program for Super-Sized Datasets

How much would you pay for this software program:

  • It determines what 300 million Yahoo users see each month, and helps Yahoo customize its home page content.
  • Facebook uses it to manage 40 billion stored photographs.
  • Microsoft changed its internal policies so that its team could develop on this software

The software is Hadoop, named after a stuffed toy elephant.  Hadoop was developed by a consultant, Doug Cutting, based on papers published by Google concerning its extremely valuable MapReduce technology.  According to Google, MapReduce is used to distribute searches and information-intensive processing across batteries of commodity computers, and is intended to enable even inexperienced programmers to easily use a large distributed system — terabytes of data on thousands of machines.

The price is zero — it is open source under the relatively lenient Apache License.

[Source: “Hadoop, a Free Software Program Finds Uses beyond Search, by Ashee Vance, The New York Times, March 17, 2009. The name “Hadoop” and depictions of the Hadoop logo and mascot are reserved for use by The Apache Software Foundation.]

