Google Sorts 1 Petabyte of Data in 6 Hours

Google (GOOG) says its engineers have sorted 1 petabyte of data in about 6 hours.

Rich Miller

November 24, 2008

1 Min Read
DataCenterKnowledge logo in a gray background | DataCenterKnowledge

Google has rewritten the record book and perhaps extended the benchmark for sorting massive volumes of data. The company said Friday that it had sorted 1 terabyte of data in just 68 seconds, eclipsing the previous mark of 209 seconds established in July by Yahoo. Google's effort included 1,000 computers using MapReduce, while Yahoo's effort featured a 910-node Hadoop cluster.

Then, just for giggles, they expanded the challenge: "Sometimes you need to sort more than a terabyte, so we were curious to find out what happens when you sort more and gave one petabyte (PB) a try," wrote Grzegorz Czajkowski of the Google Systems Infrastructure Team. "It took six hours and two minutes to sort 1PB (10 trillion 100-byte records) on 4,000 computers. We're not aware of any other sorting experiment at this scale and are obviously very excited to be able to process so much data so quickly."

Read more on the Official Google Blog.

Read more about:

Google Alphabet
Subscribe to the Data Center Knowledge Newsletter
Get analysis and expert insight on the latest in data center business and technology delivered to your inbox daily.

You May Also Like