Advertisement

Microsoft Research team shatters data sorting record, wrenches trophy from Yahoo

Bruise inducing high-fives, anyone? They're handing them out in Redmond, according to one mildly injured researcher, after breaking a data sorting record Yahoo set in 2009. The ruckus surrounds a benchmark called MinuteSort, which measures how much data can be sorted in 60 seconds. Microsoft's Distributed Systems group utilized a new file system architecture, dubbed Flat Datacenter Storage, over a full bisection bandwidth network to burn through the competition.

Not only did the nine-person crew best the old record nearly by a factor of three, it gave itself a handicap -- sorting 1,401 GB of data at 2 GB/s over a remote file system, forcing the system to crunch data at a slower speed than the technique is capable of. It's not all about bragging rights, however, Bing has its eye on the newfangled file system in hopes of boosting its RPM. Microsoft suspects the tech could also pick up the pace of machine learning and churn through large data sets in a jiffy. You can catch Microsoft Research's detailed explanation in all its glory at the source.

Update: Commenter Mark Streich points out that while 2 GB/s may sound fast, it's certainly not speedy enough to sort 1,401 gigabytes in a single minute. To achieve that performance, simultaneous input and output speeds could hit 2GB/s on each computer used.