The final results of the 2011 Student Cluster Competition are in the books and can be revealed. Well, they should have been revealed last night, but a series of meetings, bad hotel Wi-Fi, and perhaps alcohol played a role in the delay. In upcoming articles, you’ll see more videos and analysis of the results. But for now, I’ll sketch out the big picture.
In the closest imaginable finish, Team Taiwan from National Tsing Hua University took the overall student clustering crown. They did it on a hybrid Acer system consisting of 72 Xeon CPU nodes and six NVIDIA Tesla C2070 GPU accelerators. On the memory side they had 48 GB/node, a bit low compared to other teams.
Just barely behind Team Taiwan was China’s National University of Defense Technology (NUDT). The SC committee hasn’t (ever) released actual figures, but I was skulking around while they were being compiled and can say that the Taiwan-China scores were so close that literally hours of checking, re-checking, and then checking once again were required to make sure the results were correct.
Team China also ran a hybrid CPU-GPU system. But the Chinese system, while sporting the same six NVIDIA C2070 Tesla accelerators as Team Taiwan, had only 24 Xeon cores. China had more memory per node at 96 GB each, but less overall memory at 192 GB total vs. 288 for Taiwan.
What’s obvious is that both of these teams did a great job of adapting the scientific workloads for use with GPUs. While some of the apps, like PFA, were already GPU compliant, POP and some of the others were much less so and required some work to make them run efficiently.
Both teams also did a great job of sandbagging me – always saying that they weren’t sure whether their apps would take advantage of the GPUs, and that they weren’t expecting much.
Team Texas was a close third and had the highest score of any non-GPU competitor. Their mineral oil gambit seems to have paid off, giving them enough computing power (11 nodes, 132 Xeon cores and 48 GB per node) to both fit within the 26 amp limit and put up a good fight with the winners.
Team Boston, a new competitor, took fourth place with a massive hybrid AMD (336 cores)-NVIDIA (4xTesla M2090) system. This is a very good finish for a new team and also the highest placement of an AMD Interlagos-based box in the competition.
Right behind Boston was Team Russia with yet another hybrid system. Team Russia was sporting 84 Xeon cores and a whopping twelve NVIDIA M2070 cards and running a Microsoft operating system. While they didn’t take the overall crown, Team Russia nailed a win on LINPACK with their 1.9 TFlop result.
Close on their heels was Team Colorado, running a traditional CPU-based system with 256 Xeon cores and more memory (128GB per node) than any other team.
Purdue’s gambit of sandbagging the Monday HPCC/LINPACK portion of the competition in order to apply maximum hardware to the scientific applications didn’t pay off as hoped. We have a video posted that discusses the what’s and why’s behind Purdue’s strategy.
Rounding out the competition was Team Costa Rica, who completed every challenge but didn’t excel in enough of them to climb past the leaders. They were thrown a last-minute curve ball when they had to change hardware vendors on setup day. The HPC Advisory Council – in the person of Gilad Shainer – stepped in to make sure that Costa Rica had everything they needed to compete. And compete they did, making a great impression on judges and spectators alike with their attitude, technical prowess, and desire to succeed.
I’ll be writing a bit more analysis of the competition and what we’ve learned from it when I get back home – or maybe after the aspirin kicks in. But briefly, the huge impact of GPUs can’t be denied. Hybrids took the top two spots, and four of the top five places.
The student teams were able to either find GPU-enabled codes for the scientific workloads or convert existing code to take advantage of them. That’s what made the difference this year. More later… meds now…