Barnburner. Slugfest. Photo finish. Just a few cliches to let you know what’s coming up as we break down the finish of the 2022 Winter Classic Invitational Student Cluster Competition.
Twelve teams from Historically Black Colleges and Universities and Hispanic Serving Institutions battled for the Winter Classic championship from early February until the final judging interviews on April 14th.
The unique structure of the virtual competition had all student teams hosted by one of four mentor organizations who allocated hardware and taught the students about the benchmark or application they’d be optimizing. It was a level playing field with all teams running on the same hardware configurations and receiving the same coaching from the mentors.
The mentor organizations and the competition challenges they hosted:
- HPE mentoring LINPACK/HPCG
- Surprise HPC Pop Quiz given by competition organizers Intersect360 Research
- NASA Ames mentoring a subset of the NAS Parallel benchmarks
- Oak Ridge National Laboratory mentoring a machine learning challenge
- AWS mentoring the OpenFOAM Challenge
- BioTeam was our emergency HPC SWAT team, jumping in when needed to provide much needed help and guidance
- The Judging Interview
All of challenges were worth 100 maximum points and the scores were normalized to the top team result. Student teams could earn a maximum total of 700 points for the competition.
The Battle
The competition opened with LINPACK, always a crowd favorite. The scores were on the high side, with seven teams huddled at 88% or better. The Texas Tech Red Raiders pulled the top score with Tennessee State only 1.5% behind and UTEP following at third only 1.63% behind the winner.
Things got a lot more interesting with HPCG, however. Team Tennessee State shocked the field by reducing the HPCG problem set to the point where it entirely fit into system cache and notched a score that was 31% higher than second place Florida A&M (FAMU).
Their novel approach to the problem caused several conversations among the competition officials. Was reducing the problem size ‘fair game’ or not? There wasn’t a rule against reducing the problem size, the team used the required four nodes, and they the problem for at least 30 minutes, so it was a kosher run and the score was put into the books.
The HPC Pop Quiz was added to the competition due to scheduling conflicts and also to just mess with the students. Why give them an additional week off if we don’t have to, right? The test featured 20 questions about HPC history, technology, and current events. We gave them study materials and to keep them from just searching up the answers, we timed the test, a fiendish twist. Each team member took the quiz individually and the longer they took on the test, the lower their score.
UTEP came out on top in this challenge, adding 100 points to their total, moving them from sixth place to a solid fourth. The TTU Red Raiders finished a very close second, grabbing 98.78 points, but Tennessee State was barely off the pace with 96.34 points. End result: Tennessee State holds on to their first place position with the Red Raiders in second and FAMU a very close third.
Next up was NASA with a subset of their popular NAS Parallel benchmarks. Fayetteville State jumped out to an early lead with a top score on BT-MZ, but FAMU responded by pounding the field with their winning result on LU-MZ. However, Tennessee State made the SP-MZ benchmark their own and, by dint of their high finishes on the other NAS Par runs, they won the overall NASA module by 1.82 points over FAMU and 4.5 points over the other team from Texas Tech, the Matadors. End result: Tennessee State keeps the lead, FAMU moves into second, and the Red Raiders fall into third.
Oak Ridge National Lab throws a curve ball to the teams with their machine learning exercise. Students were running on ORNL’s Ascend cluster, which is the same configuration as Summit, albeit a bit smaller. This challenge was less about optimizing code and more about running ML routines and answering a series of tough questions.
Nearly all the teams were complete ML newbies, but they caught on quickly as the week progressed. Four teams took home the full 100 points on the ORNL module, UTEP, Tennessee State, Prairie View A&M, and the TTU Red Raiders. The Matadors, also from TTU, added 95 to their total with a close fourth place finish.
At the end of the ORNL module, Tennessee State holds onto a 50+ point lead, the Red Raiders slide into second, UTEP is in third, and FAMU is holding fourth. Only 35 points separate second place and sixth place – things are very tight.
The AWS OpenFOAM Challenge changes everything
It was a deceptively simple proposition: run OpenFOAM model with a motorcycle and optimize your code while keeping the end result, the drag coefficient, within a narrow range. The students had to run their code on both an AMD-based and Intel-based cluster and run it 500 times to verify their results.
Like all of the mentors, AWS presented a mound of material on how to use the clusters, how to run OpenFOAM, and also provided live support during the week. Most all of the teams were up and running early and generating solid results.
But one team was generating better than ‘solid results.’ They knocked OpenFOAM out of the park and beat it like a rental car. Team Prairie View from Texas wasn’t just looking at compiler flags and MPI options. They went at the problem from another angle – looking to see how much they could coarsen the mesh and keep the required drag coefficient. This was the key to the kingdom, OpenFOAM-wise.
With this optimization, Prairie View turned in run times of 83 seconds on the AMD cluster and 64 seconds on the Intel cluster. The second-best result, from Tennessee State, was 563 seconds on AMD and 775 seconds on Intel. With score normalization to the top result, this meant that Prairie View gained 100 points while Tennessee State only took added a paltry 11.52 to their total. The other teams took home even smaller numbers of points, ranging from three to eight and change.
This scoring disparity had a shocking impact on the leaderboard. Prairie View vaulted from fifth place to second, only 2.21 points behind perennial leader Tennessee State. The third through sixth teams probably wouldn’t be able to catch either Prairie View or Tennessee, but they had their own battles to contend with. The scoring margin between sixth and third was a tight 36 points out of 600 total points possible to date.
There’s only one event left. The judging interviews, worth a maximum of 100 points. These interviews will decide it all. We couldn’t have written a better script for the final week of the competition.
Interviews were conducted by the competition organizers joined by a panel of HPC experts from the mentor organizations. During their interview, student teams gave presentations discussing their trials, tribulations, and the results of their HPC benchmarking and optimization efforts.
The interviews were highly competitive and the teams all interview well. But with everything on the line, Team Prairie View aced out Team Tennessee State in the interview and won the Overall Championship by a narrow 30 points out of 700 possible. Wow.
Tennessee State finished second with the Texas Tech Red Raiders taking home third place. The rest of the field remained very close with UTEP holding fourth place, the Texas Tech Matadors grabbing fifth, and Florida A&M finishing sixth.
We closed out the 2022 competition last Friday with an online Gala Awards Ceremony. It featured an opening welcome by recent Turing Award winner Jack Dongarra and award announcements from industry players from HPE, NASA, Oak Ridge, AWS, and Oracle Cloud Infrastructure.
In addition to announcing application results and the final standings, we also give out $30,000 in Brueckner Awards to competitors, which felt great.
You can see the Gala Awards Ceremony, in all its glory, in the video below: