The development of a new clean energy source has the potential to revolutionize our world. The Swiss Plasma Center at EPFL (École polytechnique fédérale de Lausanne) is trying to do just that: Using intense magnetic fields to confine hydrogen at temperatures up to 100 million degrees, scientists aim to create the conditions for fusion reactions to occur, such as in the stars, thus releasing a huge amount of clean energy—and solving the world’s energy problems in the process.
As part of the EUROfusion program, the Swiss Plasma Center is involved in the development of the ITER, the world’s largest scientific experiment under construction, to prove the feasibility of large-scale fusion reactions and pave the way for DEMO, the demonstration fusion reactor. If it succeeds, fusion could solve the world’s energy problems without generating any greenhouse gas or any long-term radioactive waste. The physical simulations that run on these experiments are an essential part of this process.
My job as director of operations for Scientific IT and Applications Support at EPFL is to provide High Performance Computing (HPC) resources to scientific projects like this one. Paolo Ricci, a professor at the Swiss Plasma Center, explains that “the field of fusion power entails not just building massive experiments such as ITER that are at the forefront of technology, but also performing cutting-edge theoretical research to better understand, interpret and predict physical phenomena. These predictions are based on large-scale simulations that require the world’s most powerful computers. Researchers need operational support to perform such calculations.” Starting on July 1, EPFL will host a EUROfusion’s Advanced Computing Hub, that will support Europe in the development of the software to carry out the simulation for fusion in Europe, and I will direct its operations.
To run these massive simulations, Professor Ricci and his group developed software called GBS. The goal of GBS simulations is to describe and understand the physics of the narrow layer, just a few centimeters thick, that separates the 100-million-degrees plasma core from the machine walls that must be kept at a much lower temperature—just a few hundreds degrees. This temperature gradient, probably the strongest in the universe, is dominated by extremely complex nonlinear multiscale and multiphysics phenomena. An accurate description of this region is crucial to understanding the performance of tokamaks and is thus required for the optimal operation of ITER.
Deploying large-scale energy simulations on Google Cloud
Accurately simulating medium to large tokamaks, the devices where fusion reactions occur, is computationally very demanding and requires a Tier-0 (or currently petaflops-capable) supercomputer. However, resources and access to Tier-0 supercomputers are limited. It is therefore crucial to understand the performance of simulation codes like GBS on Google Cloud, to give the broader scientific community access to the technology.
Using Google Cloud’s HPC VM images, we are able to deploy a fully-localized compute cluster using TerraForm recipes from the slurm-gcp project maintained by SchedMD. Users access the cluster’s front end with their EPFL LDAP account and, using Spack, a widely-used package manager for supercomputers, we install the same computing environment as the one we provide on-premises. Overall, we can now deploy a flexible and powerful HPC infrastructure that is virtually identical to the one we maintain at EPFL in less than 15 minutes and dynamically offload on-prem workloads in times of high demand.
We tested the performance of GBS with two tokamaks, TCV and JT60-SA, using Google Cloud’s HPC VM images and observed excellent scaling, even with the very demanding large-size tokamak. In terms of ‘time to solution,’ we compared one iteration of the solver running on a Tier-0 supercomputer vs. on Google Cloud. Using the Google Cloud HPC VM Images, we achieved comparable results up to 150 nodes, which is very impressive considering the added flexibility Google Cloud offers.
Using Tokamak Configuration Variable (TCV) geometry, our results show excellent scalability: we managed to get a 33X speedup for the TCV tokamak simulation, with a near-perfect scale up to 32 nodes.
To test the performance of the HPC VM images, we also performed the same turbulence simulation using a configuration based on JT60-SA, a large-scale advanced tokamak that will operate in Japan with a geometry similar to ITER. Because of its size, simulations on this tokamak become very demanding at around one billion unknowns, but we managed to get very good results up to 150 nodes.
Solving the world’s energy problems is a complex problem, and to solve it, our work must be scalable, adaptable, and take advantage of the most advanced computing technologies. Google Cloud provides the needed performance and flexibility to complement the powerful Tier-0 supercomputers we use today.
You can learn more about HPC on Google Cloud here.
By: Dr. Gilles Fourestey (Director of Operations, EPFL) and Dr. Grazia Frontoso (Customer Engineering Manager, Google Cloud)
Source: Google Cloud Blog