Big data means big issues
for exascale visualization
(page 2 of 3)
Visualization of a Rayleigh-Taylor instability. Click image to enlarge and for more information.
Exascale computing will pose even more special challenges than that, although operators would seemingly be in a better-than-ever driver’s seat. “One bottleneck will have to do with data movement between (processor) nodes and even from primary memory to cache to registers. And it will be a real problem if we have to store it or read it back. All that data movement will take too much time. Secondly, it will take power – the central issue at the exascale.”
Childs cites Argonne National Laboratory’s Mira, an IBM Blue Gene/Q system that uses about 4 megawatts of electricity to calculate at a little more than 10 petaflops. That’s much more efficient than many earlier computers but still energy-hungry when taken to exascale. Most experts believe 20 megawatts is the limit for computer power consumption, Childs says. So an exascale version of Mira would be 100 times faster but could fuel that speed with only five times more power.
“In every part of the simulation process, people will be asking ‘How can we do this in a power-efficient way?’ And visualization and analysis is no exception. We also have to rethink how we will do things. For us, it really comes down to minimizing data movement, because that costs power.”
Breaking with tradition
In visualization, that will mean abandoning the traditional, time-consuming practice of regularly storing simulation snapshots – at full resolution – on disk, then later reading them back on separate resources to apply visualization algorithms and create images that help interpret data.
Though he’s only starting his five-year research program, Childs expects his “Data Exploration at the Exascale” Early Career project will focus on creating techniques that avoid regularly saving the full simulation for visualization and analysis. Running at exascale will make his task more complicated: Like the simulation itself, data processing will have to be executed in a billion-way concurrent environment while minimizing information flow to trim power costs.
To do that, visualization and analysis researchers have seized on in situ processing: running visualization and analysis routines concurrent with the simulation and operating directly on the data. In situ visualization eliminates the need to record simulation data but works best when application scientists know at the start what they want to study. Without that, their only choices are to pause the simulation and explore the data or accept that the data will be lost forever.