Exascale Science
September 2014

Cool to the core

The next Department of Energy supercomputer: a high-speed, energy-sipping bridge to exascale speed.

The brains behind the Cori supercomputer: the Knights Landing chip, with up to 72 processor cores. (Image courtesy of Intel newsroom.)

There are many superlatives Sudip Dosanjh could use to talk about the National Energy Research Scientific Computing (NERSC) Center’s next-generation supercomputer, scheduled for delivery in 2016. Called NERSC-8 – and nicknamed Cori – it will be the world’s first to use Intel’s new Knights Landing chip, the company’s most powerful to date, each delivering more than three teraflops (trillion calculations per second) of performance. Overall, Cori will pack a 10-fold increase in application performance over Hopper (NERSC-6).

But it’s not primarily Cori’s anticipated stellar performance that makes NERSC Director Dosanjh see the supercomputer as a great leap toward exascale computing. It’s that Cori will do it with less energy.

“Our science users are telling us they need to get to the level of hundreds of petaflops (quadrillion calculations per second) within this decade, and the only way we can get to that level of computing is to make this transition to energy-efficient architectures,” says Dosanjh, who profiled Cori at the recent SciDAC-3 Principal Investigators meeting.

“With Cori we will begin transitioning the broad range of (DOE) Office of Science codes that run at NERSC to energy-efficient many-core computer architectures.”

No supercomputer’s design has been more driven by energy efficiency than Cori’s.

At the heart of its green design is the Knights Landing chip, a supercomputing game-changer that provides more performance while using significantly fewer watts-per-operation of computing.

Although the proof will be in the actual operation, it’s estimated that Knights Landing will deliver between 14 and 16 gigaflops per watt. The world’s most efficient supercomputers currently achieve about 4 gigaflops per watt, the Green 500 list reports.

The Knights Landing chip is a technological pole vault over this memory wall – the inability to efficiently get data to processors, leaving them idling.

A focus on more cores, rather than higher clock speed, is central to this approach.

Historically, HPC architectures tried to increase code performance by boosting the processor’s clock speed – the number of calculations performed in a given clock tick.

“For the foreseeable future clock speeds aren’t going to increase because of energy constraints,” Dosanjh says.

Each clock tick also uses energy and releases heat. The cost to power and cool a machine using traditional processors can run to the tens of millions of dollars annually. That leads to a blistering energy barrier for supercomputers beyond petascale speed.

Instead of focusing on clock speed, Cori’s Knights Landing chips will boost performance primarily by using simpler, stripped-down cores that consume less energy and produce less heat.

“With simpler cores you can put more of them on the same chip,” Dosanjh says. There are more than 60 cores on each of Cori’s Knights Landing chips, compared with the 12 cores on Hopper’s AMD Magny-Cours chips. “So you get a lot more computing using just a little more power.”

Cori also will achieve more with less because the Knights Landing chip is self-hosted – that is, the operating system runs directly on the chip.

Many of today’s fastest supercomputers boost performance with accelerators or co-processors such as GPUs (graphics processing units). But this comes at a big cost: more complex programming models and performance challenges because data shuffles along relatively slow connections between co-processors and processors.

“Having a chip that’s both energy efficient and self-hosted – that’s a big leap forward in terms of the programmability of the system,” Dosanjh says.

Cori’s approximately 9,300 Knights Landing chips, in an architecture designed by supercomputer company Cray, will provide NERSC’s more than 5,000 users approximately 30 petaflops of peak performance for applications ranging from materials simulations to Earth climate models to galaxy evolution calculations.

Cori also will represent a major step forward in improved performance in data-intensive computing, a science mission as critical as simulation at NERSC, Dosanjh says.

Each month the center is deluged with several petabytes of data generated at light sources, particle accelerators, observatories and genomics centers.

“The performance of many applications that run at NERSC is limited by memory bandwidth or (by) our ability to move data,” Dosanjh says – not by the ability to perform floating-point operations.

The Knights Landing chip is a technological pole vault over this memory wall – the inability to efficiently get data to processors, leaving them idling.

Cori’s chips will come supported by two additional types of memory. They will be the world’s first to use a new three-dimensional on-package memory, projected to be five times faster than current memory. They also will have flash memory to accelerate data input and output.

“If you know ahead of a time that this job is going to run and that it needs data out on disk, you can read this data into the flash memory before the job starts to run, so your processor isn’t sitting idle waiting for data,” Dosanjh explains.

At the SciDAC-3 meeting, Dosanjh emphasized that getting the most out of Cori will require tweaking and optimizing the approximately 650 applications scientists run at NERSC.

“The big challenges will be exposing a lot more concurrency, or parallelization, in codes to take advantage of the many nodes, and effectively using the two additional layers of memory,” Dosanjh says.

He notes that the NERSC Exascale Science Applications Program is set to create a pool of Cori early adapters.

From more than 50 proposals, the program will select 20 science user teams, eight of which will receive an embedded NERSC post-doctoral researcher. All eight teams will have access to early hardware test beds and what Dosanjh calls “deep dungeon” training sessions with Intel technicians so the user teams can adapt their codes for Cori.

Dosanjh anticipates that Cori will fit in perfectly at its new home at the Lawrence Berkeley National Laboratory’s Computational Research and Theory building, now under construction for 2015.

Like Cori, the dedicated supercomputer building is designed with energy efficiency at its core, including using San Francisco Bay air circulated through heat exchangers to cool its supercomputers.

Waste computer heat also will be the building’s primary source of heating.

Dosanjh and the other NERSC staff plan to move to the new building in early next year, before Edison (the NERSC-7 computer) and Cori are up and running.

He adds, “It could be a little chilly that first winter.”