spacer
ASCR Home Button ASCR Organization Button ASCR News Button Contact ASCR Button
DOE Homepage Science Homepage
ASCRlogo ASCR Discovery home page

Building an operating system from the ground up

Posted April 16, 2007

In a project that harks back to the days of computer pioneer John Von Neumann, scientists at Sandia National Laboratories in Albuquerque, N.M., are breaking down the entire concept of an operating system (OS) and rebuilding it.

Like Von Neumann’s original computer architecture, the Config-OS framework they’re designing combines a family of core components to build special-purpose operating systems.  Project coordinator Ron Brightwell and his team want to build an OS backbone sturdy enough to run a variety of applications on computers capable of 1 quadrillion calculations per second and beyond, but with good performance and the features programmers need.

The project is part of the FAST-OS initiative sponsored by the Department of Energy’s Office of Advanced Scientific Computing Research.  It combines the massively parallel scientific computing systems experience of Brightwell and his team at Sandia, a DOE facility, with their long-time academic partner Barney Maccabe at the University of New Mexico and the visionary work of Thomas Sterling at Louisiana State University.

Sterling is perhaps best known for his pioneering work on Beowulf, the first system to cluster inexpensive personal computers into a parallel processing machine.  Now he’s developing platform designs for beyond-petascale computing.  Brightwell says the partnership will ensure that Config-OS (as it’s called for now) will accommodate new computer architectures well into the next decade.

Those future architectures are likely to be based on the parallel programming model, which has become the dominant high-end computing scheme since the early 1990s.  Parallel computing solves big problems much more quickly by breaking them into pieces so multiple processors can work on them simultaneously.  Today, parallel computing systems have grown to thousands of processors and that trend is likely to continue creating computers with tens of thousands of processors over the next decade.  A critical challenge is now developing the operating systems that will effectively control these very large processor count systems when they are used to solve our nation’s most complex science and engineering application problems.

When high-performance parallel computing was first developed, programmers created operating systems specific to each platform, Brightwell explains.  Some programmers elected to build a simple, “lightweight” OS that didn’t compromise system speed.  Others chose to use existing full-featured systems developed for time-shared environments, where each server simultaneously juggled several users and applications.  The relatively inexpensive Beowulf clusters and the Linux open-source operating system enabled parallel programming with a larger set of OS services and features.  However, Brightwell points out, many of those services deal with file systems and virtual memory, which are irrelevant for large parallel systems like Sandia’s Red Storm – the first computer in the Cray XT3 product line.

1   |   2   |   Print       Next »

Web Policies Button No Fear Act Button Site Map Button Privacy Button Phone Book Button Employment Button
spacer