Volume 7, Issue 1 -
Spring/Summer 1999

Volume 6, Issue 3
Fall 1998

Volume 6, Issue 2
Spring/Summer 1998

Volume 6, Issue 1
Winter 1998

Volume 5, Issue 4
Fall 1997

Volume 5, Issue 3
Summer 1997

Volume 5, Issue 2
Spring 1997

Volume 5, Issue 1
Winter 1997

Volume 4, Issue 4
Fall 1996

Volume 4, Issue 3
Summer 1996

Volume 4, Issue 2
Spring 1996

Volume 4, Issue 1
Winter 1996

Volume 3, Issue 4
Fall 1995

Volume 3, Issue 3
Summer 1995

Volume 3, Issue 2
Spring 1995

Volume 3, Issue 1
January 1995

Volume 2, Issue 4
October 1994

Volume 2, Issue 3
July 1994

Volume 2, Issue 2
April 1994

Volume 2, Issue 1
January 1994

Volume 1, Issue 4
October 1993

Volume 1, Issue 3
July 1993

Volume 1, Issue 2
April 1993

Volume 1, Issue 1
January 1993

Caltech Structured Multithreaded Programming Project

K. Mani Chandy and John Thornley, Caltech

The idea of dividing a program into multiple threads of control to speed up execution on a multiprocessor computer is well established; however, this kind of multithreaded programming has made very little impact on mainstream business and personal computing, or even on most areas of science and engineering. The recent advent of inexpensive multiprocessor PCs and commodity operating system support for lightweight threads opens the door to an important role for high-performance multithreading in all areas of computing. Dual-processor and quad-processor PCs are now available for a few thousand dollars, and mass-produced multiprocessors with 8, 16, and more processors will soon follow. Commodity operating systems (e.g., Windows NT and Unix) already can support hundreds of fine-grained threads with low overhead, even on single-processor machines, and future releases will be even more efficient.

Examples of mainstream applications that could use multithreading to improve performance include spreadsheets, CAD/CAM, three-dimensional rendering, photo/video editing, voice recognition, games, simulation, and resource management. In addition, the range of multithreaded science and engineering applications could be greatly expanded.

The Caltech Structured Multithreaded Programming Project is focusing on how to divide the effort of efficient multithreaded programming between the programmer, compiler, run-time system, and hardware. The group is concentrating on mainstream applications, commodity hardware, and commodity operating systems.

The Difficulty of Multithreaded Programming

The biggest obstacle to incorporating multithreading technologies into mainstream business and personal computing is the difficulty of developing efficient multithreaded programs. General-purpose thread libraries (e.g., Win32 threads, Pthreads, and Java threads) are well suited to systems programming applications of threads, such as control systems, database systems, and distributed systems. However, the interface provided by general-purpose thread libraries is not well suited to applications where threads are used for the purpose of speeding up program execution. General-purpose thread management is unstructured and synchronization operations are complex and error-prone. The nondeterministic interactions of multiple threads introduce many problems, including race conditions and deadlock, that do not occur in sequential programming. In many regards, general-purpose thread libraries can be viewed as the assembly language of high-performance multithreaded programming.

Automatic Parallelization is Not the Solution

Automatic parallelization is one approach to overcoming the difficulty of multithreaded programming. The programmer writes a sequential program and delegates responsibility to the compiler for transforming the sequential program into an efficient multithreaded program. This approach is very appealing because of its extreme simplicity from the programmer's point of view. Unfortunately, automatic parallelization currently is an infeasible task for typical mainstream applications, which involve complex data and control structures, and consist of large numbers of separately-compiled modules. Data and control flow analysis is an extremely difficult problem, and efficient multithreaded programs often require completely different algorithms from efficient sequential programs. Despite decades of research, there is no indication that automatic parallelization is close to becoming a general approach to efficient multithreaded programming.

The Caltech Approach: Explicit Multithreading Using Sequential Reasoning

The Caltech group's approach is to have the programmer write an explicitly multithreaded program, using structured thread creation and synchronization constructs based on standard sequential constructs. In this model, a multithreaded program is simply a sequential program (e.g., in C or C++) annotated with pragmas that indicate where ordinary sequential blocks and for-loops can be executed in a multithreaded manner. Explicit operations on Boolean flags and integer counters allow sophisticated synchronization within multithreaded blocks and for-loops. Subject to a few simple restrictions, multithreaded execution is guaranteed to be deterministic and produce the same results as sequential execution. This is very useful in development, testing, and debugging. Nondeterminacy can be introduced in a controlled manner, when it is important to the efficiency of a multithreaded algorithm.

The relationship between this structured multithreaded programming model and a general-purpose thread library is analogous to the relationship between a high-level sequential language (e.g., C or C++) and assembly code. The model is no more powerful than a general-purpose thread library, just as C++ is no more powerful than assembly code. However, it provides multithreaded programming structures that are easier to reason about than unrestricted thread library calls, just as while-loops and if-statements are easier to reason about than unrestricted assembly code branch instructions. These multithreaded programming structures restrict nondeterminacy, thereby allowing sequential reasoning about multithreaded execution. However, unlike automatic parallelization, there is a direct mapping from the program to the operational details that determine multithreaded performance. In particular, the programmer directly controls every thread creation and synchronization operation.

Sthreads: A System for Structured Multithreaded Programming

As a demonstration of this structured multithreaded programming model, the researchers are developing and experimenting with a programming system called Sthreads (Structured Threads). Sthreads is a library and pragma-based notation for the C/C++ programming language. The key attributes of Sthreads are as follows:
  • Pragmas for structured thread creation, based on ordinary blocks and for-loops.
  • Library calls for structured synchronization, based on Boolean flags and integer counters.
  • Multithreaded execution is deterministic and produces the same results as sequential execution.
  • Lock synchronization for nondeterminacy.
  • Barrier synchronization for efficiency.

Integration with Microsoft Visual C++ on Windows NT

The Sthreads library is easily implemented as a thin layer on top of general-purpose thread libraries, and the Sthreads pragma preprocessor is a simple source-to-source transformation tool that involves no complex program analysis. Therefore, the entire system is highly portable. However, since the primary interest of the group is mainstream computing, its focus is on supporting Sthreads in the context of Microsoft Visual C++ on Windows NT. As a result, Visual C++ programmers can use Sthreads without having to learn a new programming language, development environment, or tool set.

Sequential Development and Multithreaded Performance

One of the major strengths of Sthreads is that it allows efficient multithreaded programs to be developed using standard sequential techniques, prior to multithreaded execution for performance. Sthreads applications can be built either as sequential applications (by ignoring the pragmas) or multithreaded applications (as directed by the pragmas). For deterministic algorithms, most development, testing, and debugging is performed in sequential mode, and then performance analysis and production runs are performed in multithreaded mode. Absence of race conditions and deadlock can be verified in the context of sequential execution. As long as the programmer obeys a few simple rules, the results of multithreaded execution will be indistinguishable from those of sequential execution. Sthreads allows nondeterminacy to be added to a program in a controlled and limited manner, so that nondeterministic programs can be developed with large deterministic components.

Applications: A Practical Test of the Model

Although the structured multithreaded programming model is known to be based on sound theory, the best test of its practical value comes from experience with real applications. To demonstrate and further investigate the strengths and weaknesses of the programming model and Sthreads, the researchers are developing the following multithreaded applications:
  • Molecular Dynamics Simulation: Time-stepped simulation of the interactions of millions of atoms.<>
  • Software-Enabled Control: Trajectory planning and dynamic control for flocks of unmanned aircraft.
  • Route Optimization: Finding the safest trajectory for an aircraft flying through a risky airspace.
  • Terrain Masking: Finding the maximum flight altitude over an uneven terrain without being observed.
  • Threat Analysis: Timed-stepped simulation of options for intercepting incoming ballistic threats.
  • Volume Rendering: Render a two-dimensional visual image of a three-dimensional data set.
All of these applications involve complex data structures and control flow. None of them are straightforward to parallelize efficiently. The thrust of the investigation is on developing the best algorithms to obtain the best performance on commodity shared-memory multiprocessors running commodity operating systems. Unlike most research on high-performance multithreaded programming, this project involves the performance of these applications on an unloaded machine, and also under realistic background load conditions when the machine is shared with other users and other applications.

For more information on the Caltech Structured Multithreaded Programming Project, see http://threads.cs.caltech.edu.