|Volume 7, Issue 1 -
Caltech Structured Multithreaded Programming Project
K. Mani Chandy and John Thornley, Caltech
The idea of dividing a program into multiple threads of control to speed up execution on a multiprocessor computer is well established; however, this kind of multithreaded programming has made very little impact on mainstream business and personal computing, or even on most areas of science and engineering. The recent advent of inexpensive multiprocessor PCs and commodity operating system support for lightweight threads opens the door to an important role for high-performance multithreading in all areas of computing. Dual-processor and quad-processor PCs are now available for a few thousand dollars, and mass-produced multiprocessors with 8, 16, and more processors will soon follow. Commodity operating systems (e.g., Windows NT and Unix) already can support hundreds of fine-grained threads with low overhead, even on single-processor machines, and future releases will be even more efficient.
Examples of mainstream applications that could use multithreading to improve performance include spreadsheets, CAD/CAM, three-dimensional rendering, photo/video editing, voice recognition, games, simulation, and resource management. In addition, the range of multithreaded science and engineering applications could be greatly expanded.
The Caltech Structured Multithreaded Programming Project is focusing on how to divide the effort of efficient multithreaded programming between the programmer, compiler, run-time system, and hardware. The group is concentrating on mainstream applications, commodity hardware, and commodity operating systems.
The Difficulty of Multithreaded ProgrammingThe biggest obstacle to incorporating multithreading technologies into mainstream business and personal computing is the difficulty of developing efficient multithreaded programs. General-purpose thread libraries (e.g., Win32 threads, Pthreads, and Java threads) are well suited to systems programming applications of threads, such as control systems, database systems, and distributed systems. However, the interface provided by general-purpose thread libraries is not well suited to applications where threads are used for the purpose of speeding up program execution. General-purpose thread management is unstructured and synchronization operations are complex and error-prone. The nondeterministic interactions of multiple threads introduce many problems, including race conditions and deadlock, that do not occur in sequential programming. In many regards, general-purpose thread libraries can be viewed as the assembly language of high-performance multithreaded programming.
Automatic Parallelization is Not the SolutionAutomatic parallelization is one approach to overcoming the difficulty of multithreaded programming. The programmer writes a sequential program and delegates responsibility to the compiler for transforming the sequential program into an efficient multithreaded program. This approach is very appealing because of its extreme simplicity from the programmer's point of view. Unfortunately, automatic parallelization currently is an infeasible task for typical mainstream applications, which involve complex data and control structures, and consist of large numbers of separately-compiled modules. Data and control flow analysis is an extremely difficult problem, and efficient multithreaded programs often require completely different algorithms from efficient sequential programs. Despite decades of research, there is no indication that automatic parallelization is close to becoming a general approach to efficient multithreaded programming.
The Caltech Approach: Explicit Multithreading Using Sequential ReasoningThe Caltech group's approach is to have the programmer write an explicitly multithreaded program, using structured thread creation and synchronization constructs based on standard sequential constructs. In this model, a multithreaded program is simply a sequential program (e.g., in C or C++) annotated with pragmas that indicate where ordinary sequential blocks and for-loops can be executed in a multithreaded manner. Explicit operations on Boolean flags and integer counters allow sophisticated synchronization within multithreaded blocks and for-loops. Subject to a few simple restrictions, multithreaded execution is guaranteed to be deterministic and produce the same results as sequential execution. This is very useful in development, testing, and debugging. Nondeterminacy can be introduced in a controlled manner, when it is important to the efficiency of a multithreaded algorithm.
The relationship between this structured multithreaded programming model and a general-purpose thread library is analogous to the relationship between a high-level sequential language (e.g., C or C++) and assembly code. The model is no more powerful than a general-purpose thread library, just as C++ is no more powerful than assembly code. However, it provides multithreaded programming structures that are easier to reason about than unrestricted thread library calls, just as while-loops and if-statements are easier to reason about than unrestricted assembly code branch instructions. These multithreaded programming structures restrict nondeterminacy, thereby allowing sequential reasoning about multithreaded execution. However, unlike automatic parallelization, there is a direct mapping from the program to the operational details that determine multithreaded performance. In particular, the programmer directly controls every thread creation and synchronization operation.
Sthreads: A System for Structured Multithreaded ProgrammingAs a demonstration of this structured multithreaded programming model, the researchers are developing and experimenting with a programming system called Sthreads (Structured Threads). Sthreads is a library and pragma-based notation for the C/C++ programming language. The key attributes of Sthreads are as follows:
Integration with Microsoft Visual C++ on Windows NTThe Sthreads library is easily implemented as a thin layer on top of general-purpose thread libraries, and the Sthreads pragma preprocessor is a simple source-to-source transformation tool that involves no complex program analysis. Therefore, the entire system is highly portable. However, since the primary interest of the group is mainstream computing, its focus is on supporting Sthreads in the context of Microsoft Visual C++ on Windows NT. As a result, Visual C++ programmers can use Sthreads without having to learn a new programming language, development environment, or tool set.
Sequential Development and Multithreaded PerformanceOne of the major strengths of Sthreads is that it allows efficient multithreaded programs to be developed using standard sequential techniques, prior to multithreaded execution for performance. Sthreads applications can be built either as sequential applications (by ignoring the pragmas) or multithreaded applications (as directed by the pragmas). For deterministic algorithms, most development, testing, and debugging is performed in sequential mode, and then performance analysis and production runs are performed in multithreaded mode. Absence of race conditions and deadlock can be verified in the context of sequential execution. As long as the programmer obeys a few simple rules, the results of multithreaded execution will be indistinguishable from those of sequential execution. Sthreads allows nondeterminacy to be added to a program in a controlled and limited manner, so that nondeterministic programs can be developed with large deterministic components.
Applications: A Practical Test of the ModelAlthough the structured multithreaded programming model is known to be based on sound theory, the best test of its practical value comes from experience with real applications. To demonstrate and further investigate the strengths and weaknesses of the programming model and Sthreads, the researchers are developing the following multithreaded applications:
For more information on the Caltech Structured Multithreaded Programming Project, see http://threads.cs.caltech.edu.