dHPF: A PLATFORM FOR DATA-PARALLEL COMPILER RESEARCH

Vikram Adve, Ken Kennedy, John Mellor-Crummey, Michael Paleczny, Ajay Sethi, Rice University

The Fortran D project at Rice University is developing a data-parallel compiler (dHPF) to serve as the basis for long-term research on compilers and tools for machine-independent parallel programming. The project aims to expand data-parallel compilation technology in three directions: higher performance on "regular" applications, such as dense- matrix codes; optimization techniques for emerging architectures, such as distributed shared-memory and clusters of shared-memory multiprocessors; and new high-level language support for irregular applications and out-of-core computations. In a companion project, the group is developing programming tools for performance analysis and automatic data layout selection.

The Fortran D95 language is the input language to the dHPF compiler. Fortran D95 is a research variant of High Performance Fortran (HPF), incorporating most of the important features in the official HPF 1.1 Subset as well as experimental language features, such as directives for out-of-core layouts of large arrays.

The project so far has focused on developing a compiler aimed at providing high performance for regular applications using a Message Passing Interface (MPI) communication substrate. The compiler supports important optimizations found in other data parallel compilers, including communication vectorization, communication coalescing, reduction recognition, and overlap area analysis. In addition to these optimizations, three features distinguish this work from other data- parallel compilers.

First, the compiler supports a much wider class of computation partitionings than the traditional owner-computes rule, while still allowing different statements within the same loop to have different partitionings. These generalizations are made possible by sophisticated techniques for communication analysis and code generation, and provide a flexible framework for experimenting with computation partitioning algorithms. The current implementation chooses partitionings for statements in a loop to collectively minimize communication.

Second, the compiler uses a powerful integer set manipulation package, the Omega library from the University of Maryland, to implement large parts of communication analysis and code generation in terms of simple abstract operations on integer sets. Because of the generality of Omega, powerful new communication and code generation optimizations have become practical.

Third, during performance optimizations that impact storage requirements for non-local data, specifically, communication placement and overlap area selection, the compiler ensures that limits on available memory are not exceeded. Such memory-conscious compilation strategies are important because feasible problem sizes for real applications are often limited by the available memory per processor.

The compiler supports experimental language directives to enable out-of- core computation on large data sets. Using these directives, which suggest a computation, and I/O organization for data on multiple disks, the compiler transforms an ordinary in-memory program into a staged computation on slabs of out-of-core data. This transformation considers communication, computation, and I/O costs while observing memory limits. The compiler currently uses the PASSION runtime library from Syracuse University for parallel I/O.

In the near future, the Fortran D group expects to focus its research on compiler techniques for new classes of parallel architectures and evolving HPF language features. A major goal of the group's research is to develop compiler techniques that target the emerging class of distributed shared memory multiprocessors such as the Convex Exemplar SPP-2000 and the SGI Origin 2000 systems. The researchers are also currently experimenting with compiler and language support for multi- block codes, which are computations on irregularly-coupled blocks of regular dense meshes.

The initial distribution of the dHPF compiler will be publicly available from the group's World Wide Web site in November of 1996. This compiler version will include support for regular computation on message-passing systems using MPI, and initial support for out-of-core computations. The first release of the compiler will have very limited support for parallelizing loops containing procedure calls. In particular, calls to other than Fortran intrinsic functions will inhibit parallelization. This limitation will be relaxed in future releases.

Other members of the Fortran D research group who have contributed to the compiler design and implementation are Zoran Budimlic, Arun Chauhan, Chen Ding, Gil Hansen, Bo Lu, Collin McCurdy, Kathy Fletcher, Nat McIntosh, Monica Mevencamp, Dejan Mircevski, Nenad Nedeljkovic, Yoshiki Seo, Lisa Thomas, and Lei Zhou.

For more information, see http://www.cs.rice.edu/~dsystem/.

Table of Contents