|Volume 7, Issue 1 -
Stencil Compiler for the Thinking Machines CM-2 and CM-200
Ralph G. Brickner, Los Alamos National Laboratory; William George, Clemson University; S. Lennart Johnson, Alan Ruttenberg, Thinking Machines Corporation
A stencil is a weighted sum of circularly-shifted CM Fortran arrays. The stencil compiler optimizes the data motion between processing nodes, minimizes the data motion within a node, and minimizes the data motion between registers and local memory in a node. The compiler that was developed for this project makes novel use of the communication system and has highly optimized register use.
The stencil compiler has three major software components: customized microcode, a compiler to generate calls to the microcode and supply the precise register and memory access patterns required, and a runtime system that resolves some data distribution issues unknown at compile time and applies the microcode to a set of user arrays at run time.
Performance of the stencil compiler is highly dependent upon a number of factors including number of stencil points, stencil size and shape, data type, coefficient type and rank, and the geometries of the actual CM Fortran arrays passed in as arguments. One crucial factor is whether a given stencil will "fit" into the Floating Point Unit and sequencer register set at all, and if so, what the maximum "multiwidth" is. Here, "multiwidth" refers to the number of stencils that can be stacked together along one axis in order to re-use FPU registers along that axis. As a general consideration, the greater the multiwidth, the better the performance. If a multiwidth-one multistencil cannot be made to fit into the register set, the stencil compiler returns with an informatory message. The user is then free to decompose the stencil into smaller stencils by hand. Future work will address automatic work for such decomposition.
The compiler is available as part of the Connection Machine Scientific Software Library, Release 3.1.
Table of Contents