Chris Elford, Tara Madhyastha, Dan Reed, University of Illinois

Using data from the CRPC's Scalable I/O (SIO) characterization effort to guide design, researchers at the University of Illinois have developed a portable parallel file system (PPFS) to study the interaction of application access patterns, file caching and prefetching algorithms, and application file data distributions.

PPFS consists of a group of cooperating data servers and has a rich application interface, allowing applications to advertise access patterns and control caching, prefetching, and data placement policies at multiple levels. Each parallel application node can invoke PPFS library calls to allow parallel file sharing and to perform parallel file input/output. Data servers simplify implementation and promote portability and extensibility by distributing data between multiple underlying UNIX files.

Experiments using large research codes on the Intel Paragon XP/S have shown that tuning PPFS file system policies to application needs, rather than forcing the application to use inappropriate and inefficient file access modes, is the key to performance. In short, simple access pattern hints and cache policy controls can yield large performance benefits. However, to achieve these performance gains with PPFS, the application writer must understand both the application access pattern and the PPFS input/output cost model to specify appropriate policy controls and file data distributions. Unfortunately, manual policy selection places a substantial cogitative burden on the programmer and ideally should be automatic.

To simplify this automatic performance optimization, the researchers have taken two approaches to adaptive selection of file system policies. The first approach, based on their belief that tuning policies to access patterns is the key to performance, is to automatically classify access patterns using a trained artificial neural network and to select appropriate policies based on these classifications. Experimental data shows that performance improves significantly with this approach.

In addition to policy selection based on dynamic classification of file access patterns, PPFS also allows dynamic policy selection based on observed performance. PPFS uses the Pablo library to capture and compute periodic, quantitative performance sensor metrics such as PPFS data rates and cache hit ratios. These dynamic sensor metrics allow the user to select appropriate caching and prefetching policies and monitor performance changes caused by these policy shifts. For example, if there are low cache hit ratios, the user may automatically actuate an increase in prefetching, thus improving cache hit ratios. When application access patterns or resource availability changes, the resulting shifts in the observed sensor metrics cause additional policy refinement.

The group is currently designing a new parallel file system library that will conform to the low-level SIO parallel input/output Application Programming Interface (API). The portable implementation will support the SIO hint and collective input/output extensions. The new design will also incorporate the adaptive ability of PPFS to select file system policies based on both qualitative and quantitative performance data.

Additional information about PPFS is available at http://www-pablo.cs.uiuc.edu/Projects/PPFS/ as well as a Beta software distribution for a variety of parallel platforms, including the Intel Paragon (NX), the IBM SP/2 (MPI), the SGI Power Challenge Array (MPI), the Convex Exemplar (MPI and PVM), and workstation clusters (MPI and PVM).

Other Issues of PCR Back to PCR CRPC Home Page