PACI Projects at the CRPC


Last year, the CRPC became a major player in the National Science Foundation's Partnerships for Advanced Computational Infrastructure (PACI) program. This program essentially replaces and extends the previous Supercomputer Centers program, which is currently being phased out. The goals of PACI include providing access for scientists to high performance computational resources, enabling effective use of advanced computational infrastructure, promoting early use of these advances, fostering interdisciplinary research enabled by computational advances, facilitating the development of intellectual capital, and broadening the national computational infrastructure activity base. In short, PACI's mission is to broaden the reach of HPCC in the scientific community. A key part of that mission is to make parallel machines truly usable, a goal that CRPC has advanced from its very beginnings.

NSF funded two PACI partnerships in 1997: the National Computational Science Alliance ("the Alliance"), headquartered at University of Illinois at Urbana-Champaign, and the National Partnership for Advanced Computational Infrastructure (NPACI), headquartered at the University of California, San Diego. CRPC sites and affiliated sites are highly involved with both institutions to conduct computationally intensive research and educational programs that will help develop and exploit the infrastructure from these partnerships.

PACI is funded by approximately $340 million over five years. For more information, see http://alliance.ncsa.uiuc.edu/ , http://www.cise.nsf.gov/acir/paci.html, and http://www.npaci.edu.

Alliance Projects

The Alliance includes more than 50 university, national laboratory, and industrial partner sites across the United States. (Originally, there was at least one partner in every state.) The project vision is to prototype a national computational infrastructure called the National Technology Grid. The Grid will enable the science and engineering community to take advantage of rapidly improving high-performance computing and communications technologies and will make those developments available to broad sectors of society. Components of the Grid will include the NCSA supercomputer center, mid-range centers, high-speed networking and visualization resources. Equally important will be the human parts of the Grid-the scientists, engineers, and educators who will provide education, outreach, training, and development expertise to the national community.

The Alliance organizes its activities around several teams. Three Enabling Technologies (ET) teams work on computer science issues in Parallel Supercomputing, Distributed Computation, and Data and Collaboration. Six Application Technologies (AT) teams bring computational science to bear on specific scientific areas. The other team names-"Education, Outreach, and Training (EOT)" and "Partners for Advanced Computational Services (PACS)"-are fairly self-explanatory. Members of the CRPC are involved with ET, EOT, and PACS.

Enabling Technologies Teams

Three Enabling Technologies teams are integrating the elements of the Grid so that its distributed components operate as if they are a single computing system. They are creating software to link parallel computing systems remotely with other researchers, computers, databases, visualization environments, and remote instruments.

CRPC Director Ken Kennedy (Rice University) is a team leader for the Parallel Computing Team, which is developing a toolkit comprised of portable programming languages, libraries, and other advanced tools to support parallel systems, including distributed shared memory (DSM). Among the first projects undertaken by the team is the development of new compiler capabilities for High Performance Fortran (HPF) and High Performance C++ (HPC++) running on DSM systems. Ultimately, these tools will enable scientists and engineers to build codes portable to all the architectures in the Grid.

Other CRPC researchers involved with the Parallel Computing effort include John Dennis and Chuck Koelbel (Rice University), Jack Dongarra (University of Tennessee), Dennis Gannon (Indiana University), Lennart Johnsson (University of Houston [UH]), and Dan Reed (University of Illinois at Urbana-Champaign).

The Distributed Computing Team is establishing an operating environment in which the Alliance's diverse networks and resources can work together. It is developing an integrated software system that treats the distributed resources of the Grid as a single virtual machine, or metacomputer, yet accommodates each application's unique requirements for bandwidth and resources. One of its first projects is to prototype a master scheduler for the Grid. The basic infrastructure for this will be adapted from Globus, the software originally developed for the Information Wide Area Year (I-WAY) developed by ANL and the University of Southern California's Information Sciences Institute. (See "Research Focus," Spring 1997 Parallel Computing Research ).

Led by Rick Stevens (ANL), this team includes CRPC researchers Ian Foster (ANL), Geoffrey Fox (Syracuse University), Dennis Gannon (Indiana University), and CRPC External Advisory Committee (EAC) member Marina Chen (Boston University).

The third Enabling Technologies group is the Data Collaboration Team, which is working on a toolkit for creating and controlling parallel computing file systems that will make possible large speed increases in input/output (I/O). As well as speeding the flow of data, the team is working on visualization and virtual reality techniques to enable researchers to better understand large data sets. Currently, high-performance computers can operate at speeds exceeding one trillion operations per second, but I/O operations, in which data is moved in and out of secondary and tertiary storage systems, run closer to 10 million bytes per second. This team will bring I/O operations more in line with processor performance. The team is also deploying information management methods to organize, characterize, and access data efficiently so it can be shared more easily, and is enhancing collaboration efforts so that researchers can manipulate and explore data simultaneously via virtual reality environments.

Led by Dan Reed, the Data Collaboration Team includes Rick Stevens and CRPC EAC member Paul Woodward (University of Minnesota).

Partners for Advanced Computational Services

The NCSA has recruited 15 organizations to be Partners for Advanced Computational Services (PACS), which is the gateway to the Grid. Some of these sites are contributing computing or visualization resources, and others are serving as satellite sites for the NCSA's education, outreach, and training programs. Most PACS are serving as training grounds, and all act as conduits for promoting and disseminating technologies and educational materials developed by the NCSA teams.

The Argonne PACS involvement, led by Rick Stevens, Tom Morgan, and Remy Evard of ANL, provides access to ANL's computing hardware for qualified researchers. Along with this, ANL provides a testbed for distributed computing and visual supercomputing methods. These testbeds help develop the new technologies needed by the Grid. As a direct result of Stevens' involvement in the ET teams, ANL is also a key site for deploying distributed computing technology to other partners.

Rice's PACS involvement, led by Chuck Koelbel, focuses on deployment of the parallel computing technologies developed in Kennedy's ET team. This includes testing and packaging of the software for DSM, as well as training in those technologies. Access to Rice's HP/Convex Exemplar X-2000 is also available to Alliance researchers.

NPACI Projects

NPACI involves 37 academic and research institutions located in 18 states across the country. Driven by real applications needs, NPACI is developing software infrastructure to link high performance computers, data servers, and archival storage systems to enable easier use of the aggregate computing power. Development work is focused in specific thrust areas that join applications scientists, computer scientists, and technology developers and leverage separately funded research projects to ensure rapid deployment and robustness of the resulting infrastructure. This work is complemented by an extensive education, outreach, and training program and collaboration with industry.

NPACI partners are categorized into four groups: Research and Education, Resource, Associate, and Industrial. The CRPC is one of 11 Research and Education partners who provide computational and data resources and leadership in one or more thrust areas to contribute to the development of new technologies and enhance the infrastructure. The technology thrust areas are Data-intensive Computing, Interaction Environments, Metasystems, and Programming Tools and Environments. Application thrust areas are Earth Systems Science, Engineering, Molecular Science, and Neuroscience. The CRPC is focusing on Programming Tools and Environments, Data-Intensive Computing, Earth Systems Science, and the collaborative EOT program of the NCSA and NPACI, described above.

Programming Tools and Environments

Led by CRPC researcher Joel Saltz (University of Maryland), the Programming Tools and Environments research is creating tools to help scientists use high-end computing resources to solve scientific problems. Collaborators in this area are developing tools for irregular, adaptive, and unstructured problems; tools for data navigation and processing of multi-resolution data sets stored in secondary and tertiary memory; compiler support for out-of-core applications and adaptive problems, new parallel linear algebra libraries, and more. Partners include CRPC sites Caltech, Rice, and the University of Texas.

Data-Intensive Computing The Data-Intensive Computing team is developing digital library systems that will support publication of scientific data sets, information retrieval, and data analysis services. By enabling attribute-based data set identification, it is possible to eliminate manual data-handling tasks, thereby increasing the ability to analyze massive data sets. These efforts are built on current research at partner sites, including data collections at Caltech and the University of Maryland.

Earth Systems Science

The Earth Systems Science area is advancing the study of the Earth's natural systems and the complex interactions between humans and those systems. The partners, which include the University of Maryland and the University of Texas, are focusing on earth systems modeling and orchestrating multiscale models across distributed computers; ecological and environmental modeling that incorporates remote-sensing data; and an Earth systems digital library that will create new opportunities for scientists, educators, students, and policy makers. NPACI is supporting climate researchers investigating the 1997-1998 El Niño phenomenon. To develop better predictions of conditions provoked by El Niño at global, regional, and watershed scales, the researchers have assembled a set of computationally intensive multiscale, multi-resolution models to be run on NPACI machines. One result will be a collection of global and regional climate and weather models residing within NPACI.

Education, Outreach, and Training Teams at NPACI and Alliance

To ensure that citizens from kindergarten through adulthood can productively employ the next generation of communication and computing tools, the PACI formed Education, Outreach, and Training (EOT), the first program to closely link both the Alliance and NPACI. The EOT has targeted education, universal access, and communities as its three areas of focus. Members of the CRPC are involved in the first two areas.

Co-led by Fox, the Enhancing Education Team is exploring ways in which many of the new technologies originating in PACI partner efforts can enhance learning in all stages of life. Initially, the team is focusing on the classroom, particularly on developing self-paced, interactive tools for science and engineering. Prototype programs that are being expanded upon include educational programs that link computational scientists with teachers to develop innovative curricula, web-based textbooks, and tools for visualization, simulation, remote instrumentation, and collaboration. A series of "SuperWebs" is being developed for education using the data and application tools created by the Application and Enabling Technologies Teams. These will support a wide array of learning resources, such as mentoring programs, teleworkshops, interactive simulations, chat groups for asynchronous and synchronous discussions, and online help.

CRPC Director of Human Resources and Education Richard Tapia (Rice University) is the leader of the Universal Access Team. This effort is aimed at increasing the participation of women, underrepresented minorities, and disabled people in high-performance computing and communications careers. The team's programs are targeting students, teachers, administrators, academic and industrial mentors, and policymakers at local, state, and national levels to ensure that the viewpoints and talents of diverse groups will help shape the future of computational research. Richard Aló (UHD) is a member of this team.


Other Issues of PCR Back to PCR CRPC Home Page