Jack Dongarra to lead NSF-funded Scalable Intracampus Research Grid (SInRG)
October 14, 1999
Knoxville, TN -- The National Science Foundation (NSF) has awarded $2 million dollars over five years to a large group of researchers at the University of Tennessee, Knoxville (UTK) for the creation of a experimental technology grid on the UTK campus. The purpose of this infrastructure, which can be called a "computational power grid" by analogy with the electrical power grid, is to support leading-edge research on technologies and applications for high performance distributed computing and information systems.
The project, called the Scalable Intracampus Research Grid (SInRG), will deploy an infrastructure that mirrors, within the boundaries of the Knoxville campus, both the underlying technologies and the interdisciplinary research collaborations that are characteristic of the national technology grid that the US research community is now developing.
Such computational power grids use special system software, sometimes known as middleware, to integrate high performance networks, computers and storage systems into unified systems that can provide advanced computing and information services - such as data staging, remote instrument control, and resource aggregation - in a pervasive and dependable way for an entire community.
The national technology grid is now growing out of the convergent efforts of NSF's Partnerships for Advanced Computational Infrastructure (PACI) and several other government agencies, including NASA, DOD, and DOE. While the SInRG infrastructure will become a node on this national grid at some point, it's primary purpose is to provide a technological and organizational microcosm in which key research challenges underlying grid-based computing can be attacked with better communication and control than wide-area environments usually permit.
Jack Dongarra, professor and distinguished scientist at UTK and the Oak Ridge National Laboratory, leads the large team of computer scientists and research partners from other disciplines that will build and use SInRG. The project team is made up of two basic groups. One is focused on research for SInRG's middleware and the other is engaged in interdisciplinary research leading to applications that will leverage SInRG's power.
The members of the middleware group - Dongarra, Jim Plank, Rich Wolski, and Micah Beck - bring complementary research interests and component software to the task at hand, including software for remote scientific computing, scheduling distributed computation, resource monitoring and performance prediction, and flexible management of distributed storage.
"Of course we're excited by the opportunity to build on and experiment with the integration of our different projects to create SInRG's system software," Dongarra said. "But we also plan to leverage the work that's being done by the PACI's and the other parts of the national grid community. Our work with SInRG will be one part, an important one we hope, of a much larger story about the transformation of computing and information systems that computational grids will bring about."
The SInRG applications research group mirrors another aspect of the national effort - it's based on interdisciplinary collaboration between computer scientists and researchers from other domains with extremely challenging computational problems they need to solve. The initial set of SInRG applications will build on long running partnerships between a group of computer science co-PIs - including Bob Ward, Jens Gregor, Mike Thomason, Mike Langston, Padma Raghavan, and Michael Berry - and leading researchers from other departments, including Chemical Engineering (Peter Cummings), Radiology (Gary Smith), Electrical Engineering (Don Bouldin) and Computational Ecology (Lou Gross).
"The grid community recognizes that having the requirements of advanced applications drive the development of grid technology is crucial," said Wolski, who participates extensively in the work of the PACIs. "The fact that SInRG has several well established research collaborations to build on and that all the collaborators are on the same campus, and mostly in the same department as SInRG's middleware group, is tremendously advantageous for the project."
All but a fraction of SInRG's funding will go to purchase special Grid Service Clusters (GSCs), which are hardware ensembles specifically designed and configured to fit SInRG's multifaceted research agenda. Each GSC will consist of a compute engine (e.g. a large commodity cluster), a mass storage device, and a fast data switch integrating them all and connecting them to the campus' high performance network.
"You can think of a GSC as a next generation workgroup cluster in an advanced local area network," said Micah Beck, who was a member of the team that developed the concept of a GSC. "Each of them will be assigned to one of the collaborating teams and customized to meet their special needs. But they'll also be designed from the ground up to be nodes on the grid, so that computational power can be moved around as needed and the resources can be flexibly shared by the whole SInRG community."
Other features of the SInRG project correspond to further aspects of the national grid effort. For instance SInRG GSC's will be physically located at the various sites around the UT campus, primarily those of the partnering groups from other disciplines. Each will be managed with some degree of autonomy by these groups, with oversight and coordination from the CS co-PIs who collaborate with them.
The campus network will provide the underlying fabric that makes it possible to use all this distributed hardware as a single collective resource, a unified computational power grid. In addition, other researchers in the CS department will use the campus grid to experiment and verify their research efforts, such as Jesse Poore and his economical production of high-quality software efforts.
"By the end of the five years, we will have seven Grid Service Clusters spread among six different locations around the campus, including one across the Tennessee River at the UT Medical Center," said CS Department Head and SInRG co-PI Bob Ward. "In one form or another we expect to encounter all the problems that the community building the national technology grid will see and we're looking forward to the special opportunity that this unique infrastructure will give us to address them."
UTK's Division of Information Infrastructure is also participating in SInRG and will have a GSC that is partially funded by the project. Dewitt Latimer, UTK's Director of Computing and Networking Services, says his commitment to the project is motivated by the need to find a new approach to supporting high performance computing for the campus research community.
"Central university computing facilities will be hard pressed to fund large-scale super computers in the future," he noted. "SInRG's concept represents an exciting new model of resource sharing that will permit large-scale computing to be undertaken within more modest institutional budgets."
Following this model, project leaders anticipate that over time the success of SInRG will encourage other UTK researchers to join SInRG, adding new GSC's and other resources to the grid, while in return drawing on its power as needed.