Fast Forward

Source: Sallyport, February/March 1992
By David D. Medina

For a moment in November, Rice computer expert Ken Kennedy '67 sat still in his office, unraveling from travel fatigue. He had just spent 10 days in Albuquerque, where he worked as program chair of SUPERCOMPUTING '91, the most important high-performance computer meeting of the year. He'd been out of the country five weeks already, traveling to England, Austria, Israel and Amsterdam. Stateside, he'd flown to Hawaii; San Diego; Evanston, Ill.; Washington, D.C.; Yorktown Heights, N.Y.; and San Antonio.

Two travel bags sat next to his desk, and airline ticket stubs protruded from his open briefcase. Haggard and stricken with a cold, Kennedy struggled to answer questions about his work. He sat still only for a moment. A week later he was on the road again.

Why can't Kennedy keep still? Because it seems that in the highly competitive world of supercomputers, nearly everybody wants his expertise. In addition to chairing the university's computer science department, he directs its interdisciplinary cluster of computer experts, the Computer and Information Technology Institute (CITI).

That's probably enough work for any one person. But what keeps Kennedy really hopping is directing the Center for Research on Parallel Computation (CRPC). CRPC is funded with a federal research grant of 22.9 million, the biggest Rice has ever administered. That money supports a national team of 46 scientists and dozens of graduate students at four universities-Rice, the California Institute of Technology, the University of Tennessee and Syracuse University-and two national laboratories-Los Alamos National Laboratory and Argonne National Laboratory. The goal of CRPC is to perfect software for the next generation of supercomputers: teraflop machines.

These supercomputers will be the fastest in the world, able to compute a trillion (tera) arithmetic operations (flop) a second. That's 1,000 times faster than the fastest conventional computer today, which cranks out a mere billion calculations at the blink of an eye.

We're talking super fast. It takes 32,000 years for a trillion seconds to tick away. At a trillion calculations a second, a teraflop machine will make calculations in 3.6 seconds that currently take an hour. It will solve problems in a few hours that now require 100 days.

Most current computers are sequential computers. They solve problems one step at a time. The fastest of these sequential computers is capable of taking large amounts of numbers through a circuit called a pipeline, which works in the same way as an assembly line. One part of the machine does the same job over and over and then passes the completed part to another stage. The heart of the sequential computer is the processor, a single silicon chip smaller than a fingernail and capable of executing simple instructions. Personal computers typically have one processor; most supercomputers have several. The Y-MP C90, introduced in November by Cray Research Inc., has 16.

Teraflop machines will have several thousand processors. Instead of functioning sequentially, they will function in parallel. Rather than waiting for a piece of the puzzle to be solved before moving on to the next step, parallel computers work simultaneously on different parts of a single problem.

Such speed is necessary because current supercomputers have reached their limits and don't compute fast enough to solve some of the complex problems that scientists and engineers are facing. Weather forecasting, for example, is still unreliable because current supercomputers simply can't process all the variables that come into play. Teraflop machines could more easily handle the vast array of data.

Teraflop machines could also help airplane designers simulate the Navier-Stokes equation, the complex formula that describes how air flows over surfaces. The equation is vital in designing aircraft wings, but with the current generation of supercomputers, complex flight conditions have to be simplified.

Teraflop machines will improve car and space shuttle designs, help in genetic research and assist in discovering oil. Governments, universities and big business will pay dearly for one. Analysts estimate their cost to range from $50 million to $200 million.

The federal government wants the teraflop machine and wants it fast. Under the terms of its federal grants, by the end of the decade CRPC will produce prototypes of software that will make teraflop machines easy to program.

No wonder Kennedy can't waste time. He's on a deadline. In his opinion, whichever country perfects teraflop computing and integrates it into science and technology will dominate the world's economy.

Kennedy follows basketball, enjoys the visual arts and serves on the board of Houston's Society for the Performing Arts.

The son of an army officer, he was born in Washington, D.C., and moved 16 times before graduating from Carlisle Senior High School in Carlisle, Penn. He entered Rice in 1962, well before universities offered computer majors. He thrived in Math 100, a required course, taught by the legendary Arlen Brown, which most freshmen dreaded. The mythically hard homework three nights a week was heaven to him. "I really thought it was terrific. I enjoyed it tremendously," he says.

Kennedy intended to major in physics, but in his junior year he switched to mathematics because, he says, the physics laboratory assignments were getting the best of him. The only thing that saved him from doing badly was his first computer: a 1620 IBM punch-card model slower than most current pocket calculators. He became fascinated with the wonders it could work.

"I thought it was fun because it took abstract mathematical concepts and gave them some touch of reality," he says.

Kennedy's fascination for solving logical mysteries led him to search for programming errors. What started off as a hobby eventually turned into an academic discipline. After graduating from Rice summa cum laude in 1966, Kennedy went to New York University to pursue a doctorate in mathematics.

When Jacob T. Schwartz, his faculty mentor at NYU, decided to become a computer scientist, Kennedy followed him into the uncharted field. In 1971, he and another student were the first to earn doctorates in computer science at NYU. A few months later, he got his first and only job, teaching at Rice, where he has now been for 21 years.

In 1990, Kennedy was elected to the prestigious National Academy of Engineering for his work in parallel processing. This September, he was invited to serve as an expert on the President's Council of Advisers on Science and Technology.

The road to Kennedy's role in parallel computing lay through what are known as compilers, programs that translate other programs. Kennedy became a compiler expert by spending a one-year sabbatical in 1978 at IBM Research in Yorktown Heights, N.Y. There he exchanged ideas with David Kuck of the University of Illinois. The two men led competing groups in making a compiler that would use ordinary computer language to run a variety of vector machines.

Most current supercomputers are vector machines, so called because they use vector processors. Vector processors bring order to a long problem by sorting out similar mathematical operations. The multiplications are done first, then the division, and so on, enabling the machine to compute more quickly.

Kennedy compares vectors to a squad of British infantry with ranks of soldiers firing in sequence. While one group is reloading, another is shooting.

Parallel processing works in a more flexible and efficient way: It permits the soldiers to fire and reload as they see fit. "You get the speed when you have completely independent calculations," Kennedy says. It's a simple idea, but the big problem is how to tell the processors what to do in a way that's fast and easy. Software is needed to keep the messages moving in coordination with each other as they cross different parallel paths.

Kennedy fell short of his goal to build a compiler that uses ordinary language but did manage to automate the programming through a computer language called FORTRAN. Using fairly easy rules, a scientist can restructure a program written for one vector machine and use it for another instead of having to redo the program.

The same principle is being applied to parallel computers.

"We would like to come up with a vehicle," he says, "that is not very different from the way people are programming today."

That's going to take some work. It's two to three times harder to write a parallel program than a vector program, he says.

The work requires lots of people. Kennedy is working with Rice's computer science experts Keith Cooper, Linda Torczon, Robert Hood, Charles Koelbel, John Mellor-Crummey and Mary Hall. The team collaborates with experts from Syracuse University, the University of Tennessee and the California Institute of Technology.

In two and a half years, Kennedy's team has designed an extension of FORTRAN called FORTRAN D that many computer experts consider the right way to program a variety of parallel computers. Rice is a year away from having a prototype that the industry will be able to use, Kennedy says.

Essentially Kennedy and his colleagues are creating a whole new kind of software.

"We will need to replace 30 years' worth of algorithms that can take advantage of high degrees of parallelism in machines," Kennedy says.

Algorithms are the mathematical core of software. They consist mostly of ordinary arithmetic calculations, such as subtraction, addition, multiplication and division. Most algorithms have been written with single processors in mind. Kennedy's task at CRPC is to develop algorithms that can handle large numbers of processors effectively.

Fourteen members of Rice's math science department are working on algorithms, along with researchers from four universities and two national laboratories.

The United States' chief competition in the race for parallel computing is, not surprisingly, Japan. Two years ago the Japanese government announced it was embarking on a five-year program to develop software for massively parallel systems.

Sensing the urgency, President Bush's Council of Advisers on Science and Technology has asked Congress to increase spending on supercomputing during the next five years, aiming at a funding target of $1.1 billion a year.

The money is being spent to develop hardware and software for supercomputers, establish an educational computer network system and provide training to handle high-performance computers. Supercomputers would be used to tackle the federal government's "Grand Challenges," a list of 20 major scientific problems that confront the nation, from controlling air pollution to detecting cancer-causing genes to studying the earth's biosphere.

Last year, the federal government spent $500 million on supercomputers. For 1992, Congress has already approved $650 million. Private industry is competing to build teraflop machines by 1996. Industry will then benefit from the programming solutions that CRPC is creating.

"We're ahead, and we need to stay ahead," Kennedy says. In order to stay ahead, Kennedy coordinates a complex mix of grants, institutions and groups of institutions.

One of his tasks is to direct the Computer and Information Technology Institute, which consists of 45 Rice faculty members from nine academic departments: computer science, geology, electrical and computer engineering, space physics, mathematical sciences, psychology, mechanical engineering, linguistics and chemical engineering. Administrative offices are housed in a wing of the Fondren Library.

With electrical engineering professor Sidney Burrus and chemical engineering professor David Hellums, Kennedy helped found CITI in 1987. It had become clear to researchers that single-discipline investigations were no longer sufficient to deal with larger scientific problems. The federal government encouraged the formation of interdisciplinary programs in the sciences through its granting policies. Grants were used to spur universities into industrial and technological collaborations that would lead to new inventions. In 1987 the National Science Foundation began funding CITI with $3 million in grants for parallel computing and infrastructure.

CITI's biggest boost came in 1989, when the NSF selected Rice to lead a consortium of universities and government laboratories in trying to make supercomputers easier to use. Rice received a five-year $22.9 million grant to establish the Center for Research on Parallel Computation.

It was a coup for Rice. Of the 330 research centers that applied, only 11 universities were chosen to create such "centers of excellence." This November, after CRPC received a glowing review from the NSF, the center's grant was extended to eight years. Last year the state of Texas kicked in $5 million for five years.

Rice University President George Rupp has proposed that a building be constructed on campus to house all the university's efforts in computer research under one roof. For now, Rice's supercomputing projects are scattered in several sites on campus.

The university's four parallel computers are situated in the computer center in the Mudd Building. The largest of these, an Intel iPSC/860, looks like an oversized refrigerator and has 32 processors. Through CRPC, Rice researchers have access to substantially more powerful machines, including the Connection Machine 2, a big black cube stretching seven feet long containing 2,000 processors at Los Alamos' Advanced Computing Laboratory.

Using what Kennedy says is the best collection of hardware in the country, the CRPC team intends to finish its gigantic project by the end of the decade. The federal government has set an 11-year goal that Kennedy believes the CRPC can meet. After that goal is completed, Kennedy says, the CRPC can rename itself, go after more government grants and major projects, or simply dissolve itself.

"But I expect to have new challenges by the time that rolls around," he says.

Hipersoft | CRPC

Fast Forward

Source: Sallyport, February/March 1992 By David D. Medina

Source: Sallyport, February/March 1992
By David D. Medina