Source: Sallyport, February/March 1992
By David D. Medina
For a moment in November, Rice computer expert Ken Kennedy '67 sat still
in his office, unraveling from travel fatigue. He had just spent 10 days
in Albuquerque, where he worked as program chair of SUPERCOMPUTING '91,
the most important high-performance computer meeting of the year. He'd
been out of the country five weeks already, traveling to England,
Austria, Israel and Amsterdam. Stateside, he'd flown to Hawaii; San
Diego; Evanston, Ill.; Washington, D.C.; Yorktown Heights, N.Y.; and San
Two travel bags sat next to his desk, and airline ticket stubs protruded
from his open briefcase. Haggard and stricken with a cold, Kennedy
struggled to answer questions about his work. He sat still only for a
moment. A week later he was on the road again.
Why can't Kennedy keep still? Because it seems that in the highly
competitive world of supercomputers, nearly everybody wants his
expertise. In addition to chairing the university's computer science
department, he directs its interdisciplinary cluster of computer
experts, the Computer and Information Technology Institute (CITI).
That's probably enough work for any one person. But what keeps Kennedy
really hopping is directing the Center for Research on Parallel
Computation (CRPC). CRPC is funded with a federal research grant of 22.9
million, the biggest Rice has ever administered. That money supports a
national team of 46 scientists and dozens of graduate students at four
universities-Rice, the California Institute of Technology, the
University of Tennessee and Syracuse University-and two national
laboratories-Los Alamos National Laboratory and Argonne National
Laboratory. The goal of CRPC is to perfect software for the next
generation of supercomputers: teraflop machines.
These supercomputers will be the fastest in the world, able to compute a
trillion (tera) arithmetic operations (flop) a second. That's 1,000
times faster than the fastest conventional computer today, which cranks
out a mere billion calculations at the blink of an eye.
We're talking super fast. It takes 32,000 years for a trillion seconds
to tick away. At a trillion calculations a second, a teraflop machine
will make calculations in 3.6 seconds that currently take an hour. It
will solve problems in a few hours that now require 100 days.
Most current computers are sequential computers. They solve problems one
step at a time. The fastest of these sequential computers is capable of
taking large amounts of numbers through a circuit called a pipeline,
which works in the same way as an assembly line. One part of the machine
does the same job over and over and then passes the completed part to
another stage. The heart of the sequential computer is the processor, a
single silicon chip smaller than a fingernail and capable of executing
simple instructions. Personal computers typically have one processor;
most supercomputers have several. The Y-MP C90, introduced in November
by Cray Research Inc., has 16.
Teraflop machines will have several thousand processors. Instead of
functioning sequentially, they will function in parallel. Rather than
waiting for a piece of the puzzle to be solved before moving on to the
next step, parallel computers work simultaneously on different parts of
a single problem.
Such speed is necessary because current supercomputers have reached
their limits and don't compute fast enough to solve some of the complex
problems that scientists and engineers are facing. Weather forecasting,
for example, is still unreliable because current supercomputers simply
can't process all the variables that come into play. Teraflop machines
could more easily handle the vast array of data.
Teraflop machines could also help airplane designers simulate the
Navier-Stokes equation, the complex formula that describes how air flows
over surfaces. The equation is vital in designing aircraft wings, but
with the current generation of supercomputers, complex flight conditions
have to be simplified.
Teraflop machines will improve car and space shuttle designs, help in
genetic research and assist in discovering oil. Governments,
universities and big business will pay dearly for one. Analysts estimate
their cost to range from $50 million to $200 million.
The federal government wants the teraflop machine and wants it fast.
Under the terms of its federal grants, by the end of the decade CRPC
will produce prototypes of software that will make teraflop machines
easy to program.
No wonder Kennedy can't waste time. He's on a deadline. In his opinion,
whichever country perfects teraflop computing and integrates it into
science and technology will dominate the world's economy.
Kennedy follows basketball, enjoys the visual arts and serves on the
board of Houston's Society for the Performing Arts.
The son of an army officer, he was born in Washington, D.C., and moved
16 times before graduating from Carlisle Senior High School in Carlisle,
Penn. He entered Rice in 1962, well before universities offered computer
majors. He thrived in Math 100, a required course, taught by the
legendary Arlen Brown, which most freshmen dreaded. The mythically hard
homework three nights a week was heaven to him. "I really thought it was
terrific. I enjoyed it tremendously," he says.
Kennedy intended to major in physics, but in his junior year he switched
to mathematics because, he says, the physics laboratory assignments were
getting the best of him. The only thing that saved him from doing badly
was his first computer: a 1620 IBM punch-card model slower than most
current pocket calculators. He became fascinated with the wonders it
"I thought it was fun because it took abstract mathematical concepts and
gave them some touch of reality," he says.
Kennedy's fascination for solving logical mysteries led him to search
for programming errors. What started off as a hobby eventually turned
into an academic discipline. After graduating from Rice summa cum laude
in 1966, Kennedy went to New York University to pursue a doctorate in
When Jacob T. Schwartz, his faculty mentor at NYU, decided to become a
computer scientist, Kennedy followed him into the uncharted field. In
1971, he and another student were the first to earn doctorates in
computer science at NYU. A few months later, he got his first and only
job, teaching at Rice, where he has now been for 21 years.
In 1990, Kennedy was elected to the prestigious National Academy of
Engineering for his work in parallel processing. This September, he was
invited to serve as an expert on the President's Council of Advisers on
Science and Technology.
The road to Kennedy's role in parallel computing lay through what are
known as compilers, programs that translate other programs. Kennedy
became a compiler expert by spending a one-year sabbatical in 1978 at
IBM Research in Yorktown Heights, N.Y. There he exchanged ideas with
David Kuck of the University of Illinois. The two men led competing
groups in making a compiler that would use ordinary computer language to
run a variety of vector machines.
Most current supercomputers are vector machines, so called because they
use vector processors. Vector processors bring order to a long problem
by sorting out similar mathematical operations. The multiplications are
done first, then the division, and so on, enabling the machine to
compute more quickly.
Kennedy compares vectors to a squad of British infantry with ranks of
soldiers firing in sequence. While one group is reloading, another is
Parallel processing works in a more flexible and efficient way: It
permits the soldiers to fire and reload as they see fit. "You get the
speed when you have completely independent calculations," Kennedy says.
It's a simple idea, but the big problem is how to tell the processors
what to do in a way that's fast and easy. Software is needed to keep the
messages moving in coordination with each other as they cross different
Kennedy fell short of his goal to build a compiler that uses ordinary
language but did manage to automate the programming through a computer
language called FORTRAN. Using fairly easy rules, a scientist can
restructure a program written for one vector machine and use it for
another instead of having to redo the program.
The same principle is being applied to parallel computers.
"We would like to come up with a vehicle," he says, "that is not very
different from the way people are programming today."
That's going to take some work. It's two to three times harder to write
a parallel program than a vector program, he says.
The work requires lots of people. Kennedy is working with Rice's
computer science experts Keith Cooper, Linda Torczon, Robert Hood,
Charles Koelbel, John Mellor-Crummey and Mary Hall. The team
collaborates with experts from Syracuse University, the University of
Tennessee and the California Institute of Technology.
In two and a half years, Kennedy's team has designed an extension of
FORTRAN called FORTRAN D that many computer experts consider the right
way to program a variety of parallel computers. Rice is a year away from
having a prototype that the industry will be able to use, Kennedy
Essentially Kennedy and his colleagues are creating a whole new kind of
"We will need to replace 30 years' worth of algorithms that can take
advantage of high degrees of parallelism in machines," Kennedy says.
Algorithms are the mathematical core of software. They consist mostly of
ordinary arithmetic calculations, such as subtraction, addition,
multiplication and division. Most algorithms have been written with
single processors in mind. Kennedy's task at CRPC is to develop
algorithms that can handle large numbers of processors effectively.
Fourteen members of Rice's math science department are working on
algorithms, along with researchers from four universities and two
The United States' chief competition in the race for parallel computing
is, not surprisingly, Japan. Two years ago the Japanese government
announced it was embarking on a five-year program to develop software
for massively parallel systems.
Sensing the urgency, President Bush's Council of Advisers on Science and
Technology has asked Congress to increase spending on supercomputing
during the next five years, aiming at a funding target of $1.1 billion a
The money is being spent to develop hardware and software for
supercomputers, establish an educational computer network system and
provide training to handle high-performance computers. Supercomputers
would be used to tackle the federal government's "Grand Challenges," a
list of 20 major scientific problems that confront the nation, from
controlling air pollution to detecting cancer-causing genes to studying
the earth's biosphere.
Last year, the federal government spent $500 million on supercomputers.
For 1992, Congress has already approved $650 million. Private industry
is competing to build teraflop machines by 1996. Industry will then
benefit from the programming solutions that CRPC is creating.
"We're ahead, and we need to stay ahead," Kennedy says. In order to stay
ahead, Kennedy coordinates a complex mix of grants, institutions and
groups of institutions.
One of his tasks is to direct the Computer and Information Technology
Institute, which consists of 45 Rice faculty members from nine academic
departments: computer science, geology, electrical and computer
engineering, space physics, mathematical sciences, psychology,
mechanical engineering, linguistics and chemical engineering.
Administrative offices are housed in a wing of the Fondren Library.
With electrical engineering professor Sidney Burrus and chemical
engineering professor David Hellums, Kennedy helped found CITI in 1987.
It had become clear to researchers that single-discipline investigations
were no longer sufficient to deal with larger scientific problems. The
federal government encouraged the formation of interdisciplinary
programs in the sciences through its granting policies. Grants were used
to spur universities into industrial and technological collaborations
that would lead to new inventions. In 1987 the National Science
Foundation began funding CITI with $3 million in grants for parallel
computing and infrastructure.
CITI's biggest boost came in 1989, when the NSF selected Rice to lead a
consortium of universities and government laboratories in trying to make
supercomputers easier to use. Rice received a five-year $22.9 million
grant to establish the Center for Research on Parallel Computation.
It was a coup for Rice. Of the 330 research centers that applied, only
11 universities were chosen to create such "centers of excellence." This
November, after CRPC received a glowing review from the NSF, the
center's grant was extended to eight years. Last year the state of Texas
kicked in $5 million for five years.
Rice University President George Rupp has proposed that a building be
constructed on campus to house all the university's efforts in computer
research under one roof. For now, Rice's supercomputing projects are
scattered in several sites on campus.
The university's four parallel computers are situated in the computer
center in the Mudd Building. The largest of these, an Intel iPSC/860,
looks like an oversized refrigerator and has 32 processors. Through
CRPC, Rice researchers have access to substantially more powerful
machines, including the Connection Machine 2, a big black cube
stretching seven feet long containing 2,000 processors at Los Alamos'
Advanced Computing Laboratory.
Using what Kennedy says is the best collection of hardware in the
country, the CRPC team intends to finish its gigantic project by the end
of the decade. The federal government has set an 11-year goal that
Kennedy believes the CRPC can meet. After that goal is completed,
Kennedy says, the CRPC can rename itself, go after more government
grants and major projects, or simply dissolve itself.
"But I expect to have new challenges by the time that rolls around," he