NASA AMES' BAILEY RESPONDS TO KENNEDY ARTICLE

I just read Ken Kennedy's article "Parallel Computing: What We Did Wrong and What We Did Right" (Parallel Computing Research, January 1995, p. 2). This topic is an important one in a field that changes so fast that we often lose sight of the historical perspective. I generally agree with Ken's assertions that the most pressing problem at this point in time is software. However, I'm not so sure that this has always been the case.

Even two or three years ago, the biggest obstacle was hardware, particularly the disappointing, real-world performance of many MPP systems. It was only with the rise of systems like the IBM SP-2, the Cray T3D, and the SGI Power Challenge that things started looking up. There has been considerable convergence in the hardware arena lately. For example, it is now generally acknowledged that the best parallel systems are those that have a moderate number of high-powered nodes, as opposed to tens of thousands of weak nodes. However, this is 20/20 hindsight.

It is also recognized now that parallel systems should be designed so that the programmer does not have to worry about network topology and should only have to be aware of local data and remote data. As a result, much literature in the field, including papers on hypercube-specific algorithms, have been rendered irrelevant.

Another example of 20/20 hindsight has been the progress in system reliability (a hardware and software issue). Frequent system crashes in earlier systems were extremely frustrating, making some MPP users gravitate back to conventional systems. However, today some MPPs have achieved reliability levels rivaling those of conventional systems, mostly because of proven workstation technology.

Yes, the principal issue now is software, especially in programming tools for mainstream scientific and engineering users. This need is demonstrated by the fact that most programmers still use "message passing," a euphemism for completely explicit data allocation and communication. I am cheering for HPF, although we will probably have to wait at least a year before HPF processors are widely available and generally competitive with manual message-passing efforts.

Maybe real progress in software, though, had to await progress in architecture, performance, and reliability, so software researchers could focus their efforts and third-party software companies could port their applications. The progress we have recently seen in both hardware and software is clear evidence that the technology of highly parallel systems is finally maturing and will be available to the public soon. It's about time.

- David H. Bailey

David H. Bailey is with the Numerical Aerodynamic Simulation program at NASA Ames Research Center. His research has included studies in numerical analysis, computational number theory, and supercomputer performance. He is one of the authors of the widely cited NAS Parallel Benchmarks and has received the IEEE Sid Fernbach award and the Chauvenet and Merten Hasse Prizes from the Mathematical Association of America.

Table of Contents