Pilot Study Teaches NSF Costly Lesson

Source: Science magazine, September 6, 1996
By Jeffrey Mervis

When a panel of the National Academy of Sciences issued an assessment last month of one of the most visible research programs at the National Science Foundation, the outcome was music to NSF's ears. The Committee on Science, Engineering, and Public Policy (COSEPUP) gave a strong endorsement to NSF's Science and Technology Centers (STCs) program–a $60-million-a-year effort launched in 1989–and recommended that it be continued (Science, 16 August, p. 866). Although NSF officials were pleased with the result, the review process itself pleased virtually nobody. Indeed, the assessment turned out to be a $727,000 lesson in how not to measure the value to society of basic research.

NSF officials had hoped the review would do double duty. They needed a top- to-bottom assessment of the STCs to help them decide whether to renew the program before the first centers complete their 11-year funding cycle in 2000. But they also wanted to make the review a model for how to assess the NSF's entire $3 billion research and education portfolio. NSF and every other federal agency

will soon be required to make such sweeping evaluations under the 1993 Government Performance and Results Act (GPRA), which directs agencies to justify their budgets based on the value of what they accomplish (Science, 6 January 1995, p. 20).

NSF's original plans called for one organization to conduct a 2-year study in two steps: a thorough evaluation of the STC program, which would feed information to an expert panel that would offer advice on the future of the program. But center directors were worried that a contractor might not be able to assemble the necessary talent for a blue-ribbon assessment of their programs. "This program was created out of an academy panel [the so-called 1987 Zare report], and we felt there should be an equally distinguished panel looking at its future," says Ken Kennedy, director of the Center for Research on Parallel Computation based at Rice University. So last summer NSF divvied up the job, awarding COSEPUP $184,000 to assemble the expert panel and giving a $543,000 contract to Abt Associates Inc. of Cambridge, Massachusetts, to collect information on the program. (Abt's four-volume report was submitted to NSF in June.)

The academy hoped its expert panel would be able to shape Abt's efforts to gather a mass of information on how well the centers were meeting their triple mission of pursuing frontier research, improving science education, and transferring knowledge to industry. Unfortunately, Abt had already developed its survey and begun to collect data by the time the academy panel was formed. In addition to the different paces of the two organizations, NSF was forced to push up Abt's deadline because it needed to submit the findings this summer to another advisory panel, which was preparing a final recommendation to the National Science Board. The board, NSF's oversight body, is expected to make a decision in November.

The result was a procedural nightmare. "The panel strongly recommends against NSF's use of a process like the one used in the STC program evaluation as a model for future evaluations," COSEPUP concluded in its report. "We need to recognize that this was an approach that didn't work even though [NSF] spent huge amounts of money on it," says William Brinkman, vice president for physical sciences at Bell Laboratories and chair of the COSEPUP panel. "The fundamental structure was wrong."

And the price was definitely not right. "We realized, in retrospect, that there was no way we could afford to do this across the whole foundation," says Anne Petersen, NSF's deputy director and chief financial officer.

To compound these problems, NSF didn't flesh out its approach to GPRA program assessment until December, when the STC review was nearing completion, and it opted for a less quantitative approach than it originally proposed. That made the highly quantitative STC study less relevant as a model for the more sweeping GPRA review. "When the STC evaluation began, we thought there might be a way to do things in a more quantitative way," says Petersen. "But now I think the pitfalls outweigh any benefits." While it is useful to collect detailed information about such aspects of the program as the publication citation rates of scientists, the number of students trained, and the extent of industrial partnerships, says Petersen, GPRA requires agencies "to look at the big picture."

Stephen Fitzsimmons, a vice president at Abt and principal associate on the study, agrees that GPRA is a tall order for agencies. "The government can say, 'Thou shalt have a set of indicators [to measure research outcomes].' But that doesn't mean you'll get them. It will take some time to develop a sound approach to assessing fundamental research," he says. "I don't know how to do it."

Petersen says she empathizes with the center directors, who felt they were being used as guinea pigs for an experiment whose methodology had not been worked out. But NSF has come away with one important lesson from the exercise: "From now on, our GPRA reviews will be done in-house, through an expanded use of existing committees," says Petersen.

Hipersoft | CRPC

Pilot Study Teaches NSF Costly Lesson

Source: Science magazine, September 6, 1996 By Jeffrey Mervis

Source: Science magazine, September 6, 1996
By Jeffrey Mervis