Additional Parallelism Through Stripmining

Independent of the parallelism of the CFD code, the derivative computation can be parallelized by the simple expedient of computing derivatives with respect to different sets of independent variables on different processors. This method, called stripmining, has already been described in "Parallel calculation of sensitivity derivatives for aircraft design using automatic differentiation" by Bischof, Green, Haigler and Knauff.

ADIFOR-augmented code supports stripmining with minimal programmer effort---in ADIFOR parlance, only the initialization of the seed matrices must be altered. We note, however, that this method redundantly recomputes the function value on each processor doing derivative work. To measure the impact of this redundant computation on the efficiency of stripmining, we ran several simple experiments on the SP2. In the example, we need to compute derivatives with respect to 6 input parameters. The results are as follows:

1 processor with 6 derivatives 13.27 seconds/step
2 processors with 3 derivatives 8.21 seconds/step
6 processors with 1 derivative 4.02 seconds/step

Notice that using 6 processors instead of 1 reduces the total time required to compute the derivatives by about 70%. Furthermore, since memory usage is approximately linear in the number of derivatives, using 6 processors instead of 1, reduces the memory required on each processor by about a factor of 6.