ADIFOR-augmented code supports stripmining with minimal programmer effort---in ADIFOR parlance, only the initialization of the seed matrices must be altered. We note, however, that this method redundantly recomputes the function value on each processor doing derivative work. To measure the impact of this redundant computation on the efficiency of stripmining, we ran several simple experiments on the SP2. In the example, we need to compute derivatives with respect to 6 input parameters. The results are as follows:
1 processor with 6 derivatives | 13.27 seconds/step |
2 processors with 3 derivatives | 8.21 seconds/step |
6 processors with 1 derivative | 4.02 seconds/step |
Notice that using 6 processors instead of 1 reduces the total time required to compute the derivatives by about 70%. Furthermore, since memory usage is approximately linear in the number of derivatives, using 6 processors instead of 1, reduces the memory required on each processor by about a factor of 6.