The smallest run shown, 65,536 triangles on a single processor, takes 0.44 seconds per step. The largest run shown, 65,536 triangles on each of 1,500 processors or 98.3 million triangles, takes 0.73 seconds per step, for a speedup of 915 or a parallel efficiency of 60 percent.
The observed parallel performance is indeed excellent. It also compares quite favorably with the result of 1 second per timestep for 8,000 objects per processor for the parallel RCB scheme described in [#!Hend96!#].
The parallel implementation also scales down for smaller models and fast response time, such as for interactive applications. 32 processors of a 195 MHz Origin2000 system can handle 300,000 triangles at the good interactive rate of 30 milliseconds per step.