next up previous
Next: Architectural Simulation Up: Performance Previous: FEM

Performance of Post-mortem Simulator

To evaluate the parallel performance of the simulator itself, we used the BigSim emulator on 32 real processors to run a 2D Jacobi program on 8000 simulated processors. This emulation generated trace log files that we then loaded into the POSE timestamp correction simulator. We show a speedup plot for the POSE simulator from 1 to 64 processors in Figure 8. The simulator processed 5,085,836 events and had an average grainsize of 198 microseconds.

Figure 8: POSE Timestamp Correction Simulator Speedup
Image igures/ts_spdup.png

The figure shows two plots: real speedup and self speedup. Self speedup shows how the program speeds up with respect to itself; i.e. the single processor time is the time for the parallel POSE simulation. Real speedup uses an ideal sequential time estimate of how long it would take to execute just the events of the simulation with no overhead for timestamp sorting; i.e. the time the program would take if we knew in advance the exact order in which to execute the events, as well as which events are to be executed. This is a lower-bound on the sequential time for the simulation. As the figure shows, self speedup is nearly perfectly linear up to 32 processors and tapers off after that, while real speedup shows a modest but correspondingly steady speedup improvement as we add processors.


next up previous
Next: Architectural Simulation Up: Performance Previous: FEM
Gengbin Zheng 2004-01-21