Synchronization Strategies for POSE =================================== cons (Basic Conservative): relies on GVT for advancement; no lookahead, no bells, no whistles [NOT MAINTAINED] opt (Basic Optimistic): Performs events in best-first order on the processor, one at a time opt2 (Optimized Optimistic): Same as opt, but when an object gets control, it will execute all events that have the same lowest timestamp opt3 (Constrained Opt2): Same as opt2, but will not execute events beyond a fixed speculative window spec (Speculative Window): No attempt made at best-first ordering; when an object gets control, it will execute all events within the fixed speculative window adapt (Basic Adaptive): same as spec, only the size of the speculative window changes based on rollback behavior: it starts at a min size, and expands by one for every successful set of forward executions (up to a max size); when a rollback occurs, speculative window size is reset to the minimum adapt2 (Experimental Adaptive): changes to regular adapt reflecting what I'm currently experimenting with... adapt5 (added by Ryan Mokos in 2009-2010; this documentation added in July, 2011) These are excerpts from a rough draft of the 2010 NSF annual report that discuss the algorithmic choices available in adapt5. In BigNetSim, calculations are performed by each parallel object, the objects communicate with each other and synchronize their work, and then the virtual time of the simulation is moved forward. This iterative process continues until the simulation is complete. Because BigNetSim is an *optimistic* parallel simulator, when performing their calculations, the parallel objects usually perform speculative execution (executing events that will occur at a future virtual time) to increase grain size and thus execution efficiency (more work done for each synchronization operation). If this speculative work is correct, the object is done for that iteration. If not, the object has to undo its calculations and perform them again, resulting in longer execution times. Therefore, a balance must be struck: we want to perform the maximum amount of speculative execution while keeping the amount of work that must be undone to a minimum. To accomplish this, a virtual time leash (or speculative window) is used to control how much speculative work is allowed to occur. Analysis of the algorithm in use revealed that the time leash was always being set to 1, leading to little speculative work and many synchronization iterations. We thus decided to develop a new algorithm based on old algorithmic ideas and further analysis of the problem. BigNetSim simulates various components of a network, including processing cores (procs), nodes, switches, and channels, all of which are represented by unique parallel objects. Each proc executes a sequence of events (each of which represents a block of computation time), some of which send messages to other procs. These messages are first packetized and then sent through the network of switches and channels. This modeling scheme leads to greatly varying behavior of the different types of objects. For example, consider Table 1, which shows the differences in virtual time between consecutive events on two different objects at the start of a MILC simulation: a proc and a switch. The difference between the proc's first and second event is 0 ns, the difference between the second and third event is 187,991,750 ns, and so on. Proc Switch 0 0 187991750 187991376 12012626 1 4693 11 32935783 12012619 51839515 1 14169597 6 2825424 4686 1573 1 15261816 6 5991642 32935771 11975431 1 3771 11 45977764 51839503 654698 1 420133 11 17902 14169590 958661 1 6966380 6 6392 2825412 1208322 1 21457 11 Table 1: Difference in virtual time in ns between consecutive events on a proc and a switch at the beginning of a MILC simulation As one might expect, the proc's events occur at irregular intervals, determined by the timing of the computational blocks, which are contained in its trace file. The switch's events are much more regular/periodic, having three events very close in virtual time followed by a large gap (in this part of the simulation). This "burst" behavior is indicative of packets being sent through the switch, each of which has a few events associated with it: receive, process/route, and send. While these patterns can become more complicated later in the simulation--especially during periods of network contention and if the periodicity changes--the same basic structure seems to remain. Additionally, we attempted to determine how the number of synchronization iterations and the number of rollbacks (events that are undone) affect the simulation execution time by varying a hard-coded time leash value (some results are shown in Table 2). In the end, only a loose correlation was found. However, the study did suggest that some rollbacks--on the order of 2-3 per iteration on average--were not only acceptable but desirable. Severely restricting the time leash to greatly discourage rollbacks led to so many extra iterations that the extra time spent in those iterations overshadowed the time saved by having fewer rollbacks. Thus, we decided to restrict the number of rollbacks allowed, but not attempt to eliminate them. Timeleash Iterations Rollbacks Rollbacks/Iter Exec. Time (sec) 500 1,086,151 220,436 0.2 113.6 1,000 626,029 135,299 0.22 93.3 2,500 283,553 90,938 0.32 76.9 5,000 150,975 102,857 0.68 71.8 7,500 103,195 108,164 1.05 69 10,000 79,021 112,325 1.42 68.4 12,500 64,379 123,748 1.92 70.2 15,000 54,529 129,032 2.37 68.8 17,500 47,577 141,140 2.97 72.3 20,000 42,427 160,732 3.79 71.7 22,500 38,533 190,073 4.93 73.8 25,000 35,439 220,217 6.21 72.8 27,500 32,975 265,164 8.04 73.1 Table 2: Results of running a 32-VP MILC simulation with different hard-coded time leashes Examination of these patterns and previous synchronization algorithms led to several attempts at improving performance. All attempts use at least one of the following to set the time leash value: A. Recent_Average_Rollback_Leash - The average value to which the time leash was set the last few times a rollback occurred B. Recent_Average_Event_Sparsity - The number of time units per event for the last several events C. Average_Rollbacks_Per_Iteration - The total number of rollbacks / the total number of iterations so far in the simulation D. Average_Events_Per_Rollback - The total number of events / the total number of rollbacks so far in the simulation E. Average_of_3rd_and_4th_Differences - A set of the most recent 16 virtual time differences (like those shown in Table 1) for the object is recorded. The highest two values are ignored, and the next two highest values are averaged to set the time leash. These are the four attempts made so far: Attempt 1: This algorithm simply uses the formula (Recent_Average_Event_Sparsity * Average_Events_Per_Rollback) in an attempt to set the time leash to the average number of time units per rollback (i.e., set the leash so it just hits the next rollback, on average). Attempt 2: This works as follows: Every time an event is executed: Add Recent_Average_Event_Sparsity to the time leash If Average_Rollbacks_Per_Iteration > a set constant around 2 Constrain the leash to (a_multiplier_constant * Recent_Average_Event_Sparsity * Average_Events_Per_Rollback) / Average_Rollbacks_Per_Iteration Else Constrain the leash to a reasonably high number like 1000 The idea behind this algorithm is to increase the time leash by the average time between events (extending the leash to include the next event) and then constrain it to a value that should avoid rollbacks (i.e., a fraction of the the average time between rollbacks) if too many have occurred so far (i.e., more than 2 per iteration), or let it have more space (constrain to 1000) if the number of rollbacks per iteration is low. Attempt 3: This works as follows: If a rollback occurs If leash > Recent_Average_Rollback_Leash Set leash = Recent_Average_Rollback_Leash Else Set leash = Recent_Average_Rollback_Leash / 2 Else (no rollback) Add Recent_Average_Event_Sparsity to the time leash If Average_Rollbacks_Per_Iteration > a set constant around 2 Constrain the leash to Recent_Average_Rollback_Leash / 2 Else Constrain the leash to a_multiplier_constant * Recent_Average_Event_Sparsity * Average_Events_Per_Rollback This algorithm is first meant to set the time leash to the average leash value when rollbacks occur (i.e., try to set it so it includes no more than one rollback) if there is a rollback, or else increase the time leash by the average time between events (extending the leash to include the next event) if there is no rollback. Then the algorithm constrains the leash to half the average leash value when rollbacks occur (to try to avoid another rollback) if too many have occurred so far (i.e., more than 2 per iteration), or let it have more space (constrain it to several times the average number of time units per rollback) if the number of rollbacks per iteration is low. Attempt 4: This simply uses Average_of_3rd_and_4th_Differences to set the leash value. A number of variations on this method are also being examined, such as using a different number of virtual time differences, averaging different differences, and not averaging at all. The basic idea behind this algorithm is that we'd like to execute an entire event "burst" (e.g., the handling of a packet) each iteration. This algorithm attempts to find a time leash value that will include all events between the peak values shown in the Switch column of Table 1. All of these new algorithms are currently undergoing testing and tuning, such as by adding or changing multiplicative constants in some of the formulas, to determine which works the best across multiple applications.