Optimizing Data Locality for Fork/Join Programs Using Constrained Work Stealing
    
    International Conference for High Performance Computing, Networking, Storage and Analysis  (SC) 2014
    Publication Type: Talk
    Repository URL: 
    
        Download: 
        
          [PDF]
        
      
    Summary
     We present an approach to improving data locality across different
 phases of fork/join programs scheduled using work stealing.  The
 approach consists of: (1) user-specified and automated approaches to
 constructing a {\em steal tree}, the schedule of steal operations,
 and (2) constrained work-stealing  algorithms that constrain
 the actions of the scheduler to mirror a given steal tree.  These
 are combined to construct work-stealing schedules that maximize data
 locality across computation phases while ensuring load balance
 within each phase. These algorithms are also used to demonstrate
 dynamic coarsening , an optimization to improve spatial
 locality and sequential overheads by combining many finer-grained
 tasks into coarser tasks while ensuring sufficient concurrency for
 locality-optimized load balance.  Implementation and evaluation in
 Cilk demonstrate performance improvements of up to 2.5x on 80 cores.
 We also demonstrate that dynamic coarsening can combine the
 performance benefits of coarse task specification with the
 adaptability of finer tasks.
    People
      
    Research Areas
      









