A Scalable Double In-memory Checkpoint and Restart Scheme towards Exascale
    
    Workshop on Fault-Tolerance for HPC at Extreme Scale (FTXS) 2012
    Publication Type: Talk
    Repository URL: 
    
        Download: 
        
          [PPT]
        
      
    Summary
    This talk described recent progress in optimizing inmem checkpoint/restart fault tolerance scheme to 64K cores of Blue Gene/P machine with scalable performance.
    People
      
    Research Areas
      









