OpenAtom
Version1.5a
|
The Ortho object is basically used to accept the rows of the "S" matrix, use the orthoTransform subroutine to obtain the "T" matrix, and send the rows of the "T" matrix back to all the GSpace states. More...
Namespaces | |
cp | |
Class for paircalc config data. | |
Classes | |
class | CLA_Matrix |
class | CLA_Matrix_interface |
class | CLA_Matrix_msg |
class | CLA_MM3D_mult_init_msg |
class | CLA_MM3D_Map |
class | CLA_MM3D_multiplier |
class | Ortho |
For definition of CkDataMsg. More... | |
Macros | |
#define | MM_ALG_MIN 1 |
#define | MM_ALG_2D 1 |
#define | MM_ALG_3D 2 |
#define | MM_ALG_MAX 2 |
#define | SUCCESS 0 |
#define | ERR_INVALID_ALG -1 |
#define | ERR_INVALID_DIM -2 |
Functions | |
int | make_multiplier (CLA_Matrix_interface *A, CLA_Matrix_interface *B, CLA_Matrix_interface *C, CProxy_ArrayElement bindA, CProxy_ArrayElement bindB, CProxy_ArrayElement bindC, int M, int K, int N, int m, int k, int n, int strideM, int strideK, int strideN, CkCallback cbA, CkCallback cbB, CkCallback cbC, CkGroupID gid, int algorithm, int gemmSplitOrtho) |
template<typename T > | |
void | transpose (T *data, int m, int n) |
Ortho::Ortho () | |
Default empty constructor. For? | |
Ortho::Ortho (CkMigrateMessage *m) | |
Ortho::Ortho (int m, int n, CLA_Matrix_interface matA1, CLA_Matrix_interface matB1, CLA_Matrix_interface matC1, CLA_Matrix_interface matA2, CLA_Matrix_interface matB2, CLA_Matrix_interface matC2, CLA_Matrix_interface matA3, CLA_Matrix_interface matB3, CLA_Matrix_interface matC3, orthoConfig &_cfg, CkArrayID step2Helper, int timeKeep, CkGroupID _oMCastGID, CkGroupID _oRedGID) | |
New functions necessary for using CLA_Matrix /////////////////////////////////////////////////////////////////////////// ///////////////////////////////////////////////////////////////////////////. | |
void | Ortho::makeSections (const pc::pcConfig &cfgSymmPC, const pc::pcConfig &cfgAsymmPC, CkArrayID symAID, CkArrayID asymAID) |
Trigger the creation of appropriate sections of paircalcs to talk to. Also setup internal comm sections. More... | |
void | Ortho::start_calc (CkReductionMsg *msg) |
Symmetric PCs contribute data that is summed via this reduction to deposit a portion of the S matrix with this ortho, triggering S->T. | |
void | Ortho::do_iteration (void) |
Triggers the matrix multiplies in step 1 of an ortho iteration. More... | |
void | Ortho::do_iteration (orthoMtrigger *m) |
Used when array broadcasts in ortho are delegated to comlib/CkMulticast so as to not involve all PEs in bcast. | |
void | Ortho::step_2 () |
Triggers step 2, and optionally step 3 (if ortho helpers are being used) | |
void | Ortho::recvStep2 (CkDataMsg *msg) |
Receives the results of the call to OrthoHelper from step_2. More... | |
void | Ortho::step_3 () |
Triggers step 3 in the S->T process. More... | |
void | Ortho::tolerance_check () |
Computes square of the residuals and contributes to a reduction rooted at Ortho(0,0)::collect_error() More... | |
void | Ortho::collect_error (CkReductionMsg *msg) |
Computes the RMS error and either launches the next ortho iteration (if needed) or calls collect_results. | |
void | Ortho::collect_results (orthoMtrigger *m) |
Used when ortho redn/bcasts are delegated to comlib/CkMulticast because charm array broadcasts involve all PEs. | |
void | Ortho::collect_results (void) |
Computes walltimes, prints simulation status msgs and calls resume() if more openatom iterations are needed. More... | |
void | Ortho::resume () |
Sends results (T matrix) to the symm PCs (and also the asymms if this is dynamics) More... | |
void | Ortho::sendOrthoTtoAsymm () |
Used in dynamics, to send the results of S->T to the asymm paircalc instance which will use them. | |
void | Ortho::acceptSectionLambda (CkReductionMsg *msg) |
void | Ortho::gamma_done () |
Used in dynamics, to accept computed gamma and send it to the asymm PC instance. Also sends T if it hasnt yet been sent. More... | |
double | Ortho::array_diag_max (int sizem, int sizen, internalType *array) |
S should be equal to 2I. This returns max value of deviation from that in this ortho's portion of the S matrix. More... | |
void | Ortho::maxCheck (CkReductionMsg *msg) |
Called on ortho(0,0). Checks if PsiV update is needed based on the max deviation in S received via a redn across all orthos. Notifies GSpaceDriver if so. Called periodically in start_calc only for dynamics. | |
void | Ortho::resumeV (CkReductionMsg *msg) |
Once all GSpaceDriver chares are notified, they resume Ortho execution via a redn broadcast to this method. More... | |
void | Ortho::print_results (void) |
Dumps the T matrix to an appropriately named file. | |
virtual void | Ortho::pup (PUP::er &p) |
pack/unpack method More... | |
void | Ortho::orthoCookieinit (initCookieMsg *msg) |
void | Ortho::all_ready () |
called from each CLA_Matrix array (3 per multiplication, 3 mults) | |
void | Ortho::ready () |
Startup/Init synchronization. When all elements (PC, CLA_Matrix etc) are ready, first iteration is triggered. | |
static void | Ortho::step_2_cb (void *obj) |
Static methods used as callbacks. Could be replaced by CkCallbacks. | |
static void | Ortho::step_3_cb (void *obj) |
static void | Ortho::gamma_done_cb (void *obj) |
static void | Ortho::tol_cb (void *obj) |
CLA_Matrix::CLA_Matrix (int M, int K, int N, int m, int k, int n, int strideM, int strideN, int strideK, int part, CProxy_CLA_Matrix other1, CProxy_CLA_Matrix other2, CkCallback ready, int gemmSplitOrtho) | |
CLA_Matrix::CLA_Matrix (CProxy_CLA_MM3D_multiplier p, int M, int K, int N, int m, int k, int n, int strideM, int strideK, int strideN, int part, CkCallback cb, CkGroupID gid, int gemmSplitOrtho) | |
virtual void | CLA_Matrix::pup (PUP::er &p) |
virtual void | CLA_Matrix::ResumeFromSync () |
void | CLA_Matrix::multiply (double alpha, double beta, internalType *data, void(*fptr)(void *), void *usr_data) |
void | CLA_Matrix::receiveA (CLA_Matrix_msg *m) |
void | CLA_Matrix::receiveB (CLA_Matrix_msg *m) |
void | CLA_Matrix::multiply () |
void | CLA_Matrix::readyC (CkReductionMsg *m) |
void | CLA_Matrix::ready (CkReductionMsg *m) |
void | CLA_Matrix::mult_done (CkReductionMsg *m) |
CLA_Matrix_msg::CLA_Matrix_msg (internalType *data, int d1, int d2, int fromX, int fromY) | |
CLA_MM3D_multiplier::CLA_MM3D_multiplier (int m, int k, int n) | |
void | CLA_MM3D_multiplier::initialize_reduction (CLA_MM3D_mult_init_msg *m) |
void | CLA_MM3D_multiplier::receiveA (CLA_Matrix_msg *msg) |
void | CLA_MM3D_multiplier::receiveB (CLA_Matrix_msg *msg) |
void | CLA_MM3D_multiplier::multiply (internalType *A, internalType *B) |
Variables | |
bool | Ortho::parallelStep2 |
bool | Ortho::step2done |
bool | Ortho::step3done |
orthoConfig | Ortho::cfg |
int | Ortho::timeKeep |
internalType * | Ortho::orthoT |
internalType * | Ortho::ortho |
int | Ortho::numGlobalIter |
int | Ortho::iterations |
CProxySection_Ortho | Ortho::multiproxy |
CkSectionInfo | Ortho::orthoCookie |
int | Ortho::num_ready |
bool | Ortho::got_start |
int | Ortho::lbcaught |
PCSectionManager | Ortho::symmSectionMgr |
Section of symmetric PC chare array used by an Ortho chare. | |
PCSectionManager | Ortho::asymmSectionMgr |
Section of asymmetric PC chare array used by an Ortho chare. | |
CkGroupID | Ortho::oMCastGID |
Group IDs for the multicast manager groups. | |
CkGroupID | Ortho::oRedGID |
CProxy_OrthoHelper | Ortho::step2Helper |
The proxy of the step 2 helper chare array. | |
bool | Ortho::toleranceCheckOrthoT |
internalType * | Ortho::A |
internalType * | Ortho::B |
internalType * | Ortho::C |
internalType * | Ortho::tmp_arr |
int | Ortho::step |
int | Ortho::m |
int | Ortho::n |
double | Ortho::invsqr_tolerance |
int | Ortho::invsqr_max_iter |
CLA_Matrix_interface | Ortho::matA1 |
CLA_Matrix_interface | Ortho::matB1 |
CLA_Matrix_interface | Ortho::matC1 |
CLA_Matrix_interface | Ortho::matA2 |
CLA_Matrix_interface | Ortho::matB2 |
CLA_Matrix_interface | Ortho::matC2 |
CLA_Matrix_interface | Ortho::matA3 |
CLA_Matrix_interface | Ortho::matB3 |
CLA_Matrix_interface | Ortho::matC3 |
The Ortho object is basically used to accept the rows of the "S" matrix, use the orthoTransform subroutine to obtain the "T" matrix, and send the rows of the "T" matrix back to all the GSpace states.
The S_to_T(...) method is invoked when the S-Calculators send their contributions either through the S_Reducer group or otherwise. Once the conversion from S to T is done, the preparation for the broadcast back to the S-Calculators is done.
This method also takes of performing the load-balancing step for the experimental cases where that is in play.
2D version of ortho permits better parallelization of the diagonal and to facilitate pair calculator communication.
Each pair calculator will perform its multiply and reduce the result within its "column" AKA quadrant via the Ortho array. Resulting in sub matrices on Ortho which could be pasted together on ortho[0,0] for the complete matrix.
For the dynamics (non minization) case we have an additional multiplication of gamma= lambda*orthoT. And we pass both gamma and orthoT to the PC instead.
For dynamics we also have a tolerance check for orthoT. The tolerance check is a min reduction on OrthoT. If it falls out of the tolerance range we have to correct the velocities. We notify the PC of this via a flag in the finishsection call. It will then be handled by gspace. We will keep the tolerance flag and set orthoT to the unit AKA identity matrix for the gamma calculation.
We will perform that tolerance check every config.toleranceInterval steps.
This complicates interaction with PC a bit since we can no longer just multiply our index by the sGrainSize to determine it.
void Ortho::acceptSectionLambda | ( | CkReductionMsg * | msg | ) |
ompleted gamma will call finishPairCalcSection2
Definition at line 438 of file ortho.C.
References asymmSectionMgr, cp::ortho::orthoConfig::grainSize, cp::ortho::orthoConfig::isDynamics, cp::ortho::PCSectionManager::msgPriority, and cp::ortho::PCSectionManager::sendResults().
|
inline |
S should be equal to 2I. This returns max value of deviation from that in this ortho's portion of the S matrix.
OrthoT tolerance check util return max value.
Definition at line 279 of file ortho.h.
Referenced by start_calc().
void Ortho::collect_results | ( | void | ) |
Computes walltimes, prints simulation status msgs and calls resume() if more openatom iterations are needed.
Output debug information then increment iteration counter
Load balance controller
Definition at line 261 of file ortho.C.
References print_results(), and resume().
void Ortho::do_iteration | ( | void | ) |
Triggers the matrix multiplies in step 1 of an ortho iteration.
start step 1 on proxy 1, the callback will be for step 2 S2 = 3 * I - T * S1 currently A has T, B has S1, need to construct 3*I in C
Definition at line 717 of file ortho.C.
References cp::ortho::orthoConfig::grainSize, and step_2_cb().
Referenced by resumeV(), and start_calc().
void Ortho::gamma_done | ( | ) |
Used in dynamics, to accept computed gamma and send it to the asymm PC instance. Also sends T if it hasnt yet been sent.
CkPrintf("[d d] sending ortho g g g g gamma g g g
thisIndex.y,orthoT[0],orthoT[1],orthoT[m*n-2],orthoT[m*n-1],B[0],B[1],B[m*n-2],B[m*n-1]);
Definition at line 589 of file ortho.C.
References asymmSectionMgr, cp::ortho::orthoConfig::grainSize, cp::ortho::PCSectionManager::msgPriority, and cp::ortho::PCSectionManager::sendResults().
void Ortho::makeSections | ( | const pc::pcConfig & | cfgSymmPC, |
const pc::pcConfig & | cfgAsymmPC, | ||
CkArrayID | symAID, | ||
CkArrayID | asymAID | ||
) |
Trigger the creation of appropriate sections of paircalcs to talk to. Also setup internal comm sections.
For runs using a large numPE, Orthos chares are typically mapped onto a small fraction of the cores However, array broadcasts in charm++ involve all PEs (due to some legacy quirk present because of any-time migration support). Hence, to avoid this anti-scaling overhead, we delegate array collectives to comlib / CkMulticast by building a section that includes all chare elements! :)
The (0,0) Ortho chare sets up these sections and delegates them
Create an array section that includes the whole Ortho chare array
If reductions are being delegated to a comm library
Ask the rest of the section (the whole array) to init their CkSectionInfo cookies that identify the mgr etc
If multicasts are being delegated to a comm library
Use the appropriate library for multicasts
Symmetric PC sections should trigger S -> T computations in Ortho via this method
Asymmetric sections should simply drop off lambda at this method
Setup the symmetric instance paircalc array section for communication with the symm PC chares
Setup the asymmetric instance paircalc array section for gather/scatter of lambda data from/to the asymm PC chares
Definition at line 526 of file ortho.C.
References asymmSectionMgr, cp::ortho::orthoConfig::grainSize, cp::ortho::PCSectionManager::init(), cp::ortho::orthoConfig::numStates, oMCastGID, cp::ortho::PCSectionManager::setupArraySection(), and symmSectionMgr.
|
virtual |
pack/unpack method
CBase_Ortho::pup(p);
Definition at line 842 of file ortho.C.
References cp::ortho::orthoConfig::numStates.
|
inline |
Receives the results of the call to OrthoHelper from step_2.
Result of 0.5 * S3 * S2 arrives from helper.
void Ortho::resume | ( | ) |
Sends results (T matrix) to the symm PCs (and also the asymms if this is dynamics)
Resume execution by finishing the backward path of the Psi calculator.
orthos talking to non-phantoms in the symm pc instance will have to perform an extra transpose of their data. Hence, they do this only when they have to. We avoid this when phantoms are turned off by rigging the pc sections to make the mirror orthos deliver the data to our non-phantoms. Refer PCSectionManager::setupArraySection for more info.
Definition at line 290 of file ortho.C.
References cp::ortho::PCSectionManager::computePCStateIndices(), cp::ortho::orthoConfig::grainSize, cp::ortho::orthoConfig::isDynamics, sendOrthoTtoAsymm(), cp::ortho::PCSectionManager::sendResults(), and symmSectionMgr.
Referenced by collect_results().
void Ortho::resumeV | ( | CkReductionMsg * | msg | ) |
Once all GSpaceDriver chares are notified, they resume Ortho execution via a redn broadcast to this method.
Resume execution with the vpsi tolerance correction flag on.
Definition at line 425 of file ortho.C.
References do_iteration().
void Ortho::step_3 | ( | ) |
Triggers step 3 in the S->T process.
T = 0.5 * S2 * S3 (do S3 = T before) (on proxy 3) currently A has T, B has S1 (old), C has S2, tmp_arr has new S1.
Definition at line 785 of file ortho.C.
References cp::ortho::orthoConfig::grainSize.
Referenced by step_2().
void Ortho::tolerance_check | ( | ) |
Computes square of the residuals and contributes to a reduction rooted at Ortho(0,0)::collect_error()
calculate error and reset pointers (for step 1) current: T -> A, S1 -> tmp_arr, S2 -> C, S3 -> B
Definition at line 807 of file ortho.C.
References cp::ortho::PCSectionManager::orthomCastGrpID, and symmSectionMgr.