OpenAtom  Version1.5a
PairCalculator Class Reference
Inheritance diagram for PairCalculator:
CBase_PairCalculator

Public Member Functions

 PairCalculator (CProxy_InputDataHandler< CollatorType, CollatorType > inProxy, const pc::pcConfig _cfg)
 Entry Method. (obviously) More...
 
 PairCalculator (CkMigrateMessage *)
 Constructor for migration.
 
 ~PairCalculator ()
 Destructor (nothing needs to be done?) More...
 
CollatorTypeleftHandler () const
 Returns a pointer to the collator that will buffer the left matrix data (only for use by the corresponding InputDataHandler chare)
 
CollatorTyperightHandler () const
 Returns a pointer to the collator that will buffer the right matrix data (only for use by the corresponding InputDataHandler chare)
 
void phantomDone ()
 Entry Method. To handle correct startup for symm PC w/phantoms. More...
 
void acceptLeftData (paircalcInputMsg *msg)
 Entry Method. Method to send in the complete block of the left matrix. More...
 
void acceptRightData (paircalcInputMsg *msg)
 Entry Method. Method to send in the complete block of the right matrix. More...
 
void launchComputations (paircalcInputMsg *aMsg)
 NOT an entry method. Called locally from the acceptData* methods to launch the appropriate number-crunching method. More...
 
void multiplyForward (bool flag_dp)
 Forward path multiply driver. Prepares matrices, calls DGEMM, contributes results to Ortho subTiles and also passes relevant data to phantom PC chares. More...
 
void multiplyForwardRDMA ()
 Entry Method. Simply redirects call to multiplyForward()
 
void contributeSubTiles (internalType *fullOutput)
 Piece up a tile and send all the pieces as this PC's contribution to the Ortho chares. More...
 
void sendTiles (bool flag_dp)
 Contribute orthoGrainSized tiles of data (that are ready?) to the corresponding ortho chares. More...
 
void collectTile (bool doMatrix1, bool doMatrix2, bool doOrthoT, int orthoX, int orthoY, int orthoGrainSizeX, int orthoGrainSizeY, int numRecdBW, int matrixSize, internalType *matrix1, internalType *matrix2)
 Receive data from ortho chares and copy into matrix.
 
void initGRed (initGRedMsg *msg)
 Entry Method. Initializes the section cookie and the reduction client. Called on startup as the chare world is being created. More...
 
void acceptOrthoT (multiplyResultMsg *msg)
 Entry Method. During dynamics, each Ortho calls this on the Asymm loop PC instances to send its share of T back to avoid a race condition between Gamma and T. More...
 
void multiplyResult (multiplyResultMsg *msg)
 Entry Method. Backward path multiplication. More...
 
void multiplyPsiV ()
 Dynamics: PsiV Tolerance correction loop called on symmetric instances. Technically, backward path. More...
 
void bwMultiplyDynOrthoT ()
 Multiplies Fpsi by T (from Ortho) More...
 
void multiplyResultI (multiplyResultMsg *msg)
 Entry Method. Simply forwards the call to multiplyResult(). Dont seem to be any instances in source which call this method. Check. More...
 
void bwbarrier (CkReductionMsg *msg)
 Entry Method. a debugging tool: a barrier at the end of the backward path before anything is sent over to GSP. More...
 
void bwMultiplyHelper (int size, internalType *matrix1, internalType *matrix2, internalType *amatrix, internalType *amatrix2, bool unitcoef, int m_in, int n_in, int k_in, int BNAoffset, int BNCoffset, int BTAoffset, int BTCoffset, int orthoX, int orthoY, double beta, int ogx, int ogy)
 multiplyPsiV() and multiplyResult() call this to perform the matrix multiply math on the backward path. This calls DGEMM routines for the same. More...
 
void bwSendHelper (int orthoX, int orthoY, int sizeX, int sizeY, int ogx, int ogy)
 Called on the normal backward path (non-psiV) to set up the data sends to GSpace.
 
void sendBWResult (sendBWsignalMsg *msg)
 Entry Method. Send the results via multiple reductions as triggered by a prioritized message. More...
 
void sendBWResultDirect (sendBWsignalMsg *msg)
 Entry Method. More...
 
void sendBWResultColumn (bool other, int startGrain, int endGrain)
 Send the result for this column.
 
void sendBWResultColumnDirect (bool other, int startGrain, int endGrain)
 
void initResultSection (initResultMsg *msg)
 Entry Method. Initialize the section cookie for each slice of the result. More...
 
void pup (PUP::er &)
 PUP method. More...
 
void reorder (int *offsetMap, int *revOffsetMap, double *data, double *scratch)
 
void dumpMatrix (const char *, double *, int, int, int xstart=0, int ystart=0, int xtra1=0, int xtra2=0)
 
void dumpMatrix (const char *, complex *, int, int, int xstart=0, int ystart=0, int xtra1=0, int xtra2=0)
 
void dumpMatrixComplex (const char *, complex *, int, int, int xstart=0, int ystart=0, int iter=0)
 
void dgemmSplitBwdM (int m, int n, int k, char *trans, char *transT, double *alpha, double *A, double *B, double *bt, double *C)
 
void dgemmSplitFwdStreamMK (int m, int n, int k, char *trans, char *transT, double *alpha, double *A, int *lda, double *B, int *ldb, double *C, int *ldc)
 
void dgemmSplitFwdStreamNK (int m, int n, int k, char *trans, char *transT, double *alpha, double *A, int *lda, double *B, int *ldb, double *C, int *ldc)
 
void ResumeFromSync ()
 
void lbsync ()
 Entry Method. Something to sync?
 

Private Member Functions

void enqueueBWsend (bool unitcoef, int priority=1)
 Schedules the entry methods that send out the results to GSpace with appropriate priority. More...
 
void cleanupAfterBWPath ()
 Cleans up at end of an iteration (fw-bw computation loop); frees mem, resets counters etc. More...
 
void copyIntoTiles (double *source, double **dest, int sourceRows, int sourceCols, int *offsetsRow, int *offsetsCol, int *touched, int tileSize, int tilesPerRow)
 Copy the results from outdata1 and outdata2 into the tiles. More...
 

Private Attributes

pc::pcConfig cfg
 A private copy of the input configurations.
 
CProxy_InputDataHandler
< CollatorType, CollatorType
myMsgHandler
 A handle to the co-located chare array that handles data input.
 
CollatorTypeleftCollator
 Data collators for the left and right matrix blocks.
 
CollatorTyperightCollator
 
bool isLeftReady
 Flags indicating if the left and right matrix blocks have been received.
 
bool isRightReady
 
int numRecd
 number of messages received
 
int numRecdBW
 number of messages received BW
 
int numRecdBWOT
 number of messages received BW orthoT
 
int numExpected
 number of messages expected all
 
int numExpectedX
 number of messages expected x-axis
 
int numExpectedY
 number of messages expected y-axis
 
int grainSizeX
 number of states per chare x-axis
 
int grainSizeY
 number of states per chare y-axis
 
int orthoGrainSizeRemX
 sgrainSizeX % orthoGrainSize
 
int orthoGrainSizeRemY
 sgrainSizeY % orthoGrainSize
 
int blkSize
 number points in gspace plane
 
int numPoints
 number of points in this chunk
 
int streamCaughtR
 number of rows caught so far R stream
 
int streamCaughtL
 number of rows caught so far L stream
 
int numRecLeft
 number of rows so far total left
 
int numRecRight
 number of rows so far total right
 
int gemmSplitFWk
 number of rows in split FW dgemm
 
int gemmSplitFWm
 number of columns in split FW dgemm
 
int gemmSplitBW
 number of rows in split BW dgemm
 
int * LeftOffsets
 index numbers of caught stream elements
 
int * RightOffsets
 index numbers of caught stream elements
 
int * LeftRev
 reverse index numbers of caught stream elements
 
int * RightRev
 reverse index numbers of caught stream elements
 
double ** outTiles
 in output streaming we populate the tiles directly
 
int * touchedTiles
 tracker to detect when tiles are full
 
bool notOnDiagonal
 being on or off diagonal changes many things
 
bool symmetricOnDiagonal
 diagonal symmetric special case
 
bool expectOrthoT
 Is true only in asymmetric, dynamics scenario. For PC instances in the asymmetric chare array, orthoT should arrive before end of fwd path.
 
bool amPhantom
 consolidate thisIndex.x<thisIndex.y && cfg.isSymmetric && phantomsym
 
CkArrayID cb_aid
 bw path callback array ID
 
int cb_ep
 bw path callback entry point
 
int cb_ep_tol
 bw path callback entry point for psiV tolerance
 
bool existsLeft
 inDataLeft allocated
 
bool existsRight
 inDataRight allocated
 
bool existsOut
 outData allocated
 
bool existsNew
 newData allocated
 
bool resumed
 have resumed from load balancing
 
inputTypemynewData
 results of bw multiply
 
inputTypeothernewData
 results of sym off diagonal multiply,
 
internalType * inDataLeft
 or the C=-1 inRight orthoT +c in dynamics More...
 
internalType * inDataRight
 the input pair to be transformed
 
paircalcInputMsgmsgLeft
 
paircalcInputMsgmsgRight
 Incoming messages with left and right matrix data that are kept around so that we can directly compute on them.
 
internalType * outData
 results of fw multiply
 
int actionType
 matrix usage control [NORMAL, KEEPORTHO, PSIV]
 
double * allCaughtLeft
 unordered rows of FW input
 
double * allCaughtRight
 unordered rows of FW input
 
internalType * inResult1
 accumulate ortho or lambda
 
internalType * inResult2
 used in gamma calc (non minimization)
 
int rck
 count of received cookies
 
CkGroupID mCastGrpIdOrtho
 group id for multicast manager ortho
 
int numOrthoCol
 sGrainSizeX/orthoGrainSize
 
int numOrthoRow
 sGrainSizeY/orthoGrainSize
 
int numOrtho
 number of orthos in our grain = numOrthoCol*numOrthoRow
 
CkGroupID mCastGrpId
 group id for multicast manager bw
 
CkSectionInfo * resultCookies
 array of bw path section cookies
 
CkSectionInfo * otherResultCookies
 extra array of bw path section cookies for sym off diag, or dynamics
 
CkCallback * orthoCB
 forward path callbacks
 
CkSectionInfo * orthoCookies
 forward path reduction cookie
 
int * columnCount
 count of processed rows in BW by column
 
int * columnCountOther
 count of processed rows in BW by column
 

Detailed Description

Definition at line 305 of file ckPairCalculator.h.

Constructor & Destructor Documentation

PairCalculator::PairCalculator ( CProxy_InputDataHandler< CollatorType, CollatorType inProxy,
const pc::pcConfig  _cfg 
)

Entry Method. (obviously)

constructor

Todo:
: This is a hack to ensure that phantoms which get matrix blocks

–—— Setup the forward path input message handling –——

Set isDataReady flags to false only for those inputs this chare will be getting

Phantoms will get only a right matrix input from their mirror chares

Non-phantoms will always get at least a left matrix input

Chares on the chare array diagonals will get ONLY a left matrix (which will also be the right).

Setup the callbacks

Create a string that holds the chare ID and pass it to the message handlers

Create the message handlers for the left and right input matrix blocks

This is the first point during execution when I can supply my InputDataHandler with pointers to the msg handlers, hence it is (now) safe to insert the [w,x,y,z]th element of the InputDataHandler chare array (as it will immediately clamor for access to these handlers)

Definition at line 80 of file ckPairCalculator.C.

References allCaughtLeft, allCaughtRight, amPhantom, cp::paircalc::pcConfig::areBWTilesCollected, cp::paircalc::pcConfig::arePhantomsOn, cb_aid, cb_ep, cb_ep_tol, cfg, columnCount, columnCountOther, cp::paircalc::pcConfig::conserveMemory, existsLeft, existsNew, existsOut, existsRight, expectOrthoT, gemmSplitBW, cp::paircalc::pcConfig::gemmSplitFWk, gemmSplitFWk, cp::paircalc::pcConfig::grainSize, grainSizeX, grainSizeY, cp::paircalc::pcConfig::gSpaceAID, cp::paircalc::pcConfig::gSpaceEP, inDataLeft, inDataRight, inResult1, inResult2, cp::paircalc::pcConfig::isBWstreaming, cp::paircalc::pcConfig::isDynamics, cp::paircalc::pcConfig::isLBon, isLeftReady, cp::paircalc::pcConfig::isSymmetric, leftCollator, msgRight, myMsgHandler, mynewData, notOnDiagonal, numExpected, numExpectedX, numExpectedY, numOrtho, numOrthoCol, numOrthoRow, numPoints, numRecd, numRecdBW, numRecdBWOT, numRecLeft, numRecRight, cp::paircalc::pcConfig::numStates, orthoCB, orthoCookies, cp::paircalc::pcConfig::orthoGrainSize, orthoGrainSizeRemX, orthoGrainSizeRemY, othernewData, otherResultCookies, outData, cp::paircalc::pcConfig::PsiVEP, resultCookies, resumed, streamCaughtL, streamCaughtR, symmetricOnDiagonal, and touchedTiles.

PairCalculator::~PairCalculator ( )

Member Function Documentation

void PairCalculator::acceptLeftData ( paircalcInputMsg msg)

Entry Method. Method to send in the complete block of the left matrix.

accept data for left side of pair

Assert that data is a valid pointer

Assert that numRows is as expected

Check data validity

Once the basic checks have passed, and if we're debugging print status info

Set member data pertinent to the left block

If all data is ready

Trigger the computation

Definition at line 478 of file ckPairCalculator.C.

References paircalcInputMsg::blkSize, cfg, paircalcInputMsg::data(), existsLeft, cp::paircalc::MessageDataCollator< msgType, dataType >::getSampleMsg(), inDataLeft, isLeftReady, cp::paircalc::pcConfig::isSymmetric, launchComputations(), leftCollator, paircalcInputMsg::numCols(), numExpectedX, numPoints, numRecd, and paircalcInputMsg::numRows().

void PairCalculator::acceptOrthoT ( multiplyResultMsg msg)

Entry Method. During dynamics, each Ortho calls this on the Asymm loop PC instances to send its share of T back to avoid a race condition between Gamma and T.

Basically calls collectTile to accept incoming ortho data from an Ortho chare.

Note
This method is used only during dynamics in the asymm loop. Talk to EB for more info

Do not delete msg. Its a nokeep. /////////////////////////////////////////////////////////////////////////// ollect orthoT from matrix2

?

Todo:
Document this after talking with EB. Shouldnt it be an error if orthoX/Y > maxorthostateindex?
Note
: multiplyForward() already checks for numRecdBWOT and calls bwMultiplyDynOrthoT(). There is no need to call it again here. It appears, this would have been a correctness bug that would have showed up if gamma did beat OrthoT. Verify.

Definition at line 1030 of file ckPairCalculator.C.

References actionType, cfg, collectTile(), expectOrthoT, grainSizeX, grainSizeY, cp::paircalc::pcConfig::isSymmetric, numOrtho, numOrthoCol, numOrthoRow, numRecd, numRecdBWOT, cp::paircalc::pcConfig::numStates, cp::paircalc::pcConfig::orthoGrainSize, orthoGrainSizeRemX, and orthoGrainSizeRemY.

void PairCalculator::acceptRightData ( paircalcInputMsg msg)

Entry Method. Method to send in the complete block of the right matrix.

accept data for right side of pair

Assert that data is a valid pointer

Assert that numRows is as expected

Check data validity

Once the basic checks have passed, and if we're debugging print status info

Set member data pertinent to the right block

If all data is ready

Phantom chares already have correct info in the incoming msg. Only non-phantoms have to be handled

Trigger the computation

Definition at line 538 of file ckPairCalculator.C.

References amPhantom, paircalcInputMsg::blkSize, cfg, paircalcInputMsg::data(), existsRight, cp::paircalc::MessageDataCollator< msgType, dataType >::getSampleMsg(), inDataRight, isLeftReady, cp::paircalc::pcConfig::isSymmetric, launchComputations(), msgRight, paircalcInputMsg::numCols(), numExpectedY, numPoints, numRecd, and paircalcInputMsg::numRows().

void PairCalculator::bwbarrier ( CkReductionMsg *  msg)

Entry Method. a debugging tool: a barrier at the end of the backward path before anything is sent over to GSP.

Todo:
: Comments says this is a minimization only hack! Figure out!

heap hack for minimzation only

Definition at line 1701 of file ckPairCalculator.C.

References enqueueBWsend().

void PairCalculator::bwMultiplyDynOrthoT ( )

Multiplies Fpsi by T (from Ortho)

Ensure that we're the correct paircalc instance

Ensure that we really have received all the OrthoT pieces

Reset the counter

Definition at line 1972 of file ckPairCalculator.C.

References blkSize, cfg, existsNew, expectOrthoT, grainSizeX, grainSizeY, inDataRight, inResult2, cp::paircalc::pcConfig::isSymmetric, mynewData, notOnDiagonal, cp::paircalc::pcConfig::numChunks, numExpectedX, numExpectedY, numOrtho, numPoints, numRecdBWOT, othernewData, and streamCaughtR.

void PairCalculator::bwMultiplyHelper ( int  size,
internalType *  matrix1,
internalType *  matrix2,
internalType *  amatrix,
internalType *  amatrix2,
bool  unitcoef,
int  m_in,
int  n_in,
int  k_in,
int  BNAoffset,
int  BNCoffset,
int  BTAoffset,
int  BTCoffset,
int  orthoX,
int  orthoY,
double  beta,
int  orthoGrainSizeX,
int  orthoGrainSizeY 
)

multiplyPsiV() and multiplyResult() call this to perform the matrix multiply math on the backward path. This calls DGEMM routines for the same.

PRE: amatrix contains the entire matrix for the multiplication.

matrix contains just what was sent in the trigger message.

omes in with a minus sign

Definition at line 1716 of file ckPairCalculator.C.

References actionType, amPhantom, cp::paircalc::pcConfig::areBWTilesCollected, cp::paircalc::pcConfig::arePhantomsOn, blkSize, cfg, collectTile(), existsNew, existsOut, existsRight, cp::paircalc::pcConfig::grainSize, grainSizeX, grainSizeY, inDataLeft, inDataRight, cp::paircalc::pcConfig::isSymmetric, mynewData, notOnDiagonal, cp::paircalc::pcConfig::numChunks, numExpectedX, numExpectedY, numOrtho, numPoints, numRecdBW, cp::paircalc::pcConfig::orthoGrainSize, othernewData, outData, streamCaughtR, and symmetricOnDiagonal.

Referenced by multiplyPsiV(), and multiplyResult().

void PairCalculator::cleanupAfterBWPath ( )
private

Cleans up at end of an iteration (fw-bw computation loop); frees mem, resets counters etc.

Resets counters, zeroes out data structures, frees mem when appropriate.

Readies other components (collators) for the next iteration. Ends the backward path substep timing. Paircalc behavior under different cfg.conserveMemory settings are defined solely in this method.

Reset the backward path tile counter

For all mem usage modes

If we're stream computing, zero out these containers

Phantom chares will get a fresh right matrix every time in an incoming msg. Hence, delete it irrespective of cfg.conserveMemory or RDMA settings

For only the speed-optimized (high-mem) mode

<

Todo:
: is this redundant? isnt othernewData always allocated?

<

Todo:
: is this redundant? isnt othernewData always allocated?

For the normal and strict low-mem modes

These deletes lived in sendBWResult() and sendBWResultDirect()

For the strictest of the low-mem footprint modes,

Delete the input matrices and they'll get reallocated on the next pass. Only do this if we dont use RDMA because we dont want to setup an RDMA channel every iteration

If this chare received a right matrix message and it still exists

Delete the storage for the output matrix too

Definition at line 1483 of file ckPairCalculator.C.

References actionType, amPhantom, cfg, columnCount, columnCountOther, cp::paircalc::pcConfig::conserveMemory, existsLeft, existsNew, existsOut, existsRight, cp::paircalc::MessageDataCollator< msgType, dataType >::expectNext(), inDataLeft, inDataRight, cp::paircalc::pcConfig::isBWstreaming, cp::paircalc::pcConfig::isSymmetric, leftCollator, msgRight, mynewData, notOnDiagonal, numExpectedX, numExpectedY, numOrthoCol, numPoints, numRecdBW, othernewData, and outData.

Referenced by multiplyResult(), sendBWResult(), and sendBWResultDirect().

void PairCalculator::contributeSubTiles ( internalType *  fullOutput)

Piece up a tile and send all the pieces as this PC's contribution to the Ortho chares.

Done: necessary changes: 1: border tiles are not uniform size 2: offset calculation must handle non-uniformity solutions, use sGrainSize for all row skips use orthoGrainSizeX or orthoGrainSizeY as needed for row and column iteration

Definition at line 948 of file ckPairCalculator.C.

References cfg, grainSizeX, grainSizeY, cp::paircalc::pcConfig::isSymmetric, mCastGrpIdOrtho, numOrtho, numOrthoCol, numOrthoRow, orthoCB, orthoCookies, cp::paircalc::pcConfig::orthoGrainSize, orthoGrainSizeRemX, and orthoGrainSizeRemY.

Referenced by multiplyForward().

void PairCalculator::copyIntoTiles ( double *  source,
double **  dest,
int  sourceRows,
int  sourceCols,
int *  offsetsRow,
int *  offsetsCol,
int *  touched,
int  tileSize,
int  tilesPerRow 
)
private

Copy the results from outdata1 and outdata2 into the tiles.

Iterate through the source array, look up the destination row in offsetsRow, destination col in offsetsCol. This will be the destination row and column for the output if the output were considered as a single matrix. Use tileSize to map these values into the destination tile.

Definition at line 2573 of file ckPairCalculator.C.

References numOrtho.

void PairCalculator::enqueueBWsend ( bool  unitcoef,
int  priority = 1 
)
inlineprivate

Schedules the entry methods that send out the results to GSpace with appropriate priority.

Create a signal message

Collapse this into 1 flag

Either reduce the results or sum them direct in GSpace

Definition at line 1087 of file ckPairCalculator.C.

References amPhantom, cp::paircalc::pcConfig::arePhantomsOn, cfg, cp::paircalc::pcConfig::isOutputReduced, cp::paircalc::pcConfig::isSymmetric, notOnDiagonal, and cp::paircalc::pcConfig::shouldDelayBWsend.

Referenced by bwbarrier(), multiplyPsiV(), and multiplyResult().

void PairCalculator::initGRed ( initGRedMsg msg)

Entry Method. Initializes the section cookie and the reduction client. Called on startup as the chare world is being created.

initialize the ortho reduction trees and cookies

Do not delete msg. Its a nokeep.

Note
: numRecd here is just used as some counter during the init phase. Not related to its usual purpose

Definition at line 370 of file ckPairCalculator.C.

References cp::paircalc::pcConfig::arePhantomsOn, cfg, cp::paircalc::pcConfig::instanceIndex, cp::paircalc::pcConfig::isSymmetric, mCastGrpIdOrtho, notOnDiagonal, numOrtho, numOrthoCol, numRecd, cp::paircalc::pcConfig::numStates, orthoCB, orthoCookies, cp::paircalc::pcConfig::orthoGrainSize, and cp::paircalc::pcConfig::uponSetupCompletion.

void PairCalculator::initResultSection ( initResultMsg msg)

Entry Method. Initialize the section cookie for each slice of the result.

initialize the multicast tree and cookies for backward path

Do not delete msg. Its a nokeep.

Definition at line 431 of file ckPairCalculator.C.

References cfg, cp::paircalc::pcConfig::grainSize, cp::paircalc::pcConfig::isSymmetric, mCastGrpId, otherResultCookies, rck, and resultCookies.

void PairCalculator::launchComputations ( paircalcInputMsg aMsg)

NOT an entry method. Called locally from the acceptData* methods to launch the appropriate number-crunching method.

once the data has arrived, launch the forward path

Ensure that we're really ready to launch computations

< Flag to make sure we cleanup only if FW path starts

expectOrthoT is false in any scenario other than asymmetric, dynamics. numRecdBWOT is equal to numOrtho only when it is asymm, dynamics and T has been received completely (from Ortho). So this condition, invokes multiplyForward() on all cases except (asymm, dynamics when T has not been received)

In that exception scenario, we dont do anything now. Later, when all of T is received, both multiplyForward() and bwMultiplyDynOrthoT() are called. Look for these calls in acceptOrthoT().

Do nothing for the phantom chare, non-psiv loops. Computation will be triggered only in the backward path

Reset the counters and flags for the next iteration

If asymm,dyn and T has not yet arrived completely, fw path has not yet been triggered. In this case numRecd will be reset in acceptOrthoT() after the fw path has been triggered. numRecd is reset here for all other cases.

All non-phantoms should expect left matrix data again

All non(symm, on-diagonal) chares should expect right matrix data again

Definition at line 602 of file ckPairCalculator.C.

References actionType, amPhantom, paircalcInputMsg::blkSize, blkSize, cfg, cp::paircalc::MessageDataCollator< msgType, dataType >::expectNext(), expectOrthoT, isLeftReady, cp::paircalc::pcConfig::isSymmetric, leftCollator, numExpected, numExpectedX, numExpectedY, numOrtho, numRecd, numRecdBWOT, and symmetricOnDiagonal.

Referenced by acceptLeftData(), and acceptRightData().

void PairCalculator::multiplyForward ( bool  flag_dp)

Forward path multiply driver. Prepares matrices, calls DGEMM, contributes results to Ortho subTiles and also passes relevant data to phantom PC chares.

Forward path multiply.

Essentially does a simple GEMM on the input matrices.

A = [numExpectedX, numPoints] (left matrix, input) B = [numExpectedY, numPoints] (right matrix, input) C = [numExpectedX, numExpectedY] (output matrix, also called S matrix)

In terms of the input matrix dimensions, we need: [numExpectedX, numExpectedY] = [numExpectedX, numPoints] X [numPoints, numExpectedY]

GEMM performs: C = alpha * op(A).op(B) + beta * C where the matrix dimensions are: op(A) = [m,k] op(B) = [k,n] C = [m,n] with m = numExpectedX k = numPoints n = numExpectedY

At first glance, it appears we need to transpose matrix B to make this work. However, since GEMMs are fortran routines, they have a transposed view of the data i.e, a matrix M[r,c] in C++ appears to be a matrix M'[c,r] if the raw array is directly passed into fortran.

The input arrays passed directly into fortran routines will appear to be of dimensions [numPoints, numExpectedX] and [numPoints, numExpectedY]. Hence, to achieve the desired multiply, we transpose the first matrix (A). This gives us the solution matrix we want in one step.

If this is an asymmetric loop, dynamics case AND Ortho has already sent T, call bwMultiplyDynOrthoT() as we must also multiply orthoT by Fpsi

Note
: This if condition originally lived in acceptPairData(). Has been shoveled here to reduce branching over there.

Definition at line 831 of file ckPairCalculator.C.

References cp::paircalc::pcConfig::arePhantomsOn, blkSize, cfg, contributeSubTiles(), paircalcInputMsg::data(), existsOut, existsRight, expectOrthoT, grainSizeX, grainSizeY, cp::paircalc::pcConfig::isSymmetric, msgRight, notOnDiagonal, numExpectedX, numExpectedY, numOrtho, numPoints, numRecdBWOT, outData, and symmetricOnDiagonal.

Referenced by multiplyForwardRDMA().

void PairCalculator::multiplyPsiV ( )

Dynamics: PsiV Tolerance correction loop called on symmetric instances. Technically, backward path.

Tolerance correction PsiV Backward path multiplication This is the same as the regular one matrix backward path with the following exceptions:

  • inDataLeft and inDataRight contain PsiV
  • outData contains the orthoT from the previous (standard) backward path invocation

We do not need to go through the all of multiplyresult for PsiV. All we really need is the setup for multiplyHelper

Schedule the entry methods that will send the bw results out

Definition at line 1134 of file ckPairCalculator.C.

References actionType, amPhantom, cp::paircalc::pcConfig::arePhantomsOn, blkSize, bwMultiplyHelper(), cfg, paircalcInputMsg::data(), enqueueBWsend(), existsRight, grainSizeX, grainSizeY, cp::paircalc::pcConfig::isSymmetric, msgRight, notOnDiagonal, numExpectedY, numPoints, and outData.

void PairCalculator::multiplyResult ( multiplyResultMsg msg)

Entry Method. Backward path multiplication.

Backward path multiplication.

Involves more GEMMs

The basic idea is to apply the orthonormalized 'correction' matrix to the input matrices from the forward path and ship them back to GSpace.

For the symmetric instance, since the input matrices (left and right) are the same, there's only one input matrix to apply the correction to. We achieve better load balance by making the phantom chares also participate in this process. This is the reason the phantoms exist. And this is why input data is shipped off to the phantoms at the end of the forward path.

For symm instance: phantoms do: inDataRight . transpose(amatrix) non-phantoms do: inDataLeft . amatrix

For asymm instance: inDataLeft . amatrix

  • inDataRight . amatrix2

Increment the number of tiles received in the backward path

Get the indices of the ortho that chare that sent this data to us

If I am a phantom chare and have not yet received the right matrix data from my non-phantom mirror

Note
: ASSUMING TMATRIX IS REAL (LOSS OF GENERALITY)

<

Note
: may be overridden later

If the grain size for paircalc and ortho are the same

else, if this is a PsiV loop

else, if PC is collecting all the tiles from Ortho

If we have all the input we need (received all the tiles, or streaming the computations)

If we're stream computing without any barriers, simply send whatever results are ready now

If we're done with all the paircalc work in this loop (iteration), then cleanup

If all the computations are done, and we're either collecting tiles or are barriered, then we need to send our results to GSpace now

Note
: We do not enter this if we're streaming without barriers, as in that case data will be sent out as it is ready
: cfg.orthoGrainSize == cfg.grainSize implies this PC chare will get only one msg in the bw path. This is covered by the check numRecdBW==numOrtho, hence is just for paranoia.
Todo:
: It appears actionType = PSIV will not enter multiplyResult. Needs to be verified as lots of conditions here chk for actionType==PSIV

If we're barriered on the backward path

If we're not streaming, delete inResult*

Definition at line 1207 of file ckPairCalculator.C.

References actionType, amPhantom, cp::paircalc::pcConfig::areBWTilesCollected, cp::paircalc::pcConfig::arePhantomsOn, bwMultiplyHelper(), bwSendHelper(), cfg, cleanupAfterBWPath(), collectTile(), cp::paircalc::pcConfig::conserveMemory, enqueueBWsend(), existsRight, expectOrthoT, cp::paircalc::pcConfig::grainSize, grainSizeX, grainSizeY, inResult1, inResult2, cp::paircalc::pcConfig::isBWbarriered, cp::paircalc::pcConfig::isBWstreaming, cp::paircalc::pcConfig::isSymmetric, notOnDiagonal, numOrtho, numOrthoCol, numOrthoRow, numPoints, numRecdBW, cp::paircalc::pcConfig::numStates, cp::paircalc::pcConfig::orthoGrainSize, orthoGrainSizeRemX, orthoGrainSizeRemY, and symmetricOnDiagonal.

Referenced by multiplyResultI().

void PairCalculator::multiplyResultI ( multiplyResultMsg msg)

Entry Method. Simply forwards the call to multiplyResult(). Dont seem to be any instances in source which call this method. Check.

PairCalculator::multiplyResult(int size, double *matrix1, double *matrix2)

Do not delete msg. Its a nokeep.

Definition at line 1119 of file ckPairCalculator.C.

References multiplyResult().

void PairCalculator::phantomDone ( )

Entry Method. To handle correct startup for symm PC w/phantoms.

setup of phantom is done

Definition at line 424 of file ckPairCalculator.C.

References cfg, cp::paircalc::pcConfig::instanceIndex, numOrtho, and cp::paircalc::pcConfig::uponSetupCompletion.

void PairCalculator::pup ( PUP::er &  p)

PUP method.

pack and unpack

Todo:
: Fix this to allocate or grab a msgLeft and msgRight. inDataLeft/Right is no longer allocated directly
Todo:
: Fix this to pack msgLeft and msgRight directly. inDataLeft/Right is no longer allocated directly msgLeft and msgRight will already be packed and available as msgs. They just wont be packed together with the rest of paircalc. Is there any way we could simply hand the msgs back to charm and ask it to deliver them to me after I am done migrating.? Or am I talking crap? if(existsLeft) p(inDataLeft, numExpectedX * numPoints * 2); if(existsRight) p(inDataRight, numExpectedY* numPoints * 2);

Definition at line 228 of file ckPairCalculator.C.

References actionType, amPhantom, cp::paircalc::pcConfig::areBWTilesCollected, cb_aid, cb_ep, cb_ep_tol, cfg, columnCount, columnCountOther, cp::paircalc::pcConfig::conserveMemory, existsLeft, existsNew, existsOut, existsRight, expectOrthoT, cp::paircalc::pcConfig::grainSize, grainSizeX, grainSizeY, inDataLeft, inDataRight, cp::paircalc::pcConfig::isBWstreaming, cp::paircalc::pcConfig::isLBon, cp::paircalc::pcConfig::isSymmetric, mCastGrpId, mCastGrpIdOrtho, mynewData, notOnDiagonal, cp::paircalc::pcConfig::numChunks, numExpected, numExpectedX, numExpectedY, numOrtho, numOrthoCol, numOrthoRow, numPoints, numRecd, numRecdBW, numRecdBWOT, cp::paircalc::pcConfig::numStates, orthoCB, orthoCookies, orthoGrainSizeRemX, orthoGrainSizeRemY, othernewData, otherResultCookies, outData, rck, resultCookies, resumed, and symmetricOnDiagonal.

void PairCalculator::sendBWResult ( sendBWsignalMsg msg)

Entry Method. Send the results via multiple reductions as triggered by a prioritized message.

This is an entry method to allow us to delay this outbound communication to minimize brain dead BG/L interference we have a signal to prioritize this.

If we're done with all the paircalc work in this loop (iteration), then cleanup

Definition at line 2459 of file ckPairCalculator.C.

References actionType, amPhantom, cb_aid, cb_ep, cb_ep_tol, cfg, cleanupAfterBWPath(), grainSizeX, grainSizeY, cp::paircalc::pcConfig::isSymmetric, mCastGrpId, mynewData, numOrtho, numPoints, numRecdBW, othernewData, otherResultCookies, and resultCookies.

void PairCalculator::sendBWResultDirect ( sendBWsignalMsg msg)

Entry Method.

Determine the callback to use

Off-diagonal PCs in the symm (and asymm, in dynamics) instance, have to handle othernewdata

If we're done with all the paircalc work in this loop (iteration), then cleanup

Definition at line 2364 of file ckPairCalculator.C.

References actionType, amPhantom, cb_aid, cb_ep, cb_ep_tol, cfg, cleanupAfterBWPath(), grainSizeX, grainSizeY, cp::paircalc::pcConfig::isSymmetric, mynewData, numOrtho, numPoints, numRecdBW, othernewData, and cp::paircalc::pcConfig::resultMsgPriority.

void PairCalculator::sendTiles ( bool  flag_dp)

Contribute orthoGrainSized tiles of data (that are ready?) to the corresponding ortho chares.

Todo:
: The only use of flag_dp has been commented out. Check if this argument is still needed, else weed it out.

Definition at line 738 of file ckPairCalculator.C.

References cfg, cp::paircalc::pcConfig::isSymmetric, mCastGrpIdOrtho, numOrtho, numOrthoCol, orthoCB, orthoCookies, cp::paircalc::pcConfig::orthoGrainSize, orthoGrainSizeRemX, orthoGrainSizeRemY, outTiles, and touchedTiles.

Member Data Documentation

internalType* PairCalculator::inDataLeft
private

or the C=-1 inRight orthoT +c in dynamics

the input pair to be transformed

Definition at line 462 of file ckPairCalculator.h.

Referenced by acceptLeftData(), bwMultiplyHelper(), cleanupAfterBWPath(), PairCalculator(), pup(), and ~PairCalculator().


The documentation for this class was generated from the following files: