The primary goal of the parallel debugger is to provide an integrated debugging environment which allows the programmer to examine the changing state of the parallel program during the course of its execution.
The CHARM++ debugging system has a number of useful features for CHARM++ programmers. The system includes a Java GUI client program which runs on the programmer's desktop, and a CHARM++ parallel program which acts as a server. The client and server need not be on the same machine, and communicate over the network using a secure protocol described in http://charm.cs.uiuc.edu/manuals/html/converse/5_CONVERSE_Client_Server_In.html
The system provides the following features:
.
The debugging client provides these features via extensive support built into the CHARM++ runtime.
Every version of CHARM++ is already provided with a compiled version of the debugger, and is located under charm/java/bin/charmdebug. This compiled version is built with java 1.4.2.
In case of necessity, to rebuild the debugger, checkout a fresh copy of charm and then
cd charm/java; ant clean; ant;
This will recreate charm/bin/charmdebug for your java version.
No instrumentation is required to use the CHARM++ debugger. Being CCS based, you can use it to set and step through entry point breakpoints and examine CHARM++ structures on any CHARM++ application.
Nevertheless, for some features to be present some additional options might be required at either compile or link time:
The Record Replay feature is independant of the charmdebug application. It is a mechanism used to detect bugs that happen only once in a while depending on the order in which messages are processed. The program in consideration is first run in a record mode which produces a trace. When the program is run in replay mode it uses a previous trace got from a record run to ensure that messages are processed in the same order as the recorded run. The idea is to make use of a message-sequence number and a theorem says that the serial numbers will be the same if the messages are processed in the same order. [#!rashmithesis!#]
To enable the required tracing for record and replay, a CHARM++ program is linked with the option ``-tracemode recordreplay'' and run with the ``+record'' option, which records messages in order in a file for each processor. The same execution order can be replayed using the ``+replay'' runtime option; which can be used at the same time as the other debugging tools in CHARM++.
Note! If your CHARM++ is built with CMK_OPTIMIZE on, all tracing will be disabled. So, use an unoptimized CHARM++ to do your debugging.
To run an application locally via the debugger on 4 pes with command line options for your pgm (e.g. opt1 opt2 ):
charmdebug pgm +p4 4 opt1 opt2
If the application should be run in a remote cluster behind a firewall, the previous command line will become:
charmdebug -host cluster.inst.edu -user myname -sshtunnel pgm +p4 4 opt1 opt2
Charmdebug can also be executed without any parameters. The user can then choose the application to launch and its command line parameters from within the File menu as shown in Figure 1.
Note: charmdebug command line launching only works on net-* builds of CHARM++
To replay a previously recorded session:
charmdebug pgm +p4 opt1 opt2 +replay
When using the charm debugger to launch your application, it will automatically set these to defaults appropriate for most situations.
Note: If you're using the charm debugger it will probably be best to control the sequential (i.e. gdb) debuggers from within its GUI interface.
The preceding pair of commands +record +replay are used to produce the ``instant replay'' feature. This feature is valuable for catching errors which only occur sporadically. Such bugs which arise from the nondeterminacy of parallel execution can be fiendishly difficult to replicate in a debugging environment. Typical usage is to keep running the application with +record until the bug occurs. Then run the application under the debugger with the +replay option.
Charmdebug is currently limited to applications started directly by the debugger due to implementation peculiarities. It will be extended to support connection to remote running applications in the near future.
Due to the current implementation, the debugging tool is limited to net-* versions. Other builds of CHARM++ might have unexpected behaviour. In the near future this will be extended at least to the mpi-* versions.
As per Rashmi's thesis: There are some unique issues for replay in the context of Charm because it provides high-level support for dynamic load balancing, quiescence detection and information sharing. Many of the load balancing strategies in Charm have a spontaneous component. The strategy periodically checks the sizes of the queues on the local processor. A replay load balancing strategy implements the known load redistribution. The behavior of the old balancing strategy is therefore not replayed only its effect is. Since minimal tracing is used by the replay mechanism the amount of perturbation due to tracing is reduced. The replay mechanism is proposed as a debugging support to replay asynchronous message arrival orders.
Moreover, if your application crashes without a clean shutdown, the log may be lost with the application.
Once the debugger's GUI loads, the programmer triggers the program execution by clicking the Start button. When starting by command line, the application is automatically started. The program starts off displaying the user and system entry points as a list of check boxes, freezing at the onset. The user could choose to set breakpoints by clicking on the corresponding entry points and kick off execution by clicking the Continue Button. Figure 2 shows a snapshot of the debugger when a breakpoint is reached. The program freezes when a breakpoint is reached.
Clicking the Freeze button during the execution of the program freezes execution, while Continue button resumes execution. Quit button can be used to abort execution at any point of time. Entities (for instance, array elements) and their contents on any processor can be viewed at any point in time during execution as illustrated in Figure 3.
![]() |
Specific individual processes of the CHARM++ program can be attached to instances of gdb as shown in Figure 4. The programmer chooses which PEs to connect gdb processes to via the checkboxes on the right side. Note! While the program is suspended in gdb for step debugging, the high-level features such as object inspection will not work.
CHARM++ objects can be examined via the View Entities on PE : Display selector. It allows the user to choose from Charm Objects, Array Elements, Messages in Queue, Readonly Variables, Readonly Messages, Entry Points, Chare Types, Message Types and Mainchares. The right sideselector sets the PE upon which the request for display will be made. The user may then click on the Entity to see the details.
The menu option Action Memory allows the user to display the entire memory layout of a secific processor. An example is shown in figure 5. This layout is colored and the colors have the following meaning:
Currently it is not possible to change this color association. The bottom part of the view shows the stack trace at the moment when the highlighted (yellow) memory slot was allocated. By left clicking on a particular slot, this slot is fixed in highlight mode. This allows a more accurate inspection of its stack trace when this is large and does not fit the window.
Info Show Statistics will display a small information box like the one in Figure 6.
A useful tool of this view is the memroy leak search. This is located in the menu Action Search Leaks. The processor under inspection runs a reacheability test on every memory slot allocated to find if there is a pointer to it. If there is none, the slot is partially colored in green, to indicate its status of leak. The user can the inspect further these slots. Figure 7 shows some leaks being detected.
If the memory window is kept open while the application is unfrozen and makes progress, the loaded image will become obsolete. To cope with this, the ``Update'' button will refresh the view to the current allocation status. All the leaks that had been already found as such, will still be partially colored in green, while the newly allocated slots will not, even if leaking. To update the leak status, re-run the Search Leaks tool.
Finally, when a specific slot is highlighted, the menu Action Inspect opens a new window displaying the content of the memory in that slot, as interpreted by the debugger (see next subsection for more details on this).
Without any code rewriting of the application, CharmDebug is capable of loading a raw area of memory and parse it with a given type name. The result (as shown in Fig. 8), is a browseable tree. The initial type of a memory area is given by its virtual table pointer (CHARM++ objects are virtual and therefore loadbable). In the case of memory slots not containing classes with virtual methods, no display will be possible.
When the view is open and is displaying a type, by right clicking on a leaf containing a pointer to another memory location, a popup menu will allow the user to ask for its dereference (shown in Fig. 8). In this case, CharmDebug will load this raw data as well and parse it with the given type name of the pointer. This dereference will be inlined and the leaf will become an internal node of the browse tree.
The following classes in the PUP framework were used in implementing debugging support in charm.
class PUP::er - This class is the abstract superclass of all the other classes in the framework. The pup method of a particular class takes a reference to a PUP::er as parameter. This class has methods for dealing with all the basic C++ data types. All these methods are expressed in terms of a generic pure virtual method. Subclasses only need to provide the generic method.
class PUP::toText - This is a subclass of the PUP::toTextUtil class which is a subclass of the PUP::er class. It copies the data of an object to a C string, including the terminating NULL.
class PUP::sizerText - This is a subclass of the PUP::toTextUtil class which is a subclass of the PUP::er class. It returns the number of characters including the terminating NULL and is used by the PUP::toText object to allocate space for building the C string.
The code below shows a simple class declaration that includes a pup method.
class foo {
private:
bool isBar;
int x;
char y;
unsigned long z;
float q[3];
public:
void pup(PUP::er &p) {
p(isBar);
p(x);p(y);p(z);
p(q,3);
}
};
The Converse Client-Server (CCS) module enables Converse [#!InterOpIPPS96!#] programs to act as parallel servers, responding to requests from non-Converse programs. The CCS module is split into two parts - client and server. The server side is used by a Converse program while the client side is used by arbitrary non-Converse programs. A CCS client accesses a running Converse program by talking to a server-host which receives the CCS requests and relays them to the appropriate processor. The server-host is charmrun [#!charmman!#] for net-versions and is the first processor for all other versions.
In the case of the net-version of CHARM++, a Converse program is started as a server by running the CHARM++ program using the additional runtime option ``++server''. This opens the CCS server on any TCP port number. The TCP port number can be specified using the command-line option ``server-port''. A CCS client connects to a CCS server, asks a server PE to execute a pre-registered handler and receives the response data. The function CcsConnect takes a pointer to a CcsServer as an argument and connects to the given CCS server. The functions CcsNumNodes, CcsNumPes, CcsNodeSize implemented as part of the client interface in CHARM++ returns information about the parallel machine. The function CcsSendRequest takes a handler ID and the destination processor number as arguments and asks the server to execute the particular handler on the specified processor. CcsRecvResponse receives a response to the previous request in-place. A timeout is also specified which gives the number of seconds to wait till the function returns a 0, otherwise the number of bytes received is returned.
Once a request arrives on a CCS server socket, the CCS server runtime looks up the appropriate registered handler and calls it. If no handler is found the runtime prints a diagnostic and ignores the message. If the CCS module is disabled in the core, all CCS routines become macros returning 0. The function CcsRegisterHandler is used to register handlers in the CCS server. A handler ID string and a function pointer are passed as parameters. A table of strings corresponding to appropriate function pointers is created. Various built-in functions are provided which can be called from within a CCS handler. The debugger behaves as a CCS client invoking appropriate handlers which makes use of some of these functions. Some of the built-in functions are as follows.
CcsSendReply - This function sends the data provided as an argument back to the client as a reply. This function can only be called from a CCS handler invoked remotely.
CcsDelayReply - This call is made to allow a CCS reply to be delayed until after the handler has completed.
The CCS runtime system provides several built-in CCS handlers, which are available to any Converse program. All CHARM++ programs are essentially Converse programs. ccs_getinfo takes an empty message and responds with information about the parallel job. Similarly the handler ccs_killport allows a client to be notified when a parallel run exits.
This document was generated using the LaTeX2HTML translator Version 2002-2-1 (1.71)
Copyright © 1993, 1994, 1995, 1996,
Nikos Drakos,
Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999,
Ross Moore,
Mathematics Department, Macquarie University, Sydney.
The command line arguments were:
latex2html -white -antialias -local_icons -long_titles 1 -show_section_numbers -top_navigation -address '
November 23, 2009
CharmDebug Homepage
Charm Homepage' -split 0 manual.tex
The translation was initiated by root on 2009-11-23
November 23, 2009
CharmDebug Homepage
Charm Homepage