The Graphs window (see figure 2) is where you can analyze your data by breaking it into any number of intervals and look at what goes on in each of those intervals.
When the Graph Window first appears, a dialog box will also appear. It will ask for the following information (Please refer to 4.4 for information on special features you can use involving the various fields)::
Standard PROJECTIONS dialog options and buttons are also available (see 4.4 for details).
The following menu items are available:
The amount of time to analyze your data depends on several factors, including the number of processors, number of entries, and number of intervals you have selected. A progress meter will show the amount of data loaded so far. The meter will not, however, report rendering progress which is determined mainly by the number of intervals selected. As a rule of thumb, limit the number of intervals to 1,000 or less.
The Graph Window has 3 components in its display:
Clicking on `Apply' updates the graph with your choices. Clicking on `Select All' chooses the entire processor range. When you select more than one processor's worth of data to display, the graph will show the desired information summed across all selected processors. The exception to this is processor utilization data which is always displayed as data averaged across all selected processors.
The Timeline window (see figure 3) lets you look at what a specific processor is doing at each moment of the program.
When opening a Timeline view, a dialog box appears. The box asks for the following information (Please refer to 4.4 for information on special features you can use involving the various fields):
Standard PROJECTIONS dialog options and buttons are also available (see 4.4 for details).
Warning: You might have to click ``update'' before being able to click ok on the dialog box. This is a known bug.
The following menu options are available:
Other color schemes are provided that can be used for some applications. The colors set as described above are the default coloring scheme. Other options for coloring the events are by event id (chare array index), user supplied parameter, or memory usage. In order to color by a user supplied parameter such as timestep, the C function traceUserSuppliedData(int value); should be called within some entry methods. If such a method is called in an entry method, the entry method invocation can be colored by the parameter. The user supplied data can also be viewed in the tooltip that appears when the cursor hovers over an entry method invocation in the window. To color by memory usage, the C function traceMemoryUsage(); should be called in all entry methods. The call records the current memory usage. Red indicates high memory usage, and green indicates low memory usage. The actual memory usage can also be viewed in the tooltips that appear when the cursor is over an event. The memory usage is only available in when using a Charm++ version that uses gnu memory.
The Timeline Window consists of two parts:
This is where the timelines are displayed and is the largest portion of the window. The time axis is displayed at the top of the panel. The left side of the panel shows the processor labels, each containing a processor number and two strange numbers. These two numbers represent the percentage of the loaded timeline during which work occurs. The first of the two numbers is the ``non-idle'' time, i.e. the portion of the time in the timeline not spent in idle regions. This contains both time for entry methods as well as other uninstrumented time spent likely in the Charm++ runtime. The second number is the percentage of the time used by the entry methods for the selected range.
The timeline itself consists of colored bars for each event. Placing the cursor over any of these bars will display information about the event including: the name, the begin time, the end time, the total time, the time spent packing, the number of messages it created, and which processor created the event.
Left clicking on an event bar will cause a window to popup. This window contains detailed information about the messages sent by the clicked upon event.
Right clicking on an event bar will cause a line to be drawn to the beginning of the event bar from the point where the message causing the event originated. This option may not be applicable for threaded events. If the message originated on a processor not currently included in the visualization, the other processor will be loaded, and then the message line will be drawn. A warning message will appear if the message origination point is outside the time duration, and hence no line will be drawn.
User events are displayed as thin bars above the ordinary event bars in the display area.
Message pack times and send points can be displayed below the event bars. The message sends are small white tick marks, while the message pack times are small pink bars usually occurring immediatly after the message send point. If zoomed in to a point where each microsecond takes more than one pixel, the message send point and the following packing time may appear disconnected. This is an inherent problem with the granularity used for the logfiles.
The controls in this panel are obvious, but we mention one here anyway.
View User Event - Checking this box will bring up a new window showing the string description, begin time, end time and duration of all user events on each processor. You can access information on user events on different processors by accessing the numbered tabs near the top of the display.
Various features appear when the user moves the mouse cursor over the top axis. A vertical line will appear to highlight a specific time. The exact time will be displayed at the bottom of the window. Additionally a user can select a range by clicking while a time is highlighted and dragging to the left or right of that point. As a selection is being made, a vertical white line will mark the beginning and end of the range. Between these lines, the background color for the display will change to gray to better distinguish the selection from the surrounding areas. After a selection is made, its duration is displayed at the bottom. A user can zoom into the selection by clicking the ``Zoom Selected'' button. To release a selection, single-click anywhere along the axis. Clicking ``Load Selected'' when a selection is active will cause the timeline range to be reloaded. To zoom out, the ``«'' or ``Reset'' button can be used.
To then zoom into the selected area via this interface, click on either the ``Zoom Selected'' or the ``Load Selected'' buttons. The difference between these two buttons is that the "Load Selected" zooms into the selected area and discards any events that are outside the time range. This is more efficient than ``Zoom Selected'' as the latter draws all the events on a virtual canvas and then zooms into the canvas. The disadvantage of using ``Load Selected'' is that it becomes impossible to zoom back out without having to re-specify the time range via the ``Select Ranges'' button.
Performance-wise, this is the most memory-intensive part of the visualization tool. The load and zoom times are proportional to the number of events displayed. The user should be aware of how event-intensive the application is over the desired time-period before proceeding to use this view. If PROJECTIONS takes too long to load a timeline, cancel the load and choose a smaller time range or fewer processors. We expect to add features to alleviate this problem in future releases.
When the window first comes up, a dialog box appears asking for the processor(s) you want to look at as well as the time range you want to look at. This dialog functions in exactly the same way as for the Timeline tool (see section 4.3.2).
The following menu options are available in this view:
The following components are supported in this view:
If you mouse-over a portion of the bar (with the exception of the black area on top), a pop-up window will appear telling you the name of the item, what percent of the usage it has, and the processor it is on.
The ``Pie Chart'' button generates a pie chart representation (see figure 6) of the same information using averaged statistics but without idle time and communication CPU overheads.
The ``Change Colors'' button lists all entry methods displayed on the main display and their assigned colors. It allows you to change those assigned colors to aid in highlighting entry methods.
The resource consumption of this view is moderate. Load times and visualization times should be relatively fast, but dismissing the tool may result in a very slight delay while PROJECTIONS reclaims memory through Java's garbage collection system.
The communication tool (see figure 7) visualizes communication properties on each processor over a user-specified time range.
The dialog box of the tool allows you to specify the time period within which to load communication characteristics information. This dialog box is exactly the same as that of the Timeline tool (see section 4.3.2).
The main component employs the standard capabilities provided by PROJECTIONS' standard graph (see 4.4).
The control panel allows you to switch between the following communication characteristics:
This view uses memory proportional to the number of processors selected.
The communication over time tool (see figure 8) visualizes communication properties over all processors and displayed over a user-specified time range on the x-axis.
The dialog box of the tool allows you to specify the time period within which to load communication characteristics information. This dialog box is exactly the same as that of the Communication tool (see section 4.3.4).
The main component employs the standard capabilities provided by PROJECTIONS' standard graph (see 4.4).
The control panel allows you to switch between the following communication characteristics:
This view has no known problems loading any range or volume of data.
This window (see figure 9) lets you see a translation of a log file from a bunch of numbers to a verbose version. A dialog box asks which processor you want to look at. After choosing and pressing OK, the translated version appears. Note that this is not a standard processor field. This tool will only load exactly one processor's data.
Each line has:
This tool has the following menu options:
The tool has 2 buttons. ``Open File'' reloads the dialog box (described above) and allows the user to select a new processor's data to be loaded. ``Close Window'' closes the current window.
This module (see figure 10) allows you to examine the performance property distribution of all your entry points (EP). It gives a histogram of different number of EP's that have the following properties falling in different property bins:
The dialog box for this view asks the following information from the user. (Please refer to 4.4 for information on special features you can use involving the various fields):
The dialog box reports the selection of bins as specified by the user by displaying the minimum bin size (in units - microseconds or bytes) to the maximum bin size. ``units'' refer to microseconds for time-based histograms or bytes for histograms representing message sizes.
Standard graph features can be employed for the main display of this view (see section 4.4).
The following menu items are available in this tool:
The following options are available in the control panel in the form of toggle buttons:
The use of the tool is somewhat counterintuitive. The dialog box is created immediately and when the tool window is created, it is defaulted to a time-based histogram. You may change this histogram to a message-size-based histogram by selecting the ``Message Size'' radio button which would then update the graph using the same parameters provided in the dialog box. This issue will be fixed in upcoming editions of PROJECTIONS.
The following features are, as of this writing, not implemented. They will be ready in a later release of PROJECTIONS.
The ``Select Entries'' button is intended to bring up a color selection and filtering window that allows you to filter away entry methods from the count. This offers more control over the analysis (e.g. when you already know EP 5 takes 20-30ms and you want to know if there are other entry points also takes 20-30ms).
The ``Out-of-Range EPs'' button is intended to bring up a table detailing all the entry methods that fall into the overflow (last) bin. This list will, by default, be listed in descending order of time taken by the entry methods.
The performance of this view is affected by the number of bins the user wishes to analyze. We recommend the user limits the analysis to 1,000 bins or less.
Overview (see figure 11(a)) gives users an overview of the utilization of all processors during the execution over a user-specified time range.
The dialog box of the tool allows you to specify the time period within which to load overview information. This dialog box is exactly the same as that of the Timeline tool (see section 4.3.2).
[Overview with dominant Entry Method colors]
|
This tool provides support for the following menu options:
The view currently hard codes the number of intervals to 7,000 independent of the time-range desired.
Each processor has a row of colored bars in the display, different colors indicating different utilization at that time (White representing 100utilization (100representing 0a display of the processor usage of the specific processor at the specific time in the status bar below the graph. Vertical and horizontal zoom is enabled by two zooming bars to the right and lower of the graph. Panning is possible by clicking on any part of the display and dragging the mouse.
The ``by EP colors'' radio button provides more detail by replacing the utilization colors with the colors of the most significant entry method execution time in that time-interval on that processor represented by the cells (as illustrated in figure 11(b)).
The Overview tool uses memory proportional to the number of processors selected. If an out-of-memory error is encountered, try again by skipping processors (e.g. 0-8191:2 instead of 0-8191). This should show the general application structure almost as well as using the full processor range.
This window (see figure 12) animates the processor usage over a specified range of time and a specified interval size.
The dialog box to load animation information is exactly the same as that of the Graph tool (see section 4.3.1).
A color temperature bar serves as a legend for displaying different processor utilizations as the animation progresses. Each time interval will have its data rendered as a frame. A frame displays in text on the top of the display the currently represented execution time of the application and what the size of an interval is.
Each selected processor is laid out in a 2-D plot as close to a square as possible. The view employs a color temperature ranging from blue (cool - low utilization) to bright red (hot - high utilization) to represent utilization.
You may manually update the frames by using the ``'' or ``'' buttons to visualize the preceding or next frames respectively. The ``Auto'' button toggles automatic animation given the desired refresh rate.
The ``Frame Refresh Delay'' field allows you to select the real time delay between frames. It is a time-based field (see section 4.4 for special features in using time-based fields).
The ``Set Ranges'' button allows you to set new parameters for this view via the dialog box.
This view has no known performance issues.
The Time Profile view (see figure 13) is a visualization of the amount of time contributed by each entry method summed across all processors and displayed by user-adjustable time intervals.
Time Profile's dialog box is exactly the same as that of the Graph tool (see section 4.3.1).
Standard graph features can be employed for the main display of this view (see section 4.4).
Under the tool options, one may:
Time Profile also reacts to the presence of data about AMPI functions (See section 2.1.3). When such data is detected, an extra tabbed window displays a graph similar to entry method profiles, but for AMPI functions only.
This tool's performance is tied to the number of intervals desired by the user. We recommend that the user stick to visualizing 1,000 intervals or less.
The User Events view is essentially a usage profile (See section 4.3.3) of bracketed user events (if any) that were recorded over a specified time range. The x-axis holds bars of data associated with each processor while the y-axis represents the time spent by each user event. Each user event is assigned a color.
It is important to note that user-events can be arbitrarily nested. The view currently displays information based on raw data without regard to the way the events are nested. Memory usage is proportional to the number of processors to be displayed.
For performance logs generated from large numbers of processors, it is often difficult to view in detail the behavior of poorly behaved processors. This view attempts to present information similar to usage profile but only for processors whose behavior is ``extreme''.
``Extreme'' processors are identified through the application of heuristics specific to the attribute that analysts wish to study applied to a specific activity type. You can specify the number of ``extreme'' processors are to be picked out by PROJECTIONS by filling the appropriate number in the field ``Outlier Threshold''. The default is to pick 10% of the total number of processors up to a cap of 20. As an example, an analyst may wish to find ``extreme'' processors with respect to the idle time of normal CHARM++ trace events.
Figure 15 shows the choices available to this tool. Specific to this view are two pull-down menus: Attribute and Activity.
There are four Activity options:
There are four Attribute options:
At the same time, a k-means clustering algorithm is applied to the data to help identify processors with exemplar behavior that is representative of each cluster (or equivalence class) identified by the algorithm. You can control the value of k by filling in the appropriate number in the field ``Number of Clusters''. The default value is 5.
The result of applying the required heuristics to the appropriate attribute and activity types results in a chart similar to figure 16. This is essentially a usage profile that shows, over the user's selected time range, from left to right:
The tool helps the user reduce the number of processor bars that must be visually examined in order to identify candidates for more detailed study. To further the cause of this goal, if the analyst has the timeline view (see section 4.3.2) open, a mouse-click on any of the processor activity profile bars (except for group-averaged bars) will load that processor's detailed timeline (over the time range specified in the timeline view) into the timeline view itself.
In the case of AMPI functions, the events are properly nested. The information displayed is currently that of inclusive time (i.e. if function B's calls are nested within function A's, the time spent in function B contribute to both function B's and function A's displayed performance information). There are plans to implement the presentation of AMPI function information based on exclusive time (i.e. time within functions are computed by subtracting the measured time spent minus the time spent by any calls to nested functions).
The AMPI Usage Profile view presents a graph similar to Function Tool's (See section 4.3.14) with several modifications:
![]() |
![]() |
The NoiseMiner view (see figure 17 and 18) displays statistics about abnormally long entry methods. Its purpose is to detect symptoms consistent with Operating System Interference or Compuatational Noise or Software Interference. The abnormally long events are filtered and clustered across multiple dimensions to produce a concise summary. The view displays both the duration of the events as well as the rate at which they occur. The initial dialog box allows a selection of processors and a time range. The user should select a time range that ignores any startup phase where events have chaotic durations. The tool makes only a single pass through the log files using a small bounded amount of memory, so the user should select as large time range as possible.
The tool uses stream mining techniques to produce its results by making only one pass through the input data while using a limited amount of memory. This allows NoiseMiner to be very fast and scalable.
The initial result window shows a list of zero or more noise components. Each noise component is a cluster of events whose durations are abnormally long. The noise duration for each event is computed by comparing the actual duration of the event with an expected duration of the event. Each noise component contains events of different types across one or more processors, but all the events within the noise component have similar noise durations.
Clicking on the ``view'' button for a noise component opens a window similar to figure 18. This second window displays up to 36 miniature timelines, each for a different event associated with the noise component.
NoiseMiner works by storing histograms of each entry method's duration. The histogram bins contain a window of recent occurrences as well as an average duration and count. After data stream has been parsed into the histogram bins, the histogram bins are clustered to determine the expected entry method duration. The histograms are then normalized by the expected duration so that they represent the abnormally stretched amounts for the entry methods. Then the histogram bins are clustered by duration and across processors. Any clusters that do not contribute much to the overall runtime are dropped.
June 29, 2008
Charm Homepage