Charm++ FEM Framework
Tutorial
Orion Sky Lawlor
olawlor@uiuc.edu
2003/10/20

Roadmap
Why use FEM?
FEM Concepts
FEM Basic Calls
FEM Advanced Calls
Extra Features

"Why use FEM?"
Why use FEM?

Why use the FEM Framework?
Makes parallelizing a serial code faster and easier
Handles mesh partitioning
Handles communication
Handles load balancing (via Charm)
Allows extra features
IFEM Matrix Library
NetFEM Visualizer
Collision Detection Library

Why not use the FEM Framework?
Does not help you write serial code
But it does help with parallel
Another thing to learn
But it has a manual and examples
Another thing to break
But it runs on native MPI as well as AMPI

Scalability of FEM Framework

“Overhead” of Multipartitioning

Load balancer in action

FEM Framework Users: Frac3D
Frac3D fracture mechanics
Basic structures code, but with “cohesive elements” for fracture
Authors: Scott Brietenfeld, Philippe Geubelle

FEM Framework Users: CSAR
Rocflu fluids solver, a part of GENx
Finite-volume fluid dynamics code
Uses FEM ghost elements
Author: Andreas Haselbacher

FEM Framework Users: DG
Dendritic Growth
Simulate metal solidification process
Solves mechanical, thermal, fluid, and interface equations
Implicit, uses BiCG
Adaptive 3D mesh
Authors: Jung-ho Jeong, John Danzig

NetFEM Client: pretty pictures

"FEM Concepts"
FEM Concepts

FEM Basics
FEM programs manipulate elements and nodes
Element is a portion of problem domain, surrounded by nodes
Node is one point in the domain

Serial FEM Mesh

Partitioned Mesh

FEM Parallel Model: Shared Nodes
“Shared Node” model
Element computations based on values of surrounding nodes
Node values are sum of surrounding elements
Example: Mechanical Simulation
Element stresses are computed from locations of element’s nodes
Sum up forces at nodes

FEM Mesh: Node Communication
Summing forces from other processors only takes one call:
FEM_Update_field
Adds values from shared nodes

FEM Parallel Model: Ghosts
“Ghost” model
Element computations based only on values of surrounding nodes and elements
Example: Fluid Dynamics
Element pressures and velocities come from neighboring element pressures and velocities, and node locations

FEM Mesh: Ghosts

FEM Program Structure

Parallelization
Partition the FEM Mesh into multiple chunks
Distribute elements, replicate shared nodes and/or add ghosts
Keep track of communication
Partition so that communication is minimized

Parallel FEM Program

FEM Framework Program
Consists of at least two user-written subroutines
init
driver
init is called on processor 0
driver is called on every chunk

init

driver

Structure of an FEM Application

A Serial Program

A Parallel Framework Program

Real Names of Pieces

FEM Mesh Access Calls
New and
Improved!

Old FEM Mesh Access Routines
FEM_Set_elem
FEM_Set_elem_data
FEM_Set_elem_conn
FEM_Set_node
FEM_Set_node_data
FEM_Set_sparse
FEM_Set_sparse_elem

New FEM Mesh Access Routine (!)
void FEM_Mesh_data(
int mesh,
int entity,
int attr,
void *data,
int first, int length,
int datatype,int width
);

New Mesh Access: Example
FEM_Mesh_data(
mesh,
FEM_NODE,
FEM_DATA+23,
coord,
0,nNodes,
FEM_DOUBLE,3
);

New Mesh Access: Advantages
Novice users have only one routine to learn
User’s read and write routines can be similar or even identical (just like PUP)
Smart users can easily abstract over parameters
E.g., “higher-order function” that sets both ghosts and real elements for any entity type
Library developers can easily, naturally add:
New entities: FEM_SPARSE, FEM_GHOST
New data: FEM_COORD, FEM_NODE_PRIMARY

New Mesh Access: Disadvantages
8 parameters is too many to keep straight
Invalid combinations detected at runtime, instead of compile time as with the old way
E.g., FEM_NODE doesn’t have FEM_CONN, so there’s no FEM_Set_node_conn
Solution: Keep the old routines
But implement them in terms of the new routine
Mix and match with new routine
Needed for backward compatability, too

FEM Communication Calls

Node Fields
Framework handles combining data for shared nodes and keeps them in sync
Framework does not understand meaning of node fields, only their location and types
Framework needs to be informed of locations and types of fields
Create_field once, Update_field every timestep

Create a Field

Update Field: Shared Nodes

FEM Ghost Layers

Ghost Elements: Overview
Most FEM programs communicates via shared nodes, using FEM_Update_field
Some computations require read-only copies of remote elements—“ghosts”
Stencil-type finite volume computations
The push form of matrix-vector product
Many kinds of mesh modification
Ghosts are a recent addition to the FEM framework

Ghosts: 2D Example

Building Ghosts:
Add ghost elements layer-by-layer from init
A chunk will include ghosts of all the elements it is connected to by “tuples”—sets of nodes
For 2D, a tuple might be a 2-node edge
For 3D, a tuple might be a 4-node face
You specify a ghost layer with FEM_Add_ghost_layer(tupleSize,ghostNodes)
ghostNodes indicates whether to add ghost nodes as well as ghost elements.

Building Ghosts:
FEM_Add_ghost_elem(e,t,elem2tuple)
e is the element type
t is the number of tuples per element
elem2tuple maps an element to its tuples:
A tupleSize by t array of integers
Contains element-local node numbers
Repeat this call for each ghost element type

Ghosts: Node adjacency

Ghosts: Edge adjacency

Extracting and Using Ghosts
Ghosts are always given larger numbers than non-ghosts—that is, ghosts are at the end
FEM_Get_node_ghost() and FEM_Get_elem_ghost(e)
Return the index of the first ghost node or element
FEM_Update_ghost_field(fid,e,data)
Obtain other processor’s data (formatted like fid) for each ghost element of type e

Update Field: Ghosts

Ghost Example: Mesh

Ghost Example: Ghost Elements

FEM Installation

Where to Get It ?

How to Build It ?

How to Compile & Link ?
Use “charmc”: available under bin
a multi-lingual compiler driver, understands f90
Knows where modules and libraries are
Portable across machines and compilers
Linking
use “-language femf” : for F90
Use “–language fem” : for C/C++
See example Makefiles
charm/pgms/charm++/fem/…

How to Run ?
Charmrun
A portable parallel job execution script
Specify number of processors: +pN
Specify number of chunks: +vpN
Special “nodelist” file for net-* versions

Example

Advanced FEM Calls

Advanced: FEM Migration

Advanced: Complicated Fields

Node Fields

FEM_Create_Field

Create_field Example

Create_field Example

Advanced: Symmetry Ghosts

Ghosts and Symmetries
In addition to cross-processor ghosts, can build ghosts to model problem symmetries
Translational and rotational periodicities
Mirror symmetry
FEM_Add_linear_periodicity(nFaces,nPer, facesA,facesB, nNodes,nodeLocs)
Identify these two lists of faces under linear periodicity, and build ghosts to match

Symmetry Ghosts: 2D Example

Ghost Elements

Symmetry-Ghost Elements

Ghosts and Symmetries: Update

Advanced: FEM on MPI

Extra FEM Features

Collision Detection

Charm++ Collision Detection
Detect collisions (intersections) between objects scattered across processors

Collision Detection Algorithm:

Serial Scaling

Parallel Scaled Problem

Mesh Adaptation

Parallel Mesh Refinement
To refine, split the longest edge:
But if split neighbor has a longer edge, split his edge first
Refinement propagates across mesh, but preserves mesh quality
Initial 2D parallel implementation built on Charm++
Interfaces with FEM Framework

IFEM: Iterative FEM
Linear Solver Interface
  Brand New!

IFEM: Intro to Matrix-based FEM
Normal computations run element-by-element
Only consider one element at a time
Matrix-based computations are holistic
One big matrix represents the whole system
Solve the matrix == solve the problem
Row of matrix solves for “Degree of Freedom”
E.g., a node’s X coordinate
Typically a row has only a few nonzero entries
Around 106 degrees of freedom--106 rows

106 Rows is a Big  Matrix
Normal matrix operations are O(n3)
Can’t use Gaussian elimination, Jacobi, QR, …
Almost everybody uses iterative solvers
Based on a fast “guess-and-check” method
Each iteration consists of a matrix-vector product
Typically converges in 10 to 100 iterations
Popular iterative solvers:
Conjugate gradient (CG)
Bidirectional Conjugate Gradient (BiCG)
Generalized modified residual (GMRES)

ILSI:  Iterative Linear Solver Interface
Provides solvers with operations they need:
Parallel matrix-vector product (written by user)
Parallel dot-product
class ILSI_Comm {
virtual void matrixVectorProduct(const double *src,double *dest) =0;
virtual double dotProduct(const double *a,const double *b) =0;
};
Iterative solvers use these operations to solve the matrix
Writing iterative solvers: an active research area
Krylov Subspaces, Residuals, Quadratic forms…

Matrix Example:
Hooke’s spring law:
F=-kx
F1=-k(D1-D2)
Linear relation between node force and node displacement
Element corresponds to little 2x2 matrix

Matrix Example: 2 Elements

Matrix Example: Derivation

Matrix Example: FEM

IFEM: Bottom Line
If you can do matrix-vector product, you can run iterative solvers like conjugate gradient
Matrix-vector product amounts to the usual FEM calculation (displacements to forces), plus the usual FEM communication (update shared nodes)
You don’t even need to store the matrix
Users can easily solve iterative problems with the FEM framework and IFEM solvers!

NetFEM

NetFEM Client: pretty pictures

NetFEM: Easy Visualization
Can interact with a running FEM application
Uses Charm++ CCS (Client-server) library
Easy to “publish” attributes

NetFEM: Easy visualization

NetFEM: Zoom in

NetFEM: Outline Elements

NetFEM: Point Nodes

NetFEM Server Side: Overview
To allow the NetFEM client to connect, you add NetFEM registration calls to your server
Register nodes and element types
Register data items: scalars or spatial vectors associated with each node or element
You provide the display name and units for each data item
Link your program with “-module netfem”
Run with “++server”, and connect!

NetFEM Server Side: Setup
n=NetFEM_Begin(FEM_My_partition(),timestep, dim,NetFEM_POINTAT)
Call this each time through your timeloop; or skip
timestep identifies this data update
dim is the spatial dimension—must be 2 or 3
Returns a NetFEM handle n used by everything else
NetFEM_End(n)
Finishes update n

NetFEM Server Side: Nodes
NetFEM_Nodes(n,nnodes,coord,”Position (m)”)
Registers node locations with NetFEM—future vectors and scalars will be associated with nodes
n is the handle returned by NetFEM_Begin
nnodes is the number of nodes
coord is a dim by nnodes array of doubles
The string describes the coordinate system and meaning of nodes
Currently, there can only be one call to nodes

NetFEM: Node Displacement

NetFEM Server Side: Elements
NetFEM_Elements(n,nelem,nodeper, conn,”Triangles”)
Registers elements with NetFEM—future vectors and scalars will be associated with these elements
n is the handle returned by NetFEM_Begin
nelem is the number of elements
nodeper is the number of nodes per element
conn is a nodeper by nelem array of node indices
The string describes the kind of element
Repeat to register several kinds of element
Perhaps: Triangles, squares, pentagons, …

NetFEM: Element Stress

NetFEM Server Side: Vectors
NetFEM_Vector(n,val,”Displacement (m)”)
Registers a spatial vector with each node or element
Whichever kind was registered last
n is the handle returned by NetFEM_Begin
val is a dim by nitems array of doubles
There’s also a more general NetFEM_Vector_field in the manual
The string describes the meaning and units of the vectors
Repeat to register multiple sets of vectors
Perhaps: Displacement, velocity, acceleration, rotation, …

NetFEM: Element Velocity

NetFEM Server Side: Scalars
NetFEM_Scalar(n,val,s,”Displacement (m)”)
Registers s scalars with each node or element
Whichever kind was registered last
n is the handle returned by NetFEM_Begin
val is an s by nitems array of doubles
There’s also a more general NetFEM_Scalar_field in the manual
s is the number of doubles for each node or element
The string describes the meaning and units of the scalars
Repeat to register multiple sets of scalars
Perhaps: Stress, plasticity, node type, damage, …

NetFEM Server Side: 2D Example

NetFEM: Conclusion
Easy, general way to get output from an FEM computation
Client configures itself based on server
Client can be run anywhere (from home!)
Server performance impact minimal (1ms!)
Future work:
Support multiple chunks per processor
Non-network, file-based version
Movie mode

Conclusions: Charm++ FEM
Easy to parallelize existing codes
Flexible: shared nodes or ghosts
High performance
Extra features
Visualization with NetFEM
Collision
Matrix methods with IFEM

Old FEM Mesh Routines

Framework Calls
FEM_Set_Mesh
Called from initialization to set the serial mesh
Framework partitions mesh into chunks
FEM_Create_Field
Registers a node data field with the framework, supports user data types
FEM_Update_Field
Updates node data field across all processors
Handles all parallel communication
Other parallel calls (Reductions, etc.)

FEM_*_Elem

FEM_*_Node

Element Connectivity

Additional Data for Nodes and Elements

Utility