


Orion Sky Lawlor 

olawlor@uiuc.edu 

2003/10/20 




Why use FEM? 

FEM Concepts 

FEM Basic Calls 

FEM Advanced Calls 

Extra Features 








Makes parallelizing a serial code faster and
easier 

Handles mesh partitioning 

Handles communication 

Handles load balancing (via Charm) 

Allows extra features 

IFEM Matrix Library 

NetFEM Visualizer 

Collision Detection Library 





Does not help you write serial code 

But it does help with parallel 

Another thing to learn 

But it has a manual and examples 

Another thing to break 

But it runs on native MPI as well as AMPI 







Frac3D fracture mechanics 

Basic structures code, but with “cohesive
elements” for fracture 

Authors: Scott Brietenfeld, Philippe Geubelle 




Rocflu fluids solver, a part of GENx 

Finitevolume fluid dynamics code 

Uses FEM ghost elements 

Author: Andreas Haselbacher 




Dendritic Growth 

Simulate metal solidification process 

Solves mechanical, thermal, fluid, and interface
equations 

Implicit, uses BiCG 

Adaptive 3D mesh 

Authors: Jungho Jeong, John Danzig 








FEM programs manipulate elements and nodes 

Element is a portion of problem domain,
surrounded by nodes 

Node is one point in the domain 







“Shared Node” model 

Element computations based on values of
surrounding nodes 

Node values are sum of surrounding elements 



Example: Mechanical Simulation 

Element stresses are computed from locations of
element’s nodes 

Sum up forces at nodes 






Summing forces from other processors only takes
one call: 

FEM_Update_field 



Adds values from shared nodes 





“Ghost” model 

Element computations based only on values of
surrounding nodes and elements 



Example: Fluid Dynamics 

Element pressures and velocities come from
neighboring element pressures and velocities, and node locations 









Partition the FEM Mesh into multiple chunks 

Distribute elements, replicate shared nodes
and/or add ghosts 

Keep track of communication 

Partition so that communication is minimized 






Consists of at least two userwritten
subroutines 

init 

driver 



init is called on processor 0 

driver is called on every chunk 











FEM_Set_elem 

FEM_Set_elem_data 

FEM_Set_elem_conn 





FEM_Set_node 

FEM_Set_node_data 





FEM_Set_sparse 

FEM_Set_sparse_elem 




void FEM_Mesh_data( 

int mesh, 

int entity, 

int attr, 

void *data, 

int first, int length, 

int datatype,int width 

); 






FEM_Mesh_data( 

mesh, 

FEM_NODE, 

FEM_DATA+23, 

coord, 

0,nNodes, 

FEM_DOUBLE,3 

); 







Novice users have only one routine to learn 

User’s read and write routines can be similar or
even identical (just like PUP) 

Smart users can easily abstract over parameters 

E.g., “higherorder function” that sets both
ghosts and real elements for any entity type 

Library developers can easily, naturally add: 

New entities: FEM_SPARSE, FEM_GHOST 

New data: FEM_COORD, FEM_NODE_PRIMARY 





8 parameters is too many to keep straight 

Invalid combinations detected at runtime,
instead of compile time as with the old way 

E.g., FEM_NODE doesn’t have FEM_CONN, so there’s
no FEM_Set_node_conn 



Solution: Keep the old routines 

But implement them in terms of the new routine 

Mix and match with new routine 

Needed for backward compatability, too 





Framework handles combining data for shared
nodes and keeps them in sync 

Framework does not understand meaning of node
fields, only their location and types 

Framework needs to be informed of locations and
types of fields 

Create_field once, Update_field every timestep 








Most FEM programs communicates via shared nodes,
using FEM_Update_field 

Some computations require readonly copies of
remote elements—“ghosts” 

Stenciltype finite volume computations 

The push form of matrixvector product 

Many kinds of mesh modification 

Ghosts are a recent addition to the FEM
framework 





Add ghost elements layerbylayer from init 

A chunk will include ghosts of all the elements
it is connected to by “tuples”—sets of nodes 

For 2D, a tuple might be a 2node edge 

For 3D, a tuple might be a 4node face 

You specify a ghost layer with
FEM_Add_ghost_layer(tupleSize,ghostNodes) 

ghostNodes indicates whether to add ghost nodes
as well as ghost elements. 





FEM_Add_ghost_elem(e,t,elem2tuple) 

e is the element type 

t is the number of tuples per element 

elem2tuple maps an element to its tuples: 

A tupleSize by t array of integers 

Contains elementlocal node numbers 

Repeat this call for each ghost element type 







Ghosts are always given larger numbers than
nonghosts—that is, ghosts are at the end 

FEM_Get_node_ghost() and FEM_Get_elem_ghost(e) 

Return the index of the first ghost node or
element 





FEM_Update_ghost_field(fid,e,data) 

Obtain other processor’s data (formatted like fid)
for each ghost element of type e 











Use “charmc”: available under bin 

a multilingual compiler driver, understands f90 

Knows where modules and libraries are 

Portable across machines and compilers 

Linking 

use “language femf” : for F90 

Use “–language fem” : for C/C++ 

See example Makefiles 

charm/pgms/charm++/fem/… 





Charmrun 

A portable parallel job execution script 

Specify number of processors: +pN 

Specify number of chunks: +vpN 

Special “nodelist” file for net* versions 














In addition to crossprocessor ghosts, can build
ghosts to model problem symmetries 

Translational and rotational periodicities 

Mirror symmetry 



FEM_Add_linear_periodicity(nFaces,nPer,
facesA,facesB, nNodes,nodeLocs) 

Identify these two lists of faces under linear
periodicity, and build ghosts to match 













Detect collisions (intersections) between
objects scattered across processors 









To refine, split the longest edge: 

But if split neighbor has a longer edge, split
his edge first 

Refinement propagates across mesh, but preserves
mesh quality 

Initial 2D parallel implementation built on
Charm++ 

Interfaces with FEM Framework 








Normal computations run elementbyelement 

Only consider one element at a time 

Matrixbased computations are holistic 

One big matrix represents the whole system 

Solve the matrix == solve the problem 

Row of matrix solves for “Degree of Freedom” 

E.g., a node’s X coordinate 

Typically a row has only a few nonzero entries 

Around 10^{6} degrees of freedom10^{6}
rows 





Normal matrix operations are O(n^{3}) 

Can’t use Gaussian elimination, Jacobi, QR, … 

Almost everybody uses iterative solvers 

Based on a fast “guessandcheck” method 

Each iteration consists of a matrixvector
product 

Typically converges in 10 to 100 iterations 

Popular iterative solvers: 

Conjugate gradient (CG) 

Bidirectional Conjugate Gradient (BiCG) 

Generalized modified residual (GMRES) 





Provides solvers with operations they need: 

Parallel matrixvector product (written by user) 

Parallel dotproduct 

class ILSI_Comm { 

virtual void matrixVectorProduct(const
double *src,double *dest) =0; 

virtual double dotProduct(const double
*a,const double *b) =0; 

}; 



Iterative solvers use these operations to solve
the matrix 

Writing iterative solvers: an active research
area 

Krylov Subspaces, Residuals, Quadratic forms… 





Hooke’s spring law: 

F=kx 

F1=k(D1D2) 

Linear relation between node force and node
displacement 

Element corresponds to little 2x2 matrix 








If you can do matrixvector product, you can run
iterative solvers like conjugate gradient 

Matrixvector product amounts to the usual FEM
calculation (displacements to forces), plus the usual FEM communication
(update shared nodes) 

You don’t even need to store the matrix 

Users can easily solve iterative problems with
the FEM framework and IFEM solvers! 







Can interact with a running FEM application 

Uses Charm++ CCS (Clientserver) library 

Easy to “publish” attributes 









To allow the NetFEM client to connect, you add
NetFEM registration calls to your server 

Register nodes and element types 

Register data items: scalars or spatial vectors
associated with each node or element 

You provide the display name and units for each
data item 

Link your program with “module netfem” 

Run with “++server”, and connect! 





n=NetFEM_Begin(FEM_My_partition(),timestep, dim,NetFEM_POINTAT) 

Call this each time through your timeloop; or
skip 

timestep identifies this data update 

dim is the spatial dimension—must be 2 or 3 

Returns a NetFEM handle n used by everything
else 

NetFEM_End(n) 

Finishes update n 







NetFEM_Nodes(n,nnodes,coord,”Position (m)”) 

Registers node locations with NetFEM—future
vectors and scalars will be associated with nodes 

n is the handle returned by NetFEM_Begin 

nnodes is the number of nodes 

coord is a dim by nnodes array of doubles 

The string describes the coordinate system and
meaning of nodes 

Currently, there can only be one call to nodes 








NetFEM_Elements(n,nelem,nodeper, conn,”Triangles”) 

Registers elements with NetFEM—future vectors
and scalars will be associated with these elements 

n is the handle returned by NetFEM_Begin 

nelem is the number of elements 

nodeper is the number of nodes per element 

conn is a nodeper by nelem array of node indices 

The string describes the kind of element 

Repeat to register several kinds of element 

Perhaps: Triangles, squares, pentagons, … 









NetFEM_Vector(n,val,”Displacement (m)”) 

Registers a spatial vector with each node or
element 

Whichever kind was registered last 

n is the handle returned by NetFEM_Begin 

val is a dim by nitems array of doubles 

There’s also a more general NetFEM_Vector_field
in the manual 

The string describes the meaning and units of
the vectors 



Repeat to register multiple sets of vectors 

Perhaps: Displacement, velocity, acceleration,
rotation, … 









NetFEM_Scalar(n,val,s,”Displacement (m)”) 

Registers s scalars with each node or element 

Whichever kind was registered last 

n is the handle returned by NetFEM_Begin 

val is an s by nitems array of doubles 

There’s also a more general NetFEM_Scalar_field
in the manual 

s is the number of doubles for each node or
element 

The string describes the meaning and units of
the scalars 

Repeat to register multiple sets of scalars 

Perhaps: Stress, plasticity, node type, damage,
… 








Easy, general way to get output from an FEM
computation 

Client configures itself based on server 

Client can be run anywhere (from home!) 

Server performance impact minimal (1ms!) 

Future work: 

Support multiple chunks per processor 

Nonnetwork, filebased version 

Movie mode 





Easy to parallelize existing codes 

Flexible: shared nodes or ghosts 

High performance 

Extra features 

Visualization with NetFEM 

Collision 

Matrix methods with IFEM 








FEM_Set_Mesh 

Called from initialization to set the serial
mesh 

Framework partitions mesh into chunks 

FEM_Create_Field 

Registers a node data field with the framework,
supports user data types 

FEM_Update_Field 

Updates node data field across all processors 

Handles all parallel communication 

Other parallel calls (Reductions, etc.) 





