#################################################################
         N U M E R I C A L   A N A L Y S I S   S E M I N A R
   Department of Mathematics, University of Maryland, College Park
  #################################################################

                 F A L L   1 9 9 3   S C H E D U L E


-----------------------------------------------------------------
Sept. 16:  Mechanics of DNA supercoiling
-----------------------------------------------------------------  
Dr. Yang Yang, Naval Research Lab.

DNA usually occurs as a double-helix with two complementary nucleotide
chains winding around a common axis. In cells, the common axis of the
DNA often folds into a supercoiled form that is linked to its
biological activities. The theories of mechanics which haven been 
widely used in design of buildings, bridges and airplanes may provide 
important quantitative analysis for these biological processes. 
In the present approach a long DNA is modeled as a thin elastic rod.
Even for such a relatively simple model we still face a great deal
of difficulty in overcoming non-linearity and self-contact resulting 
from DNA supercoiling. A finite element scheme based on Kirchhoff's
rod theory is newly formulated to address these problems. Until now
most of these problems have not been fully understood. There is no
shortage of interesting mathematics and physics problems here.

-----------------------------------------------------------------
Sept. 23: Amending the AR Method for Frequency Detection
          by a Fixed Point Idea
-----------------------------------------------------------------
Prof. Benjamin Kedem, 
      Dept. of Math. and Systems Research Center, UMCP

It is well known that the AR method for multiple frequency
estimation known as Prony's method, gives asymptotically biased
estimates.  In an attempt to improve on this problem,
Dragosevic and Stankovic (1989) couple the AR method
with an iterative scheme, discussed by Kay (1984), and an
all-pole filter with poles inside the unit circle.
But the asymptotic bias still persists.

It is shown, however,
that if the least squares estimator of the AR parameters
is parametrized by using a parametric filter which possesses a
certain ``fundamental property", a consistent estimator is
obtained from the fixed point of the parametrized least squares
(random) estimator.

In particular, the all-pole filter considered by
Dragosevic and Stankovic can be reparametrized so
that it too possesses the ``fundamental property" and thus leads to
consistent estimates. Experimental results with the reparametrized
all-pole filter show that the modified method has high resolution,
and that its overall performance is quite remarkable.

------------------------------------------------------------------------
Sept. 30: Extrapolation Quadrature over Triangles
------------------------------------------------------------------------
Prof. J. N. Lyness, Argonne National Laboratory

In this talk, I shall treat parts of the theory of extrapolation 
quadrature. First I shall provide a brief one-dimensional background. 
Then I shall treat numerical quadrature over triangles. In each case 
I shall treat first the theory relating to regular integrands and
then the corresponding theory for singular integrands with emphasis on
the "full corner singularity". Some recent results relating to 
integration over curved surfaces will be mentioned.

------------------------------------------------------------------------
Oct. 7: Compilers and Runtime Support for Unstructured Scientific Problems
------------------------------------------------------------------------  
Prof. J. Saltz, Dept. of Computer Science, UMCP

We are developing methods that make it possible to produce portable compilers
targeted at scalable architectures that generate efficient code for sparse,
irregular, adaptive and block structured computations.

The development of sophisticated runtime support forms the foundation of our
research effort.  We have developed the CHAOS library, a runtime support
library that couples partitioners to compilers, partitions loop iterations,
remaps data and index arrays, and generates optimized communication schedules.
We have implemented address translation methods which support irregular
mappings by using highly optimized page-based translation tables.  We have also
developed a partitioner library, (PIECES library) containing a wide range of
efficient partitioners which are based on  data access patterns of a problem,
computational load information and problem geometry.

Using the CHAOS library and the Syracuse Fortran 90D compiler we have developed
a prototype distributed memory compiler able to generate efficient code for
templates extracted from adaptive problems.  One particularly innovative
feature associated with this compiler is our simple but robust method of
verifying that the communication preprocessing associated with a given data
access pattern can be reused.

This compiler also generates code for problems with irregularly coupled regular
meshes. We have used this compiler to parallelize a kernel abstracted from a
block structured Navier-Stokes solver and a full semi-coarsening multigrid
code.  We have also applied CHAOS directly to parallelize full adaptive
applications codes.  A large scale test of our methods involved porting a 120K
line molecular dynamics code, CHARMM, onto the Intel iPSC/860. We obtained
results that were better than any other known parallel implementation of
CHARMM.

------------------------------------------------------------------------
Oct. 14:  Successive Band Reduction and Tridiagonalization 
------------------------------------------------------------------------
Dr. Xiaobai Sun, Argonne National Laboratory

Reduction to tridiagonal form is a major step in many algorithms 
for symmetric eigenvalue problems. The conventional methods for 
tridigonalization eliminate subdiagonals either one at a time or 
all at a time. We have developed a new approach that, by eliminating 
several (but not necessarily all) subdiagonals at a time, leads to 
algorithms with lower computational and storage complexity and 
increased scope for parallelism. This class of algorithms also naturally 
lends itself to the use of block orthogonal transformations which 
decrease the overall communication complexity and exploit the 
architectural features of high-performance superscalar processors. 
In addition, the successive band reduction algorithm can also be very 
advantageously used to compute the invariant subspaces of matrices arising 
in the so-called invariant subspace decomposition algorithm, where the 
eigenvalue problem is repeatedly decoupled by computed invariant subspaces. 
We will present computational results on Delta, SP-1, and Paragon, 
where we implemented our successive band reduction algorithm using a 
two-dimensional block torus wrap mapping. 

------------------------------------------------------------------------
Oct. 21: Numerical Solution of 4th Order Sturm-Liouville Problems
------------------------------------------------------------------------
Prof. M. Marletta, University of Leicester, England

The most successful methods for solving second order Sturm-Liouville 
Problems rely on the well-developed Sturm Oscillation Theory for these 
problems. In this talk I will present some oscillation results which 
may be used as the basis of solution methods for 4th order 
Sturm-Liouville eigenvalue problems. Some of these are a consequence 
of the Hamiltonian structures underlying the problems; others are 
unique to fourth order problems. This is joint work with Leon Greenberg.

------------------------------------------------------------------------
Oct. 28: Global Search and Continuation in Structural Elasticity
------------------------------------------------------------------------
Dr. Gabor Domokos, Technical University of Budapest, Hungary

The state-change of elastic bar structures subject to slowly varying
load is parametrized by physical time, however, since the speed of load
variation is not specified, physical time is not a good choice for the
independent variable. There is a "quasi-time" in statics which is quite
different from physical time.  The first goal of this talk is to define
this quasi-time mathematically for elastic bar structures.
 As a tool for this definition we introduce the phase space of the
boundary value problem (BVP), being analogous to the phase space of the
initial value problem (IVP). The investigation of the principal
similarities and differences between the IVP and the BVP phase space
lead us to the construction of global search and continuation
algorithms which neither rely on iteration techniques, nor do they
accumulate errors.
    We will demonstrate the global behaviour of some sample problems by
applying the aforementioned numerical methods.

------------------------------------------------------------------------
Nov. 4: Parallel Function Decomposition and Space Decomposition Method
------------------------------------------------------------------------
Dr. Xue-Cheng Tai, Univ. of Jyvaskyla, Finland

Some methods which we call function decomposition methods and space
decomposition methods are developed. These methods deal with a convex
programming problem, i.e., a minimization problem of a convex function
over a space or a convex set.  If the function can be decomposed into
the sum of subspaces, then parallel methods can be used for the
minimization. The intention of this research is to give a general
convergence theory for domain decomposition methods and to show a
relationship between the domain decomposition method and the
traditional splitting method. Especially, we intend to use this theory
to extend domain decomposition methods for some nonlinear problems.

To use the developed algorithms for practical problems, there are
different ways to decompose a function and to decompose a space. Many
partial differential equations can be formulated as a minimization
problem in some way. Therefore we get some parallel methods for partial
differential equations.  The method is not restricted to linear
problems, nonlinear problems are naturally included in the theory.
Applications to linear and quasilinear self-adjoint elliptic equations,
to strongly nonlinear elliptic equations like the p-Laplace equation,
to the Stokes equation, and to variational problems like the obstacle
problem are discussed. With the applications we show that the the
classical alternating direction method and local one-dimensional method
are one way to decompose a function. The domain decomposition methods
are ways to decompose a function and also ways to decompose a space.
One observation is that the Schwarz alternating method for the
overlapping domain decomposition is a Gauss-Seidel method. If we
replace the Gauss-Seidel method by the Jacobi method we will get
parallel domain decomposition methods. Convergence of the Jacobi method
for general space decomposition is proved here.

------------------------------------------------------------------------
Nov.11: Finite Element Solution to the Exteriour Helmholtz Problem: 
        A One Dimensional Study of the h- and h-p-Versions
------------------------------------------------------------------------
Dr. Frank Ihlenburg, Institute for Physical Science and Technology, UMCP

While proofs of stability and convergence for the FE-solution of the
exteriour Helmholtz problem have been given previously, there exists a
certain gap between the asymptotic theorems of numerical analysis and
"rules of the thumb" that are applied by engineers to choose the
meshwidth in practical computations.  On the one hand, the assumptions
of the analytical results do not lead to practically applicable rules
if the wavenumber is high. On the other hand, the rules used so far in
applications of the h-version are reported to fail in the very case of
high wavenumber. The talk will
* present analytical results for the weak formulation of the Helmholtz problem
* investigate by numerical experimentation if the assumptions of the theorems 
    are due to technicalities or inherent to the problem
* draw practical conclusions for the application of the h- and h-p-version in 
    the case of high wavenumbers.

------------------------------------------------------------------------
Nov.18: Scientific Libraries on Scalable Architectures
------------------------------------------------------------------------
Dr. Lennart Johnsson, Thinking Machines Corporation

Maximizing locality of reference and minimizing network contention while 
maintaining load balance is key to efficient computations on massively
parallel processors.  The memory hierarchy in each node includes registers,
cache, DRAM pages, and the maintenance of memory maps if not all of memory
is mapped at any given time.  In addition there is remote memory with varying
access characteristics.  The efficient management of memory hierarchies of
this type pose a serious challange, often beyond state-of-the-art compilers
and run-time systems.  Thus, memory management has become a significant
portion of scientific libraries for distributed memory architectures.  Tools
for data (re)distribution is an important component for good performance on
unstructured grid problems.  

Libraries for distributed memory archietctures must accept data structures 
distributed in arbitrary ways in memory and must exploit concurrency in 
computations on single instances of objects as well as the concurrency offered
by computations on multiple instances.  The machine size and configuration 
must be transparent to the programmer.  We will discuss how these challenges 
have been addressed in the Connection Machine Scientific Software Library 
(CMSSL).

------------------------------------------------------------------------
Dec. 2: Inexact and Preconditioned Uzawa Algorithms for Saddle Point Problems
------------------------------------------------------------------------
Prof. Howard Elman, Dept. of Computer Science, UMCP

We develop and analyze variants of the Uzawa algorithm for solving symmetric
indefinite linear systems of the type arising from discretization of the
Stokes equations.  Each step of this algorithm requires the solution of a
symmetric positive-definite system of linear equations.  We prove that if this
computation is replaced by an approximate solution produced by an arbitrary
iterative method, then with relatively modest requirements on the accuracy of
the approximate solution, the resulting inexact Uzawa algorithm is convergent
with convergence rate close to that of the exact algorithm.  In addition, for
mixed finite element discretizations of the Stokes equations, we show how to
precondition the system using a submatrix of the pressure mass matrix so that
convergence of both the exact and inexact Uzawa algorithms is improved.  The
analysis is supplemented by numerical results demonstrating the effectiveness
of the inexact and preconditioned algorithms.

-----------------------------------------------------------------------------
Dec. 9: Wavelet Bases for Boundary Element Methods
-----------------------------------------------------------------------------
Prof. Tobias von Petersdorff, Department of Mathmematics, UMCP

Boundary Element Methods lead to linear sytems with a full matrix and 
therefore require at least O(N^2) operations for N degrees of freedom on the 
boundary.
Using a wavelet-type basis for the boundary element spaces yields a matrix 
where most elements are so small that they can be neglected without affecting 
the rates of convergence. We obtain matrices which are sparse and have a 
uniformly bounded condition number. We analyze the case of a polygonal domain 
and describe an algorithm which gives optimal convergence rates with only 
O(N log(N)^k) operations. Numerical computations confirm that special 
consideration of the vertices is necessary for an optimal method.