close

Вход

Забыли?

вход по аккаунту

?

Coarsening of mesh models for representation of rigid objects in finite element analysis

код для вставкиСкачать
INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN ENGINEERING
Int. J. Numer. Meth. Engng. 44, 281—303 (1999)
DOMAIN DECOMPOSITION METHODS FOR PARALLEL
SOLUTION OF SHAPE SENSITIVITY ANALYSIS PROBLEMS
MANOLIS PAPADRAKAKIS* AND YIANNIS TSOMPANAKIS
Institute of Structural Analysis and Seismic Research, National Technical University of Athens, Athens 15773, Greece
ABSTRACT
This paper presents the implementation of advanced domain decomposition techniques for parallel solution
of large-scale shape sensitivity analysis problems. The methods presented in this study are based on the
FETI method proposed by Farhat and Roux which is a dual domain decomposition implementation. Two
variants of the basic FETI method have been implemented in this study: (i) FETI-1 where the rigid-body
modes of the floating subdomains are computed explicitly. (ii) FETI-2 where the local problem at each
subdomain is solved by the PCG method and the rigid-body modes are computed explicitly. A two-level
iterative method is proposed particularly tailored to solve re-analysis type of problems, where the dual
domain decomposition method is incorporated in the preconditioning step of a subdomain global PCG
implementation. The superiority of this two-level iterative solver is demonstrated with a number of
numerical tests in serial as well as in parallel computing environments. Copyright 1999 John Wiley
& Sons, Ltd.
KEY WORDS: shape sensitivity analysis; domain decomposition methods; preconditioned conjugate gradient; multiple
right-hand sides; reanalysis problems
INTRODUCTION
Shape optimization aims to improve the shape of a structure, defined with a number of
parameters called design variables, by minimizing an objective function subject to certain
constraint functions. The shape optimization algorithm proceeds with the following steps: (i)
a finite element mesh is generated, (ii) displacements, stresses, frequencies, etc. are evaluated
depending on the type of optimization problem, (iii) the gradients, or the sensitivities of the
functions are computed by perturbing each design variable by a small amount, (iv) the optimization problem is solved and the new shape of the structure is defined. These steps are repeated until
convergence has occurred. The most time-consuming part of a gradient-based optimization
process is devoted to the sensitivity analysis phase. For this reason several techniques have been
developed for the efficient calculation of the sensitivities in an optimization problem. The
semi-analytical and the finite-difference approaches are the two most widely used types of
sensitivity analysis techniques. From the algorithmic point of view the semi-analytical technique
results in a typical linear solution problem with multiple right-hand sides in which the stiffness
* Correspondence to: Manolis Papadrakakis, Department of Civil Engineering, National Technical University, Zografou
Campus, GR 157 73 Athens, Greece. E-mail: mpapadra@central.ntua.gr
Contract/grant sponsor: European Union; Contract/grant number: HC&M/9203390
CCC 0029—5981/99/020281—23$17.50
Copyright 1999 John Wiley & Sons, Ltd.
Received 24 June 1996
Revised 20 February 1998
282
M. PAPADRAKAKIS AND Y. TSOMPANAKIS
matrix remains the same, while the finite-difference technique results in a typical re-analysis
problem in which the stiffness matrix is modified due to the perturbations of the design variables.
Shape optimization problems are usually computationally intensive tasks, where 60—90
per cent of the computations are spent for the solution of equilibrium equations required for the
finite element analysis and sensitivity analysis. Although it is widely recognized that hybrid
solution methods, based on a combination of direct and iterative solvers for solving linear finite
element equations, outperform their direct counterparts, in sequential as well as parallel computing environments, little effort has been devoted until now to their implementation in the field of
structural optimization. In a recent paper by Papadrakakis et al., the performance of various
iterative solution methods, based on PCG and Lanczos algorithms, in sequential computing
environment was demonstrated and compared with the conventional direct skyline solver in
a number of topology and shape optimization problems. In the present study two variants of the
basic FETI method of Farhat and Roux are implemented for solving sensitivity analysis
problems using the semi-analytical approach, while an innovative two-level parallel solution
method is proposed for solving sensitivity analysis problems using the global finite-difference
approach.
In the two variants of the basic FETI method the rigid-body modes of the floating subdomains
are computed explicitly, while in the second variant a PCG iterative solver with a strong
preconditioner is also used for the solution of the local problem in each subdomain. These two
variants have a beneficial effect on the robustness of basic FETI method for ill-conditioned
problems, while the second variant operates under reduced storage requirements. In the present
study a two-level iterative method is proposed specially tailored for solving reanalysis type of
problems in general, a special case of which is the problem arising when the global finitedifference sensitivity analysis approach is used. For this type of problems the two variants of the
basic FETI method are incorporated in the preconditioning step of a global subdomain implementation of the PCG method. This two-level subdomain implementation is applicable in serial
and parallel computing environments resulting in a drastic improvement of computing time
compared to the conventional one-level methods.
SHAPE SENSITIVITY ANALYSIS
Sensitivity analysis is the most important and time-consuming part of gradient-based structural
optimization. Several techniques have been developed which can be mainly distinguished by their
numerical efficiency and their implementation aspects. The primary objective of sensitivity
analysis is to compute the derivatives of the displacement field with respect to perturbations of the
primary design variables. The methods for sensitivity analysis can be divided into discrete and
variational methods. In the variational approach the sensitivity coefficients can be determined
by applying basic variational theorems. In this case, the sensitivities are given as boundary and
surface integrals which are solved after the structure has been discretized, whereas in the discrete
approach they are evaluated using the finite element equations. The implementation for discrete methods is simpler than the one for variational techniques. A further classification of the
discrete methods is the following: (i) Global finite-difference method where the derivatives needed
for the solution of the optimization problem are computed numerically. (ii) Semi-analytical
method where the calculation of the sensitivities is performed via analytical and numerical
expressions. (iii) Analytical method where the derivatives of the objective and constraint functions
are obtained analytically.
Copyright 1999 John Wiley & Sons, Ltd.
Int. J. Numer. Meth. Engng. 44, 281—303 (1999)
DOMAIN DECOMPOSITION METHODS FOR PARALLEL SOLUTION
283
The decision on which method to implement depends strongly on the type of problem, the
organization of the computer program and the access to the source code. The implementation of
analytical and semi-analytical methods is more complex and requires access to the source code,
whereas when a finite-difference method is applied the formulation is much simpler. Depending
on the type of problem and the approach used, the sensitivity analysis together with the finite
element analysis can take 60—90 per cent of the total computational effort required to solve the
whole optimization problem. An efficient and reliable sensitivity analysis module that can take
full advantage of the innovative computer architectures with parallel processing capabilities
could therefore result in a considerable reduction of the overall computational effort of the
optimization procedure. In the present study the emphasis is given on the application of efficient
domain decomposition methods in parallel computing environments for solving the systems of
the algebraic equations encountered in the two most commonly used types of sensitivity analysis,
namely the global finite difference and the semi-analytical approaches.
¹he Global Finite-Difference (GFD) approach
The GFD approach provides a simple way of computing the sensitivity coefficients. This
method requires the solution of the linear system of equations Ku"f, where u is the
displacement vector, for the original design variables s, and for each perturbed design variable
sN"s#*s , where *s is the magnitude of the perturbation, usually taken in the range of
I
I
I
I
10\—10\ of the value of the design variable. The design sensitivities for the displacements
*u/*s are computed using a forward difference scheme:
I
*u u(s #*s )!u(s )
I
I
*u/*s + " I
I *s
*s
I
I
(1)
where u(s #*s ) is evaluated by solving the following reanalysis type of problem:
I
I
K(s #*s )u(s #*s )"f (s #*s )
I
I
I
I
I
I
(2)
The Semi-Analytical (SA) approach
The SA approach is based on the chain rule differentiation of the finite element equations
Ku"f :
K
*f
*u *K
# u"
*s
*s
*s
I
I
I
(3)
*u
"f *
I
*s
I
(4)
which when rearranged results in
K
where
*f
*K
f *" !
u
I
*s
*s
I
I
Copyright 1999 John Wiley & Sons, Ltd.
(5)
Int. J. Numer. Meth. Engng. 44, 281—303 (1999)
284
M. PAPADRAKAKIS AND Y. TSOMPANAKIS
f * represents a pseudo-load vector. The derivatives of *K/*s and *f /*s are computed for each
I
I
I
design variable by recalculating the new values of K(s #*s ) and f (s #*s ) for a small
I
I
I
I
perturbation *s of the design variable s , while the stiffness matrix remains unchanged throughI
I
out the whole sensitivity analysis.
The Conventional Semi-Analytical (CSA) approach. In the CSA approach, the values of the
derivatives in equation (3) are calculated by applying the forward-difference approximation
*K K(s #*s )!K(s )
I
I
I
*K/*s + "
I *s
*s
I
I
(6)
* f f (s #*s )!f (s )
I
I
*f/*s + " I
I *s
*s
I
I
(7)
For maximum efficiency of the semi-analytical approach only those elements which are affected
by the perturbation of a certain design variable are involved into the calculations of *f/*s and
I
*K/*s .
I
The *Exact+ Semi-Analytical (ESA) approach. The CSA approach may suffer some drawbacks in
particular types of shape optimization problems. This is due to the fact that in the numerical
differentiation of the element stiffness matrix with respect to shape design variables the components of the pseudo-load vector associated with the rigid-body rotation do not vanish. The
solution suggested by Olhoff et al. alleviates the problem by performing an ‘exact’ numerical
differentiation of the elemental stiffness matrix as follows:
*k
L *k *a
H
"
*a *s
*s
I H H I
(8)
where n is the number of elemental nodal coordinates affected by the perturbation of the design
variable s and a is the nodal co-ordinate of the element which can be either an x- or
I
H
a y-co-ordinate. This means that by perturbing a design variable all nodes of an element on the
perturbed boundary are perturbed as well and the summation is carried out only for the
perturbed nodal co-ordinates. The term *a /*s is computed using the forward-difference
H I
scheme while the derivative *k/*a is computed by differentiating the element stiffness matrix
H
expression.
SOLUTION METHODS IN SENSITIVITY ANALYSIS PROBLEMS
The implementation of hybrid solution schemes in structural optimization, which are based on
a combination of direct and preconditioned iterative solvers, has not yet received the attention it
deserves from the research community, even though in finite element linear solution problems
and particularly when dealing with large-scale applications their efficiency is well documented. In
a recent study by Papadrakakis et al., a class of efficient hybrid solution methods was applied in
the context of shape sensitivity analysis and topology optimization problems in sequential
computing environment.
Copyright 1999 John Wiley & Sons, Ltd.
Int. J. Numer. Meth. Engng. 44, 281—303 (1999)
DOMAIN DECOMPOSITION METHODS FOR PARALLEL SOLUTION
285
In this work domain decomposition methods are applied for solving the sensitivity analysis
part of shape optimization problems, in both sequential and parallel computing environments,
after being properly modified to address the special features of the particular type of problems at
hand. The most computational intensive part of sensitivity analysis, using either semi-analytical
or finite-difference sensitivity analysis approaches, is the solution of finite element equilibrium
equations (4) or (2), respectively. In the first case, the coefficient matrix remains constant and only
the right-hand-side vector is changing, which is the typical case for solving linear systems with
multiple right-hand sides, while in the second case the coefficient matrix is slightly modified
therefore a typical reanalysis problem needs to be solved.
Solving sensitivity analysis problems with the semi-analytical approach
One of the main shortcomings of iterative solution methods is encountered when a sequence of
right-hand sides has to be processed. In such cases direct methods possess a clear advantage over
iterative methods since the most computationally intensive part, associated with the factorization
of the stiffness matrices, is not repeated and only a backward and forward substitution is required
for each subsequent right-hand side. The Lanczos method has been used in the past for treating
a sequence of right-hand sides. An efficient implementation of Lanczos method was proposed by
Papadrakakis and Smerou which handles all approximations to the solution vectors simultaneously without the necessity for storing the tridiagonal matrix and the orthonormal basis. This
approach, however, cannot be implemented in problems with multiple right-hand sides that are
not known at the beginning of the iterative procedure. The most efficient method for this type of
problems is the dual domain decomposition method FETI with the projection—re-orthogonalization scheme for handling problems with multiple or repeated right-hand sides. Subsequently, the
basic FETI method, its two variants and the re-orthogonalization procedure will be briefly
presented.
The basic FETI method. The FETI method proposed by Farhat and Roux is considered to be
very promising method for solving large-scale problems in both shared and distributed memory
computer architectures due to its very satisfactory numerical and parallel scalability features.
This method operates on totally disconnected subdomains, while the governing equilibrium
equations are derived by invoking stationarity of the energy functional subject to displacement
constraints which enforce the compatibility conditions on the subdomain interface. The augmented equations are solved for the Lagrange multipliers after eliminating the unknown displacements. The resulting interface problem is in general indefinite due to the presence of floating
subdomains which do not have enough prescribed displacements to eliminate the local rigid-body
modes.
The application of the FETI method results in the following interface problem:
KC BC2
BC 0
fC
uC
j
"
0
(9)
where KCuC"f C are the unassembled equations for all subdomains and BC"[B B ) ) ) BQ ]
are signed Boolean matrices which localize the subdomain displacements on the interface and
‘s’ is the total number of subdomains. The vector of Lagrange multipliers j represents the
Copyright 1999 John Wiley & Sons, Ltd.
Int. J. Numer. Meth. Engng. 44, 281—303 (1999)
286
M. PAPADRAKAKIS AND Y. TSOMPANAKIS
interaction forces between the subdomains along their common boundary that impose the
continuity of the structure.
The system of equation (9) is indefinite but has a unique solution provided that the global
problem is adequately restrained. Care has to be taken for the handling of the floating subdomains which are characterized by a singular stiffness matrix KH. The system of equation (9) may
be explicitly written for fixed subdomains as
uH"KH\( f H!BH2j)
(10)
uH"KH>( f H!BH2j)#RHcH
(11)
and for floating subdomains as
where KH> is a generalized inverse of KH for floating subdomains, RH corresponds to the
rigid-body modes of the floating subdomain j, and cH specifies a linear combination of these.
If the stiffness matrix KH of a floating subdomain is partitioned as
KH KH
2
KH KH
where KH has full rank, then KH> and RH are given by
\
\
KH
0
!KH
KH
and RH"
KH>" I
0
0
KH"
(12)
(13)
The additional equations required for determining cH are provided by the zero energy condition
of the rigid-body modes
RH2KHuH"0
(14)
The combination of the compatibility conditions of equation (9) for the displacements uH on the
interface d.o.f. of the subdomains with equations (10), (11) and (14) gives the following interface
problem:
F
!G
'
'
!G2 0
'
j
c
f
"
f
H
(15)
A
where
Q
Q
F " BHKH>BH2, c"[c ) ) ) cQD]2, f " BHKH>f H
'
H
H
H
2
G "[B ) R ) ) ) BQD ) RQD ], f "[ f ) R ) ) ) f QD2 ) RQD ]2
'
A
and s is the total number of floating subdomains. The solution of this indefinite problem can be
D
performed by a Preconditioned Conjugate Projected Gradient (PCPG) algorithm. The preconditioning matrix adopted in this study is the lumped-type preconditioner which does not require
any additional storage, involves only matrix vector products on the interface level and often
outperforms the optimal but expensive Dirichlet preconditioner which requires additional storage equivalent to the storage of the factorized stiffness matrices of each subdomain.
Copyright 1999 John Wiley & Sons, Ltd.
Int. J. Numer. Meth. Engng. 44, 281—303 (1999)
DOMAIN DECOMPOSITION METHODS FOR PARALLEL SOLUTION
287
Two variants of the basic FETI method. In the basic FETI method the solution of the local
problem of equations (10) or (11) is required associated with both internal and interface d.o.f. of
each subdomain. The solution of this problem by a direct Cholesky solver is most advantageous
for two reasons: (i) A solution of equations (10) or (11) has to be performed at each PCPG
iteration for calculating, via fast forward and backward substitutions, the product KH>+dH,
K
( j"1, . . . , s), where +dH, is the direction vector in mth iteration. (ii) The rigid-body modes
K
RH of the floating subdomains are computed as a by-product of the factorization procedure
according to equation (13).
An alternative way of treating the solution of the local problems is to use a PCG solver where
the preconditioning matrix is computed by an incomplete, or even a complete factorization of the
stiffness matrix but stored in single precision arithmetic. In this approach the mixed precision
PCG implementation proposed in Reference 10 with a strong preconditioner is adopted in which
all computations are performed in single precision arithmetic except for the matrix—vector
multiplication, occurring during the recursive evaluation of the residual vector, which is performed in double precision arithmetic.
The stiffness matrix is stored in double precision arithmetic but in compact form which
demands only a small fraction of the memory needed by the skyline storage scheme especially for
large-scale 3-D problems. The storage requirements of this mixed precision PCG are at most half
the storage of the direct Cholesky (in the case of a complete factorized preconditioner since it is
stored in single precision arithmetic) plus the memory for the compact stored stiffness matrix and
the four auxiliary vectors of the PCG method, thus resulting in a great reduction of the memory
demands. This implementation proved to be a robust and reliable solution procedure even for
handling large and ill-conditioned problems, while it is computer storage-effective. It was also
demonstrated to be more cost-effective than double precision arithmetic calculations for the same
storage demands.
By adopting this alternative treatment for the solution of the local problem at each subdomain
the rigid-body modes cannot be obtained as a by-product of the factorization procedure as
previously described for the basic FETI method. Bitzarakis et al. have proposed an analytical handling of the rigid-body modes for all types of finite elements which permits the incorporation of the PCG algorithm with a strong preconditioner for the solution of the local problem in
the basic FETI method. Under this formulation the floating subdomains are fixed with constraints equal to the number of local singularities which is less or equal to 3 for 2-D problems and
is less or equal to 6 for 3-D ones. For a 2-D problem discretized with plane stress/strain finite
elements, the rigid-body modes associated with 2 translations and 1 rotation produce the
following displacements at node i of subdomain j:
RH"[R R R ]2
G
(16)
with
1
R " 0 ,
0
0
!y
G
R " 1 , R " x
G
0
0
where x , y are the co-ordinates of node i. The rigid body modes matrix RH required for the
G G
computation of G is formed by
'
RH"[RH RH . . . RH ]2
(17)
L
Copyright 1999 John Wiley & Sons, Ltd.
Int. J. Numer. Meth. Engng. 44, 281—303 (1999)
288
M. PAPADRAKAKIS AND Y. TSOMPANAKIS
where n is the total number of nodes of the subdomain. Similar expressions for RH may be defined
G
for plate, shell and 3-D elasticity problems. Whenever the boundary conditions of the structure
impose a restriction in some of the translational/rotational movements of a floating subdomain,
then only the remaining ‘active’ rigid-body modes of this subdomain need to be calculated and
used in the computations. The detection of the correct number of rigid-body modes, which
corresponds to the size of the null space of the stiffness matrix of floating subdomains, is a very
important issue for the FETI method. The analytical handling of the rigid-body modes requires
a slightly complicated algorithm in the case of partially supported subdomains. In the rather
extreme case of subdomains that contain non-rigid-body zero-energy modes the presented
procedure is not directly applicable and needs further refinement.
The benefits of using the analytical calculation of the rigid-body modes are twofold. The first is
related to the accuracy of the computations involved in the evaluation of the rigid-body modes
and their linear combinations c, while the second is associated with the implementation of
a reduced storage solver for the solution of the local problem. During the factorization procedure
of the basic FETI method the restraints are imposed on the last degrees of freedom of the floating
subdomains and as a result the computation of the rigid-body modes is infected by round-off
errors which become more pronounced for ill-conditioned problems. Thus, the explicit calculation of the rigid-body modes results to a more stabilized procedure which becomes more
evident in large-scale 3-D and ill-conditioned problems.
¹he re-orthogonalization procedure for treating repeated right-hand sides. A re-orthogonalization procedure has been proposed recently by Farhat et al. for extending the PCG method to
problems with multiple and/or repeated right-hand sides based on the K-conjugate property of
the search directions (d "d2 Kd "0 for mOi). The implementation of the re-orthogonalization
K
K G
technique is impractical when applied to the full problem KuH"f H due to excessive storage
requirements. However, this methodology has been efficiently combined with the FETI method
where the size of the interface problem can be order(s) of magnitude less than the size of the global
problem. Thus, the cost of re-orthogonalization is negligible compared to the cost of the
solution of the local problems associated with the matrix-vector products of F with the interface
'
direction vectors of the FETI method, while the additional memory requirements are not
excessive.
The modified search direction of the PCPG algorithm is given by
K d2F d
d "d
! G ' K> d
G
K>
K>
d2F d
G G ' G
(18)
which enforces explicitly the orthogonality condition d F d "0, i"1, . . . , m. The initial
K> ' G
estimate jG> of the solution vector of the subsequent right-hand side
[ f H> f H> ]2
H
A
of equation (15) is given by
jH>"D x#x
I
(19)
where D2F D x"D2 ( f H>!F x) and x"G (G2G )\f H>.
I ' I
I H
'
' ' '
A
Copyright 1999 John Wiley & Sons, Ltd.
Int. J. Numer. Meth. Engng. 44, 281—303 (1999)
DOMAIN DECOMPOSITION METHODS FOR PARALLEL SOLUTION
289
Solving sensitivity analysis problems with the global finite-difference approach
The hybrid solution schemes proposed in Reference 2 for treating reanalysis type of problems,
based on the global formulation and solution of the problem of equation (4), proved to be very
efficient compared with the standard skyline solver in a sequential computing environment. Their
parallel implementation, however, is hindered by the inherent scalability difficulties encountered
during the preconditioning step which incorporates forward and backward substitutions of
a fully factorized stiffness matrix. In order to alleviate this deficiency the Global Subdomain
Implementation (GSI) of a subdomain-by-subdomain PCG algorithm is implemented in this
study on the global stiffness matrix. The dominant matrix-vector operations of the stiffness and
the preconditioning matrices are performed in parallel on the basis of the same multi-element
group partitioning of the entire domain.
In order to exploit the parallelizable features of the GSI(PCG) method and to take advantage
of the efficiency of a fully factorized preconditioning matrix, the following two-level methodology
is proposed based on the combination of the global subdomain implementation and the FETI
method. The GSI(PCG) method is employed, using a multi-element group partitioning of the
entire finite element domain, in which the solution required during the preconditioning step is
performed by the FETI method, or any of its variants, operating on the same mesh partitioning of
the GSI(PCG) method. In the proposed methodology the preconditioning step of the GSI(PCG)
method
z
"C\r
(20)
K>
I K>
is performed by FETI. For the solution of this problem two methodologies, namely the
GSI(PCG)-FETI and the GSI(NCG)-FETI are proposed. The second approach is based on
a Neumann series expansion of the preconditioning step.
The GSI (PCG )-FETI method. In the GSI(PCG)-FETI method the iterations are performed on
the global level with the GSI(PCG) method, using an incomplete Cholesky factorization of the
stiffness matrix as preconditioner. Thus, the incomplete factorization of the stiffness matrix
K #*K can be written as LDL2"K #*K!E, where E is an error matrix which does not
have to be formed. Matrix E is usually defined by the computed positions of ‘small’ elements in
¸ which do not satisfy a specified magnitude criterion and therefore are discarded. For the
typical reanalysis problem
(K #*K)u"f
(21)
matrix E is taken as *K, so that the preconditioning matrix becomes the complete factorized
initial stiffness matrix: C "K . Therefore, the solution of the preconditioning step of the
I
GSI(PCG) algorithm, which has to be performed at each GSI(PCG) iteration, can be effortlessly
executed, once K is factorized, by a forward and backward substitution.
With the parallel implementation of the two-level GSI(PCG)-FETI method the preconditioning step can be solved in parallel by the interface dual method for treating the repeated solutions
required in equation (20), using the same decomposition of the domain employed by the external
GSI(PCG) method. The procedure continues this way for every re-analysis problem, while the
direction vectors of FETI are being re-orthogonalized in order to further decrease the number of
PCPG iterations of the interface problem within the preconditioning step. The solution of
equation (20) is performed n * n times via the FETI method, where n and n correspond to the
G P
G
P
number of GSI(PCG) iterations and the number of reanalysis steps, respectively.
Copyright 1999 John Wiley & Sons, Ltd.
Int. J. Numer. Meth. Engng. 44, 281—303 (1999)
290
M. PAPADRAKAKIS AND Y. TSOMPANAKIS
The GSI (NCG )-FETI method. The combination of Neumann series expansion and PCG
method on the global level for the solution of re-analysis problems in shape and topology
optimization was investigated in a previous study. In this work the Neumann series expansion is
used to improve the quality of the preconditioning step of the two-level method by computing
a better approximation to the inverse preconditioning matrix. The preconditioning matrix is now
defined as the complete stiffness matrix (K #*K), but the solution for z
of equation (20) is
K>
performed approximately using a truncated Neumann series expansion.
Thus, the preconditioned vector z
of equation (20) is obtained at each iteration by
K>
!1
(22)
z
"(I#K!1
0 *K)\K0 r
K>
K>
where the term in parenthesis can be expressed in a Neumann expansion giving
z
"(I!P#P!P#) ) ) )K\ r
(23)
K>
K>
with P"K\ *K. The preconditioned residual vector of equation (23) can now be represented by
the following series:
z
"z !z #z !z #) ) )
K>
(24)
with
z "K\ r
(25)
K>
z "K\ (*Kz ), i"1, 2, . . . ,
(26)
G\
The incorporation of the Neumann series expansion in the preconditioned step of the PCG
algorithm can be seen from two different perspectives. From the PCG point of view, an
improvement of the quality of the preconditioning matrix is achieved by computing a better
approximation to the solution of u"(K #*K)\ f during the preconditioning step, than the
one provided by the preconditioning matrix K\. From the point of view of the Neumann series
expansion, the inaccuracy entailed by the truncated series is alleviated by the conjugate gradient
iterative procedure.
In the present study the following parallel implementation of the two-level GSI(NCG)-FETI
method for solving equation (21) is used: A GSI(NCG) method is performed as described in the
previous section, in which the preconditioning step is performed according to equation (24).
Equations (25) and (26) can be solved in a parallel computing environment by the FETI method
utilizing the same decomposition of the domain adopted for the external GSI(NCG) method. The
procedure continues this way for every reanalysis problem, while the direction vectors of FETI
are being re-orthogonalized in order to keep the number of PCPG iterations of the interface
problem small. The solution of equations (25) and (26) is a typical multiple right hand-side
problem and it is performed n ;n ; n times via the FETI method, where n , n , n correspond to
H G
P
H G P
the number of terms in the Neumann series expansion, the number of GSI(NCG) iterations and
the number of re-analysis steps, respectively.
NUMERICAL TESTS
Before presenting the performance of the methods discussed for the solution of shape optimization problems it would be appropriate to demonstrate the performance of the two variants of the
Copyright 1999 John Wiley & Sons, Ltd.
Int. J. Numer. Meth. Engng. 44, 281—303 (1999)
DOMAIN DECOMPOSITION METHODS FOR PARALLEL SOLUTION
291
Figure 1. 3-D cantilever beam
basic FETI method for solving 2-D and 3-D ill-conditioned problems. All computational results
reported in this section were run on a SG Power Challenge XL shared memory machine with
R8000 processors.
The three versions of the FETI method which are tested and compared are the following: The
basic FETI method where the computation of the rigid-body modes of the floating subdomains is
performed as a by-product of the factorization procedure. The FETI-1, where the computation of
the rigid-body modes is performed explicitly via equations (16) and the position of the pseudoboundary conditions is located by an automatic procedure that avoids close positioning of the
restrained d.o.f. The FETI-2, where the rigid-body modes are computed explicitly via equations
(16) and the local problem is solved via the PCG algorithm where the preconditioner is the
completely factorized subdomain stiffness matrix stored in single precision arithmetic. In all test
cases the basic FETI method and its variants are applied with re-orthogonalization unless
otherwise stated and the lumped-type preconditioner is used for the PCPG algorithm.
Linear solution test examples
3-D cantilever beam problem. The three versions of FETI are applied first to the 3-D cantilever
beam of Figure 1 which is discretized with 31 104 solid-cube elements resulting in 105 000 d.o.f. In
order to deteriorate the conditioning of the problem the Poisson ratio is taken l"0)499. The
characteristic d.o.f. for 6, 12, 18, 24 and 36 subdomains are depicted in Table I. The convergence
tolerance for this example was taken as 3;10\.
Table I shows the performance of the methods in 6 processors when the structure is divided
with one way dissection into 6, 12, 18, 24 and 36 subdomains. The global problem size is fixed and
the number of subdomains is increased in an effort to reduce the cost of the local factorizations
and solutions and reduce storage requirements. It can be observed that an improvement in the
computing time is accomplished by the FETI-2 variant. This improvement becomes more evident
when no re-orthogonalization is performed. It was found that using more subdivisions at each
subdomain is equally advantageous for the basic FETI method and for the variant FETI-2. It can
also be seen that the benefits from the reduced computer storage demands of the FETI-2 method
are still significant when more subdomains are allocated at each processor, while the gains in
computational efficiency remain almost the same.
Copyright 1999 John Wiley & Sons, Ltd.
Int. J. Numer. Meth. Engng. 44, 281—303 (1999)
292
M. PAPADRAKAKIS AND Y. TSOMPANAKIS
Table I. 3-D cantilever beam: Characteristic d.o.f. and performance of the methods in six processors
Performance of the methods
with re-orthogonalization
Number of
subdomains
Interf
d.o.f.
Local
d.o.f.
6
3,705
18,525
12
8,151
8,892
18
12,597
6,669
24
17,043
5,187
36
25,935
2,964
Method
Iter
PCPG
time (s)
Local
time (s)
Total
time (s)
Storage
(Mb)
FETI
FETI-1
FETI-2
FETI
FETI-1
FETI-2
FETI
FETI-1
FETI-2
FETI
FETI-1
FETI-2
FETI
FETI-1
FETI-2
117
117
118
154
154
151
178
178
178
216
212
213
312
298
299
25
25
26
94
94
92
158
158
158
259
254
255
694
661
663
2,176
2,174
2,072
1,490
1,489
1,379
1,059
1,057
982
942
925
864
991
922
867
2,282
2,280
2,178
1,660
1,659
1,548
1,290
1,288
1,212
1,276
1,254
1,196
1,763
1,660
1,609
682
682
391
509
509
312
393
393
263
351
350
254
361
355
292
Figure 2. (a) 2-D plane strain problem; (b) 8 subdomains with optimal aspect ratio
2-D plane strain problem. The basic FETI method and its variants are also applied to the 2-D
plane strain problem of Figure 2(a) which is discretized with 20 385 elements resulting in 41 188
d.o.f. as shown in Figure 2(b). In order to increase the conditioning of the problem the Poisson
ratio is taken as l"0)4999, while the convergence tolerance for this example is taken as 10\.
Copyright 1999 John Wiley & Sons, Ltd.
Int. J. Numer. Meth. Engng. 44, 281—303 (1999)
293
DOMAIN DECOMPOSITION METHODS FOR PARALLEL SOLUTION
Table II. 2-D plane strain problem: number of iterations of basic FETI and
FETI-1 for the optimal decomposition to 8 subdomains
Tolerance
10E!1
10E!2
10E!3
10E!4
10E!5
10E!6
Method
FETI
FETI-1
206
206
245
245
With re-orthogonalization
278
*
*
277
297
325
*
339
FETI
FETI-1
1,165
824
1,648
1,146
No re-orthogonalization
2,324
*
*
1,512
1,717
2,131
*
2,356
* No convergence
Figure 3. 2-D plane strain cantilever beam with different support conditions
This problem is subdivided using TOPDOMDEC to 8 subdomains with optimal aspect ratio
resulting in 1254 interface d.o.f., as shown in Figure 2(b). The performance of the methods is
presented in Table II. It can be observed that the analytical computation of the rigid-body modes
in FETI-1 is beneficial even for applications with optimal subdomain aspect ratio, since the basic
FETI method cannot manage to achieve half the required tolerance even with re-orthogonalization. Without re-orthogonalization the basic FETI presents a much slower convergence rate
than FETI-1 variant.
The improved convergence behaviour of FETI-1 variant depicted in Table II is related to the
accuracy with which the rigid-body modes and the linear combination of these are computed. In
the case of the basic FETI method the computation of the rigid-body modes is infected by the
round-off errors which are developed during the factorization of the stiffness matrix of the
floating subdomains. This infection becomes more pronounced for ill-conditioned problems and
the improvement with the FETI-1 variant becomes much more evident when no re-orthogonalization is performed.
2-D plane strain cantilever beam problem. The basic FETI method and its variant are also
applied to the 2-D cantilever beam of Figure 3 which is discretized with 5000 elements with 10 200
d.o.f. Two test cases were considered for this example with two types of supports, as shown in
Figures 3(a) and 3(b), in order to investigate their influence on the convergence behaviour of the
methods. For the first test case the beam is optimally subdivided into 8 subdomains with 514
interface d.o.f. In Figure 3(b) the two supports are very close together in order to demonstrate the
Copyright 1999 John Wiley & Sons, Ltd.
Int. J. Numer. Meth. Engng. 44, 281—303 (1999)
294
M. PAPADRAKAKIS AND Y. TSOMPANAKIS
Table III. 2-D cantilever beam—Test case 1:
number of iterations of basic FETI and FETI-1
Poisson
ratio
0)3
Method
FETI
FETI-1
51
51
distant supports
116
*
116
231
FETI
FETI-1
57
57
close supports
*
*
118
240
0)49
0)4999
* No convergence
Table IV. 2-D cantilever beam—Test case 2:
number of iterations of basic FETI and FETI-1
Subdomains
2
4
8
Method
FETI
FETI-1
*
60
l"0)4999
*
241
116
240
FETI
FETI-1
*
65
l"0)499999
*
316
172
314
*No convergence
effect of large rigid-body movements associated with small strains on the convergence behaviour
of the FETI method. As can be seen in Table III the location of the supports affects the
convergence behaviour of the methods. Furthermore, the basic FETI method is much more
sensitive to the position of the restraints since the influence of the infected rigid-body modes by
the round-off errors appears to be more decisive for badly restrained structures.
Table IV depicts the performance of the methods for the cantilever beam of Figure 4 for
different number of subdivisions. The results confirm the superiority of FETI-1 over the basic
FETI method which is affected by the accuracy with which the rigid-body modes are computed
during the factorization of the subdomain stiffness matrices. This accuracy is reduced as the size
of the subdomains become larger.
Shape optimization test examples
Two benchmark shape optimization test examples are used to demonstrate the efficiency of the
proposed methods using the shape optimization code ADOPT. The first example is a long,
slender domain which has a rather dense global stiffness matrix with narrow bandwidth, whereas
the global stiffness matrix of the second example has a sparse pattern with a relatively large
Copyright 1999 John Wiley & Sons, Ltd.
Int. J. Numer. Meth. Engng. 44, 281—303 (1999)
DOMAIN DECOMPOSITION METHODS FOR PARALLEL SOLUTION
295
Figure 4. 2-D plane strain cantilever beam with different number of subdivisions
bandwidth. The performance of the proposed solution methods is investigated and compared first
in serial computing mode with the conventional direct skyline and PCG, NCG and Lanczos
solvers proposed in Reference 2. NCG is a PCG algorithm in which the preconditioning step is
performed via a Neumann series expansion of the stiffness matrix. Furthermore, the parallel
performance of basic FETI method and its variants is investigated in both types of sensitivity
analysis problems, while the two-level PCG method is applied for the GFD sensitivity analysis
test cases. The convergence tolerance for all solution methods was taken as 10\. For these test
examples plane stress conditions and isotropic material properties were assumed (elastic modulus
E"210 000 N/mm and Poisson’s ratio l"0)3).
The following abbreviations are used: Direct is the conventional skyline direct solver; PCG(t)
and Lanczos(t) are the PCG and Lanczos solvers, respectively, with the preconditioning matrix
produced via a complete, or an incomplete Cholesky factorization controlled by the rejection
parameter t. A value of t between 0 and 1 corresponds to an incomplete Cholesky preconditioner, while t"0 gives the complete factorized matrix. NCG-i is the NCG solver with i terms of
the Neumann series expansion. Finally, the two variants of the two-level methodology, combined
with the FETI-2 variant, namely the GSI(PCG)-FETI-2 and GSI(NCG)-FETI-2 methods, are
compared with the other solvers, both in serial and parallel computing modes, in the case of GFD
sensitivity analysis problems.
Connecting rod problem. The problem definition is given in Figure 5. The linearly varying
line load between key points 4 and 6 has a maximum value of p"500 N/mm. The objective is to
minimize the volume of the structure, subject to an equivalent stress limit of p "1200 N/mm.
The design model, which makes use of symmetry, consists of 12 key points, 4 primary design
variables (7, 10, 11, 12) and 6 secondary design variables (7, 8, 9, 10, 11, 12). The stress constraint
is imposed as a global constraint for all the Gauss points and as key point constraint for the
key points 2, 3, 4, 5, 6 and 12. The movement directions are indicated by the dashed arrows.
Key points 8 and 9 are linked to 7 so that the shape of the arc is preserved throughout
the optimization procedure. The problem is analysed with a fine mesh having 39 200 d.o.f.
resulting in a dense global stiffness matrix with relatively narrow bandwidth. The characteristic
d.o.f. for 4 and 8 subdomains, as depicted in Figure 6, are given in Table V. The ESA and the GFD
Copyright 1999 John Wiley & Sons, Ltd.
Int. J. Numer. Meth. Engng. 44, 281—303 (1999)
296
M. PAPADRAKAKIS AND Y. TSOMPANAKIS
Figure 5. Connecting rod: (a) initial shape; (b) final shape
Figure 6. Connecting rod: subdivision in 4 and 8 subdomains
sensitivity analysis methods are used to compute the sensitivities with perturbation value
*s"10\.
Table VI demonstrates the performance of the methods operated in a sequential mode for the
case of ESA sensitivity analysis. In the basic FETI method and its variants the operations are
Copyright 1999 John Wiley & Sons, Ltd.
Int. J. Numer. Meth. Engng. 44, 281—303 (1999)
297
DOMAIN DECOMPOSITION METHODS FOR PARALLEL SOLUTION
Table V. Connecting rod: characteristic d.o.f.
for 4 and 8 subdomains
Subdomains
Total d.o.f.
Internal d.o.f.*
Interface d.o.f.
4
8
39,200
10,038
474
39,200
5,118
1,098
* Of the larger subdomain
Table VI. Connecting rod: performance of
the methods in sequential mode with ESA
sensitivity analysis
Method
(4 subdomains1 processor)
Time
(s)
Storage
(Mbytes)
186
220
209
205
224
257
234
198
47
35
16
34
15
14
14
8
Direct skyline
NCG-1
Lanczos (1E!9)
PCG (0)
PCG (1E!9)
FETI
FETI-1
FETI-2
Table VII. Connecting rod: performance of the methods in parallel mode with
ESA sensitivity analysis
Right-handsides
Method
(4 processors)
FETI
FETI-2
Method
(8 processors)
FETI
FETI-2
1
2
3
4
5
Time
(s)
Storage
(Mbytes)
10
10
8
8
86
69
14
8
13
13
11
11
59
48
9
6
Iterations
26
26
12
12
10
10
Iterations
36
36
16
16
15
15
carried out in 4 subdomains. Following the results depicted in Table I it is expected that the
performance of the basic FETI method and its variants will be further improved for an increased
number of subdomains. Table VII demonstrates the performance of the basic FETI method and
FETI-2 variant, for the case of ESA sensitivity analysis, operated on parallel computing mode in
Copyright 1999 John Wiley & Sons, Ltd.
Int. J. Numer. Meth. Engng. 44, 281—303 (1999)
298
M. PAPADRAKAKIS AND Y. TSOMPANAKIS
Table VIII. Connecting rod: performance of the
methods in serial mode with GFD sensitivity analysis
Method
(4 subdomains1 processor)
Time
(s)
Direct skyline
NCG-1
PCG (0)
PCG (1E!9)
FETI
FETI-2
GSI(PCG)-FETI-2
GSI(NCG)-FETI-2
Storage
(Mbytes)
844
335
341
374
1,085
823
368
360
47
36
34
15
14
8
9
10
Table IX. Connecting rod: performance of the methods in parallel mode with GFD sensitivity analysis
Right-hand
sides
1
2
3
Method (4 processors)
FETI
FETI-2
GSI(PCG)-FETI-2
GSI(NCG)-FETI-2
5
26
26
26
26
time
(s)
Storage
(Mbytes)
6
6
349
264
124
122
14
8
9
10
7
7
247
185
88
87
9
6
7
8
Iterations
26
26
26
26
26
26
14
14
26
26
12
12
10
10
Method (8 processors)
FETI
FETI-2
GSI(PCG)-FETI-2
GSI(NCG)-FETI-2
4
9
9
8
8
8
8
7
7
Iterations
36
36
36
36
36
36
17
17
36
36
15
14
12
12
36
36
11
11
10
10
36
36
8
9
8
8
4 and 8 processors using 4 and 8 subdomains, respectively. Tables VIII and IX depict the
performance of the methods, for the case of GFD sensitivity analysis, operated on sequential and
parallel computing modes, respectively. In Tables VII and IX the iteration history is also depicted
for five right-hand sides which correspond to the initial finite element solution and the sensitivity
analysis for the four design variables of the problem. In the case of the two level methods the
number of iterations corresponds to the number of FETI iterations for the two global iterations
required for convergence.
Square plate problem with a central cut-out. The problem definition of this example is given in
Figure 7 where, due to symmetry, only a quarter of the plate is modelled. The plate is under
biaxial tension with one side loaded with a distributed loading p"0)1538 N/mm and the other
side loaded only with half of this value, as shown in Figure 7. The objective is to minimize the
volume of the structure subject to an equivalent stress limit of p "7)0 N/mm. The design
model consists of 8 key points and 5 primary design variables (2, 3, 4, 5, 6) which can move along
Copyright 1999 John Wiley & Sons, Ltd.
Int. J. Numer. Meth. Engng. 44, 281—303 (1999)
DOMAIN DECOMPOSITION METHODS FOR PARALLEL SOLUTION
299
Figure 7. Square plate: (a) initial shape; (b) final shape
Figure 8. Square plate: subdivision in 4 and 8 subdomains
radial lines. The movement directions are indicated by the dashed arrows. The stress constraints
are imposed as a global constraint for all the Gauss points and as key point constraints for the key
points 2, 3, 4, 5, 6 and 8. The problem is analysed with a fine mesh with 38 800 d.o.f. resulting in
a sparse global stiffness matrix with relatively large bandwidth. The characteristic d.o.f. for 4 and
8 subdomains, as depicted in Figure 8, are given in Table X. The ESA and the GFD methods are
used to compute the sensitivities with *s"10\.
Table XI demonstrates the performance of the methods operated in a sequential mode for the
case of ESA sensitivity analysis. In the basic FETI method and its variants the operations are
carried out in 4 subdomains. Table XII demonstrates the performance of the basic FETI method
Copyright 1999 John Wiley & Sons, Ltd.
Int. J. Numer. Meth. Engng. 44, 281—303 (1999)
300
M. PAPADRAKAKIS AND Y. TSOMPANAKIS
Table X. Square plate: characteristic d.o.f. for
4 and 8 subdomains
Subdomains
Total d.o.f.
Internal d.o.f.*
Interface d.o.f.
4
8
38,800
9,738
998
38,800
5,122
2,290
* Of the larger subdomain
Table XI. Square plate: performance of the methods
in sequential mode with ESA sensitivity analysis
Method
(4 subdomains1 processor)
Direct skyline
Lanczos (0)
Lanczos (1E!9)
PCG (0)
PCG (1E!9)
FETI
FETI-1
FETI-2
Time
(s)
Storage
(Mbytes)
502
524
417
514
425
486
466
414
95
63
29
61
27
43
43
26
Table XII. Square plate: performance of the methods in parallel mode with ESA
sensitivity analysis
Right-hand
sides
1
2
4
5
6
Time
(s)
Storage
(Mbytes)
method (4 processors)
FETI(no reorth)
FETI
FETI-2(no reorth)
FETI-2
67
33
67
33
60
16
60
16
Iterations
65
51
13
10
65
51
13
10
53
9
53
9
53
8
53
8
420
150
306
120
41
43
24
26
method (8 processors)
FETI(no reorth)
FETI
FETI-2(no reorth)
FETI-2
271
64
269
64
266
24
267
24
Iterations
253 219
18
14
253 220
18
14
199
11
199
11
204
11
205
11
398
92
290
70
20
23
13
16
3
and the FETI-2 variant, for the case of ESA sensitivity analysis, operated on parallel computing
mode in 4 and 8 processors using 4 and 8 subdomains, respectively. The benefits from the use of
the re-orthogonalization are also evident both in terms of PCPG iterations and computing time.
Tables XIII and XIV depict the performance of the methods, for the case of GFD sensitivity
Copyright 1999 John Wiley & Sons, Ltd.
Int. J. Numer. Meth. Engng. 44, 281—303 (1999)
301
DOMAIN DECOMPOSITION METHODS FOR PARALLEL SOLUTION
Table XIII. Square plate: performance of the
methods in serial mode with GFD sensitivity
analysis
Method
(4 subdomains)
Time
(s)
Storage
(Mbytes)
Direct skyline
NCG-1
PCG (0)
PCG (1E!9)
FETI
FETI-2
GSI(PCG)-FETI-2
GSI(NCG)-FETI-2
2,790
732
745
714
2,108
1,782
795
779
95
65
61
27
43
26
27
29
Table XIV. Square plate: performance of the methods in parallel mode with GFD sensitivity analysis
Right-hand
sides
1
2
3
Method (4 processors)
FETI
FETI-2
GSI(PCG)-FETI-2
GSI(NCG)-FETI-2
33
33
33
33
33
33
33
33
Method (8 processors)
FETI
FETI-2
GSI(PCG)-FETI-2
GSI(NCG)-FETI-2
64
64
64
64
17
17
13
14
12
12
64
64
24
24
64
64
19
18
17
16
4
Iterations
33
33
10 10
8
11 11
9
Iterations
64
64
14 12 11
14 13 11
5
6
33
33
33
33
8
7
7
7
7
6
64
64
9
10
Time
(s)
Storage
(Mbytes)
6
6
667
574
256
249
43
26
27
29
7
7
396
332
165
163
23
16
17
18
64
64
9
9
8
8
analysis, operated on sequential and parallel computing modes, respectively. In Tables XII and
XIV the iteration history is also depicted for six right-hand sides, which correspond to the initial
finite element solution and the sensitivity analysis for the five design variables of the problem. In
the case of the two level methods the number of iterations corresponds to the number of FETI
iterations for the two global iterations required for convergence.
DISCUSSION OF THE RESULTS AND CONCLUSIONS
The variant FETI-1, where the rigid-body modes are computed explicitly, increases the robustness of the basic FETI method for ill-conditioned problems. Although re-orthogonalization
ameliorates this deficiency of the basic FETI method, it is still present in 3-D large-scale and/or
2-D ill-conditioned problems, where the basic FETI method was found incapable to converge to
stricter tolerances for a class of ill-conditioned problems. The second variant FETI-2, where the
local problem is solved via the PCG algorithm, retains the characteristics of the first variant and
Copyright 1999 John Wiley & Sons, Ltd.
Int. J. Numer. Meth. Engng. 44, 281—303 (1999)
302
M. PAPADRAKAKIS AND Y. TSOMPANAKIS
operates on reduced storage requirements which are particularly important when solving largescale 3-D problems. The benefits from the reduced computer storage demands of the FETI-2
variant are retained when more subdivisions are allocated at each processor since the beneficial
effect of solving local problems with smaller bandwidths is equally exploited by the basic FETI
method and by FETI-2 variant.
In the case of sensitivity analysis with the ‘exact’ semi-analytical approach, the two FETI
variants outperform the basic FETI method, both in sequential and in parallel computing modes,
as well as the direct skyline solver, the global PCG and Lanczos methods in sequential computing
mode. In the connecting rod problem with a small bandwidth, the direct skyline solver is slightly
faster than the FETI-2 variant with re-orthogonalization, but requires 6 times more computer
storage. In the square plate problem with a relatively sparse stiffness matrix the FETI-2 method
with re-orthogonalization is 1)25 times faster than the direct skyline solver and requires 6 times
less computer storage. For this example the global PCG and Lanczos methods with incomplete
Cholesky factorization preconditioners outperform the skyline solver, both in terms of computing
time and storage. In all cases the re-orthogonalization procedure proved to be extremely
beneficial to the performance of the domain decomposition methods.
In the case of sensitivity analysis with the global finite-difference approach, the superiority of
the hybrid solution methods is more pronounced. The global PCG and Lanczos methods are 2)5
and 4 times faster than the direct solver in the dense and sparse example, respectively, and reduce
storage requirements by 65 per cent in both cases. The proposed two-level methods, namely the
GSI(PCG)-FETI-2 and GSI(NCG)-FETI-2 (with two terms in the Neumann series expansion),
appear to perform equally well, in the sequential mode, compared to the global PCG and
Lanczos methods with incomplete Cholesky factorization preconditioners, while they outperform
the direct skyline solver and the one-level FETI and its two variants by a factor ranging from 2.3
to 3.5 for both examples. In the parallel computing mode the superiority of the two-level methods
is retained over the one-level domain decomposition methods by a factor of more than 2 in both
examples considered. The benefits of using the FETI-2 variant instead of the basic FETI method
is even more pronounced in this case.
It has to be pointed out that the overall optimization time with the global finite-difference
sensitivity analysis is drastically reduced using the parallel version of the proposed two-level
GSI(PCG)-FETI methods. More specifically, using the direct skyline solver, the optimization
time with the ‘exact’ semi-analytical sensitivity analysis is 5 times faster than the corresponding
time required by the global finite-difference sensitivity analysis, whereas using the parallel
two-level PCG methods this ratio is reduced to 2. It is anticipated that for large-scale threedimensional problems and for more design variables the superiority of the proposed methods
will become even more evident. Therefore by using the two-level methods in a parallel computing environment, the performance of GFD sensitivity analysis can become competitive to the
SA type of sensitivity analysis particularly in large-scale 3-D shape optimization problems.
Bearing in mind the simplicity and ease of the implementation of the GFD compared to that
of the SA, the former approach could be a prospective alternative to other sensitivity analysis
approaches.
ACKNOWLEDGEMENTS
This work has been supported by HC & M/9203390 project of the European Union. The authors
wish to thank E. Hinton and J. Sienz for providing the shape optimization code ADOPT. The
Copyright 1999 John Wiley & Sons, Ltd.
Int. J. Numer. Meth. Engng. 44, 281—303 (1999)
DOMAIN DECOMPOSITION METHODS FOR PARALLEL SOLUTION
303
authors are also grateful to D. Harbis for his assistance with the linear solution test examples that
are presented in this work.
REFERENCES
1. C. Farhat and F.-X. Roux, ‘A method of finite element tearing and interconnecting and its parallel solution algorithm’,
Int. J. Numer Meth. Engng., 32, 1205—1227 (1991).
2. M. Papadrakakis, Y. Tsompanakis, E. Hinton and J. Sienz, ‘Advanced solution methods in topology optimization
and shape sensitivity analysis’, J. Engng. Comput., 13(5), 57—90 (1996).
3. M. Papadrakakis, ‘Domain decomposition techniques for computational structural mechanics’, in M. Papadrakakis
(ed.), Parallel Solution Methods in Computational Mechanics, Wiley, New York, 1997, pp. 87—141.
4. J. Sienz, ‘Integrated structural modelling, adaptive analysis and shape optimization’, Ph.D. ¹hesis, Dept. of Civil Eng.,
Univ. of Wales, Swansea, UK, 1994.
5. K. U. Bletzinger, S. Kimmich and E. Ramm, ‘Efficient modeling in shape optimal design’, Comput. Systems Engng.,
2(5/6), 483—495 (1991).
6. N. Olhoff, J. Rasmussen and E. Lund, ‘Method of exact numerical differentiation for error estimation in finite element
based semi-analytical shape sensitivity analyses’, Special Report No. 10, Institute of Mechanical Engineering, Aalborg
University, Aalborg, DK, 1992.
7. M. Papadrakakis and S. Smerou, ‘A new implementation of the Lanczos method in linear problems’, Int. J. Numer.
Meth. Engng., 29, 141—159 (1990).
8. C. Farhat and F.-X. Roux, ‘Implicit parallel processing in structural mechanics’, Comput. Mech. Adv., 2, 1—124 (1994).
9. M. Papadrakakis and S. Bitzarakis, ‘A dual substructure method for solving large-scale problems on parallel
computers’, Report 95-1, Institute of Structural Analysis and Seismic Research, NTUA, Athens, Greece, 1995.
10. N. Bitoulas and M. Papadrakakis, ‘An optimised computer implementation of the incomplete Cholesky factorization’, Comput. Systems Engng., 5(3), 265—274 (1994).
11. S. Bitzarakis, M. Papadrakakis and A. Kotsopoulos, ‘Parallel solution techniques in computational structural
mechanics’, Comput. Meth. Appl. Mech. Eng., 148, 75—104 (1997).
12. C. Farhat, L. Crivelli and F.-X. Roux, ‘Extending substructure based iterative solvers to multiple load and repeated
analyses’, Comput. Meth. Appl. Mech. Engng., 195—209 (1994).
13. M. Sharp and C. Farhat, ‘TOPDOMDEC—A totally object oriented program for visualisation, domain decomposition and parallel processing’, ºser’s Manual, PGSoft and University of Colorado, Boulder, USA, 1994.
14. E. Hinton and J. Sienz, ‘Studies with a robust and reliable structural shape optimization tool’, in B. H. V. Topping
(ed.), Developments in Computational ¹echniques for Structural Engineering, CIVIL-COMP Press, Edinburgh, 1995,
pp. 343—358.
Copyright 1999 John Wiley & Sons, Ltd.
Int. J. Numer. Meth. Engng. 44, 281—303 (1999)
Документ
Категория
Без категории
Просмотров
4
Размер файла
376 Кб
Теги
model, elements, mesh, rigid, finite, coarsening, analysis, objects, representation
1/--страниц
Пожаловаться на содержимое документа