# Coarsening of mesh models for representation of rigid objects in finite element analysis

код для вставкиСкачатьINTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN ENGINEERING Int. J. Numer. Meth. Engng. 44, 281—303 (1999) DOMAIN DECOMPOSITION METHODS FOR PARALLEL SOLUTION OF SHAPE SENSITIVITY ANALYSIS PROBLEMS MANOLIS PAPADRAKAKIS* AND YIANNIS TSOMPANAKIS Institute of Structural Analysis and Seismic Research, National Technical University of Athens, Athens 15773, Greece ABSTRACT This paper presents the implementation of advanced domain decomposition techniques for parallel solution of large-scale shape sensitivity analysis problems. The methods presented in this study are based on the FETI method proposed by Farhat and Roux which is a dual domain decomposition implementation. Two variants of the basic FETI method have been implemented in this study: (i) FETI-1 where the rigid-body modes of the floating subdomains are computed explicitly. (ii) FETI-2 where the local problem at each subdomain is solved by the PCG method and the rigid-body modes are computed explicitly. A two-level iterative method is proposed particularly tailored to solve re-analysis type of problems, where the dual domain decomposition method is incorporated in the preconditioning step of a subdomain global PCG implementation. The superiority of this two-level iterative solver is demonstrated with a number of numerical tests in serial as well as in parallel computing environments. Copyright 1999 John Wiley & Sons, Ltd. KEY WORDS: shape sensitivity analysis; domain decomposition methods; preconditioned conjugate gradient; multiple right-hand sides; reanalysis problems INTRODUCTION Shape optimization aims to improve the shape of a structure, defined with a number of parameters called design variables, by minimizing an objective function subject to certain constraint functions. The shape optimization algorithm proceeds with the following steps: (i) a finite element mesh is generated, (ii) displacements, stresses, frequencies, etc. are evaluated depending on the type of optimization problem, (iii) the gradients, or the sensitivities of the functions are computed by perturbing each design variable by a small amount, (iv) the optimization problem is solved and the new shape of the structure is defined. These steps are repeated until convergence has occurred. The most time-consuming part of a gradient-based optimization process is devoted to the sensitivity analysis phase. For this reason several techniques have been developed for the efficient calculation of the sensitivities in an optimization problem. The semi-analytical and the finite-difference approaches are the two most widely used types of sensitivity analysis techniques. From the algorithmic point of view the semi-analytical technique results in a typical linear solution problem with multiple right-hand sides in which the stiffness * Correspondence to: Manolis Papadrakakis, Department of Civil Engineering, National Technical University, Zografou Campus, GR 157 73 Athens, Greece. E-mail: mpapadra@central.ntua.gr Contract/grant sponsor: European Union; Contract/grant number: HC&M/9203390 CCC 0029—5981/99/020281—23$17.50 Copyright 1999 John Wiley & Sons, Ltd. Received 24 June 1996 Revised 20 February 1998 282 M. PAPADRAKAKIS AND Y. TSOMPANAKIS matrix remains the same, while the finite-difference technique results in a typical re-analysis problem in which the stiffness matrix is modified due to the perturbations of the design variables. Shape optimization problems are usually computationally intensive tasks, where 60—90 per cent of the computations are spent for the solution of equilibrium equations required for the finite element analysis and sensitivity analysis. Although it is widely recognized that hybrid solution methods, based on a combination of direct and iterative solvers for solving linear finite element equations, outperform their direct counterparts, in sequential as well as parallel computing environments, little effort has been devoted until now to their implementation in the field of structural optimization. In a recent paper by Papadrakakis et al., the performance of various iterative solution methods, based on PCG and Lanczos algorithms, in sequential computing environment was demonstrated and compared with the conventional direct skyline solver in a number of topology and shape optimization problems. In the present study two variants of the basic FETI method of Farhat and Roux are implemented for solving sensitivity analysis problems using the semi-analytical approach, while an innovative two-level parallel solution method is proposed for solving sensitivity analysis problems using the global finite-difference approach. In the two variants of the basic FETI method the rigid-body modes of the floating subdomains are computed explicitly, while in the second variant a PCG iterative solver with a strong preconditioner is also used for the solution of the local problem in each subdomain. These two variants have a beneficial effect on the robustness of basic FETI method for ill-conditioned problems, while the second variant operates under reduced storage requirements. In the present study a two-level iterative method is proposed specially tailored for solving reanalysis type of problems in general, a special case of which is the problem arising when the global finitedifference sensitivity analysis approach is used. For this type of problems the two variants of the basic FETI method are incorporated in the preconditioning step of a global subdomain implementation of the PCG method. This two-level subdomain implementation is applicable in serial and parallel computing environments resulting in a drastic improvement of computing time compared to the conventional one-level methods. SHAPE SENSITIVITY ANALYSIS Sensitivity analysis is the most important and time-consuming part of gradient-based structural optimization. Several techniques have been developed which can be mainly distinguished by their numerical efficiency and their implementation aspects. The primary objective of sensitivity analysis is to compute the derivatives of the displacement field with respect to perturbations of the primary design variables. The methods for sensitivity analysis can be divided into discrete and variational methods. In the variational approach the sensitivity coefficients can be determined by applying basic variational theorems. In this case, the sensitivities are given as boundary and surface integrals which are solved after the structure has been discretized, whereas in the discrete approach they are evaluated using the finite element equations. The implementation for discrete methods is simpler than the one for variational techniques. A further classification of the discrete methods is the following: (i) Global finite-difference method where the derivatives needed for the solution of the optimization problem are computed numerically. (ii) Semi-analytical method where the calculation of the sensitivities is performed via analytical and numerical expressions. (iii) Analytical method where the derivatives of the objective and constraint functions are obtained analytically. Copyright 1999 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng. 44, 281—303 (1999) DOMAIN DECOMPOSITION METHODS FOR PARALLEL SOLUTION 283 The decision on which method to implement depends strongly on the type of problem, the organization of the computer program and the access to the source code. The implementation of analytical and semi-analytical methods is more complex and requires access to the source code, whereas when a finite-difference method is applied the formulation is much simpler. Depending on the type of problem and the approach used, the sensitivity analysis together with the finite element analysis can take 60—90 per cent of the total computational effort required to solve the whole optimization problem. An efficient and reliable sensitivity analysis module that can take full advantage of the innovative computer architectures with parallel processing capabilities could therefore result in a considerable reduction of the overall computational effort of the optimization procedure. In the present study the emphasis is given on the application of efficient domain decomposition methods in parallel computing environments for solving the systems of the algebraic equations encountered in the two most commonly used types of sensitivity analysis, namely the global finite difference and the semi-analytical approaches. ¹he Global Finite-Difference (GFD) approach The GFD approach provides a simple way of computing the sensitivity coefficients. This method requires the solution of the linear system of equations Ku"f, where u is the displacement vector, for the original design variables s, and for each perturbed design variable sN"s#*s , where *s is the magnitude of the perturbation, usually taken in the range of I I I I 10\—10\ of the value of the design variable. The design sensitivities for the displacements *u/*s are computed using a forward difference scheme: I *u u(s #*s )!u(s ) I I *u/*s + " I I *s *s I I (1) where u(s #*s ) is evaluated by solving the following reanalysis type of problem: I I K(s #*s )u(s #*s )"f (s #*s ) I I I I I I (2) The Semi-Analytical (SA) approach The SA approach is based on the chain rule differentiation of the finite element equations Ku"f : K *f *u *K # u" *s *s *s I I I (3) *u "f * I *s I (4) which when rearranged results in K where *f *K f *" ! u I *s *s I I Copyright 1999 John Wiley & Sons, Ltd. (5) Int. J. Numer. Meth. Engng. 44, 281—303 (1999) 284 M. PAPADRAKAKIS AND Y. TSOMPANAKIS f * represents a pseudo-load vector. The derivatives of *K/*s and *f /*s are computed for each I I I design variable by recalculating the new values of K(s #*s ) and f (s #*s ) for a small I I I I perturbation *s of the design variable s , while the stiffness matrix remains unchanged throughI I out the whole sensitivity analysis. The Conventional Semi-Analytical (CSA) approach. In the CSA approach, the values of the derivatives in equation (3) are calculated by applying the forward-difference approximation *K K(s #*s )!K(s ) I I I *K/*s + " I *s *s I I (6) * f f (s #*s )!f (s ) I I *f/*s + " I I *s *s I I (7) For maximum efficiency of the semi-analytical approach only those elements which are affected by the perturbation of a certain design variable are involved into the calculations of *f/*s and I *K/*s . I The *Exact+ Semi-Analytical (ESA) approach. The CSA approach may suffer some drawbacks in particular types of shape optimization problems. This is due to the fact that in the numerical differentiation of the element stiffness matrix with respect to shape design variables the components of the pseudo-load vector associated with the rigid-body rotation do not vanish. The solution suggested by Olhoff et al. alleviates the problem by performing an ‘exact’ numerical differentiation of the elemental stiffness matrix as follows: *k L *k *a H " *a *s *s I H H I (8) where n is the number of elemental nodal coordinates affected by the perturbation of the design variable s and a is the nodal co-ordinate of the element which can be either an x- or I H a y-co-ordinate. This means that by perturbing a design variable all nodes of an element on the perturbed boundary are perturbed as well and the summation is carried out only for the perturbed nodal co-ordinates. The term *a /*s is computed using the forward-difference H I scheme while the derivative *k/*a is computed by differentiating the element stiffness matrix H expression. SOLUTION METHODS IN SENSITIVITY ANALYSIS PROBLEMS The implementation of hybrid solution schemes in structural optimization, which are based on a combination of direct and preconditioned iterative solvers, has not yet received the attention it deserves from the research community, even though in finite element linear solution problems and particularly when dealing with large-scale applications their efficiency is well documented. In a recent study by Papadrakakis et al., a class of efficient hybrid solution methods was applied in the context of shape sensitivity analysis and topology optimization problems in sequential computing environment. Copyright 1999 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng. 44, 281—303 (1999) DOMAIN DECOMPOSITION METHODS FOR PARALLEL SOLUTION 285 In this work domain decomposition methods are applied for solving the sensitivity analysis part of shape optimization problems, in both sequential and parallel computing environments, after being properly modified to address the special features of the particular type of problems at hand. The most computational intensive part of sensitivity analysis, using either semi-analytical or finite-difference sensitivity analysis approaches, is the solution of finite element equilibrium equations (4) or (2), respectively. In the first case, the coefficient matrix remains constant and only the right-hand-side vector is changing, which is the typical case for solving linear systems with multiple right-hand sides, while in the second case the coefficient matrix is slightly modified therefore a typical reanalysis problem needs to be solved. Solving sensitivity analysis problems with the semi-analytical approach One of the main shortcomings of iterative solution methods is encountered when a sequence of right-hand sides has to be processed. In such cases direct methods possess a clear advantage over iterative methods since the most computationally intensive part, associated with the factorization of the stiffness matrices, is not repeated and only a backward and forward substitution is required for each subsequent right-hand side. The Lanczos method has been used in the past for treating a sequence of right-hand sides. An efficient implementation of Lanczos method was proposed by Papadrakakis and Smerou which handles all approximations to the solution vectors simultaneously without the necessity for storing the tridiagonal matrix and the orthonormal basis. This approach, however, cannot be implemented in problems with multiple right-hand sides that are not known at the beginning of the iterative procedure. The most efficient method for this type of problems is the dual domain decomposition method FETI with the projection—re-orthogonalization scheme for handling problems with multiple or repeated right-hand sides. Subsequently, the basic FETI method, its two variants and the re-orthogonalization procedure will be briefly presented. The basic FETI method. The FETI method proposed by Farhat and Roux is considered to be very promising method for solving large-scale problems in both shared and distributed memory computer architectures due to its very satisfactory numerical and parallel scalability features. This method operates on totally disconnected subdomains, while the governing equilibrium equations are derived by invoking stationarity of the energy functional subject to displacement constraints which enforce the compatibility conditions on the subdomain interface. The augmented equations are solved for the Lagrange multipliers after eliminating the unknown displacements. The resulting interface problem is in general indefinite due to the presence of floating subdomains which do not have enough prescribed displacements to eliminate the local rigid-body modes. The application of the FETI method results in the following interface problem: KC BC2 BC 0 fC uC j " 0 (9) where KCuC"f C are the unassembled equations for all subdomains and BC"[B B ) ) ) BQ ] are signed Boolean matrices which localize the subdomain displacements on the interface and ‘s’ is the total number of subdomains. The vector of Lagrange multipliers j represents the Copyright 1999 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng. 44, 281—303 (1999) 286 M. PAPADRAKAKIS AND Y. TSOMPANAKIS interaction forces between the subdomains along their common boundary that impose the continuity of the structure. The system of equation (9) is indefinite but has a unique solution provided that the global problem is adequately restrained. Care has to be taken for the handling of the floating subdomains which are characterized by a singular stiffness matrix KH. The system of equation (9) may be explicitly written for fixed subdomains as uH"KH\( f H!BH2j) (10) uH"KH>( f H!BH2j)#RHcH (11) and for floating subdomains as where KH> is a generalized inverse of KH for floating subdomains, RH corresponds to the rigid-body modes of the floating subdomain j, and cH specifies a linear combination of these. If the stiffness matrix KH of a floating subdomain is partitioned as KH KH 2 KH KH where KH has full rank, then KH> and RH are given by \ \ KH 0 !KH KH and RH" KH>" I 0 0 KH" (12) (13) The additional equations required for determining cH are provided by the zero energy condition of the rigid-body modes RH2KHuH"0 (14) The combination of the compatibility conditions of equation (9) for the displacements uH on the interface d.o.f. of the subdomains with equations (10), (11) and (14) gives the following interface problem: F !G ' ' !G2 0 ' j c f " f H (15) A where Q Q F " BHKH>BH2, c"[c ) ) ) cQD]2, f " BHKH>f H ' H H H 2 G "[B ) R ) ) ) BQD ) RQD ], f "[ f ) R ) ) ) f QD2 ) RQD ]2 ' A and s is the total number of floating subdomains. The solution of this indefinite problem can be D performed by a Preconditioned Conjugate Projected Gradient (PCPG) algorithm. The preconditioning matrix adopted in this study is the lumped-type preconditioner which does not require any additional storage, involves only matrix vector products on the interface level and often outperforms the optimal but expensive Dirichlet preconditioner which requires additional storage equivalent to the storage of the factorized stiffness matrices of each subdomain. Copyright 1999 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng. 44, 281—303 (1999) DOMAIN DECOMPOSITION METHODS FOR PARALLEL SOLUTION 287 Two variants of the basic FETI method. In the basic FETI method the solution of the local problem of equations (10) or (11) is required associated with both internal and interface d.o.f. of each subdomain. The solution of this problem by a direct Cholesky solver is most advantageous for two reasons: (i) A solution of equations (10) or (11) has to be performed at each PCPG iteration for calculating, via fast forward and backward substitutions, the product KH>+dH, K ( j"1, . . . , s), where +dH, is the direction vector in mth iteration. (ii) The rigid-body modes K RH of the floating subdomains are computed as a by-product of the factorization procedure according to equation (13). An alternative way of treating the solution of the local problems is to use a PCG solver where the preconditioning matrix is computed by an incomplete, or even a complete factorization of the stiffness matrix but stored in single precision arithmetic. In this approach the mixed precision PCG implementation proposed in Reference 10 with a strong preconditioner is adopted in which all computations are performed in single precision arithmetic except for the matrix—vector multiplication, occurring during the recursive evaluation of the residual vector, which is performed in double precision arithmetic. The stiffness matrix is stored in double precision arithmetic but in compact form which demands only a small fraction of the memory needed by the skyline storage scheme especially for large-scale 3-D problems. The storage requirements of this mixed precision PCG are at most half the storage of the direct Cholesky (in the case of a complete factorized preconditioner since it is stored in single precision arithmetic) plus the memory for the compact stored stiffness matrix and the four auxiliary vectors of the PCG method, thus resulting in a great reduction of the memory demands. This implementation proved to be a robust and reliable solution procedure even for handling large and ill-conditioned problems, while it is computer storage-effective. It was also demonstrated to be more cost-effective than double precision arithmetic calculations for the same storage demands. By adopting this alternative treatment for the solution of the local problem at each subdomain the rigid-body modes cannot be obtained as a by-product of the factorization procedure as previously described for the basic FETI method. Bitzarakis et al. have proposed an analytical handling of the rigid-body modes for all types of finite elements which permits the incorporation of the PCG algorithm with a strong preconditioner for the solution of the local problem in the basic FETI method. Under this formulation the floating subdomains are fixed with constraints equal to the number of local singularities which is less or equal to 3 for 2-D problems and is less or equal to 6 for 3-D ones. For a 2-D problem discretized with plane stress/strain finite elements, the rigid-body modes associated with 2 translations and 1 rotation produce the following displacements at node i of subdomain j: RH"[R R R ]2 G (16) with 1 R " 0 , 0 0 !y G R " 1 , R " x G 0 0 where x , y are the co-ordinates of node i. The rigid body modes matrix RH required for the G G computation of G is formed by ' RH"[RH RH . . . RH ]2 (17) L Copyright 1999 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng. 44, 281—303 (1999) 288 M. PAPADRAKAKIS AND Y. TSOMPANAKIS where n is the total number of nodes of the subdomain. Similar expressions for RH may be defined G for plate, shell and 3-D elasticity problems. Whenever the boundary conditions of the structure impose a restriction in some of the translational/rotational movements of a floating subdomain, then only the remaining ‘active’ rigid-body modes of this subdomain need to be calculated and used in the computations. The detection of the correct number of rigid-body modes, which corresponds to the size of the null space of the stiffness matrix of floating subdomains, is a very important issue for the FETI method. The analytical handling of the rigid-body modes requires a slightly complicated algorithm in the case of partially supported subdomains. In the rather extreme case of subdomains that contain non-rigid-body zero-energy modes the presented procedure is not directly applicable and needs further refinement. The benefits of using the analytical calculation of the rigid-body modes are twofold. The first is related to the accuracy of the computations involved in the evaluation of the rigid-body modes and their linear combinations c, while the second is associated with the implementation of a reduced storage solver for the solution of the local problem. During the factorization procedure of the basic FETI method the restraints are imposed on the last degrees of freedom of the floating subdomains and as a result the computation of the rigid-body modes is infected by round-off errors which become more pronounced for ill-conditioned problems. Thus, the explicit calculation of the rigid-body modes results to a more stabilized procedure which becomes more evident in large-scale 3-D and ill-conditioned problems. ¹he re-orthogonalization procedure for treating repeated right-hand sides. A re-orthogonalization procedure has been proposed recently by Farhat et al. for extending the PCG method to problems with multiple and/or repeated right-hand sides based on the K-conjugate property of the search directions (d "d2 Kd "0 for mOi). The implementation of the re-orthogonalization K K G technique is impractical when applied to the full problem KuH"f H due to excessive storage requirements. However, this methodology has been efficiently combined with the FETI method where the size of the interface problem can be order(s) of magnitude less than the size of the global problem. Thus, the cost of re-orthogonalization is negligible compared to the cost of the solution of the local problems associated with the matrix-vector products of F with the interface ' direction vectors of the FETI method, while the additional memory requirements are not excessive. The modified search direction of the PCPG algorithm is given by K d2F d d "d ! G ' K> d G K> K> d2F d G G ' G (18) which enforces explicitly the orthogonality condition d F d "0, i"1, . . . , m. The initial K> ' G estimate jG> of the solution vector of the subsequent right-hand side [ f H> f H> ]2 H A of equation (15) is given by jH>"D x#x I (19) where D2F D x"D2 ( f H>!F x) and x"G (G2G )\f H>. I ' I I H ' ' ' ' A Copyright 1999 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng. 44, 281—303 (1999) DOMAIN DECOMPOSITION METHODS FOR PARALLEL SOLUTION 289 Solving sensitivity analysis problems with the global finite-difference approach The hybrid solution schemes proposed in Reference 2 for treating reanalysis type of problems, based on the global formulation and solution of the problem of equation (4), proved to be very efficient compared with the standard skyline solver in a sequential computing environment. Their parallel implementation, however, is hindered by the inherent scalability difficulties encountered during the preconditioning step which incorporates forward and backward substitutions of a fully factorized stiffness matrix. In order to alleviate this deficiency the Global Subdomain Implementation (GSI) of a subdomain-by-subdomain PCG algorithm is implemented in this study on the global stiffness matrix. The dominant matrix-vector operations of the stiffness and the preconditioning matrices are performed in parallel on the basis of the same multi-element group partitioning of the entire domain. In order to exploit the parallelizable features of the GSI(PCG) method and to take advantage of the efficiency of a fully factorized preconditioning matrix, the following two-level methodology is proposed based on the combination of the global subdomain implementation and the FETI method. The GSI(PCG) method is employed, using a multi-element group partitioning of the entire finite element domain, in which the solution required during the preconditioning step is performed by the FETI method, or any of its variants, operating on the same mesh partitioning of the GSI(PCG) method. In the proposed methodology the preconditioning step of the GSI(PCG) method z "C\r (20) K> I K> is performed by FETI. For the solution of this problem two methodologies, namely the GSI(PCG)-FETI and the GSI(NCG)-FETI are proposed. The second approach is based on a Neumann series expansion of the preconditioning step. The GSI (PCG )-FETI method. In the GSI(PCG)-FETI method the iterations are performed on the global level with the GSI(PCG) method, using an incomplete Cholesky factorization of the stiffness matrix as preconditioner. Thus, the incomplete factorization of the stiffness matrix K #*K can be written as LDL2"K #*K!E, where E is an error matrix which does not have to be formed. Matrix E is usually defined by the computed positions of ‘small’ elements in ¸ which do not satisfy a specified magnitude criterion and therefore are discarded. For the typical reanalysis problem (K #*K)u"f (21) matrix E is taken as *K, so that the preconditioning matrix becomes the complete factorized initial stiffness matrix: C "K . Therefore, the solution of the preconditioning step of the I GSI(PCG) algorithm, which has to be performed at each GSI(PCG) iteration, can be effortlessly executed, once K is factorized, by a forward and backward substitution. With the parallel implementation of the two-level GSI(PCG)-FETI method the preconditioning step can be solved in parallel by the interface dual method for treating the repeated solutions required in equation (20), using the same decomposition of the domain employed by the external GSI(PCG) method. The procedure continues this way for every re-analysis problem, while the direction vectors of FETI are being re-orthogonalized in order to further decrease the number of PCPG iterations of the interface problem within the preconditioning step. The solution of equation (20) is performed n * n times via the FETI method, where n and n correspond to the G P G P number of GSI(PCG) iterations and the number of reanalysis steps, respectively. Copyright 1999 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng. 44, 281—303 (1999) 290 M. PAPADRAKAKIS AND Y. TSOMPANAKIS The GSI (NCG )-FETI method. The combination of Neumann series expansion and PCG method on the global level for the solution of re-analysis problems in shape and topology optimization was investigated in a previous study. In this work the Neumann series expansion is used to improve the quality of the preconditioning step of the two-level method by computing a better approximation to the inverse preconditioning matrix. The preconditioning matrix is now defined as the complete stiffness matrix (K #*K), but the solution for z of equation (20) is K> performed approximately using a truncated Neumann series expansion. Thus, the preconditioned vector z of equation (20) is obtained at each iteration by K> !1 (22) z "(I#K!1 0 *K)\K0 r K> K> where the term in parenthesis can be expressed in a Neumann expansion giving z "(I!P#P!P#) ) ) )K\ r (23) K> K> with P"K\ *K. The preconditioned residual vector of equation (23) can now be represented by the following series: z "z !z #z !z #) ) ) K> (24) with z "K\ r (25) K> z "K\ (*Kz ), i"1, 2, . . . , (26) G\ The incorporation of the Neumann series expansion in the preconditioned step of the PCG algorithm can be seen from two different perspectives. From the PCG point of view, an improvement of the quality of the preconditioning matrix is achieved by computing a better approximation to the solution of u"(K #*K)\ f during the preconditioning step, than the one provided by the preconditioning matrix K\. From the point of view of the Neumann series expansion, the inaccuracy entailed by the truncated series is alleviated by the conjugate gradient iterative procedure. In the present study the following parallel implementation of the two-level GSI(NCG)-FETI method for solving equation (21) is used: A GSI(NCG) method is performed as described in the previous section, in which the preconditioning step is performed according to equation (24). Equations (25) and (26) can be solved in a parallel computing environment by the FETI method utilizing the same decomposition of the domain adopted for the external GSI(NCG) method. The procedure continues this way for every reanalysis problem, while the direction vectors of FETI are being re-orthogonalized in order to keep the number of PCPG iterations of the interface problem small. The solution of equations (25) and (26) is a typical multiple right hand-side problem and it is performed n ;n ; n times via the FETI method, where n , n , n correspond to H G P H G P the number of terms in the Neumann series expansion, the number of GSI(NCG) iterations and the number of re-analysis steps, respectively. NUMERICAL TESTS Before presenting the performance of the methods discussed for the solution of shape optimization problems it would be appropriate to demonstrate the performance of the two variants of the Copyright 1999 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng. 44, 281—303 (1999) DOMAIN DECOMPOSITION METHODS FOR PARALLEL SOLUTION 291 Figure 1. 3-D cantilever beam basic FETI method for solving 2-D and 3-D ill-conditioned problems. All computational results reported in this section were run on a SG Power Challenge XL shared memory machine with R8000 processors. The three versions of the FETI method which are tested and compared are the following: The basic FETI method where the computation of the rigid-body modes of the floating subdomains is performed as a by-product of the factorization procedure. The FETI-1, where the computation of the rigid-body modes is performed explicitly via equations (16) and the position of the pseudoboundary conditions is located by an automatic procedure that avoids close positioning of the restrained d.o.f. The FETI-2, where the rigid-body modes are computed explicitly via equations (16) and the local problem is solved via the PCG algorithm where the preconditioner is the completely factorized subdomain stiffness matrix stored in single precision arithmetic. In all test cases the basic FETI method and its variants are applied with re-orthogonalization unless otherwise stated and the lumped-type preconditioner is used for the PCPG algorithm. Linear solution test examples 3-D cantilever beam problem. The three versions of FETI are applied first to the 3-D cantilever beam of Figure 1 which is discretized with 31 104 solid-cube elements resulting in 105 000 d.o.f. In order to deteriorate the conditioning of the problem the Poisson ratio is taken l"0)499. The characteristic d.o.f. for 6, 12, 18, 24 and 36 subdomains are depicted in Table I. The convergence tolerance for this example was taken as 3;10\. Table I shows the performance of the methods in 6 processors when the structure is divided with one way dissection into 6, 12, 18, 24 and 36 subdomains. The global problem size is fixed and the number of subdomains is increased in an effort to reduce the cost of the local factorizations and solutions and reduce storage requirements. It can be observed that an improvement in the computing time is accomplished by the FETI-2 variant. This improvement becomes more evident when no re-orthogonalization is performed. It was found that using more subdivisions at each subdomain is equally advantageous for the basic FETI method and for the variant FETI-2. It can also be seen that the benefits from the reduced computer storage demands of the FETI-2 method are still significant when more subdomains are allocated at each processor, while the gains in computational efficiency remain almost the same. Copyright 1999 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng. 44, 281—303 (1999) 292 M. PAPADRAKAKIS AND Y. TSOMPANAKIS Table I. 3-D cantilever beam: Characteristic d.o.f. and performance of the methods in six processors Performance of the methods with re-orthogonalization Number of subdomains Interf d.o.f. Local d.o.f. 6 3,705 18,525 12 8,151 8,892 18 12,597 6,669 24 17,043 5,187 36 25,935 2,964 Method Iter PCPG time (s) Local time (s) Total time (s) Storage (Mb) FETI FETI-1 FETI-2 FETI FETI-1 FETI-2 FETI FETI-1 FETI-2 FETI FETI-1 FETI-2 FETI FETI-1 FETI-2 117 117 118 154 154 151 178 178 178 216 212 213 312 298 299 25 25 26 94 94 92 158 158 158 259 254 255 694 661 663 2,176 2,174 2,072 1,490 1,489 1,379 1,059 1,057 982 942 925 864 991 922 867 2,282 2,280 2,178 1,660 1,659 1,548 1,290 1,288 1,212 1,276 1,254 1,196 1,763 1,660 1,609 682 682 391 509 509 312 393 393 263 351 350 254 361 355 292 Figure 2. (a) 2-D plane strain problem; (b) 8 subdomains with optimal aspect ratio 2-D plane strain problem. The basic FETI method and its variants are also applied to the 2-D plane strain problem of Figure 2(a) which is discretized with 20 385 elements resulting in 41 188 d.o.f. as shown in Figure 2(b). In order to increase the conditioning of the problem the Poisson ratio is taken as l"0)4999, while the convergence tolerance for this example is taken as 10\. Copyright 1999 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng. 44, 281—303 (1999) 293 DOMAIN DECOMPOSITION METHODS FOR PARALLEL SOLUTION Table II. 2-D plane strain problem: number of iterations of basic FETI and FETI-1 for the optimal decomposition to 8 subdomains Tolerance 10E!1 10E!2 10E!3 10E!4 10E!5 10E!6 Method FETI FETI-1 206 206 245 245 With re-orthogonalization 278 * * 277 297 325 * 339 FETI FETI-1 1,165 824 1,648 1,146 No re-orthogonalization 2,324 * * 1,512 1,717 2,131 * 2,356 * No convergence Figure 3. 2-D plane strain cantilever beam with different support conditions This problem is subdivided using TOPDOMDEC to 8 subdomains with optimal aspect ratio resulting in 1254 interface d.o.f., as shown in Figure 2(b). The performance of the methods is presented in Table II. It can be observed that the analytical computation of the rigid-body modes in FETI-1 is beneficial even for applications with optimal subdomain aspect ratio, since the basic FETI method cannot manage to achieve half the required tolerance even with re-orthogonalization. Without re-orthogonalization the basic FETI presents a much slower convergence rate than FETI-1 variant. The improved convergence behaviour of FETI-1 variant depicted in Table II is related to the accuracy with which the rigid-body modes and the linear combination of these are computed. In the case of the basic FETI method the computation of the rigid-body modes is infected by the round-off errors which are developed during the factorization of the stiffness matrix of the floating subdomains. This infection becomes more pronounced for ill-conditioned problems and the improvement with the FETI-1 variant becomes much more evident when no re-orthogonalization is performed. 2-D plane strain cantilever beam problem. The basic FETI method and its variant are also applied to the 2-D cantilever beam of Figure 3 which is discretized with 5000 elements with 10 200 d.o.f. Two test cases were considered for this example with two types of supports, as shown in Figures 3(a) and 3(b), in order to investigate their influence on the convergence behaviour of the methods. For the first test case the beam is optimally subdivided into 8 subdomains with 514 interface d.o.f. In Figure 3(b) the two supports are very close together in order to demonstrate the Copyright 1999 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng. 44, 281—303 (1999) 294 M. PAPADRAKAKIS AND Y. TSOMPANAKIS Table III. 2-D cantilever beam—Test case 1: number of iterations of basic FETI and FETI-1 Poisson ratio 0)3 Method FETI FETI-1 51 51 distant supports 116 * 116 231 FETI FETI-1 57 57 close supports * * 118 240 0)49 0)4999 * No convergence Table IV. 2-D cantilever beam—Test case 2: number of iterations of basic FETI and FETI-1 Subdomains 2 4 8 Method FETI FETI-1 * 60 l"0)4999 * 241 116 240 FETI FETI-1 * 65 l"0)499999 * 316 172 314 *No convergence effect of large rigid-body movements associated with small strains on the convergence behaviour of the FETI method. As can be seen in Table III the location of the supports affects the convergence behaviour of the methods. Furthermore, the basic FETI method is much more sensitive to the position of the restraints since the influence of the infected rigid-body modes by the round-off errors appears to be more decisive for badly restrained structures. Table IV depicts the performance of the methods for the cantilever beam of Figure 4 for different number of subdivisions. The results confirm the superiority of FETI-1 over the basic FETI method which is affected by the accuracy with which the rigid-body modes are computed during the factorization of the subdomain stiffness matrices. This accuracy is reduced as the size of the subdomains become larger. Shape optimization test examples Two benchmark shape optimization test examples are used to demonstrate the efficiency of the proposed methods using the shape optimization code ADOPT. The first example is a long, slender domain which has a rather dense global stiffness matrix with narrow bandwidth, whereas the global stiffness matrix of the second example has a sparse pattern with a relatively large Copyright 1999 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng. 44, 281—303 (1999) DOMAIN DECOMPOSITION METHODS FOR PARALLEL SOLUTION 295 Figure 4. 2-D plane strain cantilever beam with different number of subdivisions bandwidth. The performance of the proposed solution methods is investigated and compared first in serial computing mode with the conventional direct skyline and PCG, NCG and Lanczos solvers proposed in Reference 2. NCG is a PCG algorithm in which the preconditioning step is performed via a Neumann series expansion of the stiffness matrix. Furthermore, the parallel performance of basic FETI method and its variants is investigated in both types of sensitivity analysis problems, while the two-level PCG method is applied for the GFD sensitivity analysis test cases. The convergence tolerance for all solution methods was taken as 10\. For these test examples plane stress conditions and isotropic material properties were assumed (elastic modulus E"210 000 N/mm and Poisson’s ratio l"0)3). The following abbreviations are used: Direct is the conventional skyline direct solver; PCG(t) and Lanczos(t) are the PCG and Lanczos solvers, respectively, with the preconditioning matrix produced via a complete, or an incomplete Cholesky factorization controlled by the rejection parameter t. A value of t between 0 and 1 corresponds to an incomplete Cholesky preconditioner, while t"0 gives the complete factorized matrix. NCG-i is the NCG solver with i terms of the Neumann series expansion. Finally, the two variants of the two-level methodology, combined with the FETI-2 variant, namely the GSI(PCG)-FETI-2 and GSI(NCG)-FETI-2 methods, are compared with the other solvers, both in serial and parallel computing modes, in the case of GFD sensitivity analysis problems. Connecting rod problem. The problem definition is given in Figure 5. The linearly varying line load between key points 4 and 6 has a maximum value of p"500 N/mm. The objective is to minimize the volume of the structure, subject to an equivalent stress limit of p "1200 N/mm. The design model, which makes use of symmetry, consists of 12 key points, 4 primary design variables (7, 10, 11, 12) and 6 secondary design variables (7, 8, 9, 10, 11, 12). The stress constraint is imposed as a global constraint for all the Gauss points and as key point constraint for the key points 2, 3, 4, 5, 6 and 12. The movement directions are indicated by the dashed arrows. Key points 8 and 9 are linked to 7 so that the shape of the arc is preserved throughout the optimization procedure. The problem is analysed with a fine mesh having 39 200 d.o.f. resulting in a dense global stiffness matrix with relatively narrow bandwidth. The characteristic d.o.f. for 4 and 8 subdomains, as depicted in Figure 6, are given in Table V. The ESA and the GFD Copyright 1999 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng. 44, 281—303 (1999) 296 M. PAPADRAKAKIS AND Y. TSOMPANAKIS Figure 5. Connecting rod: (a) initial shape; (b) final shape Figure 6. Connecting rod: subdivision in 4 and 8 subdomains sensitivity analysis methods are used to compute the sensitivities with perturbation value *s"10\. Table VI demonstrates the performance of the methods operated in a sequential mode for the case of ESA sensitivity analysis. In the basic FETI method and its variants the operations are Copyright 1999 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng. 44, 281—303 (1999) 297 DOMAIN DECOMPOSITION METHODS FOR PARALLEL SOLUTION Table V. Connecting rod: characteristic d.o.f. for 4 and 8 subdomains Subdomains Total d.o.f. Internal d.o.f.* Interface d.o.f. 4 8 39,200 10,038 474 39,200 5,118 1,098 * Of the larger subdomain Table VI. Connecting rod: performance of the methods in sequential mode with ESA sensitivity analysis Method (4 subdomains1 processor) Time (s) Storage (Mbytes) 186 220 209 205 224 257 234 198 47 35 16 34 15 14 14 8 Direct skyline NCG-1 Lanczos (1E!9) PCG (0) PCG (1E!9) FETI FETI-1 FETI-2 Table VII. Connecting rod: performance of the methods in parallel mode with ESA sensitivity analysis Right-handsides Method (4 processors) FETI FETI-2 Method (8 processors) FETI FETI-2 1 2 3 4 5 Time (s) Storage (Mbytes) 10 10 8 8 86 69 14 8 13 13 11 11 59 48 9 6 Iterations 26 26 12 12 10 10 Iterations 36 36 16 16 15 15 carried out in 4 subdomains. Following the results depicted in Table I it is expected that the performance of the basic FETI method and its variants will be further improved for an increased number of subdomains. Table VII demonstrates the performance of the basic FETI method and FETI-2 variant, for the case of ESA sensitivity analysis, operated on parallel computing mode in Copyright 1999 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng. 44, 281—303 (1999) 298 M. PAPADRAKAKIS AND Y. TSOMPANAKIS Table VIII. Connecting rod: performance of the methods in serial mode with GFD sensitivity analysis Method (4 subdomains1 processor) Time (s) Direct skyline NCG-1 PCG (0) PCG (1E!9) FETI FETI-2 GSI(PCG)-FETI-2 GSI(NCG)-FETI-2 Storage (Mbytes) 844 335 341 374 1,085 823 368 360 47 36 34 15 14 8 9 10 Table IX. Connecting rod: performance of the methods in parallel mode with GFD sensitivity analysis Right-hand sides 1 2 3 Method (4 processors) FETI FETI-2 GSI(PCG)-FETI-2 GSI(NCG)-FETI-2 5 26 26 26 26 time (s) Storage (Mbytes) 6 6 349 264 124 122 14 8 9 10 7 7 247 185 88 87 9 6 7 8 Iterations 26 26 26 26 26 26 14 14 26 26 12 12 10 10 Method (8 processors) FETI FETI-2 GSI(PCG)-FETI-2 GSI(NCG)-FETI-2 4 9 9 8 8 8 8 7 7 Iterations 36 36 36 36 36 36 17 17 36 36 15 14 12 12 36 36 11 11 10 10 36 36 8 9 8 8 4 and 8 processors using 4 and 8 subdomains, respectively. Tables VIII and IX depict the performance of the methods, for the case of GFD sensitivity analysis, operated on sequential and parallel computing modes, respectively. In Tables VII and IX the iteration history is also depicted for five right-hand sides which correspond to the initial finite element solution and the sensitivity analysis for the four design variables of the problem. In the case of the two level methods the number of iterations corresponds to the number of FETI iterations for the two global iterations required for convergence. Square plate problem with a central cut-out. The problem definition of this example is given in Figure 7 where, due to symmetry, only a quarter of the plate is modelled. The plate is under biaxial tension with one side loaded with a distributed loading p"0)1538 N/mm and the other side loaded only with half of this value, as shown in Figure 7. The objective is to minimize the volume of the structure subject to an equivalent stress limit of p "7)0 N/mm. The design model consists of 8 key points and 5 primary design variables (2, 3, 4, 5, 6) which can move along Copyright 1999 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng. 44, 281—303 (1999) DOMAIN DECOMPOSITION METHODS FOR PARALLEL SOLUTION 299 Figure 7. Square plate: (a) initial shape; (b) final shape Figure 8. Square plate: subdivision in 4 and 8 subdomains radial lines. The movement directions are indicated by the dashed arrows. The stress constraints are imposed as a global constraint for all the Gauss points and as key point constraints for the key points 2, 3, 4, 5, 6 and 8. The problem is analysed with a fine mesh with 38 800 d.o.f. resulting in a sparse global stiffness matrix with relatively large bandwidth. The characteristic d.o.f. for 4 and 8 subdomains, as depicted in Figure 8, are given in Table X. The ESA and the GFD methods are used to compute the sensitivities with *s"10\. Table XI demonstrates the performance of the methods operated in a sequential mode for the case of ESA sensitivity analysis. In the basic FETI method and its variants the operations are carried out in 4 subdomains. Table XII demonstrates the performance of the basic FETI method Copyright 1999 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng. 44, 281—303 (1999) 300 M. PAPADRAKAKIS AND Y. TSOMPANAKIS Table X. Square plate: characteristic d.o.f. for 4 and 8 subdomains Subdomains Total d.o.f. Internal d.o.f.* Interface d.o.f. 4 8 38,800 9,738 998 38,800 5,122 2,290 * Of the larger subdomain Table XI. Square plate: performance of the methods in sequential mode with ESA sensitivity analysis Method (4 subdomains1 processor) Direct skyline Lanczos (0) Lanczos (1E!9) PCG (0) PCG (1E!9) FETI FETI-1 FETI-2 Time (s) Storage (Mbytes) 502 524 417 514 425 486 466 414 95 63 29 61 27 43 43 26 Table XII. Square plate: performance of the methods in parallel mode with ESA sensitivity analysis Right-hand sides 1 2 4 5 6 Time (s) Storage (Mbytes) method (4 processors) FETI(no reorth) FETI FETI-2(no reorth) FETI-2 67 33 67 33 60 16 60 16 Iterations 65 51 13 10 65 51 13 10 53 9 53 9 53 8 53 8 420 150 306 120 41 43 24 26 method (8 processors) FETI(no reorth) FETI FETI-2(no reorth) FETI-2 271 64 269 64 266 24 267 24 Iterations 253 219 18 14 253 220 18 14 199 11 199 11 204 11 205 11 398 92 290 70 20 23 13 16 3 and the FETI-2 variant, for the case of ESA sensitivity analysis, operated on parallel computing mode in 4 and 8 processors using 4 and 8 subdomains, respectively. The benefits from the use of the re-orthogonalization are also evident both in terms of PCPG iterations and computing time. Tables XIII and XIV depict the performance of the methods, for the case of GFD sensitivity Copyright 1999 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng. 44, 281—303 (1999) 301 DOMAIN DECOMPOSITION METHODS FOR PARALLEL SOLUTION Table XIII. Square plate: performance of the methods in serial mode with GFD sensitivity analysis Method (4 subdomains) Time (s) Storage (Mbytes) Direct skyline NCG-1 PCG (0) PCG (1E!9) FETI FETI-2 GSI(PCG)-FETI-2 GSI(NCG)-FETI-2 2,790 732 745 714 2,108 1,782 795 779 95 65 61 27 43 26 27 29 Table XIV. Square plate: performance of the methods in parallel mode with GFD sensitivity analysis Right-hand sides 1 2 3 Method (4 processors) FETI FETI-2 GSI(PCG)-FETI-2 GSI(NCG)-FETI-2 33 33 33 33 33 33 33 33 Method (8 processors) FETI FETI-2 GSI(PCG)-FETI-2 GSI(NCG)-FETI-2 64 64 64 64 17 17 13 14 12 12 64 64 24 24 64 64 19 18 17 16 4 Iterations 33 33 10 10 8 11 11 9 Iterations 64 64 14 12 11 14 13 11 5 6 33 33 33 33 8 7 7 7 7 6 64 64 9 10 Time (s) Storage (Mbytes) 6 6 667 574 256 249 43 26 27 29 7 7 396 332 165 163 23 16 17 18 64 64 9 9 8 8 analysis, operated on sequential and parallel computing modes, respectively. In Tables XII and XIV the iteration history is also depicted for six right-hand sides, which correspond to the initial finite element solution and the sensitivity analysis for the five design variables of the problem. In the case of the two level methods the number of iterations corresponds to the number of FETI iterations for the two global iterations required for convergence. DISCUSSION OF THE RESULTS AND CONCLUSIONS The variant FETI-1, where the rigid-body modes are computed explicitly, increases the robustness of the basic FETI method for ill-conditioned problems. Although re-orthogonalization ameliorates this deficiency of the basic FETI method, it is still present in 3-D large-scale and/or 2-D ill-conditioned problems, where the basic FETI method was found incapable to converge to stricter tolerances for a class of ill-conditioned problems. The second variant FETI-2, where the local problem is solved via the PCG algorithm, retains the characteristics of the first variant and Copyright 1999 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng. 44, 281—303 (1999) 302 M. PAPADRAKAKIS AND Y. TSOMPANAKIS operates on reduced storage requirements which are particularly important when solving largescale 3-D problems. The benefits from the reduced computer storage demands of the FETI-2 variant are retained when more subdivisions are allocated at each processor since the beneficial effect of solving local problems with smaller bandwidths is equally exploited by the basic FETI method and by FETI-2 variant. In the case of sensitivity analysis with the ‘exact’ semi-analytical approach, the two FETI variants outperform the basic FETI method, both in sequential and in parallel computing modes, as well as the direct skyline solver, the global PCG and Lanczos methods in sequential computing mode. In the connecting rod problem with a small bandwidth, the direct skyline solver is slightly faster than the FETI-2 variant with re-orthogonalization, but requires 6 times more computer storage. In the square plate problem with a relatively sparse stiffness matrix the FETI-2 method with re-orthogonalization is 1)25 times faster than the direct skyline solver and requires 6 times less computer storage. For this example the global PCG and Lanczos methods with incomplete Cholesky factorization preconditioners outperform the skyline solver, both in terms of computing time and storage. In all cases the re-orthogonalization procedure proved to be extremely beneficial to the performance of the domain decomposition methods. In the case of sensitivity analysis with the global finite-difference approach, the superiority of the hybrid solution methods is more pronounced. The global PCG and Lanczos methods are 2)5 and 4 times faster than the direct solver in the dense and sparse example, respectively, and reduce storage requirements by 65 per cent in both cases. The proposed two-level methods, namely the GSI(PCG)-FETI-2 and GSI(NCG)-FETI-2 (with two terms in the Neumann series expansion), appear to perform equally well, in the sequential mode, compared to the global PCG and Lanczos methods with incomplete Cholesky factorization preconditioners, while they outperform the direct skyline solver and the one-level FETI and its two variants by a factor ranging from 2.3 to 3.5 for both examples. In the parallel computing mode the superiority of the two-level methods is retained over the one-level domain decomposition methods by a factor of more than 2 in both examples considered. The benefits of using the FETI-2 variant instead of the basic FETI method is even more pronounced in this case. It has to be pointed out that the overall optimization time with the global finite-difference sensitivity analysis is drastically reduced using the parallel version of the proposed two-level GSI(PCG)-FETI methods. More specifically, using the direct skyline solver, the optimization time with the ‘exact’ semi-analytical sensitivity analysis is 5 times faster than the corresponding time required by the global finite-difference sensitivity analysis, whereas using the parallel two-level PCG methods this ratio is reduced to 2. It is anticipated that for large-scale threedimensional problems and for more design variables the superiority of the proposed methods will become even more evident. Therefore by using the two-level methods in a parallel computing environment, the performance of GFD sensitivity analysis can become competitive to the SA type of sensitivity analysis particularly in large-scale 3-D shape optimization problems. Bearing in mind the simplicity and ease of the implementation of the GFD compared to that of the SA, the former approach could be a prospective alternative to other sensitivity analysis approaches. ACKNOWLEDGEMENTS This work has been supported by HC & M/9203390 project of the European Union. The authors wish to thank E. Hinton and J. Sienz for providing the shape optimization code ADOPT. The Copyright 1999 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng. 44, 281—303 (1999) DOMAIN DECOMPOSITION METHODS FOR PARALLEL SOLUTION 303 authors are also grateful to D. Harbis for his assistance with the linear solution test examples that are presented in this work. REFERENCES 1. C. Farhat and F.-X. Roux, ‘A method of finite element tearing and interconnecting and its parallel solution algorithm’, Int. J. Numer Meth. Engng., 32, 1205—1227 (1991). 2. M. Papadrakakis, Y. Tsompanakis, E. Hinton and J. Sienz, ‘Advanced solution methods in topology optimization and shape sensitivity analysis’, J. Engng. Comput., 13(5), 57—90 (1996). 3. M. Papadrakakis, ‘Domain decomposition techniques for computational structural mechanics’, in M. Papadrakakis (ed.), Parallel Solution Methods in Computational Mechanics, Wiley, New York, 1997, pp. 87—141. 4. J. Sienz, ‘Integrated structural modelling, adaptive analysis and shape optimization’, Ph.D. ¹hesis, Dept. of Civil Eng., Univ. of Wales, Swansea, UK, 1994. 5. K. U. Bletzinger, S. Kimmich and E. Ramm, ‘Efficient modeling in shape optimal design’, Comput. Systems Engng., 2(5/6), 483—495 (1991). 6. N. Olhoff, J. Rasmussen and E. Lund, ‘Method of exact numerical differentiation for error estimation in finite element based semi-analytical shape sensitivity analyses’, Special Report No. 10, Institute of Mechanical Engineering, Aalborg University, Aalborg, DK, 1992. 7. M. Papadrakakis and S. Smerou, ‘A new implementation of the Lanczos method in linear problems’, Int. J. Numer. Meth. Engng., 29, 141—159 (1990). 8. C. Farhat and F.-X. Roux, ‘Implicit parallel processing in structural mechanics’, Comput. Mech. Adv., 2, 1—124 (1994). 9. M. Papadrakakis and S. Bitzarakis, ‘A dual substructure method for solving large-scale problems on parallel computers’, Report 95-1, Institute of Structural Analysis and Seismic Research, NTUA, Athens, Greece, 1995. 10. N. Bitoulas and M. Papadrakakis, ‘An optimised computer implementation of the incomplete Cholesky factorization’, Comput. Systems Engng., 5(3), 265—274 (1994). 11. S. Bitzarakis, M. Papadrakakis and A. Kotsopoulos, ‘Parallel solution techniques in computational structural mechanics’, Comput. Meth. Appl. Mech. Eng., 148, 75—104 (1997). 12. C. Farhat, L. Crivelli and F.-X. Roux, ‘Extending substructure based iterative solvers to multiple load and repeated analyses’, Comput. Meth. Appl. Mech. Engng., 195—209 (1994). 13. M. Sharp and C. Farhat, ‘TOPDOMDEC—A totally object oriented program for visualisation, domain decomposition and parallel processing’, ºser’s Manual, PGSoft and University of Colorado, Boulder, USA, 1994. 14. E. Hinton and J. Sienz, ‘Studies with a robust and reliable structural shape optimization tool’, in B. H. V. Topping (ed.), Developments in Computational ¹echniques for Structural Engineering, CIVIL-COMP Press, Edinburgh, 1995, pp. 343—358. Copyright 1999 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng. 44, 281—303 (1999)

1/--страниц