close

Вход

Забыли?

вход по аккаунту

?

922

код для вставкиСкачать
INTERNATIONAL FOR NUMERICAL METHODS IN ENGINEERING, VOL.
39,3933-3951 (1996)
AN ITERATIVE SOLUTION SCHEME FOR SYSTEMS OF
WEAKLY CONNECTED BOUNDARY ELEMENT
EQUATIONS
K. DAVEY AND I. ROSINDALE
Department of Mechanical Engineering, UMXST, PO Box 88, Manchester M60 IQD,U.K.
SUMMARY
In this paper an iterative scheme of first degree is developed for solving linear systems of equations. The
systems investigated are those which are derived from boundary integral equations and are of the form
1 Hijxj = c i , i = 1,2, . . . , N, where Hij are matrices, x j and ci are column vectors. In addition,
N denotes the number of domains and for i # j , Hij is considered to be small in some sense. These systems,
denoted as weakly connected, are solved using first-order iterative techniques initially developed by the
authors for solving single-domain problems. The techniques are extended to solve multi-domain problems.
Novel solution strategies are investigated and procedures are developed which are computationallyefficient.
Computation times are determined for the iterative proceduresand for elimination techniques indicating the
benefits of iterative techniques over direct methods for problems of this nature.
KEY WORDS linear systems; indirect methods; boundary elements
1. INTRODUCTION
Techniques for solving systems of linear simultaneous equations can be categorized as either
direct or iterative. An iterative method can be computationally more efficient than a direct
method provided the scheme converges and the rate of convergence is sufficiently rapid. Moreover, if the solution of a system of equations is approximately known, which is often the case for
physical problems, then the number of iterations required can be small. Fast convergence with
knowledge of the solution makes iterative procedures substantially more attractive than elimination techniques.
A number of authors’P2 have discounted the use of stationary iterative schemes, such as
G a u ~ s - S e i d e l , ~and
, ~ Richardson,’ for the solution of systems whose coefficients are
formulated from integral equations. This is due to the nature of the resulting coefficient matrix in
boundary element analysis, which is unsymmetric and there is no guarantee that it is either
diagonally dominant or positive definite. However, some success has been achieved by transforming unsymmetric boundary element systems into equivalent symmetric’ or equivalent row
diagonally dominant6 systems. Unfortunately, in such cases, the cost of employing the transformation algorithm was found to be prohibitively high. In general, direct methods such as Gauss
elimination are preferred for solving systems of boundry element equations.
*
Iterative techniques with which a measure of success has been gained include the Lanczo~,’~”
Conjugate Gradient,9*3*4Bi-Conjugate Gradient,’. l o and Generalised Minimal Resid~al,’~
methods. A number of semi-iterative techniques have also been developed.’’.’3
’
CCC 0029-598 11961233933-19
0 1996 by John Wiley & Sons, Ltd.
Received I7 November 1994
Revised 3 January 1996
3934
K. DAVEY AND 1. ROSINDALE
+
The authors have examined the first-order stationary iterative scheme14 E x f k + ' )= D x ' ~ ) d,
where E and D are square matrices. It was shown that by considering matrix D embedded in a vector space15 and reducing its size with respect to a chosen matrix norm,
that convergence rates can be successfully improved. Equation ordering and parameter
matrices were used to reduce the size of D. However, only single-domain problems were
considered. In multi-domain BEM analysis, the resulting coefficient matrix is unsymmetric
and block banded in s t r ~ c t u r e . ~Techniques
,'~
which are used for solving such systems often
take advantage of the numerous blocks of zeros which occur within the system matrix to
reduce storage requirements and computation. Efficient techniques for solving blockbanded systems have been developed by CrottyI6 Tomlin," Lachat et a1.,'8*l 9
Das," Kane," and Stabrowski." Unfortunately, these techniques rely heavily on Gauss
elimination.
In this paper, systems of boundary integral equations of the form
H i j x j = ci,
i = 1,2,. . . ,N are investigated, where H i j are matrices, xi and ci are column vectors. In
addition, N denotes the number of domains and for i # j , H i j is considered to be small in
some sense. Furthermore, H i i is assumed to be a square invertible matrix. Systems of this
type result if the solution domain is divided into regions (subdomains) where large changes
in the field variable in one domain do not cause similar magnitudes of change in others.
Examples of problems of this type frequently occur in engineering and applied physics.
Heat transfer between regions through a resistive barrier is one example. Deformation of
blocks connected together by a relatively pliable material is another. The iterative schemes
considered are of the form, E i x ? + l ) = D i x p + di, where Ei and Di are square matrices
and di and x i are column vectors, with i indicating the domain operated on. A number
of solution strategies are considered. One approach is to use Gauss-Seidel for iteration
between the domains and the first-order scheme developed by the authors14 to solve
the system of equations for each domain. Moreover, a scheme that incorporates
the features of the first-order scheme and the block Gauss-Seidel procedure is considered.
This generalized scheme is shown to give higher rates of convergence for the examples
considered.
Although it is recognized in the literature that the order in which equations and domains are processed affects convergence, no simple tests are available to determine the
optimum order prior to the application of the solution procedure. Consequently, in
this paper domain order is investigated, and a method for determining the near optimum
domain order is described. The philosophy behind the domain ordering procedure is an
extension of that utilised by the authors to determine equation order for single domain
zy=
problem^.'^
The mathematical foundations upon which the first-order scheme developed by the
auth01-s'~is based, are presented. Proofs are provided which indicate that the matrix
norm minimization strategies of equation ordering and parameter matrix ~election,'~
ultimately,
reduce the number of iterations required for convergence. These proofs reinforce the validity
of the tests for the determination of near-optimum domain ordering. The strategy for parameter matrix selection is discussed in detail and the structure of the optimum parameter matrix,
i.e. the matrix that enables convergence in one iteration, is introduced. The structure of the
parameter matrix effectively limits the minimization strategies and the implications of this are
discussed.
The iterative procedures are tested on a boundary element model that is used to predict the
temperatures on the surfaces of a die in the pressure die-casting process. Solution times are
obtained for the iterative procedures and for elimination techniques.
3935
AN ITERATIVE SOLUTION SCHEME
2. BACKGROUND THEORY
The system of equations considered can be represented in matrix form
where H is a non-singular square matrix, and x and c are column matrices. It is assumed that
H has unit diagonal coefficients which can always be arranged by row scaling. Consider the
multiplication of H with a non-singular square matrix w and the decomposition of H to give
w(1- L
-
U)X = wc
(2)
where U and L are strictly upper and strictly lower triangular matrices, respectively. Equation (2)
can be manipulated to give an iterative procedure of the form
(I - WL)X(k+1) = ( I - w
+ wU)x'k' + wc
(3)
where k denotes the kth iterate.
Note that if w is set equal to I then (3) reduces to the Gauss-Seidel procedure. Furthermore, if
w is set equal to PI, where is a parameter, then (3) reduces to a procedure denoted as successive
over or under relaxation3 depending on the numerical value of the parameter. Consider replacing
x ( ~with
) x + dk)in (3), where x is the exact solution and E is the error. This gives
where M is known as the iteration matrix. A necessary and sufficient condition for the convergence of (3), for all vectors c, is that the eigenvalues of M have absolute values which are less than
i.e. the spectral radius of M , p ( M ) , is strictly less than one.
3. GENERAL CONCEPTS
In this section, the criteria for the determination of the parameter matrix w is discussed along with
the overall strategy for improving the convergence of iterative scheme (3). The basic idea is to
choose the parameter matrix to reduce the size of the matrix I - w wU subject to the
restriction that I - wL is non-singular and of a form that is easily invertible. In fact, it is desirable,
provided this later restriction is attained, and where possible, to select w so that I - w wU = 0.
This is because equation (3) then reduces to
+
+
( I - WL)X(k+l)= wc
(5)
and consequently convergence is assured in one iteration. However, it should be emphasized that
it is essential that the matrix I - wL is non-singular and of a simpler form than H , making it
computational more desirable to solve (5) rather than (1). In practice, it is impractical to select
w so that I - w + wU = 0 as this would require inversion. Consequently, w is selected to reduce
the Frobenius norm,15 /I I - w + wU lIF. The reason for using the Frobenius norm, rather than
another norm, is that it is differentiable everywhere. In the propositions that follow, the form of
w is not restricted unless explicitly stated.
Proposition 1. If H is non-singular and the Frobenius norm Ill
I - woL is non-singular.
-
wo
+ woU I I F
= 0, then
3936
K. DAVEY AND 1. ROSINDALE
Proof:
III-oO+WOUIIF=o~I-~O+WO[l=o
* I =oo(I - U )
(6)
=I - W O L = o o ( I - u - L) = o o H
(7)
Equation (6) implies that wo is non-singular since deto, = det(I - U)-'= 1. Moreover,
I - o o L is equal to the product of two invertible matrices and so is non-singular. 0
This proposition ensures that equation (5) is solvable with a unique solution. The parameter
matrix oo= (I - U)- is considered to be the optimum since I - wo + ooU = 0 and equation
(3) converges in the minimum number of iterations possible, i.e one. Note that the optimum
matrix oo is an upper triangular matrix with ones on the diagonal. This is an important
observation as it provides some restriction on the choice of w .
'
Proposition 2. If H is non-singular, there exists a positive real number 6, with II I - o
< 6, such that I - W Land w are non-singular.
+ wU \IF
Iw i jI along with the
The following proof makes use of the m -norm /I w 11 = max,
inequality IIoII, < n1/211wIIF[13] where o is an n x n matrix. Moreover, use is made of the
continuity of the determinant function det: M,(R) + R, where M,(R) is the set of n x n matrices
with real coefficients.
Proofi Consider
ll~--ollm= l l ~ - ( I - ~ ) - ' I l m
= ll(I - o ( I -
U ) ) ( I - u)-' [Im
< II(I - U)-I IIm I I I - 0 + IIm
< nl/'ll(I - U ) - ' )Im11 I - o + wUIIF
(8)
Moreover, consider
I1( I - W L ) - ( I - W O L ) I1m = I1(0- @ O W I1m
< n"211L/Im I I ( I
-
U)-'Ilm 1
1
1- a
+ WUIIF
(9)
The inequalities indicate that as 11 I - w + wU 11 + 0, w + wo and I - OL + I - woL = o o H .
In addition to this det w + det wo = det(I - U ) - = 1 and det (I - wL) + det (woH)= det wo
det H = det H.Since det is a continuous function there exists an open q-ball(&-neighbourhood)
centred at oo such that d e t o # 0. Likewise an open &'-ball centred at woH so that
det(1 - oL)# 0 exists. It immediately follows that 6' exists with 11 I - o oU IIF c al, so that
( 1 o - oo11 < c 1 with o non-singular. In addition, B z exists with 1) I - o + wU 11 < d2, so that
II(I - oL)- o o H llm < c2 with (I - oL) non-singular. Thus, specifying 6 = min(bl ,d2) completes the proof. 0
+
This proposition indicates the desirability of choosing the parameter matrix to reduce the norm
11 I - o + oU / I F since this ensures that, provided the norm is sufficiently small, I - oL is
non-singular.
3937
AN ITERATIVE SOLUTION SCHEME
+
Proposition 3. If H is non-singular, there exists a positive real number E with I( I - w oU I( F < E
so that equation (3) converges.
Proof. Convergence of ( 3 ) is assured if the spectral radius p ( M )of the iterative matrix M is less
than 1, i.e. p ( M ) < 1. Gershgorin's theorem pro-ides the inequality, p ( M ) < 11 M Il,,where
1 I mijl- Thus,
II M 11 a = max, < i < n
p ( M ) < IIMll, = ll(I - w L - I ( I - w
+ WU)//,
< ll(I - wL)-'IIm I l ( I - 0 + wU)IIm
< n1I211 ( I wL)- 11 11 (I - w + oU)IIp
(10)
-
Moreover, note that
~ ~ ( I - w L ) - ' l=~ m
ll(z-wL)-' - ( I - w o L ) - '
< Il(I-wL)-'
+(I-WoL)-'/l,
-(I-woL)-'Ilm
+ ~ ~ ( I - - o ~ L ) - l ~ l m (11)
Proposition 2 ensures that, S2 exists with 11 Z - w + wU 11 < Sz,so that II(I - wL) - woH II
< t Zwith (I - wL) non-singular. Thus, for 11 I - o + wU 11 < Szthere exists a &-ball centred at
00 I H - within which (I - wL)- is non-singular. It follows that the uniform boundedness
p r i n ~ i p l e 'can
~ be applied to inequality (11) to give Il(1 - OIL)-'I l m < P where P is a positive
number independent of 0. This inequality holds over a c3-ball centred at 00 H - ' with E~ < $.
Note, for convenience, it is assumed that the g-ball and &-ball coincide as this can always be
made possible either by decreasing Sz or increasing P. Hence, specifying E = min(dl, e 3 ) where
c3 = (n'/'P)-' provides the inequality
'
'
'
p(M)< n'/'PE
<1
(12)
Hence, the spectral radius is less than 1 and convergence is assumed. [7
Proposition 4. I f H is non-singular and (3) converges in K iterations with K > 1 then there exists
a positive E with )II - o oU 11 c E so that equation ( 3 ) converges in L iterations with L c K .
Proof. The proof of this proposition follows immediately from Proposition 1, since if the form
of w is unrestricted it can be made equal to wo = (I - U ) - l and convergence can be achieved in
one iteration. 0
+
Propositions 1-4 indicate the desirability of reducing 11 I - w + W UIIF. The following two
sections examine strategies for the reduction of 11 I - w + wU 11 with the provision that I - oL is
a lower triangular matrix.
4. EQUATION ORDERING
In this section, the idea of using equation and column ordering of the original matrix H to
reduce 11 I - w + wU / I F is examined. Note that after ordering the parameter matrix is
chosen to minimize / / I - w + wU/IF and if w = I, I - o + w U = U, and the following
inequality holds:
1
1
1- (3 + W u l l F
< 11 UIlF
n 11 u 1,
(13)
The basic idea is to use row (equation) and column ordering to place the smaller terms of H in
U to minimize either (1 U (1 or (1 U (1 m . Once the reordering is complete the parameter matrix is
selected to minimize 11 I - w + oU I I F . It should be recognized there is no guarantee that
11 I - o + wU // will be smaller with reordering but this is the most likely outcome. The authors
3938
K. DAVEY AND 1. ROSINDALE
examined the effects of reordering in Reference 14 where it was found, for the test-cases
considered, that changing rows and columns to minimize 11 U 11 gave the best result. The
mechanics of the reordering scheme is quite simple and is described in detail in Reference 14. It
should be recognized however, that the ordering procedure may not give rise to the smallest value
of 11 U 11 achievable. However, the procedure is simple to implement and computationally
efficient and therefore recommended by the authors.
5. PARAMETER MATRIX
The class of parameter matrices considered here are restricted to be of the upper bidiagonal form
w = (wij), where wii+ are arbitrary, wii = 1 with all other coefficients set to zero
(14)
Note that this matrix is of the same form as the optimum matrix w o , i.e. an upper triangular
matrix with ones on the principal diagonal. This is the only form for which I - wL is a lower
triangular matrix, thus enabling back substitution. The fact that L is strictly lower has allowed
terms wii+ i = 1,. . . ,n - 1, to be included.
The basic idea behind the selection of the parameter matrix is to allow further reduction of the
iteration matrix M. The authors demonstrated that ordering no minimize 11 U ( 1 and selecting
the parameter matrix to minimize 1) I - w + wU 1) gives the best results for single-domain
pr0b1ems.l~The coefficients wii+ are therefore chosen to minimize I( I - w wU I(F , subject to
certain constraints. In particular, the choice of o should ensure that w - l and (I - wL)-' exist.
Consider then, the minimization of 11 I - w + wU .;1 which is of the form''
+
1
n-
1 (wii+l
III-o+wUIIE=
n-1
+hii+112+
i= 1
n
C C (hij+uii+lhi+Ij)'
i=l j=i+2
(15)
Since (1 I - w + wU l I F is a strictly convex function,23it has a unique absolute minimum. This can
be determined by setting
dl11 - w
+ WUII$/~O,,+~
= 0,
u=
1,2,. . . , n - 1
(16)
This gives rise to the following set of equations:
n
(wvv+ 1
+ hue+ + 1
(hvj +
1)
I h v + l j ) h o + l j = 0,
=
1 > L *. . , n - 1
(17)
j=v+2
These equations can be used to determine w,
+j=v+2
-(huu+l
ow+1 =
1
+
2
+ 1,
i.e.
hu+ljhvj
, u = l , 2 , . . . ,n - 1
hv2+,j
j=0+2
The success of scheme (3) depends upon on the ability of the equation ordering procedure and
parameter matrix (14) to minimize sufficiently (1 I - w + oU (1 to enable convergence.
6. MULTI-DOMAIN PROBLEMS
The three domain problem (see Figure 1) is used as a vehicle for the development of the iterative
procedures. The procedures are, however, general and easily applicable to a problem with an
3939
AN ITERATIVE SOLUTION SCHEME
Figure 1. Abstract three domain problem
arbitrary number of domains. The system of equations for the three domain system can be
represented in matrix form
Hx=
[
H11
H12
H13
H21
H22
H23
H31
H32
H3]
[11[I:]
=
where x l , x2 and x3 are column vectors representing the unknowns in domains 1,2 and 3,
respectively. Note, the off-diagonal submatrices H i j ( i # j) contain the coefficients which represent
the connectivity between the domains, and in general are not fully populated, consisting of bands
of zeros.9s16From a computational viewpoint it is undesirable to store these matrices in
two-dimensional arrays. However, it will be shown that the iterative schemes will allow the
coefficients of the off-diagonal matrices to be stored in a compacted form in a single-dimensional
array or on a computer storage medium. The diagonal matrices Hii are unsymmetric and fully
populated and there is no guarantee that they are either diagonally dominant or positive definite.
These matrices can be stored in two-dimensional arrays.
7. BLOCK GAUSS-SEIDEL ITERATION
Manipulating equation (19) gives a block Gauss-Seidel procedure of the form
where k is the number of ‘block’iterations. This number represents how many times each block of
equations has been processed. Replacing x ( ~with
) x + dk),where x is the exact solution and E is
the error gives
3940
K. DAVEY AND I. ROSINDALE
where MGSis the iteration matrix. It is clear on examination of equations (20) and (21) that if H12,
and H23 are equal to zero matrices then convergence would be obtained in one (block)
iteration. In practice, this would only occur if the domains were totally disconnected. However, if
the domains are weakly connected then H12,H13 and Hz3 will be close to zero matrices, and this
would possibly result in convergence in a small number of iterations. Another consideration is the
order of the submatrices of H. For example, the scheme
H13
H22
0
I[:: ;:
( k + 1)
%,I[rrj
=[!:I-
0
H21
:
H23
X2
(22)
H;3][J
may give a different rate of convergence from (20). In general, the reordering of the matrices will
give different performances. A simple test is required to determine the optimum order in advance
of application of the iterative procedure. The philosophy behind the domain ordering approach
adopted is an extension of that utilized to determine equation order for single-domain system of
equations. Domain ordering is used to place the smaller off-diagonal submatrices in the upper
part of H, in an attempt to minimize the norm
Note, the diagonal matrices, H i i , are assumed to have their diagonal terms set to one. This
provides some normalization to combat the effects of scaling. It is apparent that each block
iteration requires the inversion of fully populated matrices H l l , HZ2and H33.A direct or an
iterative scheme could be used to invert these systems. The scheme adopted in this paper is the
iterative technique developed by the authors for solving single-domain problems14 and described
here briefly in Sections 2-5. The reason for choosing this technique is that when tested on the
single-domain boundary element systems, the scheme outperformed all other solution techniques
considered. l 4
8. EXTENSION OF FIRST-ORDER SCHEME
Consider the first row of equation (19), i.e.
Hllxl
= - H]2Xz - H13X3
+ c]
Multiplying this equation with a parameter matrix w1(14) gives
3941
AN ITERATIVE SOLUTION SCHEME
Equations (26)-(28) can be conveniently expressed in matrix form, i.e.
i = 1, 2, 3.
The parameter matrices ol,02,and o3are chosen to minimize )II - mi+ oiUiiJIF,
Thus, the same parameter matrices and equation ordering procedures are used for (29) and (20).
However, in an attempt to reduce the number of block iterations the domains could be ordered to
minimize
However, note that
0
0
~l~['"'~Olull
I - w2
0
+
0
W2U22
1--3+03U33
1'1
so minimization can be achieved by domain ordering to minimize equation (23) and selecting the
parameter matrices to minimize 11 I - oi -twiUii11 F, i = 1,2,3.
3942
K. DAVEY AND 1. ROSINDALE
The differencebetween schemes (29) and (20) is that, for scheme (29), the coefficients of x are
updated by successive corrections, whereas for scheme (20)they are corrected on a block by block
basis. This observation suggests that a more general approach could be devised which incorporates the two procedures.
9. GENERALIZED MULTI-DOMAIN ITERATIVE PROCEDURE
An iterative scheme for the solution of equation (19) which incorporates the block Gauss-Seidel
and the extended stationary scheme is as follows:
( I - 0 1 L l l ) x 1( k . P + l ) - (I - 01
+
(kp)
~ 1 U l l ) x l
-
C O ~ H ~ ~X ~F1 H’ ~1 3)~ : ” ’ ”+’ ~
1
~(32)
1
wherep=0,1,2, . . . , P - l a n d x l ( k + 1.0) - x ( 1k , P ) ,
+ w 2 ~ 2 2 ) (xk .2d - w z H 2 1 x (1k + 1 . 0 ) - 0 2 H z 3 X ( S * 0 ) + w 2 c z
- x (2k . Q ) ,
where q = 0, 1,2,. . . ,Q- 1 and x (2k + 1 . 0 ) (k.q+l)
=
Q , r + 1)
- (I - w 3
(I - wzL22)xz
(I - w 3 L 3 3 ) x 3
(I - 0
2
(33)
+ 0 3 U 3 3 ) x 3 ( k . r ) - w 3 H 3 1 X 1 ( k + l , O ) - 0 3 H 3 2 X (2k + 1 , 0 ) + w 3 c 3
(34)
- X (3k , R ).
where I = 0, 1,2,. . . ,R - 1 and x (k+1.0)
3
The integers P, Q and R indicate the number of iterations performed on each individual
block of equations (inner iterative loop), and k denotes the block iterate (outer iterative loop). In the first instance x ? * ~ ) ,
and ~(30”) are set to initial trial values.
Note, If P, Q and R are set equal to one the extended first-order procedure is
obtained. If these integers are not specified and equations (32)-(34) are allowed to converge then the block Gauss-Seidel procedure is obtained. An error analysis of these equations
yields
E ( k . ~ + l=
) M ~ & ~ . P)
1
E(k.q+1 )
2
=M
E(k,d
2 2
-
+
( I - o l L , 1 ) - ’ w l ( H 1 2 E : k ~H
o )1 3 dkV0)
3
(k+1,0) +
(I -w z ~ z z ) - ’ w z ( H 2 1 & 1
H
&(k.O)
23 3
(35)
(36)
where Mi = (I - wiLii)-’ (I - oi + wi Vii), i = 1,2, or 3. Performing, P,Q, and R iterations on
equations (32)-(34), respectively, gives
3943
A N ITERATIVE SOLUTION SCHEME
Equations (38)-(40) can be expressed in matrix form
I
\i=o
0
0
I
0
1
where Pi = (I - uiLii)-', i = 1,2, or 3. Thus, for the iterative procedure defined by (32)-(34) the
domains should be ordered to minimize
IV
,I
0
P
1
-
[il Mi)
PlwlH12
i=O
MP
\I
0
li,
Again minimization can be achieved by domain ordering to minimize equation (23) and selecting
the parameter matrices to minimize I/I - mi + uiUiiI/F, i = 1,2,3.
10. ELIMINATION TECHNIQUES
A direct method commonly employed for solving linear systems of algebraic equations is Gauss
elimination. The technique involves using row operations to reduce the system matrix H into an
upper triangular form. The solution is then obtained by simple back substitution. Partial (row)
pivoting is usually employed in the forward elimination stage to enhance the stability of the
scheme. The operations count for this procedure is 0(2n3/3)floating point operations (flops) for
the forward elimination, plus O ( n 2 )flops for the back substitution, where n is the number of
unknowns. A direct technique often employed in cases where the right-hand side vector c is
changing, but the coefficient matrix H is constant, is LU decomposition. This procedure involves
factorizing the coefficient matrix H into a lower triangular matrix L, and an upper triangular
matrix U.If Gout's methodz4is used to perform the factorization, then the coefficients of H can
be replaced by those of L and U , and thus no extra storage penalty is incurred. The factorized
system can be written in matrix form
HX = (LU)X= L ( U X )= Ly = c
(43)
3944
K. DAVEY AND 1. ROSINDALE
Once the factorization has been performed the solution can be obtained by first solving Ly = c
for the vector y, then U x = y for x. y is obtained by forward substitution; x by back-substitution.
The factorization requires 0(2n3/3) flops, and the forward and back-substitutions 0(n2) flops
each. Thus, for a reasonably sized linear system, consisting of n equations and n unknowns, Gauss
elimination and LU decomposition are approximately equivalent in terms of the number of
operations required to obtain the solution. For a multi-domain problem such as that given by
equation (19), an efficient procedure would be to factorize each diagonal block Hii into an
equivalent LU form, and then to use block Gauss-Seidel for iteration between the domains. This
would require 0 ( 2 n : / 3 + 2n:/3 + 2n:/3) flops for the factorization, where ni is the number
of unknowns in domain i, and where n = nl + n2 + n3. Each block iterative sweep would
then require 0(2n: 2n$ + 2 n i ) flops for the forward plus back substitutions, plus
0(4n,n2 + 4nln3 + 4n2n3) flops for the multiplication of the off-diagonal blocks.
+
11. COMPUTATION AND STORAGE
This section is concerned with the implementation of the multi-domain solution schemes
described in the previous sections, namely the block Gauss-Seidel, extended first-order, and
generalized multi-domain procedures. Efficient methods for computation and storage and discussed.
In terms of computation it is expedient to formulate matrices ( I - wiLii)and (I - oi + wi Uii),
i = 1,2,. . . ,N prior to commencing iteration as this avoids the need to repeatedly formulate the
coefficients of these matrices within the iterative cycle. In practice the coefficients of the diagonal
blocks Hii can be overwritten by these coefficients so as to minimize core storage requirements.
For the off-diagonal block matrices wiHij(i# j ) there are two options:
(i) wiHij(i# j ) can be formulated explicitly, prior to commencing iteration.
(ii) wiHij(i# j ) can be formulated implicitly, once per block iteration, as and when
required. In this case it is expedient to first formulate Hijxy’.This vector can then be multiplied
by mi.
It is a simple matter to show that, it is only expedient to formulate wiHij(i# j ) prior to iteration
if the number of block iterations required to obtain the solution exceeds the number of unknowns
associated with domain j . Therefore, in most practical situations option (ii) will be the most
efficient scheme. Also, formulating wiHij(i# j ) explicitly, prior to iteration, may also reduce the
sparsity of the system, as terms may be introduced into positions in Hij(i # j ) previously occupied
by zeros. This can be avoided if wiHij(i# j ) is formulated implicitly, within the iterative loop.
For large problems, it may be impractical to operate solvers entirely in core as this may result
in a large core memory requirement. Utilizing a backing store, such as a disk, to store coefficients
greatly reduces this requirement. The block structured coefficient matrix lends itself very nicely to
efficient storage schemes. One efficient method is to hold the fully populated diagonal blocks in
core, but to write the coefficients from the off diagonal blocks Hij(i # j ) to disk file. These off
diagonal block coefficients can be first assembled into a one-dimensional array in core, then
written off to sequential file. For a N domain problem it is advantageous to use N sequential files
as this ensures that the off diagonal block coefficients are easily accessible. Advantage can be
taken of the sparseness of these blocks so that only non-zero coefficients need to be stored. If core
memory is limited then a number of the fully populated diagonal blocks can be written off to file.
The minimum core memory requirement would be an array of size capable of holding the largest
diagonal block. If core memory is severely restricted it is possible to further partition the diagonal
blocks to obtain submatrices of manageable size.
AN ITERATIVE SOLUTION SCHEME
3945
12. RESULTS AND DISCUSSION
In this section, a number of examples are tested to ascertain which of the iterative schemes
considered is the most efficient, and to demonstrate the effect of domain ordering on convergence
rates. The examples considered are three-dimensional potential problems. A boundary element
model which models the thermal behaviour of the die blocks and component within the casting
process is utilized.” This boundary element model comprises the simultaneous solution of both
the integral formulations of Laplace’s and Poisson’s equation, and yields the ‘steady-state’ or
time-averaged temperatures on the surfaces of the die and component over a typical casting cycle.
Consider a simple two-slide die as shown in Figure 2. Convective type boundary conditions of the
form
q(x) = E( Ti - T )
(44)
apply over the boundaries of subregions RI and Rz,the die blocks, and R3 the component. Ti is
the steady state surface temperature on domain i, T is the temperature of the surrounding
medium, and h is the time averaged heat transfer coefficient between the surface and the
surrounding medium. Incorporation of the boundary conditions into the integral formulations of
Laplace’s and Poisson’s equations and discretization of the cavity, interfaces, outer die boundaries, component, and cooling channels with surface elements yields a set of linear simultaneous
equations. These equations can be written in matrix form as equation (19), with vectors xl, x2,
and x3 containing the unknown temperatures at nodes on the surfaces of subregions R1,Rz,and
R3, respectively.
Four dies are modelled, incorporating two types of elements. The die cavity, component, and
outer surfaces of the die are meshed with linear triangular elements and linear pipe elements are
used to represent the cooling channels within the die blocks.’6 A fine mesh is used on the cavity
surfaces and the component with coarser meshes on the outer surfaces of the dies and the cooling
channels. Typical values are used for the heat transfer coefficients. The four examples considered
are 2, 3,4, and 9 domain problems, respectively. The 2-zone model generates 464 equations, the
3-zone model 1039, the 4-zone model 1287,and the 9-zone model 7718. Figure 3 shows the surface
meshes for the four test components. The tests were carried out on a SUN SPARCstation IPX
computer.
The test cases are considered for each of the models to ascertain which is the computationally
most efficient scheme. For each of the cases the initial trial vector do)is set to 0 which is
Figure 2. Two-slide die. Key to superscripts: M u t e r die boundary; c c 4 i e cooling channel; c-die cavity/casting
interface; d d i e block interface
3946
K. DAVEY AND I. ROSINDALE
Figure 3. Surface meshes for the four test components: (a) model 1; (b) model 2; (c) model 3; (d) model 4
sufficiently remote from the exact solution. The iterative procedure is terminated if the residual
norm3 is less than OOOO1. The results are presented in Tables 1-V. In these tables the number of
'block' iterations are given; this number represents how many times each block of equations has
been processed. The total number of iterations given is the sum of the total number of iterations
performed on the equations within individual blocks. The computation time is also given. The
tests were performed entirely in core.
In the first set of tests the following test cases are considered
(1) Extended-first order scheme.
(2)-(8) Generalized multi-domain iterative procedure. Test cases (2)-(8) are variations of the
generalized multi-domain iterative procedure, distinguished by the maximum number
of iterations performed on each block of equations. The maximum number of iterations
performed on each block is set equal to i for each test case (i), i = 1,2,. . . ,8. However,
it should be recognized that for a particular block the iterative procedure may converge
prior to attaining the prescribed maximum number of iterations so i will be less in this
case.
3947
AN ITERATIVE SOLUTION SCHEME
Table I. Performance data for model 1
Test case
1
2
3
4
5
6
7
8
9
10
11
Block iterations
Total iterations
CPU time (s)
46
28
21
18
17
16
16
15
15
15
92
110
125
141
160
174
191
200
282
34.94
28.36
26.82
27.34
30.02
31.45
33.76
34.31
42.79
27.25
51.50
-
~
Note: Model 1. Number of zones = 2, Zone 1
equations. Total number of unknowns = 464
= 270 equations,
Zone 2
=
194
Table 11. Performance data for model 2
Test case
1
2
3
4
5
6
7
8
9
10
11
Block iterations
Total iterations
CPU time (s)
46
30
25
24
22
21
21
21
21
21
138
180
224
28 1
319
355
395
430
560
155.27
131.33
135.10
147.71
154.57
163.04
173.79
185.35
225.34
164.40
587.68
-
-
Note: Model 2. Number of zones = 3, Zone 1 = 265 equations, Zone 2 = 295
equations, Zone 3 = 479 equations. Total number of unknowns = 1039
Table 111. Performance data for model 3
Test case
1
2
3
4
5
6
7
8
9
10
11
Block iterations
Total iterations
CPU time (s)
39
25
22
21
21
20
20
20
20
20
156
200
263
329
398
440
486
529
673
181.61
146.34
152.83
167.27
187.74
195.95
210.95
225.25
281.91
221.27
1135.64
-
-
Note: Model 3. Number of zones = 4, Zone 1 = 306 equations, Zone 2 = 309
equations, Zone 3 = 510 equations, Zone 4 = 162 equations. Total number of
unknowns = 1287
3948
K. DAVEY A N D I. ROSINDALE
Table IV. Performance data for model 4
Test case
Block iterations
Total iterations
CPU time (s)
55
32
28
25
25
23
21
21
21
21
495
537
677
804
955
1061
1175
1293
1685
6678-07
4757.15
4658.21
4710.22
5369.16
5429.92
5388.69
5940.27
6175.21
1312953
236575-31
1
2
3
4
5
6
7
8
9
10
11
-
-
Note: Model 4. Number of zones = 9, Zone 1 = 724 equations, Zone 2 = 668
equations, Zone 3 = 531 equations, Zone 4 = 530 equations, Zone 5 = 182
equations, Zone 6 = 294 equations, Zone 7 = 1651 equations, Zone 8 = 1645
equations, Zone 9 = 1493 equations. Total number of unknowns = 7718.
Table V. Influence of domain order on convergence rates
~
Model
1
2
3
4
Total
cpu
Domain order
Block
iterations
iterations
time (s)
2, 1
L2
1, 2,3
3, 2, 1
2, 1,3
L3,2
1,4,2, 3
3,2,4, 1
2,4, 3, 1
4, 1,2,3
9,5, 1,6,2,4,3,8,7
7, 8, 3,4,2,6, 1, 5,9
1,2,3,4,5,6,7,8,9
9,8,7,6,5,4,3,2, 1
28
28
30
30
30
30
25
26
26
25
32
34
33
33
110
111
180
180
180
180
200
207
207
200
537
563
55 1
553
28.36
29.26
131.33
131.54
131.47
131.48
146.34
152-97
151.17
147.36
4757-15
5037.05
4898.74
4924.14
~~
II'//m
~
~
d
F
8.12
6.32
11.76
8.91
17-44 11.12
2858
2034
21.27
13.98
24.75
17.48
15.36
18.44
3641
28-48
30.96
27.16
17.58
20.50
13667
92-33
229.92 130.31
167.41 105.42
197.94 11573
(9) Block Gauss-Seidel iteration.
In addition, two direct techniques are considered, namely:
(10) LU decomposition with forward and backsubstitution. Each diagonal block matrix H i i is
factorized into an LU form using Crout's algorithm.24 Partial (row) pivoting is employed
to ensure stability of the decomposition procedure. Block Gauss-Seidel is then used for
iteration between the domains. Each block iteration requires one forward and one
backsubstitution for each diagonal block Hii.
(11) Gauss elimination with back substitution. Partial (row) pivoting is incorporated in the
forward elimination stage to enhance stability. This scheme is considered for the system
matrix H as a whole.
3949
AN ITERATIVE SOLUTION SCHEME
The domain solution order chosen for tests (1)-(10) is the one which gives rise to the minimum
oo-norm for each of the models. It can be seen on examination of Tables I-IV that for the
4 examples considered convergence is obtained for test cases (1)-(lo), and that in each case the
CPU time is significantly lower than that for Gauss elimination. Test case (2), the generalized
multi-domain procedure with the maximum number of iterations per block set to two, gave the
shortest CPU times for models 2 and 3, the 3 and 4 zone models. Test case (3) gave the shortest
CPU times for models 1 and 4, the 2 and 9 zone models. It is of interest to note that although
fewer iterations were performed for test case (1) than for test cases (2) and (3) this scheme is
computationally less efficient. The reason for this stems from the fact that test case (1) requires
significantly more block iterations. This means that additional calculations are required with the
,
k ranging from 1 to the number of block
multiplication of the off-diagonal blocks with x ( ~ )with
iterations. Compared to Gauss elimination test case (3)was some 1.9 times faster for model 1 (464
equations) and 508 times faster for model 4 (7781 equations); test case (2) was 4 5 times faster for
model 2 (1039 equations) and 7.8 times faster for model 3 (1287 equations). Compared to LU
decomposition test case (3) was marginally faster for model 1 and 2.8 times faster for model 4; test
case (2) was 1.3 times faster for model 2 and 1.5 times faster for model 3. Reasoning as to why
schemes (2) and (3) outperformed the other iterative schemes can be obtained from consideration
of the error propagation at any stage during iteration. To illustrate this consider the three-zone
problem and consider the correction of a particular block of variables, say, for example, the first
block, xl. Equation (32) defines the iterative correction for this block. If iteration is successful
then the error c1 in x1 will be reduced. The errors in x2 (c2) and x3 (c3)are unchanged whilst the
correction of x 1 takes place. It is apparent that there is a limit to the amount by which x1 can be
corrected without further correcting c2 and c3. This can be seen more clearly by considering
equation (38), i.e.
As P + ca then MP --* 0 and
(pi1
MI) (I
- O1L1 ] ) - w1 + H,'
i=O
The first term on the right-hand side of equation (38), M P E : ~ . ' )represents
,
the error in x1 which
indicates how well the first block of equations are being satisfied.
(1;Id M i ) ( Z - w l L 1 1 ) - 1 w l ( H 1 2 ~ : k+' oH) 1 3E("') ) represents the error in x1 induced by the
errors in x2, x 3 , i.e. c:kVo)and c:kso),It is clear that this error does not necessarily reduce with an
increase in K ;in fact, it converges to H ; ~ ' ( H ~ ~ E : ~ S O ' H 1 3 3 1.
Once x1 has been corrected the updated values are used immediately in the subsequent
correction of x2 (33) and x3 (34). If the errors in x1 have been markedly reduced then this will be
beneficial in the subsequent correction of x2 and x3. The results in Tables I-IV suggest that the
first two-three iterations on a particular block of equations yield a relatively large correction of
the variables and that the rate of correction reduces with subsequent iterations. Thus, for
examples of this nature, constraining the number of iterations per block of equations is beneficial.
In the examples tested a restriction of two-three iterations per block was the most efficient. This
allows a much larger correction of the variables than a single iteration, whereas little benefit is
gained by performing more iterations.
+
~
(
~
9
'
)
3950
K. DAVEY AND I. ROSINDALE
The second set of tests are performed to indicate the influence of domain ordering on
convergence rates. Test case (2), the generalized multi-domain procedure with the maximum
number of iterations per block set to 2, is utilized for these test as this scheme gave rise to the
shortest CPU times for 2 out of the 4 examples considered previously. The relevant norms are
calculated via equation (23).For the two-zone model both possible zone ordering sequences are
investigated, whereas for the three-, four-, and nine-zone models the zone orders which give both
the lowest and highest values of the norms are used. In addition, two zone orders which give
intermediate values of the norms are examined. The results are presented in Table V.
It can be seen on examination of this table that for the two- and three-zone examples ordering
to minimize the co-norm gave the smallest value for this norm, as expected, in all the examples
tested. This minimized value was smaller than the value obtained when the domains were ordered
to maximize the co-norm and also smaller than the values obtained when domain orders yielding
intermediate norm values (three-zone model) were employed. However, the differences in norm
size did not give rise to corresponding differences in the performances of the test cases considered.
For the two-zone model the benefit gained by ordering to minimize the a-norm was marginal
(one iteration). In fact, this is expected since it is a simple matter to prove that for a two-domain
problem ordering will have little influence. This is because in this case ordering will not alter the
eigenvalues of the iteration matrix. For the three-zone model ordering made no difference to the
convergence of the iterative scheme. For the four-zone model the domain orders which gave rise
to the two lowest values of the co-norm converged in the least number of iterations and the
shortest CPU times. Inspection of the oo-norm for these two cases reveals that ordering to
minimize this norm did in fact give the smallest value for this norm, however the difference
between the two norms was relatively small. For the nine-zone model ordering to minimize the
oo-norm lead to convergence in the least number of iterations and the shortest CPU time.
The results obtained for the two-and three-zone models indicate that in cases where the
number of zones is small there is little scope for improvement and so block reordering will not
necessarily improve convergence rates. However, the results obtained for the four and nine-zone
models are more encouraging with smaller norms giving improved convergence rates. It is
expected that the more domains the greater the importance of domain ordering. In many respects,
an N domain problem has similar characteristics to those of a single-domain problem with
n equations. Both the two-domain and two-equation problem are unaffected by ordering. As
n increases equation ordering can be successfully used to improve convergence. Thus, since an
N domain proble reduces to an n equation problem as the number of unknown in each domain
approaches one, it is expected that domain ordering is more useful for larger values of N .
13. CONCLUSIONS
This paper is concerned with the development of stationary iterative schemes of first degree for
solving linear systems of equations derived from boundary integral equations and of the form
H i j x j = ci, i = 1,2, . . . , N , where N denotes the number of domains, H i j are matrices,
x j and ciare column vectors. In addition, for i # j , H i j is considered to be small in some sense. The
first-order iterative techniques initially developed by the authors for solving single-domain
boundary element systems have been adapted and extended to solve multi-domain problems. An
iterative scheme incorporating the block Gauss-Seidel and the extended first-order schemes and
restricting the iterations per block of equations in the range 2-3 was shown to be the most
efficient scheme. It has been shown that the order in which the domains are considered becomes
more important as the number of domains increases, and strategies for determining optimum
domain order have been developed. In one particular example, a system with 7718 unknowns
AN ITERATIVE SOLUTION SCHEME
395 1
converged in a time which was some 51 times faster than that taken for Gauss elimination, and 2.8
times faster than that taken for LU decomposition.
ACKNOWLEDGEMENTS
The authors are grateful to EPSRC and Dynacast International Ltd. for providing financial
assistance to carry out this work which forms part of a project to develop Boundary and Finite
Element programs to analyse the pressure die-casting process.
REFERENCES
1. W. J. Mansur, F. C. Araujo and J. E. B. Malaghini, ‘Solution of BEM systems of equations via iterative techniques’,
Int. j . numer. methods eng., 33, 1823-1841 (1992).
2. R. L. Mullen and J. J. Rencis, ‘Iterative methods for solving boundary element equations’, Comput. Struct., 25,
713-723 (1987).
3. A. Jennings, Matrix Computation for Engineers and Scientists, Wiley, New York, 1977.
4. L. A. Hageman and D. M. Young, Applied Iterative Methods, Academic Press, New York, 1981.
5. L. F. Richardson, ‘The approximate arithmetical solution by finite differences of physical problems involving
differential equations with an application to the stresses in a masonry dam’, Philos. Trans. Roy. SOC.London A, 210,
307-357 (1910).
6. T. J. Urekew and J. J. Rencis, ‘The importance of diagonal dominance in the iterative solution of equations generated
from the boundary element method‘, Int. j. numer. methods eng., 36, 3509-3527 (1993).
7. C. Lanczos, ‘An iterative method for the solution of the eigenvalue problem of linear differential and integral
operators’, Nat. Bur. Std. J . Res., 45, 255-282 (1950).
8. C. Lanczos, ‘Solution of systems of linear equations by minimized iterations’, Nat. Bur. Std. J . Res., 49,33-53 (1952).
9. J. H. Kane, D. E. Keyes and K. Guru prasad, ‘Iterative solution techniques in boundary element analysis’, Int.
j . numer. methods eng., 31, 1511-1536 (1991).
10. R. Fletcher, ‘Conjugate gradient methods for indefinite systems’, Lecture Notes in Mathematics, Vol. 506, Springer,
Berlin, 1976.
11. Y. Saad and M. H. Schultz, ‘GMRES a generalised minimal residual algorithm for solving nonsymmetric linear
systems’, SIAM J . Sci. Statist. Cornput., 7 , 856869 (1986).
12. J. A. Bettess, ‘Economical solution techniques for boundary integral matrices’, Int. j . numer. methods eng., 19,
1073-1077 (1983).
13. M. Rezayat, ‘Fast decomposition of matrices generated by the boundary element method’, Int. j . numer. methods eng.,
33, 1109-1118 (1992).
14. K. Davey and I. Rosindale, ‘An iterative solution scheme for systems of boundary element equations’, Int. j. numer.
methods eng., 37, 1399-1411 (1994).
15. G . H. Golub and C. F. Van loan, Matrix Computations, 2nd edn, The Johns Hopkins University Press, London, 1989.
16. J. M. Crotty, ‘A block equation solver for large unsymmetric matrices arising in the boundary integral equation
method’, I n t . j. numer. methods eng., 18, 997--1017 (1982).
17. G. R. Tomlin, ‘Numerical analysis of continuum problems in zoned anisotropic media’, Ph.D. Thesis, University of
Southampton, 1972.
18. J. C. Lachat, ‘A further development of the boundary integral technique for elastostatics’, Ph.D. Thesis, University of
Southampton, 1975.
19. J. C. Lachat and J. 0. Watson, ‘Progress in the use of boundary integral equations, illustrated by examples’, Cornput.
Methods Appl. Mech. Eng., 10, 273-289 (1977).
20. P. C. Das, ‘A disc based block elimination technique used for the solution of nonsymmetric fully populated matrix
systems’, Proc. Int. Symp. on Recent Developments in BEM, University of Southampton, 1978, pp. 391-404.
21. J. M. Kane, ‘Optimisation of continuum structures using a boundary element formulation’, Ph.D. Dissertation,
University of Connecticut, 1986.
22. M. M. Stabrowski, ‘A block equation solver for large unsymmetric linear equation systems with dense coefficient
matrices’, Int. j. numer. methods eng., 24, 289-300 (1987).
23. M. J. D. Powell, Approximation Theory and Methods, Cambridge University Press, Cambridge, 1981.
24. W. H. Press, B. P. Flannery, S. A. Teukolsky and W. T. Vetterling, ‘Numerical Recipes, The Art ofScientific Computing
(FORTRAN Version)’, Cambridge University Press, Cambridge, 1989.
25. K. Davey and S. Hinduja,’Modelling the pressure die casting process with the boundary element method: steady state
analysis’, Int. j. numer. methods my., 30, 1275-1299 (1990).
26. R. A. Adey, ‘Electrostatics problems’, in C. A. Brebbia (ed.), Boundary Element Techniques in Computer Aided
Engineering, Nato AS1 Series, Martinus and Nijhopl, Dordrecht, 1984.
Документ
Категория
Без категории
Просмотров
2
Размер файла
1 315 Кб
Теги
922
1/--страниц
Пожаловаться на содержимое документа