close

Вход

Забыли?

вход по аккаунту

?

Stochastic Model Reduction by Maximizing Independence.

код для вставкиСкачать
Dev. Chem. Eng. Mineral Process. 13(3/4),pp. 211-220, 2005.
Stochastic Model Reduction by Maximizing
Independence
Hui Zhang* and You-Xian Sun
National Laboratory of Industrial Control Technology, Institute of
Modern Control Engineering, Department of Control Science and
Engineering, Zhejiang University,Hangzhou 31002 7, P.R. China
By analysing information descriptions in state space models of linear stochastic
systems, this paper proposes two model reduction methods via principles of
marimizing independence and conditional independence among the reduced state
vector, respectively. These methods are based on state aggregation. The
independence and conditional independence are measured by the Kullback-Leibler
information distance. It is demonstrated that the maximum conditional independence
method is not only applicable to stable systems, but also applicable to unstable
systems. Simulation results illustrate the efficiency of the present methods.
Introduction
Information theoretic methods are attracting more and more attention in the field of
control theory [ 11, 12, 161. The present paper focuses on the problem of model
reduction in the framework of information theory. For stochastic systems, many
model reduction approaches are available in the literature, such as state aggregation
[2], covariance equivalent realization method [ 141, stochastic balanced truncation [6]
and L2method [13]. There are some special approaches based on information theory,
such as the method of minimizing the Kullback-Leibler information distance (KLID)
between the full and reduced order models [ 1 11.
In &IS paper, the concept of KLID [lo] is adopted to deduce two methods of
model reduction, however, not as a measure of distance between the full and reduced
order models, but as a measure of statistical independence among the components of
reduced state vector. KLID is discussed and two model reduction methods based on
state aggregation, the maximum independence method and the maximum conditional
independence method, are derived. Illustrative examples are included.
Measure of Independence
It is frequently of interest to measure statistical independence (or dependence)
between two stochastic processes or among a random vector in such areas as
engineering, econometrics, and medicine. A typical example is the discrimination
function of features in the field of pattern recognition [4]. Among these areas, several
measures are available in the literature, such as feedback measure 171 and information
* Authorfor correspondence (zhanghui-iipc@zju.edu.cn).
211
ffuiZhang and You-Xian Sun
theoretic measures [4,5]. This section introduces the measure of KLID.
Information of a random variable can be measured by entropy. Let v be a random
vector with probability density p v . The entropy of v is H ( v ) = -E[lnp,(v)], where
E denotes taking mathematical expectation. As proposed in the field of independent
component analysis (ICA) [ 1, 41, statistical independence can be measured by KLID
between the joint probability density of a random vector and the product of its
marginal densities. Suppose v = [v, v2 v, 3' E R" is a random vector with
probability density p v , the densities of components of v are pv,,i = 1,2, ...,n . Then
n pv
n
the KLID between p v and
i=l
'
is defined as:
where E, denotes taking expectation with respect to p v . I(v) can be rewritten in
terms of entropy:
I(v)= f H ( V i ) -H(v)
...(2)
i=l
The quantity I ( v ) provides a measure of independence of the components of v, and
is the difference between the information of the vector and the sum of information of
its components. It is always positive and vanishes if, and only if, pv =
n pv,,i.e. the
n
(=I
components are mutual independent. Little dependence of the components of v
implies little redundancy in the information supplied by v.
Model Reduction via Maximizing Independence
Problem Statement
Consider a linear time-invariant, stable stochastic system modeled by:
& ( t ) = Ax(t)
+ Bw(t)
y ( t ) = Cx(t) + v(t)
...(3)
where x ( t ) E R " , w ( t ) E R", y(t), v(t) E R P, A, B, C are constant matrices with
appropriate dimensions. S is defined as &(t) = x(t + 1) , t E Z (2 denotes the set
of integers) for discrete time system; or as &(t) = i ( t ) , t E R (R denotes the set of
real numbers) for continuous time system. w ( t ) and v(t) are mutually independent
zero-mean white Gaussian random vectors with covariance mattices Q and R,
respectively, and uncorrelated with x(0). To approximate the system described by
the hll-order model (l), we wish to find a stable reduced-order model:
212
Stochastic Model Reduction by Maximizing Independence
where x , ( t ) E R‘ , 1 c n , A , , B , , Cr are constant matrices with appropriate
dimensions. The same processes w and Y are retained [ 1 13.
In the states aggregation [2], x, (t) is referred to as the aggregated state,
x, ( t ) = W t )
...(5)
I
where A € R”“ is the aggregation matrix, rankA= I , and the elements of
are all non-negative. From Equations (3), (4) and (5):
(,‘)-I
...(6)
AA=A,A
Let A+ denote the Moore-Penrose inverse of A , then:
A, = AAA’,
B , = AB, C, = CA+
...( 7)
Several approaches to choosing the aggregation matrix have been introduced in
literature. Here, we choose the aggregation matrix by maximizing the independence
or conditional independence of reduced state vector measured by KLID.
Method I : Maximizing Independence of the State Vector
System dynamic character is defined by the structure and parameters of model.
However, the “information” of the dynamics is contained in system states. The
dynamic information of the “!ill order description” of Equation (3) is contained in
x ( t ), while the dynamic information of the “reduced order description” as given by
Equation (4) is contained in x , ( t ) . Suppose the covariance matrices of x ( t ) and
x, ( t ) are n(t)
= E{x(t)xT( t ) },
IT,( t )= E { x ,(t)x: ( 1 ) ) ,respectively.
In this paper, we will focus on the steady state information. Because the full order
system of Equation (3) is stable, its steady state covariance IT = lim,,,n(t)
is the
unique positive definite solution to the Lyapunov equation:
A17+LMT+BQBT=0
...(8)
when Equation (3) is a continuous system. Or the unique positive definite solution to:
A m T + BQBT =l7
...(9)
when Equation (3) is a discrete system. For the same reason, the steady state
covariance of system Equation (4) can be defined by: I7,= lim,,n, ( t ) .
Since the systems Equations (3) and (4)are Gaussian, we can define the steady
state entropies of system (3) and (4)respectively, as:
1
H(x ) = !!ln(2ne) + -In d e w
2
2
...(10)
213
Hui Zhang and You-Xian Sun
I
1
I
H ( x , ) =-ln(2ne) +-In d e n , =-ln(2ra?)
2
2
2
1
+ -In
detMUT
2
...(11)
Let us investigate the statistical dependence among the components of the steady
value of x, using measure (1). Let x, =[xtI,...,x,, I T , S , , i = l , ...,1 denote the
diagonal elements of 17,.Then, the measure of independence among elements of
x, can be written as:
1 det(ALMT)"
1
1 '
Z(xr)=-lnnSi - - l n d e m T =-In
2
2
detAllXT
2 #=I
*.
where A' denotes the matrix A when its off-diagonal elements are put equal to zero.
From the viewpoint of information, the approximating performance of reduced
order model is determined by the amount of information of x retained in x, . When
the independence among components of x, is maximized, i.e. the redundancy or
duplication in the information supplied by x, is minimized, we get the highest
efficiency of information retaining. So less dependence of x, implies better
approximating performance. Maximizing the independence among elements of x, is
equivalent to minimizing Z(x,) .
Let A,, i = l , ...,I denote the eigenvalues of the correlation matrix of x, ,
[a,],
i, j = 1, ..., 1 ,where a, = cosh(x,,,xg) . Then Equation (12) can be rewritten as:
Z(X,)
1 1
. ..(1 3)
= --cltlAi
2 i=l
/
Hence, in order to minimize I ( x , ) , we have to maximize CltlA, . It can be shown
i=I
[9] that
I
C Idi is
maximum if Al, i = 1, ...,I are the 1 largest eigenvalues of the
1-1
correlation matrix of x. Then, I ( x , ) is minimum when the aggregation matrix A
is chosen as:
A = [ 9 , , ...,91 I'
...(14)
where q l , ...,q, are the ortho-normal eigenvectors corresponding to the 1 largest
eigenvalues of the correlation matrix ofx, 0 = [#" 1, i, j = 1, ..., n , #u = cosh(x,, x, ) .
Now, the reduced-order model with maximum independence can be obtained from
Equations (S), (7) and (14). The correlation matrix 0 (providing the basis upon
which the aggregation matrix is computed) can be obtained from l7, which is the
unique positive solution to Equation (8) or (9). It is obvious that the aggregation
matrix given by Equation (14) satisfies the conditions of rankn = I and the elements
are all non-negative.
of (MT)-I
214
Stochastic Model Reduction by Maximizing Independence
Method 2: Maximizing the Conditional Independence of State Vector
Another concept of independence adopted in ICA is the conditional independence [11,
which can be also measured by KLID. In ehrs section, we will firstly focus on the
conditional information of state vectors of full and reduced order models when system
outputs are observed, and then deduce another method of model reduction based on
the principle of maximizing conditional independence.
We assume that the system discussed in this section is discrete time, and set r=k,
k E Z . The results for continuous time systems can be obtained in the same way. In
order to discriminate this method from the notation of method 1, we set the
aggregation matrix as V in this section, then:
x , ( k ) = V x ( k ) , A, = V A V + , B , =VB,
c, = C V +
...( 15)
Suppose we have a certain sequence of output observations of the true system,
Y' = {y(1),y(2),..., y(k)} . Suppose i ( k + 1) = E[x(k + 1) I Y' 1 is the one step
ahead Kalman estimation of x(k+l) based on the given Y' (Suppose k is large
enough), and Xu(k + 1) = x(k + 1) - i ( k + 1) , C(k + 1) = E[3(k + l)ZT(k + 1)) . For
system (4), suppose i , ( k
+ 1) = E[x,(k + 1) 1 Y k ] , Z,(k+ 1) = x,(k + 1) - i r ( k + I ) ,
C,( k + 1) = E[%,(k + l)Xur
(k
T+ 1)J.
Suppose Z = Iimk+,2(k) and Xu, = lixnk+mXur(k)are the steady estimation
errors, and I= = lim,,,C(k) and I=, = limk+mZr(k)
are the covariance matrices of
steady estimation errors of full- and reduced-order models of Equations (3) and (4),
respectively. It is known that Z satisfies the following Riccati equation:
1= A(C - X ' ( C X T + R)-'CC)AT + BQBT
...( 16)
Let '
I = {y( l), y(2),...,y(c0)). From the estimation and information theory [3]
we get the conditional entropies of the full and reduced states, respectively:
n
1
H ( x 1 Y ) = H ( Z ) =-ln(2lce) +-1ndetZ
2
2
1
1
H ( x , I Y)= H ( 2 , ) =-ln(21ce)+-LndetEr
2
2
..,.(17)
...(1 8)
Let E,= [F,,
,..., ?,I' , g, ,i = 1, ..., 1 denote the diagonal elements of C,. Then the
conditional independence among components of x, with given Y is measured by:
1
'
1
Z(x, I Y ) = - l n n g i --lndetVWT
2 i=l
2
..
,.
(19)
Maximizing the conditional independence of x, is equivalent to minimizing
I(x, 1 Y). It can be concluded in the same way as in method 1 that, in order to
maximize the conditional independence among the components of reduced state
vector, we have to set the aggregation matrix as:
21s
Hui Zhang and You-Xian Sun
... (20)
where p , , ..., p, are the ortho-normal eigenvectors corresponding to the I largest
eigenvalues of the correlation matrix of Z , Y =[p,,],i,j = 1,..., n ,
qu= cosh(?!, Y j ).
The reduced order model with maximum conditional independence can be
obtained using Equations (15) and (20). The correlation matrix !P (providing the
basis upon which the aggregation matrix is computed) can be obtained from Z ,
which is the unique positive solution to Equation (16). It is obvious that the
aggregation matrix given by Equation (20) also satisfies the conditions of rankV = I ,
and the elements of (WT)-'are all nonnegative. In the case of continuous time
system, the computation of reduced-order model by maximizing conditional
independence are the same as in the case of discrete time system, although the Riccati
equation is different.
Analysis and Discussion
In this section it is also assumed that the system under discussion is discrete time. The
results for continuous time systems can be obtained in the same way.
Stability
A common property of these two methods is that the deduced reduced-order models
preserve the stability of the original model.
It is well known that model (3) is stable if, and only if for any positive definite
matrix W, there exists a positive definite solution P to the Lyapunov equation:
APA* + w = P
...(21)
For AA = A, A , we obtain the following equation by multiplying Equation (2 1) left
with A and right with AT , respectively:
A, MA' A,'
i- A WA'
...(22)
= APA*
Let A' =[AT )I,+,..)I 1' , where y ,+,,..., q are the ortho-normal eigenvectors
corresponding to the n - 1 smallest eigenvalues of the correlation matrix 0 . It can
be seen that M A T and AWAT are the I x 1 matrices in the top left-hand comer
of A'P(Af)' and A'W(Af)' , respectively. Hence, MA' and AWAT are
positive definite if P and Ware positive definite. With Equation (22) we can conclude
that, when full order model is stable, the reduced order model from method 1 is also
stable. The conclusion for method 2 is the same.
I
II
On the Unstable Systems
However, there is a difference between these two methods. The aggregation matrix in
method 1 can be computed only when the steady state covariance of the fkll order
system exists. Therefore, method 1 is applicable only when the full system is stable.
216
Stochastic Model Reduction by Maximizing Independence
In method 2, the covariance state estimation error 6 ( k ) will converges to a
constant matrix 2’ when the system is stable. However, even in the case of unstable
systems, the steady covariance X also exists and satisfies the Riccati equation (16)
if the pair [A, C‘j is observable and the pair [A, HJ is stabilizable, where HH’ = Q .
Hence, method 2 is applicable for both stable and unstable systems.
Illustrative Examples
Several examples have been successfully employed to illustrate the efficiency of the
present methods. The foliowing two examples are presented.
Example 1. The original 8-order stable discrete-time model [8].
A:
G(Z)
-
1.682’ +1.116z6- 0 . 2 1 ~+0.1522‘
~
-0.5162’ -0.2622’ +0.044~-0.006
8,’ -5.0462’ - 3 . 3 8 ~+0.63t5
~
-0.4562‘ +1.5482’ +0.786~’
-0.132~+0.018
Before applying the methods in thls paper to this model, we transformed the
transfer function G to state space model as given by Equation (3), where noises w
and v are assumed to be mutual independent Gaussian sequences with covariance
Q=R=l.
Applying method 1 to system A, we get a 3-order model as:
’
0.1933 Z * - 0.1606 z + 0.01 141
Gr’(z) = z’ - 1.923z’ + 1.0492 - 0.08672
Figure 1 shows Bode plots of model A and A,, (the frequency responses are only
plotted for frequencies smaller than the Nyquist frequency).
The 3-order reduced-order model from method 2 is:
0.16152’ -0.1253~+0.005995
At’ : Gr’(z)= 2’ -1.798~’+0.89032 -0.05204
Figure 2 shows Bode plots of model A and A,*.
Simulation results of this example illustrate the efficiency of the present methods
for stable systems. It can be seen that the reduced order models A,, and Ar2 are also
stable, and possess good approximating performances.
Example 2. To illustrate the efficiency of the maximum conditional independence
method in the case of unstable systems, we consider here a continuous-time unstable
model described by Equation (3) with parameters as [ 151:
21 7
Hui Zhang and You-Xian Sun
Figure 1. Bode plots of models A (solid line) and A,, (dashed line).
Figure 2. Bode plots of models A (solid line) and Arl (dashed line).
B:
I
-
0 0 0 0 -114
1 0 0 0 -86
A= 0 1 0 0
35 , B = [ 7 0 114 55 12 1]T,
0 0 1 0
0 0 0 1
c=[O 0
0 0 11.
3
-5 -
The noises w and v are also assumed to be mutual independent Gaussian processes
with covariance Q and R, respectively, Q = R =1 , Poles of model B are
218
Stochastic Model Reduction by Maximizing Independence
Figure 3. Singular value plots of B (solid line) and B, (dashed line) B,.
(2.0565k1.4622i,-3.9435f 1.7012i,-1.2261). In this example, we examine the
approximating performance of the reduced-order model by comparing its singular
value to that of the original model.
The obtained approximating model with 3-order is:
1
35.3574 54.9201 - 62.2864
- 120.2451
19.5113 32.2019 -36.2318 x , ( t ) + -4.6853 w ( t )
39.8363 61.6846 -71.4941
-72.60871
’
y ( t )= [0.3861 0.6047 - 0.6965)ur(t)+ ~ ( t )
[
Poles of B, are {2.9232?1.3322i,-9.7813}. Figure 3 shows the singular values of
the 111-order model B and the reduced-order model B,. This example illustrates the
usefulness of method 2 in the case of unstable systems.
Conclusions
By analyzing the information and conditional information descriptions in a state space
model of linear stochastic systems, this paper proposes two model reduction methods,
with criteria of maximum independence and maximum conditional independence,
among the reduced state vector, respectively. These methods are based on state
aggregation. Simulation results illustrated the efficiency of the present methods. It
was demonstrated that when the original model is stable, the reduced order models are
also stable. In addition, the maximum conditional independence method is not only
applicable to stable systems, but also applicable to unstable systems.
219
Hui Zhang and You-Xian Sun
Acknowledgement
This work is supported by China 973 Project (No. 2002CB3 12200).
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
Akaho, S. 2002. Conditionally independent component analysis for supervised feature extraction.
Nmrocomputing, 49: 139-150.
Aoki, M. 1968. Control of large-scale dynamic systems by aggregation. IEEE Trans. on Automatic
Control, 13: 246-253.
Caines, P.E. 1988. Linear Stochastic Systems, John Wiley, New York.
Comon, P. 1994. Independent component analysis, a new concept? Signal Processing, 36: 287-314.
Darbellay, G.A., and Wuertz, D. 2000. The entropy as a tool for analyzing statistical dependences in
financial time series. Physica A, 287: 429-439.
Desai, U.B., and Pal, D. 1984. A transformation approach to stochastic model reduction. IEEE Trans.
on Automatic Control, 29: 1097-1100.
Geweke, J. 1982. Measurement of linear dependence and feedback between multiple time series.
J. American Statistical Association, 77(378): 304-3 13.
Kangsanant, T. 1988. Model reduction of discrete-time systems via power series transformation. Proc.
1988 International Conference on Control, 13-15 April, Oxford, UK:pp.241-252.
Kapur, J.N., and Kesavan, H.K. 1992. Entropy Optimization Principles with Applications. Academic
Press, Inc., U K pp.215-216.
Kullback, S. 1959. Information Theory and Statistics. John Wiley, New York.
Leland, R. 1999. Reduced-order models and controllers for continuous-time stochastic systems: An
information theory approach. IEEE Trans. on Automatic Control, 44: 1714-17 19.
Tian, Y.C. 1993. Applications of Information Entropy in Nonlinear Systems. PhD thesis, Zhejiang
University, P.R. China.
Tjtimstrom, F., and Ljung, L. 2002. LI model reduction and variance reduction. Automntica, 38:
1517-1530.
Wagie, D.A., and Skelton, R.S. 1986. A projection approach to covariance equivalent realizations of
discrete systems. IEEE Trans. on Automatic Control, 31: 1114-1120.
Xiao, C.S., Feng, Z.M., and Shan, X.M., 1992. On the solution of the continuous-time Lyapunov
matrix equation in two canonical forms. IEEE Proceedings-D, 139(3): 286-290.
Zhang, H. 2003. Information Descriptions and Approaches in Control Systems. PhD thesis, Zhejiang
University, P.R. China.
Received 14 November 2003; Accepted after revision: 17 August 2004.
220
Документ
Категория
Без категории
Просмотров
1
Размер файла
718 Кб
Теги
maximizing, mode, stochastic, reduction, independence
1/--страниц
Пожаловаться на содержимое документа