Dev. Chem. Ens. Mineral Process., 9(IL2),pp. IOI-IO8.2001. A New Method for Measurement Error Covariance Estimation Yuhong Zhao', Zhongwen Gu and Chunhui Zhou National Laboratory of Industrial Controi Technology, Institute of Systems Engineering, Zhejiang University, Hangzhou 310027, P.R. China A new robust indirect algorithm for measurement error covariance estimation is proposed in this paper. The residual covariance matrix is estimated using Hampel's three-part redescending M-estimators. Then measurement error covariance can be obtained from ihe residual covariance. Unlike the conventional indirect estimations of measurement error covariance, which are very sensitive to gross errors, credible results can be achieved either with or without :he presence of external causes. implementation results show the robustness of the proposed method. Introduction Reliable process data play an important role in modern chemical plants for the purpose of process control, optimization implementation or performance evaluation. However, the measurements are contaminated inevitably by random errors and gross errors, whch may result fiom the miscalibration and failure of the measuring instruments, or fiom leaks of production equipment and pipe h e s , so that the data do not generally satisfy the process constrains. Thus before the measurements can be used successfully, data rectification is fiequently required which is receiving more attention in the chemical e n g i n e g and mineral processing fields. In the data rectification process, gross errors are detected and eliminated and then measurements are adjusted to satisfy the material and energy balances. Almost without exception alI rectification strategies start with the hypothesis that the measurement errors are random and normally distributed with zero mean and a known covariance matrix. In practice, the covariance matrix is usually unknom or known approximately. By far the most commonly used statistical technique for covariance matrix estimation is the direct method (la) LI I where X is sample of N repeated n-dimensional measurements X i . The unbiased 'Authorfor correspondence (e-mail: yhzhao@iipc.ziu.edu.cn). 10I Y.Zhao, Z Gu and C.Zhou maximum likelihood estimator can be produced only under the assumption of an independent error distribution with constant covariance. This drawback is eliminated by the indirect method whch makes use of the covariance matrix of constraint residuals. Almasy and Mah [l] gave a method for estimating the covariance matrix of measurement errors by using the constraint residuals calculated from available process data. Thls method gives an analytical solution which is very sensitive to the correlated measurements. Keller et al. [23 presented a method based on the relation deduced from the statistical properties of material balance consiraints, whch is more robust with regard to the correlated measurements and more suitable in practical situations. However these algorithms are sensitive to the presence of gross errors in the data set because of the use of a conventional method to calculate the covariance matrix of the residuals. Chen and Romagnoli [3] proposed a robust indirect method based on Huber M-estimator. This approach is an iterative calculation of covariance matrix of .the residuals with weights assigned to the observations according to their &stances from the current estimated location. Since Huber type weights are used in the method, the dluence by moderately large external causes can not be eliminated completely. A new indirect method based on Hampel's three-part redescendmg M-estimators is proposed using Chen and Romagnoli's approach. In the next section, the formulation of the indirect method for the error covariance estimation is introduced. Then the detail of the proposed algorithm is presented and two examples are simulated to show the performance of the algorithm. Finally, conclusions are drawn and the further research is indxated. Problem Formulation Consider a linear system described by the following equation: AX',= O (2) where A is a m x n full-row rank matrix with m 5 n , X't is the true vector of the ndimensional process variables at time k; and the observation equation is: X,= X; + e, (3) where X , is the vector of measurement at time k and e, is the measurement error vector at time k. From equations (2) and (3), the residuals R, is given by: R, = AX,= AX't + Ae, = Ae, (4) Under the assumption that e, is Gaussian with zero mean and positive definite " covariance V ( V = ( v ~ ) "),~then: E(e,) = 0 102 (5) A New Methodfor Measurement Error Covariance Estimation E(ekekT) =V (6) E(Rk)= E(Aek)= AE(e,) = 0 (7) The covariance matrix of the residuals can be obtained as following: H = cov(R,) = E(R,R:) = E(Ae,ekTAT) = AE(ekekT)AT = AVA' (8) Given A and V , H is uniquely defined by equation (9, but the converse is not true. However, since the measurements are normally made by independent instruments, the measurement errors should be uncorrelated. If instruments share some common elements such as power supplies, errors can be assumed to be weakly correlated. The implication is that H is diagonal or diagonally dominant. Hence the estimation of H can be obtain for specified off-diagonal elements. Using Kronecker matrix products and vecoperator [l], equation (8) can be rewritten as: vec(H) = (A 63 A)vec(V) (9) where Kronecker product of matrix A=(a,)," and B,,, is defined as a ( m * s ) x ( n . t )matrix: AQDB= [a, . B ) ( 10) where ay .B denotes the product of matrix B by the scalar a,, . vec(A) is defined as: where A,,i= 1,2;.-,n denotes the ith column of matrix A. The indirect method uses the residuals covariance matrix H to estimate the measurement error covariance matrix V. Since V is a symmetrical matrix, let v, vp4 --- v u f , where v,,,, , vU, ( p < q k < Z ) are offD=[vll vu diagonal elements, then equation (9) can be rewritten as: --- vec( H)= GD where Lam14 a,& a,A, anpAq+awAp*.. a,A,+a,,A, I03 Y. Zhao, Z.Gu and C. Zhou and A, ( i = l,...,n ) is the ith column of matrix A. Solving the least squares problem: min{vec(H) - GDIvec(H)-GOT the maximum likelihood estimation of D is given by: D = (GTG)-'Grvec(H) (14) Under the assumption of uncorrelated error distribution, simplified result can be obtained. Robust Estimation Based on Hampel's Three-part Redescending M-estimators Hampel's three-part redescendmg M-estimator belongs to M-estimator or generalized maximum likelihood estimator, which can be stated as: 2[w* )(X,- W W , - M ) T I / ( N - 1) (u, =V (16) ,=I where W,, W, are arbitrary weights; V is positive definite matrix, whose Cholesky decomposition is: V = SS' ; and ui = (KS)-'(X,- M ), K > 0, K is a tuning parameter varying with different M-estimators and K = 1 here. Note that when W, = W,= 1 , equation (15) and (16) become the form of conventional MLE. The iterative procedure can be written as: where p,q =l,...,n and k notes iterative step. The chosen weight functions are illustrated in Figure 1. Figure 1. Illustration of the weightfunctions. 104 A New Method for Measurement Error Covariance Estimation W,(U) = W I 2 ( 4 (20) c is the rejection point, over which the rnfluence of the gross errors can be eliminated completely. The selection of (a,b,c) is discussed in detail in Huang's work [4]. The iterative algorithm is described as follows: (1) Compute the residuals: R=AX (21) (2) Compute the covariance matrix of R by iteration: (2.1) Specify (a,b, c ) and the threshold for convergence and initialize the location and scale parameters: Mpo =median(X,) p=T,...,n; i=l,..-,N H,O = g ( x , -M,o)(x, - M ~ o )p , ~q = l,.-,n ; i =I,..., N (22) (23) ,=I Compute the Cholesky decomposition of H o: H o= SSTand let B o = S-' . (2.2) Iterative calculation: Compute the Cholesky decomposition of C C = DD' and let: p = D-' B = PB" (2.3) Judge the convergence: If equation (27) is satisfied, H = B-'(B-')T. Else let M o= M and B o = B , return (2.2). 105 Y. Zhm, Z Gu and C. Zhou It is worthwhile mentioning that the positive definiteness of matrix C can be maintained in the iteration procedure. Simulation a = 3, b = 5, c = 12 and e = 6 = lo-' are used in the following simulation. (1) Diagonal case A system with two entering and two leaving streams is studied in this section, whose system matrix is described as [2]: 0.1 0.6 - 0.2 - 0.7 0.1 0.3 -0.6 -0.2 All stream flowrates are assumed to be measured and their true values are: x-= [0.1739 5.0435 1.2175 4.003' The variance matrix of the measurement errors is: V = diag(0.003 1 0.0025 0.0006 0.0392) A multivariate normal distribute data set with the mean X' and variance V is generated to simulate the process measurements. The sample size is 1000. The performances of the conventional indirect method, of the indirect method based on Huber M-estimator and of the proposed algorithm are investigated in the cases of with and without gross errors respectively. The computation results are shown in Table 1. The three methods give similar results without the presence of gross errors. The conventional indirect method cannot derive a reliable result due to the presence of gross errors, while the proposed method and the indmect method based on Huber Mestimator can work well. In the case of large gross errors, the results from the proposed method are more reliable than that from the indirect method based on Huber M-estimator. (2) Non-diagonal case A network consisting of 7 nodes and 12 streams [2] is given in Figure 2. The measurement errors of stream sensors 2 and 5, and 6 and 11 are to be correlated. The system matrix is: 1 - 1 0 0 0 0 - 1 0 0 0 1 - 1 0 0 0 0 - 1 0 0 0 1 - 1 0 0 0 0 0 A= 0 0 0 1 - 1 0 0 0 - 1 0 0 0 0 1 - 1 0 0 0 0 0 0 0 0 0 1 1 0 -0 0 0 0 0 0 0 0 1 106 0 0 0 0 1 0 1 0 0 0 0 0 - 1 0 0 0 0 0 0 0 - 1 8.10 20.00 20.00 6.00 9.00 10.00 10.00 30.00 30.00 20.00 20.00 7.50 15.00 10.00 True Values 0.0400 0.0029 0.0025 0.0006 0.0409 0.0006 0.0030 0.0026 0.0029 0.0025 0.0006 0.0403 0.0404 p,ppLp PSpzI 0.0030 0.0030 0.0027 0.0007 0.0404 0.0030 0.0026 0.0006 0.0398 0.0380 0.0029 0.0390 0.003 1 0.0030 0.0024 0.0006 0.0396 30.89 30.19 22.08 18.42 7.29 14.86 10.56 9.92 9.34 9.01 21.30 19.60 5.5 I 9. LO 30.02 30.63 22.50 18.37 7.66 14.80 10.34 9.50 9.24 9.22 19.96 18.78 5.86 9.53 30.80 29.98 21.92 18.36 7.28 14.81 10.43 9.93 9.27 9.03 21.08 19.54 5.46 9.08 9.19 m 15.26 10.60 9.76 9.56 8.71 21.34 19.75 ep? 21.59 18.77 &2!2 31.58 30.07 30.94 22.24 18.46 8.13 14.78 10.33 10.38 9.58 9.15 20.06 18.74 6.33 9.36 30.87 30.37 2 I .63 18.70 7.98 14.76 10.45 9.95 9.35 8.92 21.13 19.50 6.00 8.88 22.33 18.57 19.96 20.05 9.57 47.03 14.64 10.32 10.36 9.51 9.17 20.07 18.64 6.29 9.39 m 1194 71.62 16.84 10.80 9.27 10.48 7.60 21.46 20.32 30.03 33.80 30.82 30.04 21.81 18.38 7.31 14.79 10.43 9.90 9.45 8.95 21.16 19.46 5.49 9.08 Table 2. Comparisons of the results in non-diagonal case. With gross errors Without gross errors X,,= 760, X,,= 360 X,,= 600, X , , = 200 CIE IEHu IEHa CIE IEHu IEHa CIE IEHu IEHa 0.0030 0.0026 0.0006 0.0407 Annotation: CIE: conventional indirect estimation; IEHu: indirect estimation based on Huber M-estimator: IEHa: indirect estimation based on Hampel’s three-part redescending M-estimator;underline: unreliable estimation result. Off-diagonal elements Diagonal elements Diagonal elements True Values Table 1. Comparisons of the results in diagonal case. With gross errors Without gross errors XI, = 1.8126 XI, = 7.2126 CIE IEHu IEHa CIE IEHu IEHa CIE IEHu IEHa s a g. 5 9 5’ B 5 2. 2 5 Q e 9 i 3 E s 0 3 c” e 2: cs Y. Zhm, 2 Gu and C. Zhou 4 Figure 2. Process network. The diagonal elements of the covariance matrix are: &O 30 20 20 7.5 15.00 10.00 10.00 10.00 8.10 20.00 20.001 and the off-diagonal elements are: V,, = V,,2= 6.00, V,.,, = V,,., = 9.00. The study is similar to the uncorrelated measurement error case. The results from the three methods are shown in Table 2. Similar conclusions can be drawn from the simulation results. The indirect method based M-estimator can give perfect results both with and without gross errors, though the proposed method perfom better than the indirect method based on Huber M-estimator in the case of large gross errors. However the conventional indirect estimator gives completely incorrect results for the correlated coefficients when the corresponding measurements are contaminatedby gross errors. Conclusions Estimation of measurement error covariance matrix is very useful in data rectification. A new robust indirect algorithm based on Hampel's three-part redescendmg M-estimators is presented. The mfluence of gross errors can either be elmhated or limited, so that the estimator can give reliable result even in the case of the presence of gross errors. Two examples demonstrate the robustness of the algorithm. Since the algorithm is restricted to the linear constraint, a further study will be focused on the robust estimation subject to nonlinear constraints. References Almasy, G.A., and Mah, R.S.H. 1984. Estimation of Measurement Error Variances from h c e s s Data. Ind. Eng. Chem. Process Des. Dev., 23(4), 779-784. 2. Keller, J.Y.; Zasadzinski, M., and h u a c h , M. 1992. Analytical Estimator of Measurement Error Variances in Data Reconciliation. Computers chem. Engng., 16(3), 185-188. 3. Chen, J.; Bandoni, A. and Romagnoli, J.A. 1997. Robust Estimation of Measurement Error VariancdCovariance from Process Sampling Data. Computers chem. Engng., 21(6), 593-600. 4. Huang, Y.C. 1990. Data Exploration and Robust Estimation. (In Chinese) Mapping Ress. Beijing. 1. Received: 20 June 1999; Accepted after revision: 14 May 2000. 108

1/--страниц