# Using ARX and NARX approaches for modeling and prediction of the process behavior application to a reactor-exchanger.

код для вставкиСкачатьASIA-PACIFIC JOURNAL OF CHEMICAL ENGINEERING Asia-Pac. J. Chem. Eng. 2008; 3: 597–605 Published online 15 October 2008 in Wiley InterScience (www.interscience.wiley.com) DOI:10.1002/apj.118 Special Theme Research Article Using ARX and NARX approaches for modeling and prediction of the process behavior: application to a reactor-exchanger Yahya Chetouani* Université de Rouen, Département Génie Chimique, Rue Lavoisier, 76821, Mont Saint Aignan Cedex, France Received 30 June 2008; Revised 25 February 2008; Accepted 26 February 2008 ABSTRACT: Chemical industries are characterized often by nonlinear processes. Therefore, it is often difficult to obtain nonlinear models that accurately describe a plant in all regimes. The main contribution of this work is to establish a reliable model of a process behavior. The use of this model should reflect the normal behavior of the process and allow distinguishing it from an abnormal one. Consequently, the black-box identification based on the neural network (NN) approach by means of a nonlinear autoregressive with exogenous input (NARX) model has been chosen in this study. A comparison with an autoregressive with exogenous input (ARX) model based on the least squares criterion is carried out. This study also shows the choice and the performance of ARX and NARX models in the training and test phases. Statistical criteria are used for the validation of the experimental data of these approaches. The identified neural model is implemented by training a multilayer perceptron artificial neural network (MLP-ANN) with input–output experimental data. An analysis of the inputs number, hidden neurons and their influence on the behavior of the neural predictor is carried out. In order to illustrate the proposed ideas, a reactor-exchanger is used. Satisfactory agreement between identified and experimental data is found and results show that the neural model predicts the evolution of the process dynamics in a better way. 2008 Curtin University of Technology and John Wiley & Sons, Ltd. KEYWORDS: reliability; modeling; neural network; NARX; ARX INTRODUCTION In the last few years, ever-growing interest has been shown in production quality standards and pollution phenomena in industrial environments. However, process development and continuous request for productivity led to an increasing complexity of industrial units. The dynamic nature and the nonlinear behavior of such units pose challenging control system design when products of constant purity are to be recovered. In chemical industries, it is absolutely necessary to control the process and any drift or anomaly must be detected as soon as possible in order to prevent risks and accidents. Moreover, detecting a fault appearance on-line is justified by the need to effectively solve the problems within a short time.[1 – 3] The anomaly detection module is intended to supervise the functioning state of the system.[4] This module has to generate on-line information concerning the state of the automated system. This state is characterized not only by control and measurement variables (temperature, rate, etc.) but also by the *Correspondence to: Yahya Chetouani, Université de Rouen, Département Génie Chimique, Rue Lavoisier, 76821, Mont Saint Aignan Cedex, France. E-mail: Yahya.Chetouani@univ-rouen.fr 2008 Curtin University of Technology and John Wiley & Sons, Ltd. general behavior of the process and its history, showing in time whether the behavior of the system is normal or presents drifts. In the context of numerical control, fault detection and isolation (FDI) proves a vital complement to the adaptive means of dealing with perturbations in nonlinear, highly nonstationary systems. Under normal conditions, the fault detection module allows all information to be processed and managed in direct liaison with its general behavior. In other case, it detects any anomaly and alerts the operator by setting on the appropriate alarms. The intrinsic highly nonlinear behavior in the industrial process, especially when a chemical reaction is used, poses a major problem for the formulation of good predictions and the design of reliable control systems.[5] Owing to the relevant number of degree of freedom, to the nonlinear coupling of different phenomena and to the processes complexity, the mathematical modeling of the process is computationally heavy and may produce an unsatisfactory correspondence between experimental and simulated data. Similar problems also arise from the uncertainty for the parameters of the process, such as the reaction rate, activation energy, reaction enthalpy, heat transfer coefficient, and their unpredictable variations. 598 Y. CHETOUANI In fact, note that most of the chemical and thermophysical variables both strongly depend and influence instantaneously the temperature of the reaction mass.[4] One way of addressing this problem is the use of a reliable model for the on-line prediction of the system dynamic evolution. However, designing empirical models like the black-box models is unavoidable. Various techniques of the processes identification were already proposed.[6] Owing to their inherent nature to model and learn ‘complexities’, artificial neural networks (ANNs) have found wide applications in various areas of chemical engineering and related fields.[7,8] Engell et al .[9] discussed general aspects of the control of reactive separation processes. They used a semi-batch reactive distillation process. A comparison was carried out between conventional control structures and modelbased predictive control by using a neural net plant model. Savkovic-Stevanovic[10] used a neural network (NN) for product composition control of a distillation plant. The NN controller design is based on the process inverse dynamic modeling. The back-propagation (BP) algorithm is applied to dynamic nonlinear relationship between product composition and reflux flow rate. The obtained results illustrate the feasibility of using neural net for learning nonlinear dynamic model distillation column from plant input–output data and control. Assaf et al .[11] modeled an ethylene oxidation fixed-bed reactor by a phenomenological model. They compared the results given by this model and those given by the neural model for possible thermal runaway situations of highly exothermic process. The final objective is to build a reliable inference alarm algorithm for fast detection and prevention of this situation. Nanayakkara et al .[12] presented a novel NN to control an ammonia refrigerant evaporator. The objective is to control evaporator heat flow rate and secondary fluid outlet temperature while keeping the degree of refrigerant superheat at the evaporator outlet by manipulating refrigerant and evaporator secondary fluid flow rates. In a previous paper,[13] a reduced and reliable model based on a neural model, which allows reproducing the dynamics of a nonlinear process as a distillation column under steady-state or unsteady state regime was carried out. The main contribution of this current work is to obtain a powerful model allowing reproducing the dynamic behavior of a process as a reactor-exchanger. The present study focuses on the development and implementation of a nonlinear autoregressive with exogenous input (NARX) neural model for the one-step ahead forecasting of the reactor-exchanger dynamics. The performance of this stochastic model was then evaluated using the performance criteria. A comparison with an autoregressive with exogenous input (ARX) model based on the least squares criterion is carried out. Experiments were performed in a reactor-exchanger and experimental data were used both to define and to validate these 2008 Curtin University of Technology and John Wiley & Sons, Ltd. Asia-Pacific Journal of Chemical Engineering models. This study is carried out in order to evaluate the time delay of the process. It shows that several ARX models could be selected. Statistical information criteria and the quality of the adjustment criteria are used in order to make a judicious choice of the ARX identified model. Finally, results show that the NARX neural model is more representative than the ARX model for modeling the dynamic behavior of the studied process. The identification procedure, the experimental setup and prediction results are described in the following sections. INPUT–OUTPUT MODELING APPROACH Modeling is an essential precursor in the parameter estimation process. Identification strategies of various kinds by means of input–output measurements are commonly used in many situations in which it is not necessary to achieve a deep mathematical knowledge of the system under study, but it is sufficient to predict the system evolution.[14,15] This is often the case in control applications, where satisfactory predictions of the system that are to be controlled and sufficient robustness to parameter uncertainty are the only requirements. In chemical systems, parameter variations and uncertainty play a fundamental role on the system dynamics and are very difficult to be accurately modeled.[5] Therefore, the identification approach based on input–output measurements can be applied. NARX neural modeling In order to provide a closer approximation to the actual process in some situations, a nonlinear NARX model is employed,[16,17] which is identified by means of ANNs. The NARX model was obtained by using multilayer perceptron ANNs (MLP-ANNs)[18,19] to accurately describe the process behavior. This approach allows bypassing the exact determination of model parameters and their unpredictable variations, as well as the achievement of deep physical knowledge of the process and its governing equations. The nonlinear model of a finite dimensional system[20] with order (ny , nu ) and scalar variables y and u are defined by y(t) = φ(y(t − 1), . . . , y(t − ny ), u(t − 1), . . . , u(t − nu )) (1) where y(k ) is the autoregressive (AR) variable or system output; u(k ) is the exogenous (X) variable or system input. ny and nu are the AR and X orders, respectively. φ is a nonlinear function. This NN (Eqn (1)) consists in highly interconnected layers of neuron-like nodes. It has an input and an Asia-Pac. J. Chem. Eng. 2008; 3: 597–605 DOI: 10.1002/apj Asia-Pacific Journal of Chemical Engineering ARX AND NARX APPROACHES FOR PROCESS BEHAVIOR output layer and any optional layers that are included between these are termed hidden layers. Figure 1 shows typical feed-forward network architecture with one hidden layer. The term ‘feed-forward’ means that the connections between nodes only allows signals to be sent to the next layer of nodes and not to the previous one.[21] The number of nodes in a hidden layer is determined by the user and can vary from zero to any finite number. The number of nodes in the input and output layers are determined by the number of inputs and by the output variables, respectively. This structure is based on a result by Cybenko[22] who proved that a NN with one hidden layer of sigmoid or hyperbolic tangent units and an output layer of linear units is capable of approximating any continuous function. f (z ) = 1 (1 + e−z ) Calculation of the NN output The following steps explain the calculation of the NN output based on the input vector.[13,14] 1. Assign ŵ T (k ) to the input vector x T (k ) and apply it to the input units where ŵ T (k ) is the regression vector given by the following equation: ŵ T (t) = [y(t − 1), . . . , y(t − ny ), u(t − 1), . . . , u(t − nu )] (3) 2. Calculate the input to the hidden layer units as follows: netjh (k ) = p Wjih (k )xi (k ) + bjh (4) i =1 (2) where z is the sum of the weighted inputs and bias term. The determination of these weights for the node connections allows the ANN to learn the information about the system to be modeled. The input data are presented to the network via the input layer. These data are propagated through the network to the output layer to obtain the network output. The network error is then determined by comparing the network output with the actual output. If the error is not smaller than a desired performance, the weights are adjusted and the training data are presented to the network again to determine a new network error. One of the most well known is the BP algorithm.[23] In this algorithm, as with any other gradient approach, large values of learning rate will speed up the learning process, but lead to instability, and convergence can only be expected for small values of learning rate. The momentum factor is used to damp down oscillations in the learning process. The latter is repeated until the network error reaches the desired performance. In this case, the network is then said to have converged and the last set of weights are retained as the network parameters. where p is the number of input nodes of the network, i.e. p = ny + nu + nb ; j is the j th hidden unit; Wji h is the connection weight between i th input unit and j th hidden unit; bj h is the bias term of the j th hidden unit. 3. Calculate the output from a node in the hidden layer as follows: (5) zj = fj h (netjh (k )) where fj h is the sigmoid function defined by Eqn (2). 4. Calculate the input to the output nodes as follows: q netl (k ) = h q Wlj (k )zj (k ) (6) j =1 q where l is the l th output unit; Wlj (k ) is the connection weight between j th hidden unit and l th output unit. 5. Calculate the outputs from the output nodes as follows: q q (7) v̂l (k ) = fl (netl (k )) q where fl is the linear activation function defined by q q q fl (netl (k )) = netl (k ) (8) Back-propagation training algorithm The error function E is defined as 1 (vl (k ) − v̂l (k ))2 2 q E= (9) l =1 Figure 1. Feed-forward network for prediction. 2008 Curtin University of Technology and John Wiley & Sons, Ltd. where q is the number of output units and vl (k ) is the l th element of the output vector of the network. Within each time interval from k to k + 1, the backpropagation (BP) algorithm tries to minimize the error for the output value as defined by E by adjusting the weights of the network connections, i.e. Wji h and Asia-Pac. J. Chem. Eng. 2008; 3: 597–605 DOI: 10.1002/apj 599 600 Y. CHETOUANI Asia-Pacific Journal of Chemical Engineering q Wlj . The BP algorithm uses the following procedure (Eqns (10), (11), (12) and (13)): Wjih (k + 1) = Wjih (k ) + αWjih (k ) − η q q q Wlj (k + 1) = Wlj (k ) + αWlj (k ) − η ∂E (10) ∂Wjih (k ) ∂E (11) q ∂Wlj (k ) where η and α are the learning rate and the momentum factor, respectively; Wji h and Wlj q are the amounts of the previous weight changes; ∂E /∂Wji h (k ) and ∂E /∂Wlj q (k ) are given by ∂E = −[zj (k )(1 − zj (k ))xi (k )] ∂Wjih (k ) × q [(vl (k ) − v̂l (k ))v̂l (k )Wljh (k )] (12) l =1 ∂E = −(vl (k ) − v̂l (k ))zj (k ) q ∂Wlj (k ) (13) The implementation of the NN for forecasting is as follows: 1. Initialize the weights using small random values and set the learning rate and momentum factor for the NN. 2. Apply the input vector given by Eqn (3) to the input units. 3. Calculate the forecast value of the error using the data available at (k − 1)th sample (Eqns (3), (4), (5), (6), (7) and (8)). 4. Calculate the error between the forecast value and the measured value. 5. Propagate the error backwards to update the weights (Eqns (10), (11), (12) and (13)). 6. Go back to step 2. For weights initialization, the Nguyen–Widrow initialization method[24] is best suited for use with the sigmoid/linear network, which is often used for function approximation. ARX modeling The second adopted method for modeling of the reactorexchanger is based on a parametric identification of an ARX model. The choice of this strategy is justified by the fact that it is simple to implement it. The evolution of the estimated output enables to follow the dynamics evolution of the process and to reflect the fault presence by the variation of the estimated parameters.[25] The aim of this contribution is to analyze the modeling improvement in comparison to the NARX modeling. 2008 Curtin University of Technology and John Wiley & Sons, Ltd. ARX modeling was the subject of studies in several fields such as chemical engineering,[26,27] agriculture and biological science,[28,29] medicine,[30] energy and the power and[31] energy economics.[32] The ARX needs the determination of the model orders, the time delay. Training and test phases validate the identified model. The ARX structure describes the input effects u(t) on the process output y(t). The ARX model is represented by the following expression: y(t) = −a1 y(t − 1) − · · · − ana y(t − na ) + b1 u(t − 1 − nk ) + · · · + bnb u(t − nb − nk ) + e(t) (14) where e(t) refers to the noise supposed to be Gaussian. ana and bnb are the model parameters. na and nb indicate the order of the polynomials of the output A(q) and the input B (q), respectively. The parameter nk is the time delay between y(t) and u(t). The polynomial representation of Eqn (14) is given as follows: A(q)y(t) = B (q)u(t − nk ) + e(t) (15) where A(q) and B (q) are given as follows: A(q) = 1 + a1 q −1 + · · · + ana q −na B (q) = b1 q −1−nk + · · · + bnb q −nb −nk (16) (17) q −1 is the shift operator such as u(t − 1) = q −1 u(t) (18) A(q) and B (q) are estimated by the least squares identification.[33,34] EXPERIMENTAL RESULTS Experimental device The reactor-exchanger (Fig. 2) is a glass-jacketed reactor with a tangential input for heat transfer fluid. The main aim of this input mode is to allow to give to the cooling a more important speed in the direction of the flow. It is equipped with an electrical calibration heating and an input system. It is also equipped with Pt 100 temperature probes. The heating–cooling system, which uses a single heat transfer fluid, works within the temperature range of −15 and +200 ◦ C. Supervision software allows the fitting of the parameters and their instruction value. It displays and stores data during the experiment as well as for its further exploitation. The input of the reactor-exchanger u(t) represents the heat transfer fluid temperature allowing the heating–cooling of the water; y(t) represents the outlet temperature of the reactor-exchanger. So, the process Asia-Pac. J. Chem. Eng. 2008; 3: 597–605 DOI: 10.1002/apj Asia-Pacific Journal of Chemical Engineering ARX AND NARX APPROACHES FOR PROCESS BEHAVIOR Figure 2. Experimental device: a reactor-exchanger. is excited by an input signal rich in frequencies and amplitudes in order to have a suitable dataset appropriated to the estimation. The set of the experimental data is composed by two measurement vectors; the input u(t) and the output y(t) = f (u(t)) of the reactorexchanger. This nonlinear relationship between u(t) and y(t) has to be determined by the neural regression as depicted by the Eqn (1). For this reason, the variation of the frequency used is done randomly in a frequencies range between a minimal value and a maximum value [f min = 1/(12 s), f max = 1/(30 min)]. On the other hand, the variation range of the heat transfer fluid temperature is done in the interval (15–90 ◦ C). This interval includes all the cooling temperatures used in practice, i.e. from the minimal temperature to the maximum temperature of cooling. The duration of the experiment is 22 h. The sampling time is fixed at 2 s. Before starting the estimation of parameters, the available data is divided into two separated sets. The first subset is the training subset, which is used for computing the gradient and updating the network weights. The second subset is the validation set. The first one is sufficiently informative and covering the whole spectrum. The second one contains sufficient elements to make the validation as credible as possible. All data were standardized (zero mean and unity standard deviation). Establishment of ARX models A set of models is built by fixing na = [1, . . . , 5], nb = [1, . . . , 5] and nk = [1, . . . , 10]. Models that have 2008 Curtin University of Technology and John Wiley & Sons, Ltd. na lower than nb are rejected in order to respect the physical aspect of the process. Consequently, a set of 150 models is achieved and estimated taking into account the stability of each model by the Lyapunov criterion.[33] Estimate of the time delay There are several methods for estimating the time delay.[18,33,34] In this paper, the adopted approach is based on the evaluation of the quadratic criterion.[34] This criterion is as follows: V (θ ) = N 1 ε(t, θ )2 N t=1 (19) ε(t, θ ) = y(t) − ŷ(t) and ŷ(t) represent, respectively, the prediction error and the associated predictor. The quadratic criterion value is calculated according to the time delay value nk = [1, . . . , 10]. This method is applied to two simple ARX models (na = nb = 1) and (na = nb = 2). The choice of these simple models allows observing the criterion evolution according to the time delay but without compensating it (time delay) by a high complexity model. The criterion evolution for those simple models is shown in Fig. 3. By examining the curve (na = nb = 2), it is easy to observe the presence of the minimal value of the statistical criterion for nk = [5, 6, 7, 8]. However, this presence is supported clearly for nk = [6, 7] in the curve (na = nb = 1). Therefore, it is better to first consider that the time delay values are both nk = 6 and nk = 7. Asia-Pac. J. Chem. Eng. 2008; 3: 597–605 DOI: 10.1002/apj 601 Y. CHETOUANI Asia-Pacific Journal of Chemical Engineering 98.9 3 na=2, nb=2 na=1, nb=1 2.8 nk=6 nk=7 98.8 98.7 2.6 Criterion (Q %) Quadratic criterion (10-4) 2.4 2.2 2 98.6 98.5 98.4 98.3 1.8 98.2 1.6 1 3 2 4 5 6 Delay time 7 8 9 10 Figure 3. Criterion evolution according to the time delay nk . Then, each model which has a different time delay (nk = [6, 7]) will be rejected. Goodness of fit The goodness-of-fit criterion (GFC) allows a judicious selection of models. This criterion proposed by Hagenblad et al .[35] is based on the analysis of the prediction error and output variance. It is given by the following expression: N 2 ˆ ) − y(k ) 1− y (k k =1 Q = 100 × 2 N n 1 y(k ) − y(i ) n i =1 (20) k =1 Figure 4 shows the GFC evolution according to different models Mna .nb . Models M 3.2, M 4.2 and M 5.5 have a good quality of adjustment compared to other models (important peaks). Model M 5.5 is not being chosen because it is too large. The peak of the model M 4.2 is more important than that of the model M 3.2. Consequently, the model M 4.2 is more representative for the dynamic behavior than the model M 3.2 and thus for the two time delay values (nk = 6 and nk = 7). In conclusion, the model (M 4.2.7), which has nk = 7, is the most suitable one for reproducing the process dynamics of the reactor-exchanger. 98.1 M22 M32 M33 M42 M43 M44 M51 M52 M53 M54 M55 Model (Mny.nu) Figure 4. Criterion evolution according to the different models Mna .nb . error. This function is expressed by the following equation: N 1 2 ε (t) (21) LF = N i =1 where ε(t) = y(t) − ŷ(t) represents the prediction error and N is the data length. The choice of the hidden nodes is carried out between 1 and 15 nodes. In fact, the minimal number of inputs is avoided to ensure the model flexibility. The maximum number of inputs is also excluded to avoid the overfitting of the model. The training on the database gives the evolution of the loss function. In order to clearly show the minimum of the LF for each model according to the number of hidden nodes, we separate the LF evolution in two different figures (Figs 5 and 6). Mny.nu.nh represents a neural model, which has the input layer composed by ny outputs, nu inputs and nh hidden nodes. These figures show the LF on the same training data for different NN models according to the hidden nodes. Model 0.025 M21 M22 M32 0.02 Loss Function 602 0.015 0.01 0.005 Establishment of NARX neural models 0 In order to establish a suitable model order for a particular system, NNs of increasing model order can be trained and their performance on the training data compared using the loss function (LF) or mean squared 2008 Curtin University of Technology and John Wiley & Sons, Ltd. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Hidden nodes Figure 5. Evolution of the loss function for low complexity models. Asia-Pac. J. Chem. Eng. 2008; 3: 597–605 DOI: 10.1002/apj Asia-Pacific Journal of Chemical Engineering ARX AND NARX APPROACHES FOR PROCESS BEHAVIOR 0.06 2.8 M33 M43 M44 M54 0.04 2.4 Criterion Loss Function 0.05 0.03 2 1.6 0.02 1.2 0.01 0.8 M2.1.12 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Hidden nodes Figure 6. Evolution of the loss function for high complexity models. M 3.2.10 exhibits the lowest LF; however, this model may not be the best choice, because there is a trade-off between the model complexity (i.e. size) and accuracy. A small decrease in the LF may be rejected if it is at the expense of enlarging the model size. Thus, the decision procedure for selecting a parsimonious model using the LF is to decide, for each increase in model order, whether any reductions in the LF are worth the expense of a larger model. The difficult trade-off between model accuracy and complexity can be clarified by using model parsimony indices from linear estimation theory, such as Akaike’s information criterion (AIC), Rissanen’s minimum description length (MDL) and Bayesian information criterion (BIC). The validation phase thus makes it possible to distinguish the model, correctly describing the dynamic behavior of the process. These statistical criteria are defined as follows: AIC = ln MDL = ln BIC = ln N LF 2 N LF 2 N LF 2 + 2nw N (22) + 2nw ln(N ) N (23) + nw ln(N ) N (24) where nw is the number of model parameters (weights in a NN). Hence, the AIC, MDL and BIC are weighted functions of the LF, which penalize for reductions in the prediction errors at the expense of increasing model complexity (i.e. model order and number of parameters). Strict application of these statistical criteria means that the model structure with the minimum AIC, MDL or BIC is selected as a parsimonious structure. However, in practice, engineering judgment may need to be exercised. Figure 7 clearly shows the evolution of AIC, 2008 Curtin University of Technology and John Wiley & Sons, Ltd. AIC MDL BIC M2.2.3 M3.2.10 M3.3,12 M4.3.3 Model (Mny.nu.nh) M4.4.8 M5.4.5 Figure 7. Evolution of the criteria for the LF minimum. MDL and BIC criteria according to the LF minimum for each model. A strict application of the indices would select the models M2.2.3. and M3.2.10, because they exhibit the lowest of three indices for all the model structures compared. Although, in this case, the AIC, MDL and BIC criteria do not provide a clear indication of a particular model, the interpretation of these criteria results provide further support for the choice of a M 3.2.10 model indicated by the LF. On the basis of the engineering judgment, the model M 2.2.3 would be preferred without significant loss of accuracy. Residual analysis Once the training and the test of the ARX and NARX models have been completed, they should be ready to simulate the system dynamics. Model validation tests should be performed to validate the identified model. Billings et al .[36] proposed some correlations based model validity tests. In order to validate the identified model, it is necessary to evaluate the properties of the errors that affect the prediction of the outputs of the model, which can be defined as the differences between experimental and simulated time series. In general, the characteristics of the error are considered satisfactory when the error behaves as white noise, i.e. it has a zero mean and is not correlated.[5,36] In fact, if both these conditions are satisfied, it means that the identified model has captured the deterministic part of the system dynamics, which is therefore accurately modeled. To this aim, it is necessary to verify that the autocorrelation function of the normalized error ε(t), namely φ ε ε (τ ), assumes the values 1 for t = 0 and 0 elsewhere; in other words, it is required that the function behaves as an impulse. This autocorrelation is defined as follows[36,37] : φ ε ε (τ ) = E (ε (t − τ )ε (t)] = δ (τ ) ∀τ, (25) where ε is the model residual, E (X ) is the expected value of X , and τ is the lag. Asia-Pac. J. Chem. Eng. 2008; 3: 597–605 DOI: 10.1002/apj 603 Y. CHETOUANI Asia-Pacific Journal of Chemical Engineering This condition is, of course, ideal and, in practice, it is sufficient to verify that φ ε ε (τ ), remains in a confidence band usually fixed at the 95%, which means that φ ε ε (τ ) must remain inside the range √ , with N the number of the test data on which ± 1.96 N φ ε ε (τ ) is calculated. Billings et al .[36] proposed also tests for looking into the cross correlation among model residuals and inputs. This cross correlation is defined by the following equation: φ uε (τ ) = E (u (t − τ )ε (t)] = 0 ∀τ from the input one. The autocorrelation of the ARX model exceeds the threshold (95%) for few points. This explains why this model cannot completely restore the nonlinear part of the signal. For the NARX neural model, all points are inside the 95% confidence bands. Therefore, this model is considered a reliable one for describing the dynamic behavior of the process. This validation phase is used with the neural weights found in the training phase. There is a good agreement between the learned neural model and the experiment in the validation phase. This result is important because it shows the ability of the NN with only one hidden layer to interpolate any nonlinear function.[22] Moreover, the analysis of the computational results has confirmed that the NARX neural model is more powerful than the ARX model. Figure 9 shows the difference between the experimental output and those simulated by the neural model M 2.2.3. Analyzing this figure, it emerges that the NARX model M 2.2.3 ensures satisfactory performances as it is indeed able to correctly identify the dynamics of the reactor-exchanger. The main advantage of the proposed neural approach consists in the natural ability of NNs in modeling nonlinear dynamics in a fast and simple way and in the possibility to address the process to be modeled as an input–output black-box, with little or no mathematical information on the system. (26) To implement these tests (Eqns (25) and (26)), u and ε are normalized to give zero mean sequences of unit variance. The sampled cross-validation function between two such data sequences u(t) and ε (t) is then calculated as N −τ φ uε (τ ) = u(t) ε (t + τ ) t=1 N t=1 u 2 (t) N 1/2 (27) ε2 (t) t=1 If the Eqns (25) and (26) are satisfied, then the model residuals are a random sequence and are not predictable from inputs and, hence, the model will be considered as adequate. These correlation-based tests are used here to validate the NN model. The results are presented in Fig. 8. In these plots, the dash dot lines are the 95% confidence bands. The evolution of the cross correlation of ARX and NARX models is inside the 95% confidence bands. In addition, the NARX cross correlation is lowest. This explains the independence of the residual signal CONCLUSION This work aims to model process dynamics by means of input–output experimental measurement models (ARX and NARX). The dynamics modeling of this reactorexchanger provides a useful solution for the formulation of a reliable model. In this case, the experimental results showed that the NARX model is able to give more 0.06 0.05 0.04 0.03 Correlation index 604 0.02 0.01 Lag 0 -20 -18 -16 -14 -12 -10 -8 -6 -4 -2 0 -0.01 -0.02 -0.03 2 4 6 8 10 12 14 16 18 20 NARX auto-correlation ARX cross-correlation NARX cross-correlation ARX auto-correlation Figure 8. Results of model validation tests. 2008 Curtin University of Technology and John Wiley & Sons, Ltd. Asia-Pac. J. Chem. Eng. 2008; 3: 597–605 DOI: 10.1002/apj Asia-Pacific Journal of Chemical Engineering ARX AND NARX APPROACHES FOR PROCESS BEHAVIOR 0.2 0.15 Prediction error (°C) 0.1 0.05 0 Sample number -0.05 -0.1 -0.15 -0.2 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 11000 12000 13000 14000 15000 Figure 9. Prediction error of the output temperature. satisfactory descriptions of the experimental data than the ARX model. Finally, the obtained neural model will be useful as a reference one for the FDI, which can occur through the process dynamics. Acknowledgments The author is grateful to two anonymous referees for their comments on a previous draft of this paper. REFERENCES [1] Y. Chetouani. Process Saf. Environ. Prot., 2006; 84, 27–32. [2] Y. Chetouani. Int. J. Reliab. Saf., 2007; 1, 411–427. [3] J. Villermaux. New horizons in chemical engineering, In Proceedings of the Plenary Reading in Fifth World Congress of Chemical Engineering, San Diego, 1996. [4] Y. Chetouani. Chem. Eng. Process., 2004; 43, 1579–1585. [5] L. Cammarata, A. Fichera, A. Pagano. Appl. Energy, 2002; 72, 513–528. [6] I.J. Leontaritis, S.A. Billings. Int. J. Control, 1985; 41, 303–328. [7] D.M. Himmelblau. Korean J. Chem. Eng., 2000; 17, 373–392. [8] R. Sharma, K. Singh, D. Singhal, R. Ghosh. Chem. Eng. Process., 2004; 43, 841–847. [9] S. Engell, G. Fernholz. Chem. Eng. Process., 2003; 42, 201–210. [10] J. Savkovic-Stevanovic. Comput. Chem. Eng., 1996; 20, 925–930. [11] E.M. Assaf, R.C. Giordano, C.A. Nascimento. Chem. Eng. Sci., 1996; 3, 107–112. [12] V.K. Nanayakkara, Y. Ikegami, H. Uehara. Int. J. Refrigeration, 2002; 25, 813–826. [13] Y. Chetouani. Int. J. Comput. Sci. Appl., 2007; 4, 119–133. [14] E.H.K. Fung, Y.K. Wong, H.F. Ho, M.P. Mignolet. Appl. Math. Model., 2003; 27, 611–627. 2008 Curtin University of Technology and John Wiley & Sons, Ltd. [15] J. Mu, D. Rees, G.P. Liu. Control Eng. Pract., 2005; 13, 1001–1015. [16] F. Previdi. Control Eng. Pract., 2002; 10, 91–99. [17] S.J. Qin, T.J. McAvoy. Comput. Chem. Eng., 1996; 20, 147–159. [18] S. Chen, S.A. Billings. Int. J. Control, 1989; 49, 1013–1032. [19] K.S. Narendra, K. Parthasarathy. IEEE Trans. Neural Netw., 1990; 1, 4–21. [20] L. Ljung. System Identification, Theory for the User, Prentice Hall, Englewood Cliffs, NJ, 1999. [21] M.R. Warnes, J. Glassey, G.A. Montague, B. Kara. Process Biochem., 1996; 31, 147–155. [22] G. Cybenko. Math. Control Signals Syst., 1989; 4, 303–312. [23] D.E. Rumelhart, G.E. Hinton, R.J. Williams. Nature, 1986; 323, 533–536. [24] D. Nguyen, B. Widrow. Int. Joint Conf. Neural Netw., 1990; 3, 21–26. [25] R. Iserman. Automatica, 1993; 29, 815–835. [26] D.E. Rivera, S.V. Gaikwad. J. Process Control, 1995; 5, 213–224. [27] S. Rohani, M. Haeri, H.C. Wood. Comput. Chem. Eng., 1999; 23, 279–286. [28] M.L. Fravolini, A. Ficola, M. La Cava. J. Food Eng., 2003; 60, 289–299. [29] H.U. Frausto, J.G. Pieters, J.M. Deltour. Biosyst. Eng., 2003; 84, 147–157. [30] Y. Liu, A.A. Birch, R. Allen. Med. Eng. Phys., 2003; 25, 647–653. [31] H. Yoshida, S. Kumar. Renewable Energy, 2001; 22, 53–59. [32] J.V. Ringwood, P.C. Austin, W. Monteith. Energy Econ., 1993; 15, 285–296. [33] L. Ljung. System Identification, Theory for the Use, PrenticeHall, New Jersey, NJ, 1987. [34] L. Ljung. System Identification Toolbox User’s Guides, The Math Works, Natick, MA, 2000. [35] A. Hagenblad, L. Ljung. Maximum likelihood identification of Wiener models with a linear regression initialization, In Proceedings 37th IEEE Conference on Decision and Control , Tampa, Florida, 1998. [36] S.A. Billings, W.S.F. Voon. Int. J. Control, 1986; 44, 235–244. [37] J. Zhang, J. Morris. Fuzzy Sets Syst., 1996; 79, 127–140. Asia-Pac. J. Chem. Eng. 2008; 3: 597–605 DOI: 10.1002/apj 605

1/--страниц