вход по аккаунту


Using ARX and NARX approaches for modeling and prediction of the process behavior application to a reactor-exchanger.

код для вставкиСкачать
Asia-Pac. J. Chem. Eng. 2008; 3: 597–605
Published online 15 October 2008 in Wiley InterScience
( DOI:10.1002/apj.118
Special Theme Research Article
Using ARX and NARX approaches for modeling and
prediction of the process behavior: application to a
Yahya Chetouani*
Université de Rouen, Département Génie Chimique, Rue Lavoisier, 76821, Mont Saint Aignan Cedex, France
Received 30 June 2008; Revised 25 February 2008; Accepted 26 February 2008
ABSTRACT: Chemical industries are characterized often by nonlinear processes. Therefore, it is often difficult to obtain
nonlinear models that accurately describe a plant in all regimes. The main contribution of this work is to establish
a reliable model of a process behavior. The use of this model should reflect the normal behavior of the process and
allow distinguishing it from an abnormal one. Consequently, the black-box identification based on the neural network
(NN) approach by means of a nonlinear autoregressive with exogenous input (NARX) model has been chosen in this
study. A comparison with an autoregressive with exogenous input (ARX) model based on the least squares criterion is
carried out. This study also shows the choice and the performance of ARX and NARX models in the training and test
phases. Statistical criteria are used for the validation of the experimental data of these approaches. The identified neural
model is implemented by training a multilayer perceptron artificial neural network (MLP-ANN) with input–output
experimental data. An analysis of the inputs number, hidden neurons and their influence on the behavior of the neural
predictor is carried out. In order to illustrate the proposed ideas, a reactor-exchanger is used. Satisfactory agreement
between identified and experimental data is found and results show that the neural model predicts the evolution of the
process dynamics in a better way.  2008 Curtin University of Technology and John Wiley & Sons, Ltd.
KEYWORDS: reliability; modeling; neural network; NARX; ARX
In the last few years, ever-growing interest has been
shown in production quality standards and pollution
phenomena in industrial environments. However, process development and continuous request for productivity led to an increasing complexity of industrial
units. The dynamic nature and the nonlinear behavior
of such units pose challenging control system design
when products of constant purity are to be recovered. In
chemical industries, it is absolutely necessary to control
the process and any drift or anomaly must be detected
as soon as possible in order to prevent risks and accidents. Moreover, detecting a fault appearance on-line is
justified by the need to effectively solve the problems
within a short time.[1 – 3] The anomaly detection module is intended to supervise the functioning state of the
system.[4] This module has to generate on-line information concerning the state of the automated system. This
state is characterized not only by control and measurement variables (temperature, rate, etc.) but also by the
*Correspondence to: Yahya Chetouani, Université de Rouen, Département Génie Chimique, Rue Lavoisier, 76821, Mont Saint Aignan
Cedex, France. E-mail:
 2008 Curtin University of Technology and John Wiley & Sons, Ltd.
general behavior of the process and its history, showing
in time whether the behavior of the system is normal or
presents drifts. In the context of numerical control, fault
detection and isolation (FDI) proves a vital complement
to the adaptive means of dealing with perturbations in
nonlinear, highly nonstationary systems. Under normal
conditions, the fault detection module allows all information to be processed and managed in direct liaison
with its general behavior. In other case, it detects any
anomaly and alerts the operator by setting on the appropriate alarms.
The intrinsic highly nonlinear behavior in the industrial process, especially when a chemical reaction is
used, poses a major problem for the formulation of good
predictions and the design of reliable control systems.[5]
Owing to the relevant number of degree of freedom, to
the nonlinear coupling of different phenomena and to
the processes complexity, the mathematical modeling of
the process is computationally heavy and may produce
an unsatisfactory correspondence between experimental
and simulated data. Similar problems also arise from the
uncertainty for the parameters of the process, such as the
reaction rate, activation energy, reaction enthalpy, heat
transfer coefficient, and their unpredictable variations.
In fact, note that most of the chemical and thermophysical variables both strongly depend and influence
instantaneously the temperature of the reaction mass.[4]
One way of addressing this problem is the use of a
reliable model for the on-line prediction of the system dynamic evolution. However, designing empirical
models like the black-box models is unavoidable. Various techniques of the processes identification were
already proposed.[6] Owing to their inherent nature to
model and learn ‘complexities’, artificial neural networks (ANNs) have found wide applications in various
areas of chemical engineering and related fields.[7,8]
Engell et al .[9] discussed general aspects of the control
of reactive separation processes. They used a semi-batch
reactive distillation process. A comparison was carried
out between conventional control structures and modelbased predictive control by using a neural net plant
model. Savkovic-Stevanovic[10] used a neural network
(NN) for product composition control of a distillation
plant. The NN controller design is based on the process inverse dynamic modeling. The back-propagation
(BP) algorithm is applied to dynamic nonlinear relationship between product composition and reflux flow
rate. The obtained results illustrate the feasibility of
using neural net for learning nonlinear dynamic model
distillation column from plant input–output data and
control. Assaf et al .[11] modeled an ethylene oxidation
fixed-bed reactor by a phenomenological model. They
compared the results given by this model and those
given by the neural model for possible thermal runaway situations of highly exothermic process. The final
objective is to build a reliable inference alarm algorithm for fast detection and prevention of this situation.
Nanayakkara et al .[12] presented a novel NN to control
an ammonia refrigerant evaporator. The objective is to
control evaporator heat flow rate and secondary fluid
outlet temperature while keeping the degree of refrigerant superheat at the evaporator outlet by manipulating
refrigerant and evaporator secondary fluid flow rates.
In a previous paper,[13] a reduced and reliable model
based on a neural model, which allows reproducing the
dynamics of a nonlinear process as a distillation column under steady-state or unsteady state regime was
carried out.
The main contribution of this current work is to obtain
a powerful model allowing reproducing the dynamic
behavior of a process as a reactor-exchanger. The
present study focuses on the development and implementation of a nonlinear autoregressive with exogenous
input (NARX) neural model for the one-step ahead forecasting of the reactor-exchanger dynamics. The performance of this stochastic model was then evaluated using
the performance criteria. A comparison with an autoregressive with exogenous input (ARX) model based on
the least squares criterion is carried out. Experiments
were performed in a reactor-exchanger and experimental data were used both to define and to validate these
 2008 Curtin University of Technology and John Wiley & Sons, Ltd.
Asia-Pacific Journal of Chemical Engineering
models. This study is carried out in order to evaluate
the time delay of the process. It shows that several ARX
models could be selected. Statistical information criteria and the quality of the adjustment criteria are used
in order to make a judicious choice of the ARX identified model. Finally, results show that the NARX neural
model is more representative than the ARX model for
modeling the dynamic behavior of the studied process.
The identification procedure, the experimental setup
and prediction results are described in the following
Modeling is an essential precursor in the parameter estimation process. Identification strategies of various kinds
by means of input–output measurements are commonly
used in many situations in which it is not necessary
to achieve a deep mathematical knowledge of the system under study, but it is sufficient to predict the system evolution.[14,15] This is often the case in control
applications, where satisfactory predictions of the system that are to be controlled and sufficient robustness
to parameter uncertainty are the only requirements. In
chemical systems, parameter variations and uncertainty
play a fundamental role on the system dynamics and
are very difficult to be accurately modeled.[5] Therefore, the identification approach based on input–output
measurements can be applied.
NARX neural modeling
In order to provide a closer approximation to the actual
process in some situations, a nonlinear NARX model
is employed,[16,17] which is identified by means of
ANNs. The NARX model was obtained by using multilayer perceptron ANNs (MLP-ANNs)[18,19] to accurately describe the process behavior. This approach
allows bypassing the exact determination of model
parameters and their unpredictable variations, as well as
the achievement of deep physical knowledge of the process and its governing equations. The nonlinear model
of a finite dimensional system[20] with order (ny , nu )
and scalar variables y and u are defined by
y(t) = φ(y(t − 1), . . . , y(t − ny ),
u(t − 1), . . . , u(t − nu ))
where y(k ) is the autoregressive (AR) variable or
system output; u(k ) is the exogenous (X) variable or
system input. ny and nu are the AR and X orders,
respectively. φ is a nonlinear function.
This NN (Eqn (1)) consists in highly interconnected
layers of neuron-like nodes. It has an input and an
Asia-Pac. J. Chem. Eng. 2008; 3: 597–605
DOI: 10.1002/apj
Asia-Pacific Journal of Chemical Engineering
output layer and any optional layers that are included
between these are termed hidden layers. Figure 1 shows
typical feed-forward network architecture with one
hidden layer. The term ‘feed-forward’ means that the
connections between nodes only allows signals to be
sent to the next layer of nodes and not to the previous
The number of nodes in a hidden layer is determined
by the user and can vary from zero to any finite number.
The number of nodes in the input and output layers
are determined by the number of inputs and by the
output variables, respectively. This structure is based
on a result by Cybenko[22] who proved that a NN
with one hidden layer of sigmoid or hyperbolic tangent
units and an output layer of linear units is capable of
approximating any continuous function.
f (z ) =
(1 + e−z )
Calculation of the NN output
The following steps explain the calculation of the NN
output based on the input vector.[13,14]
1. Assign ŵ T (k ) to the input vector x T (k ) and apply
it to the input units where ŵ T (k ) is the regression
vector given by the following equation:
ŵ T (t) = [y(t − 1), . . . , y(t − ny ),
u(t − 1), . . . , u(t − nu )]
2. Calculate the input to the hidden layer units as
netjh (k ) =
Wjih (k )xi (k ) + bjh
i =1
where z is the sum of the weighted inputs and bias
term. The determination of these weights for the node
connections allows the ANN to learn the information
about the system to be modeled. The input data are
presented to the network via the input layer. These data
are propagated through the network to the output layer
to obtain the network output. The network error is then
determined by comparing the network output with the
actual output. If the error is not smaller than a desired
performance, the weights are adjusted and the training
data are presented to the network again to determine a
new network error. One of the most well known is the
BP algorithm.[23] In this algorithm, as with any other
gradient approach, large values of learning rate will
speed up the learning process, but lead to instability,
and convergence can only be expected for small values
of learning rate. The momentum factor is used to damp
down oscillations in the learning process. The latter
is repeated until the network error reaches the desired
performance. In this case, the network is then said to
have converged and the last set of weights are retained
as the network parameters.
where p is the number of input nodes of the network,
i.e. p = ny + nu + nb ; j is the j th hidden unit; Wji h is
the connection weight between i th input unit and j th
hidden unit; bj h is the bias term of the j th hidden unit.
3. Calculate the output from a node in the hidden layer
as follows:
zj = fj h (netjh (k ))
where fj h is the sigmoid function defined by Eqn (2).
4. Calculate the input to the output nodes as follows:
netl (k ) =
Wlj (k )zj (k )
j =1
where l is the l th output unit; Wlj (k ) is the connection
weight between j th hidden unit and l th output unit.
5. Calculate the outputs from the output nodes as
v̂l (k ) = fl (netl (k ))
where fl is the linear activation function defined by
fl (netl (k )) = netl (k )
Back-propagation training algorithm
The error function E is defined as
(vl (k ) − v̂l (k ))2
l =1
Figure 1. Feed-forward network for prediction.
 2008 Curtin University of Technology and John Wiley & Sons, Ltd.
where q is the number of output units and vl (k ) is
the l th element of the output vector of the network.
Within each time interval from k to k + 1, the backpropagation (BP) algorithm tries to minimize the error
for the output value as defined by E by adjusting
the weights of the network connections, i.e. Wji h and
Asia-Pac. J. Chem. Eng. 2008; 3: 597–605
DOI: 10.1002/apj
Asia-Pacific Journal of Chemical Engineering
Wlj . The BP algorithm uses the following procedure
(Eqns (10), (11), (12) and (13)):
Wjih (k + 1) = Wjih (k ) + αWjih (k ) − η
Wlj (k + 1) = Wlj (k ) + αWlj (k ) − η
∂Wjih (k )
∂Wlj (k )
where η and α are the learning rate and the momentum
factor, respectively; Wji h and Wlj q are the amounts
of the previous weight changes; ∂E /∂Wji h (k ) and
∂E /∂Wlj q (k ) are given by
= −[zj (k )(1 − zj (k ))xi (k )]
∂Wjih (k )
[(vl (k ) − v̂l (k ))v̂l (k )Wljh (k )]
l =1
= −(vl (k ) − v̂l (k ))zj (k )
∂Wlj (k )
The implementation of the NN for forecasting is as
1. Initialize the weights using small random values
and set the learning rate and momentum factor for
the NN.
2. Apply the input vector given by Eqn (3) to the input
3. Calculate the forecast value of the error using the
data available at (k − 1)th sample (Eqns (3), (4), (5),
(6), (7) and (8)).
4. Calculate the error between the forecast value and
the measured value.
5. Propagate the error backwards to update the weights
(Eqns (10), (11), (12) and (13)).
6. Go back to step 2.
For weights initialization, the Nguyen–Widrow initialization method[24] is best suited for use with the
sigmoid/linear network, which is often used for function
ARX modeling
The second adopted method for modeling of the reactorexchanger is based on a parametric identification of an
ARX model. The choice of this strategy is justified by
the fact that it is simple to implement it. The evolution
of the estimated output enables to follow the dynamics
evolution of the process and to reflect the fault presence
by the variation of the estimated parameters.[25] The
aim of this contribution is to analyze the modeling
improvement in comparison to the NARX modeling.
 2008 Curtin University of Technology and John Wiley & Sons, Ltd.
ARX modeling was the subject of studies in several
fields such as chemical engineering,[26,27] agriculture
and biological science,[28,29] medicine,[30] energy and
the power and[31] energy economics.[32]
The ARX needs the determination of the model
orders, the time delay. Training and test phases validate
the identified model. The ARX structure describes the
input effects u(t) on the process output y(t). The ARX
model is represented by the following expression:
y(t) = −a1 y(t − 1) − · · · − ana y(t − na )
+ b1 u(t − 1 − nk ) + · · ·
+ bnb u(t − nb − nk ) + e(t)
where e(t) refers to the noise supposed to be Gaussian.
ana and bnb are the model parameters. na and nb indicate
the order of the polynomials of the output A(q) and
the input B (q), respectively. The parameter nk is the
time delay between y(t) and u(t). The polynomial
representation of Eqn (14) is given as follows:
A(q)y(t) = B (q)u(t − nk ) + e(t)
where A(q) and B (q) are given as follows:
A(q) = 1 + a1 q −1 + · · · + ana q −na
B (q) = b1 q
+ · · · + bnb q
−nb −nk
q −1 is the shift operator such as
u(t − 1) = q −1 u(t)
A(q) and B (q) are estimated by the least squares
Experimental device
The reactor-exchanger (Fig. 2) is a glass-jacketed reactor with a tangential input for heat transfer fluid. The
main aim of this input mode is to allow to give to the
cooling a more important speed in the direction of the
flow. It is equipped with an electrical calibration heating and an input system. It is also equipped with Pt
100 temperature probes. The heating–cooling system,
which uses a single heat transfer fluid, works within
the temperature range of −15 and +200 ◦ C. Supervision software allows the fitting of the parameters and
their instruction value. It displays and stores data during
the experiment as well as for its further exploitation.
The input of the reactor-exchanger u(t) represents
the heat transfer fluid temperature allowing the heating–cooling of the water; y(t) represents the outlet
temperature of the reactor-exchanger. So, the process
Asia-Pac. J. Chem. Eng. 2008; 3: 597–605
DOI: 10.1002/apj
Asia-Pacific Journal of Chemical Engineering
Figure 2. Experimental device: a reactor-exchanger.
is excited by an input signal rich in frequencies and
amplitudes in order to have a suitable dataset appropriated to the estimation. The set of the experimental
data is composed by two measurement vectors; the
input u(t) and the output y(t) = f (u(t)) of the reactorexchanger. This nonlinear relationship between u(t) and
y(t) has to be determined by the neural regression as
depicted by the Eqn (1). For this reason, the variation
of the frequency used is done randomly in a frequencies
range between a minimal value and a maximum value
[f min = 1/(12 s), f max = 1/(30 min)]. On the other
hand, the variation range of the heat transfer fluid temperature is done in the interval (15–90 ◦ C). This interval
includes all the cooling temperatures used in practice,
i.e. from the minimal temperature to the maximum temperature of cooling. The duration of the experiment is
22 h. The sampling time is fixed at 2 s. Before starting the estimation of parameters, the available data is
divided into two separated sets. The first subset is the
training subset, which is used for computing the gradient and updating the network weights. The second
subset is the validation set. The first one is sufficiently
informative and covering the whole spectrum. The second one contains sufficient elements to make the validation as credible as possible. All data were standardized
(zero mean and unity standard deviation).
Establishment of ARX models
A set of models is built by fixing na = [1, . . . , 5],
nb = [1, . . . , 5] and nk = [1, . . . , 10]. Models that have
 2008 Curtin University of Technology and John Wiley & Sons, Ltd.
na lower than nb are rejected in order to respect the
physical aspect of the process. Consequently, a set
of 150 models is achieved and estimated taking into
account the stability of each model by the Lyapunov
Estimate of the time delay
There are several methods for estimating the time
delay.[18,33,34] In this paper, the adopted approach is
based on the evaluation of the quadratic criterion.[34]
This criterion is as follows:
V (θ ) =
1 ε(t, θ )2
N t=1
ε(t, θ ) = y(t) − ŷ(t) and ŷ(t) represent, respectively,
the prediction error and the associated predictor. The
quadratic criterion value is calculated according to
the time delay value nk = [1, . . . , 10]. This method
is applied to two simple ARX models (na = nb = 1)
and (na = nb = 2). The choice of these simple models
allows observing the criterion evolution according to
the time delay but without compensating it (time delay)
by a high complexity model. The criterion evolution for
those simple models is shown in Fig. 3.
By examining the curve (na = nb = 2), it is easy
to observe the presence of the minimal value of the
statistical criterion for nk = [5, 6, 7, 8]. However, this
presence is supported clearly for nk = [6, 7] in the curve
(na = nb = 1). Therefore, it is better to first consider
that the time delay values are both nk = 6 and nk = 7.
Asia-Pac. J. Chem. Eng. 2008; 3: 597–605
DOI: 10.1002/apj
Asia-Pacific Journal of Chemical Engineering
na=2, nb=2
na=1, nb=1
Criterion (Q %)
Quadratic criterion (10-4)
Delay time
Figure 3. Criterion evolution according to the time delay nk .
Then, each model which has a different time delay
(nk = [6, 7]) will be rejected.
Goodness of fit
The goodness-of-fit criterion (GFC) allows a judicious
selection of models. This criterion proposed by Hagenblad et al .[35] is based on the analysis of the prediction
error and output variance. It is given by the following
N 2
ˆ ) − y(k )
 1−
y (k
k =1
Q = 100 ×  2
 1
y(k ) −
y(i )
n i =1
k =1
Figure 4 shows the GFC evolution according to
different models Mna .nb . Models M 3.2, M 4.2 and M 5.5
have a good quality of adjustment compared to other
models (important peaks). Model M 5.5 is not being
chosen because it is too large. The peak of the model
M 4.2 is more important than that of the model M 3.2.
Consequently, the model M 4.2 is more representative
for the dynamic behavior than the model M 3.2 and
thus for the two time delay values (nk = 6 and nk = 7).
In conclusion, the model (M 4.2.7), which has nk = 7,
is the most suitable one for reproducing the process
dynamics of the reactor-exchanger.
M22 M32 M33 M42 M43 M44 M51 M52 M53 M54 M55
Model (
Figure 4. Criterion evolution according to the different
models Mna .nb .
error. This function is expressed by the following
1 2
ε (t)
LF =
N i =1
where ε(t) = y(t) − ŷ(t) represents the prediction error
and N is the data length. The choice of the hidden
nodes is carried out between 1 and 15 nodes. In fact,
the minimal number of inputs is avoided to ensure the
model flexibility. The maximum number of inputs is
also excluded to avoid the overfitting of the model. The
training on the database gives the evolution of the loss
In order to clearly show the minimum of the LF
for each model according to the number of hidden
nodes, we separate the LF evolution in two different
figures (Figs 5 and 6). represents a neural
model, which has the input layer composed by ny
outputs, nu inputs and nh hidden nodes. These figures
show the LF on the same training data for different
NN models according to the hidden nodes. Model
Loss Function
Establishment of NARX neural models
In order to establish a suitable model order for a
particular system, NNs of increasing model order can
be trained and their performance on the training data
compared using the loss function (LF) or mean squared
 2008 Curtin University of Technology and John Wiley & Sons, Ltd.
7 8 9 10 11 12 13 14 15
Hidden nodes
Figure 5. Evolution of the loss function for low complexity
Asia-Pac. J. Chem. Eng. 2008; 3: 597–605
DOI: 10.1002/apj
Asia-Pacific Journal of Chemical Engineering
Loss Function
7 8 9 10 11 12 13 14 15
Hidden nodes
Figure 6. Evolution of the loss function for high complexity
M 3.2.10 exhibits the lowest LF; however, this model
may not be the best choice, because there is a trade-off
between the model complexity (i.e. size) and accuracy.
A small decrease in the LF may be rejected if it
is at the expense of enlarging the model size. Thus,
the decision procedure for selecting a parsimonious
model using the LF is to decide, for each increase
in model order, whether any reductions in the LF are
worth the expense of a larger model. The difficult
trade-off between model accuracy and complexity can
be clarified by using model parsimony indices from
linear estimation theory, such as Akaike’s information
criterion (AIC), Rissanen’s minimum description length
(MDL) and Bayesian information criterion (BIC). The
validation phase thus makes it possible to distinguish
the model, correctly describing the dynamic behavior
of the process. These statistical criteria are defined as
AIC = ln
MDL = ln
BIC = ln
2nw ln(N )
nw ln(N )
where nw is the number of model parameters (weights
in a NN).
Hence, the AIC, MDL and BIC are weighted functions of the LF, which penalize for reductions in the
prediction errors at the expense of increasing model
complexity (i.e. model order and number of parameters). Strict application of these statistical criteria means
that the model structure with the minimum AIC, MDL
or BIC is selected as a parsimonious structure. However, in practice, engineering judgment may need to be
exercised. Figure 7 clearly shows the evolution of AIC,
 2008 Curtin University of Technology and John Wiley & Sons, Ltd.
M3.2.10 M3.3,12 M4.3.3
Model (
Figure 7. Evolution of the criteria for the LF minimum.
MDL and BIC criteria according to the LF minimum
for each model.
A strict application of the indices would select the
models M2.2.3. and M3.2.10, because they exhibit the
lowest of three indices for all the model structures
compared. Although, in this case, the AIC, MDL and
BIC criteria do not provide a clear indication of a
particular model, the interpretation of these criteria
results provide further support for the choice of a
M 3.2.10 model indicated by the LF. On the basis of
the engineering judgment, the model M 2.2.3 would be
preferred without significant loss of accuracy.
Residual analysis
Once the training and the test of the ARX and NARX
models have been completed, they should be ready to
simulate the system dynamics. Model validation tests
should be performed to validate the identified model.
Billings et al .[36] proposed some correlations based
model validity tests. In order to validate the identified
model, it is necessary to evaluate the properties of the
errors that affect the prediction of the outputs of the
model, which can be defined as the differences between
experimental and simulated time series. In general, the
characteristics of the error are considered satisfactory
when the error behaves as white noise, i.e. it has a
zero mean and is not correlated.[5,36] In fact, if both
these conditions are satisfied, it means that the identified
model has captured the deterministic part of the system
dynamics, which is therefore accurately modeled. To
this aim, it is necessary to verify that the autocorrelation
function of the normalized error ε(t), namely φ ε ε (τ ),
assumes the values 1 for t = 0 and 0 elsewhere; in other
words, it is required that the function behaves as an
impulse. This autocorrelation is defined as follows[36,37] :
φ ε ε (τ ) = E (ε (t − τ )ε (t)] = δ (τ ) ∀τ,
where ε is the model residual, E (X ) is the expected
value of X , and τ is the lag.
Asia-Pac. J. Chem. Eng. 2008; 3: 597–605
DOI: 10.1002/apj
Asia-Pacific Journal of Chemical Engineering
This condition is, of course, ideal and, in practice,
it is sufficient to verify that φ ε ε (τ ), remains in
a confidence band usually fixed at the 95%, which
means that φ ε ε (τ ) must remain inside the range
√ , with N the number of the test data on which
± 1.96
φ ε ε (τ ) is calculated. Billings et al .[36] proposed also
tests for looking into the cross correlation among model
residuals and inputs. This cross correlation is defined by
the following equation:
φ uε (τ ) = E (u (t − τ )ε (t)] = 0
from the input one. The autocorrelation of the ARX
model exceeds the threshold (95%) for few points. This
explains why this model cannot completely restore the
nonlinear part of the signal. For the NARX neural
model, all points are inside the 95% confidence bands.
Therefore, this model is considered a reliable one
for describing the dynamic behavior of the process.
This validation phase is used with the neural weights
found in the training phase. There is a good agreement
between the learned neural model and the experiment in
the validation phase. This result is important because it
shows the ability of the NN with only one hidden layer
to interpolate any nonlinear function.[22] Moreover, the
analysis of the computational results has confirmed that
the NARX neural model is more powerful than the
ARX model. Figure 9 shows the difference between the
experimental output and those simulated by the neural
model M 2.2.3.
Analyzing this figure, it emerges that the NARX
model M 2.2.3 ensures satisfactory performances as it
is indeed able to correctly identify the dynamics of the
reactor-exchanger. The main advantage of the proposed
neural approach consists in the natural ability of NNs
in modeling nonlinear dynamics in a fast and simple
way and in the possibility to address the process to be
modeled as an input–output black-box, with little or no
mathematical information on the system.
To implement these tests (Eqns (25) and (26)), u
and ε are normalized to give zero mean sequences
of unit variance. The sampled cross-validation function
between two such data sequences u(t) and ε (t) is then
calculated as
φ uε (τ ) = u(t) ε (t + τ )
u 2 (t)
ε2 (t)
If the Eqns (25) and (26) are satisfied, then the
model residuals are a random sequence and are not
predictable from inputs and, hence, the model will be
considered as adequate. These correlation-based tests
are used here to validate the NN model. The results are
presented in Fig. 8. In these plots, the dash dot lines are
the 95% confidence bands.
The evolution of the cross correlation of ARX and
NARX models is inside the 95% confidence bands.
In addition, the NARX cross correlation is lowest.
This explains the independence of the residual signal
This work aims to model process dynamics by means of
input–output experimental measurement models (ARX
and NARX). The dynamics modeling of this reactorexchanger provides a useful solution for the formulation
of a reliable model. In this case, the experimental results
showed that the NARX model is able to give more
Correlation index
-20 -18 -16 -14 -12 -10 -8
NARX auto-correlation
ARX cross-correlation
NARX cross-correlation
ARX auto-correlation
Figure 8. Results of model validation tests.
 2008 Curtin University of Technology and John Wiley & Sons, Ltd.
Asia-Pac. J. Chem. Eng. 2008; 3: 597–605
DOI: 10.1002/apj
Asia-Pacific Journal of Chemical Engineering
Prediction error (°C)
1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 11000 12000 13000 14000 15000
Figure 9. Prediction error of the output temperature.
satisfactory descriptions of the experimental data than
the ARX model. Finally, the obtained neural model will
be useful as a reference one for the FDI, which can
occur through the process dynamics.
The author is grateful to two anonymous referees for
their comments on a previous draft of this paper.
[1] Y. Chetouani. Process Saf. Environ. Prot., 2006; 84, 27–32.
[2] Y. Chetouani. Int. J. Reliab. Saf., 2007; 1, 411–427.
[3] J. Villermaux. New horizons in chemical engineering, In
Proceedings of the Plenary Reading in Fifth World Congress
of Chemical Engineering, San Diego, 1996.
[4] Y. Chetouani. Chem. Eng. Process., 2004; 43, 1579–1585.
[5] L. Cammarata, A. Fichera, A. Pagano. Appl. Energy, 2002;
72, 513–528.
[6] I.J. Leontaritis, S.A. Billings. Int. J. Control, 1985; 41,
[7] D.M. Himmelblau. Korean J. Chem. Eng., 2000; 17, 373–392.
[8] R. Sharma, K. Singh, D. Singhal, R. Ghosh. Chem. Eng.
Process., 2004; 43, 841–847.
[9] S. Engell, G. Fernholz. Chem. Eng. Process., 2003; 42,
[10] J. Savkovic-Stevanovic. Comput. Chem. Eng., 1996; 20,
[11] E.M. Assaf, R.C. Giordano, C.A. Nascimento. Chem. Eng.
Sci., 1996; 3, 107–112.
[12] V.K. Nanayakkara, Y. Ikegami, H. Uehara. Int. J. Refrigeration, 2002; 25, 813–826.
[13] Y. Chetouani. Int. J. Comput. Sci. Appl., 2007; 4, 119–133.
[14] E.H.K. Fung, Y.K. Wong, H.F. Ho, M.P. Mignolet. Appl.
Math. Model., 2003; 27, 611–627.
 2008 Curtin University of Technology and John Wiley & Sons, Ltd.
[15] J. Mu, D. Rees, G.P. Liu. Control Eng. Pract., 2005; 13,
[16] F. Previdi. Control Eng. Pract., 2002; 10, 91–99.
[17] S.J. Qin, T.J. McAvoy. Comput. Chem. Eng., 1996; 20,
[18] S. Chen, S.A. Billings. Int. J. Control, 1989; 49, 1013–1032.
[19] K.S. Narendra, K. Parthasarathy. IEEE Trans. Neural Netw.,
1990; 1, 4–21.
[20] L. Ljung. System Identification, Theory for the User, Prentice
Hall, Englewood Cliffs, NJ, 1999.
[21] M.R. Warnes, J. Glassey, G.A. Montague, B. Kara. Process
Biochem., 1996; 31, 147–155.
[22] G. Cybenko. Math. Control Signals Syst., 1989; 4, 303–312.
[23] D.E. Rumelhart, G.E. Hinton, R.J. Williams. Nature, 1986;
323, 533–536.
[24] D. Nguyen, B. Widrow. Int. Joint Conf. Neural Netw., 1990;
3, 21–26.
[25] R. Iserman. Automatica, 1993; 29, 815–835.
[26] D.E. Rivera, S.V. Gaikwad. J. Process Control, 1995; 5,
[27] S. Rohani, M. Haeri, H.C. Wood. Comput. Chem. Eng., 1999;
23, 279–286.
[28] M.L. Fravolini, A. Ficola, M. La Cava. J. Food Eng., 2003;
60, 289–299.
[29] H.U. Frausto, J.G. Pieters, J.M. Deltour. Biosyst. Eng., 2003;
84, 147–157.
[30] Y. Liu, A.A. Birch, R. Allen. Med. Eng. Phys., 2003; 25,
[31] H. Yoshida, S. Kumar. Renewable Energy, 2001; 22, 53–59.
[32] J.V. Ringwood, P.C. Austin, W. Monteith. Energy Econ.,
1993; 15, 285–296.
[33] L. Ljung. System Identification, Theory for the Use, PrenticeHall, New Jersey, NJ, 1987.
[34] L. Ljung. System Identification Toolbox User’s Guides, The
Math Works, Natick, MA, 2000.
[35] A. Hagenblad, L. Ljung. Maximum likelihood identification
of Wiener models with a linear regression initialization, In
Proceedings 37th IEEE Conference on Decision and Control ,
Tampa, Florida, 1998.
[36] S.A. Billings, W.S.F. Voon. Int. J. Control, 1986; 44,
[37] J. Zhang, J. Morris. Fuzzy Sets Syst., 1996; 79, 127–140.
Asia-Pac. J. Chem. Eng. 2008; 3: 597–605
DOI: 10.1002/apj
Без категории
Размер файла
192 Кб
using, exchanger, process, behavior, approach, application, reactor, modeling, narx, prediction, arx
Пожаловаться на содержимое документа