close

Вход

Забыли?

вход по аккаунту

?

745

код для вставкиСкачать
INTERNATIONAL JOURNAL OF CIRCUIT THEORY AND APPLICATIONS, VOL. 24, 111- 120 (1996)
EVALUATION OF CNN TEMPLATE ROBUSTNESS TOWARDS VLSI
1MPLEMENTATION.t
PETER KINGET AND MICHIEL STEYAERT
ESAT-MICAS, Departement Electrotechniek. Kaiholieke Universiteit Leuven, Kardinaal Mercierlaan 94,B-3001 Leuven, Belgium
SUMMARY
In this paper a method for the evaluation of static robustness towards random variations in cellular neural network
(CNN) templates is proposed. From this evaluation, circuit accuracy specifications for a VLSI implementation are
derived which allow the designer to optimize the performance. Moreover, from this evaluation method, guidelines for
robust template designs are derived and parametric testing templates are developed.
1. INTRODUCTION
For the implementation of cellular neural networks (CNNs)' the cell area has to be minimized in order to
obtain the maximal cell density. However, using circuit components of small area results in large
component variations, so that accuracy specifications have to be available to the hardware designer which
guarantee correct operation of the network. Up to now very limited data on the robustness of CNNs
towards random variations have been available in the open literature.
In this paper a method for the evaluation of the static robustness of CNN templates is proposed. In
Section 2 the technological implementation limitations are discussed in detail. The robustness evaluation
method is described in Section 3. First the robustness of a single cell towards variations is calculated and
from this the behaviour of the whole network is evaluated using statistical estimations. The validity of the
approximations and assumptions is checked by Monte Car10 simulations in Section 4. This evaluation
method can be used to generate the specifications for the building blocks of a CNN circuit implementation
as illustrated in Section 5 . Furthermore, the evaluation method is also very helpful in the development and
performance optimization of templates. Finally it can be applied to the generation of test templates that can
be used to parametrically test CNN chips for their accuracy.
2. HARDWARE IMPLEMENTATION LIMITATIONS AND ERRORS
For the VLSI implementation of a signal processing or computational system, accuracy specifications are
of prime importance. They determine the optimal implementation style and the ultimate performance limits
for power and area consumption and processing speed. In digital VLSI circuits the signals are discrete in
time and amplitude. The transistors are only used as switches. Whereas the advantage of digital circuits is
their unlimited attainable accuracy, the main disadvantages are their high power drain, high area
consumption and reduced processing speed. In analogue circuits the full signal-processing capabilities of
transistors are used, which results in higher speed, low power consumption and small size but the attainable
accuracies are lower. Especially in application fields where a high processing speed is required and only
t Part of
this research has been reported in the Proceedings of the 1994 IEEE International Workshop on Cellular Neural Networks
and Their Applications held in Rome.
CCC 0098-9886/96/010111-10
0 1996 by John Wiley & Sons, Ltd.
Received 15 January 1995
Revised 6 April 1995
112
P. KINGET AND M.STEYAERT
limited accuracy is necessary, analogue VLSI circuits still yield much better performances than digital
~mplementations.~
In a continuous time cellular neural network with a self-feedback greater than unity, a cell’s output can
only be stable in a high or low state, but the cell’s state transition is graded. Two types of errors in the
computation can occur: (i) static errors where the component variations are such that a cell has an incorrect
behaviour independently of the evolution of its neighbours; or (ii) dynamical errors-comparable with
races in digital circuits-which originate from internal dynamical errors in the cell or from an incorrect
interaction between cells due to a large mismatch in the time constants. In this paper a method of evaluating
the influence of static errors on the correct operation of a CNN is proposed.
The error causes can be systematic or random. The inherent non-linearity of transistors causes
deterministic distortion errors. Systematic errors can be eliminated by correct biasing, good signal
amplitude choices or compensation schemes. The random variation in component values and the noise
sources in active components cause random errors, which limit the minimal signal that can be processed
correctly by a circuit.
For the analogue implementation of computational circuits the matching behaviour of components is the
main effect limiting the attainable performance. The component values and properties have a normal
distribution and the variance of the component spreading is inversely proportional to the area of the
components used and proportional to their mutual distance.* For many circuits the distance effect is
negligible and the circuit accuracy is proportional to the square root of its area. The proportionality
coefficient is process-dependent and can be related to physical and technological effects. An improvement in
the matching is not directly guaranteed by a downscaling of the technology, although the matching tends to
improve. The speed of a circuit will be limited by the capacitive loading of the nodes. The (parasitic)
capacitive loading is proportional to the area of the component and the speed can only be increased by
increasing the signal or bias current and thus the power consumption. The maximal attainable speed for a
given power will be increased by downscaling the technology. An analogue VLSI circuit designer is thus
confronted with important trade-offs: area versus accuracy, power versus speed and speed or power versus
accuracy. The overall performance scaling with a downscaling of the technology is not straightforward and
will be dependent on the specific processing technologies used.
The main application field of CNNs up to now has been in image processing. These applications require
a high operation speed but limited precision. For the design of a CNN the main objective is to minimize the
area of a cell and to reduce its power consumption in order to integrate as many cells as possible on a single
die. It is clear that these specifications can only be achieved with VLSI circuits in combination with lowaccuracy specifications. Moreover, the area and speed can only be fully optimized if the lower bounds on
the accuracy are known to the designer.
3. TEMPLATE ROBUSTNESS
CNN templates can be divided into two classes. Non-propagating templates such as the edge detector (see
Table I) or noise filtering’ have an A-template with all zeros except the self-feedback. Each cell evolves
independently of the evolution of the output of its neighbouring cells. For this type of template a correct
behaviour of each individual cell will guaranree a correct network behaviour. For information-propagating
templates such as the connected component detector5 or holefiller6 (see Table I) the cell’s evolution
depends on the evolution of its neighbours and the dynamics are more complex. However, the activity of
the network is restricted to a small number of cells. This implies that if an individual cell behaves correctly
and no races or dynamical errors occur, the network should behave correctly. The validity of this
assumption will be shown via Monte Car10 simulations. However, the correct operation of every single cell
is a necessary condition for the correct operation of the network.
3.1. Correct operation of a single cell
‘The behaviour of one cell with its neighbours remaining stable can be studied by drawing its dynamic
113
CNN TEMPLATE ROBUSTNESS
Table 1. CNN templates
Template name
B
A
I
AND4
[!s I:-
CCD'
[;; -;I
CCD-LARGE
[:; :]
EDGE'
0 0 0
ia i sl
HOLE^
[:i 81
HOLE-MOD
SHADOW
\
\
+
t
A
\
Figure 1. Dynamic mutes for a cell c in the connected component detector for different states of its two neighbours. The stable
equilibria are indicated by full circles, the unstable one by an open circle
114
P. KINGET AND M. STEYAERT
routes for all possible combinations of its neighbouring states.' The cell state (x,) evolution equation can
be rewritten as
dx, = -Gx,.
+ A,f(x,.) + k
dt
where k contains the constant contributions of the I-template, the B-template and the A-template with the
self-feedback (A,) excluded. From these diagrams the stable and non stable equilibrium points of the cell
can be determined. In Figure 1 the dynamic routes for a single cell in a connected component detector5are
displayed.
In Figure 2 the influence of a deviation in the different template values is shown. The deviation Ak in the
k-value results in a uniform shift of the dynamic routes. Ak can originate from a deviation in the A- (except
A,), B- or I-template values or in the output levels of the neighbours. A deviation of the cell conductance
value G affects the slope of the dynamic routes in all regions, whereas a change AAcin the A-template selffeedback coefficient results in a slope change in the linear region and a constant shift in the saturation
regions. In Figure 2 the effect of positive values for the A's is shown but since the A's are random
variables, they can have negative or positive values.
(a) The influence of a deviation in the k value, which can originate from an error in the
A, B or I template values or the output levels of the neighbours
(b) The influence of a deviation in the cell conductance value
(c) The influence of a deviation in the self-feedback value in the A template
Figure 2. Influence of template coefficient variations on the operation of a CCD
CNN TEMPLATE ROBUSTNESS
115
The effect of these deviations on the correct cell behaviour can be evaluated from the change in the
equilibrium points of the cells for the different neighbouring states. From Figures 1 and 2 it is clear that the
dynamic routes can shift without a change in the cell behaviour as long as the shift (Arou,) at the unit state 1
or - 1 is smaller than the maximal allowed deviation (Awmpla,),
which is indicated by the arrows in Figure 1.
For the connected component detector, Atemplate
= 1. The variation in the dynamic route is dependent on the
variation in the template values, which have a normal distribution since the circuit components are also
normally distributed. Moreover, since the different template coefficients are implemented by different
circuits and the variation in transistors is uncorrelated,' the deviations will be statistically invariant. The
total shift of the dynamic route in the unit state is
A,,,, = A G + z A A i + x A B i + A I
i
i
The standard deviation ,,<
,I
of the shift of the dynamic route in the unit state is calculated from the
standard deviation of the template values:'
that a cell is operating correctly with the given deviation in the template coefficients
The probability Pcorrecl
of the dynamic route in the unit state is smaller than the maximal
is the probability that the deviation AmUte
for the given template. The deviationA,,, follows a normal distribution with
allowed deviation Alemplate
and the probability P,,,,, is calculated from'
mean zero and standard deviation amUte
I,*
4%
1
erf(x) = - exp(
+)
2
dt
3.2. Yield of an N-cell CNN
For a network with N cells and a probability P,,,,, that a single cell is functioning correctly, the
probability that the network is correct or the yield of the network chips is P,",,,,, assuming that all cells have
Qroutc
Figure 3. Yield of a network as a function of umuic:
-, 32 x 32 network; ......,128 x 128 network; -.-., 512 x 512 network
116
P.KINGET AND M.STEYAERT
to operate correctly for a correct network. In Figure 3 the yield of the network for a template with
Atemplate
= 1 is displayed as a function of the spread (J,,
of
~
the dynamic route for different network sizes.
The larger the network, the smaller is the spread allowed for a given yield. From the designer viewpoint it
is important to remark that below a certain threshold in u,,,, the yield of the network becomes very high.
This threshold determines the accuracy specifications that guarantee an economic VLSI implementation of
the CNN.
Inversely, for a given desired yield of the network chips, a given size N of the network and a given
of the dynamic route, the allowed spread umuteof the cell's
maximal allowed deviation Atemplate
dynamic route can be calculated by inverting equation (5). This spread can be transformed to the
allowed standard deviation o,,,uB,and u, of the template values using equation (3). These are used to
derive the specifications for the different circuit components of the cell realization as explained in
Section 5.2.
4. SIMULATION RESULTS
A deviation in the template coefficients will have an influence on the dynamic behaviour or the state
evolution of the cell, since the current injected in the capacitor will deviate from its ideal value. For nonpropagating templates this will only result in a change in the convergence time but will have no influence on
the correct computations. However, when the information is propagating through the network, dynamical
effects can influence the correct computations. The evaluation method only takes into account the effect of
static errors on the correct behaviour of a CNN. To evaluate whether the static effects of the template
deviations are dominant over their dynamic effects, Monte Carlo simulations were performed for
information-propagating templates.
In Table I1 the desired yield and the corresponding probability P,,,, for a cell to behave correctly are
of the template values, Monte Carlo simulations of
tabulated. With the resulting calculated allowed uvalue
the CNN are performed. For the CCD template (Table I) the simulation is performed with the edges and all
cells initialized to -1 except for the first cell. For this test image all cells (except the first and the last one)
have to make transitions from - 1 to 1 and from 1 to - 1, which implies that all dynamic routes are checked.
In Figure 4 a correct simulation of the 150 100 x 1 CCD simulations is displayed, showing the correct
Table II. The necessary probability of correct operation of a cell (P,,,,)
calculated for a given yield and the corresponding simulated yield
performance. The last column (ucmr)is the standard deviation of the
simulated yield and the number of simulations is indicated in parentheses. In
the third column the allowed absolute deviation
of the template
values is given. A11 values are percentages
Calculations
Yield
PCO,,,
Monte Carlo simulation
fl"illlU
4 x 1 connected component detector
70.0
91.50
29-0
80.0
94.60
26.0
90.0
22.50
22.5
99.0
99.75
16.5
100 x 1 connected component detector
Yield
@ern,
66
78
87
99.6
2.9 (250)
2.5 (250)
2.5 (250)
0.6 (250)
90.0
99.89
32 x 32 modified holefiller
15.26
91.7
2.4 (150)
99.99
9.73
97.3
1.6 (150)
95.0 (90.0)
117
CNN TEMPLATE ROBUSTNESS
..........,,..........;.,.,"""'~.........;....~..,,._.........
I
-51
50
0
100
150
200
Time
Figure 4. Simulation of a 100 x 1 CCD with varying template values over the cells for an input image [l -1 ... - 11
1
evolution of all cells with varying template values for an input image of [I - 1 ... - 11. The variation in
the template values over the cells results in a variation in the equilibrium points of the cells and their end
states which can clearly be observed.
For the modified holefiller template HOLE-MOD (see Section 5.1 and Table I) the input image is set to a
white image with black edges. All cells are initialized to black for this template. If the network operates
correctly all cells remain black. If at least one cell becomes white due to errors, the whole image
becomes white. With this test image, however, not all dynamic routes are tested. In fact the network will
= 1) downwards. This is
only behave incorrectly if the shift in the dynamic routes is larger than Atempiate(
on1 the case for half of the incorrect networks and the yield obtained by the simulations is then
or 95%.
The accuracy of the simulated yield estimate is given by
(1 - Yield) x Yield
G o ,
=
no. of simulations
(7)
Taking into account the accuracy of the simulated yield estimates, all simulation results agree well with the
calculations. For the tested templates the influence of the variations in the template values on the static
behaviour is dominant over their impact on the dynamic behaviour.
5. APPLICATIONS
5.1 Robust template design
The method is very helpful for the evaluation of templates towards the feasibility of their hardware
implementation. On the TOP of Figure 5 the two critical dynamic routes for the classical holefiller
template6 are drawn. The upper route is for k = 1 corresponding to the case of a black input pixel and all
but one neighbouring states white. The lower route is for k = 0 corresponding to a white input pixel and all
neighbouring states black or a black input pixel and all neighbouring states white. If the cells all start with
an initial state of black and the input image is presented at the inputs, the network will fill up the holes in
the image. However, for k = 1 the equilibrium is unstable and the slightest variation will result in a wrong
behaviour. Such a template cannot be used in a hardware implementation since variations will always exist.
If the I-template value is changed from I = - 1 to 0, the cell behaviour becomes much more insensitive to
non idealities. Now Atemplate
= 1 is achieved and the higher robustness makes a hardware implementation
118
P. KINGET AND M. STEYAERT
t
(a) The critical dynamic routes for the original holetiller template; stable equilibria
are indicated with a tilled circle and unstable with a circle
t
(b) The critical dynamic routes for the modified template; the maximal allowed
deviation for a correct operation is increased from 0 to I
Figure 5
feasible (see Section 5.2). Moreover, the holefiller will now work for black images on white backgrounds
and for white images on black backgrounds as reported in Reference 8.
In this way the robustness evaluation can be used to optimize the template robustness or to (re)design
templates. This procedure can be compared with classical design centreing approaches.' There the
acceptability region for the parameters is first determined. Then a goal function or quality measure is
defined and the parameters are redesigned within the acceptability region towards optimal quality. In the
presented approach the quality of a template is determined by its robustness, which is related to Atemplate.
The influence of the different template entries on Arempllte
has to be determined as well as the acceptability
range for the parameters so that the network function remains intact. With this information the template can
be redesigned for high robustness.
We will briefly illustrate this procedure for a connected component function. Templates of the form
A = [a, a + 1, - a ] , B = 0 and I = 0 with a > 0 have dynamic routes that are isomorphic to the standard CCD
template and all perform the CCD function. Aternplare
is equal to a. Thus, to obtain optimal robustness, a has
to be chosen as large as possible. In practice the U-value will then be limited by a practical limitation such
as the limited state swing.
119
CNN TEMPLATE ROBUSTNESS
5.2. Generation of accuracy specifrcations
In Section 3.2 a procedure was introduced to determine the upper limit for urnUte
for a given network size,
Atemplate
and desired yield. This specification can be translated into accuracy specifications for the different
template circuits starting from equation (3). Assuming that all circuits have the same relative accuracy T(,,
independent of the template value, ureI
can be calculated from urOute
by
For a template circuit implementation with a constant absolute error uahindependent of the template value,
uab,is given by
M~~~= J(no. of template values
#
01
(1 1)
In Table 111 the necessary accuracies for several templates are listed for a network of 128 x 128 cells
and M,, and the corresponding urel
are calculated. From these results,
with a yield of 90%. Values of urnUte
interesting conclusions can be drawn. The more complex the template, the more sensitive it becomes to
mismatches (see e.g. edge detector). This can be explained intuitively, since a cell must be able to
distinguish a smaller signal change from a signal with a larger common mode, which is also reflected in the
This is known by analogue designers to be a difficult task. Moreover, by optimizing
lower vale of Atemplate.
the template values, a template with a similar behaviour can be obtained with much higher robustness, as
illustrated by the CCD-LARGE and HOLE-MOD templates.
In the last column a normalized estimate for the area of the synapses in the cell circuit is given. The area
of a template or synapse circuit for a fixed function chip implementation is proportional to the number of
non-zero entries in the template and inversely proportional to the allowed error u,~.For a programmable
CNN implementation the worst-case specifications have to be taken into account which results in large
circuit areas. Moreover, the implementation of a programmable synapse or template circuit requires more
area than fixed value synapses.
5.3. Parametric test templates
In the fabrication process of VLSI chips the testing after fabrication is of extreme importance to
guarantee high quality chips. Functional testing with test inputs is performed to eliminate hard errors such
Table III. The allowed uroUtc
calculated for a 128 x 128 network and for a yield of 90%. From
these specifications the maximal relative error umlfor the template circuits can be calculated from
equation (8). In the last column a normalized estimate of the area of the synapses is given
AND
CCD
CCD-LARGE
EDGE
HOLE
HOLE-MOD
SHADOW
1.5
1
2
0.25
0
1
2
0.33
0.22
0.44
0-06
0.00
0.22
0.44
2.3
2.6
4.2
3.4
5
5
3.6
14
8.4
10.5
1.9
0
4.4
12.3
2x51
3 x 142
3x91
11 x 2770
m
5 x 516
3 x 66
120
P. KINGET AND M. STEYAERT
as stuck-at faults. Besides, parametric testing is performed to check whether the circuits meet the
specification ranges on their inputs and outputs.
Test templates can be designed to test a fabricated network chip parametrically. By using e-g. a CCD
function test template of the form A = [ a ,a + 1, - a ] , B = 0 and 1 = 0 with a test image of white pixels and
one column of black pixels on the left, the smallest value of a can be measured for which the network still
that the A-template circuits of the chip can
operates correctly. This value is the lower bound on Atemplate
process. Similar test templates can be designed for the B-template circuits starting from e.g. the edge
detector. With this strategy a parametric test procedure can be set-up for CNN chips.
6. CONCLUSIONS
In this paper a method is proposed for the evaluation of the static robustness of CNN templates towards
random variations in the template values, which are inevitable in a VLSI implementation. These robustness
specifications are translated into circuit specifications that guarantee a correct network operation with a
given yield. With these specifications a circuit designer can optimize VLSI circuit implementations towards
minimal area and power consumption and maximal speed.
The evaluation method is also applicable to the optimization of templates. It is used in the optimization
and choice of template values towards a good VLSI and hardware implementability. Furthermore, test
templates can be derived to parametrically test fabricated cellular neural network chips.
ACKNOWLEDGEMENT
The authors are funded by the Belgian National Fund for Scientific Research.
REFERENCES
I . L. 0. Chua and L. Yang, ‘Cellular neural networks: theory and applications’, IEEE Trans. Circuits and Systems, CAS-35,
1257- 1290 (1988).
2. M. J. H. Pelmom, A. C. J. Duinmaiier and A. P. G. Welbers, ‘Matching
of MOS transistors’, IEEE J . Solid-State
- properties
_ .
Circuits, SC-54, 1433-1439 (1989).3. E. A. Vittoz, ‘Future of analog in the VLSI environment’, Proc. IEEE Int. Symp. on Circuits and Systems, May 1990, pp.
1372 - 1375.
4. T. Roska and L. Keks, (eds.), ‘Analogic CNN program library,’Rep. DNS-5-1994,Analogical and Neural Computing
Laboratory, Computer and Automation Institute, Hungarian Academy of Sciences, Budapest, 1994.
5. T. Matsumoto, L. 0. Chua and H. Suzuki, ‘CNN cloning- template:
connected component detector’, IEEE Trans. Circuits and
.
Systems, CAS-37,633-635, (1990).
6. T. Matsumoto, L. 0. Chua and R. Furukawa, ‘CNN cloning template: hole-filler’, IEEE Trans. Circuits and Systems, CAS-37,
635-638 11990).
I . A. Papoulis, Probability, Random Variables and Stochastic Processes, McGraw-Hill, New York, 1991.
8. L. 0. Chua and P. Thiran, ‘An analytical method for designing simple cellular neural networks’, IEEE Trans. Circuits and
Systems, CAS-38, 1332-1341 (1991).
9. R. K. Brayton and R. Spence, Sensitivity and Optimization, Elsevier, Amsterdam, 1980.
Документ
Категория
Без категории
Просмотров
2
Размер файла
591 Кб
Теги
745
1/--страниц
Пожаловаться на содержимое документа