close

Вход

Забыли?

вход по аккаунту

?

Efficient harmonic balance modeling of large microwave circuits

код для вставкиСкачать
INFORMATION TO USERS
This manuscript has been reproduced from the microfilm m aster. UMI films the
text directly from the original or copy submitted.
Thus, som e thesis and
dissertation copies are in typewriter face, while others may be from any type of
computer printer.
The quality of this reproduction is dependent upon the quality of the copy
submitted.
Broken or indistinct print, colored or poor quality illustrations and
photographs, print bleedthrough, substandard margins, and improper alignment
can adversely affect reproduction.
In the unlikely event that the author did not send UMI a com plete manuscript and
there are missing pages, these will be noted. Also, if unauthorized copyright
material had to be removed, a note will indicate the deletion.
Oversize materials (e.g., maps, drawings, charts) are reproduced by sectioning
the original, beginning at the upper left-hand comer and continuing from left to
right in equal sections with small overlaps. Each original is also photographed in
one exposure and is included in reduced form at the back of the book.
Photographs included in the original manuscript have been reproduced
xerographically in this copy. Higher quality 6” x 9” black and white photographic
prints are available for any photographs or illustrations appearing in this copy for
an additional charge. Contact UMI directly to order.
Bell & Howell Information and Learning
300 North Zeeb Road, Ann Arbor, Ml 48106-1346 USA
800-521-0600
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
E F F IC IE N T H A R M O N IC B A L A N C E M O D E L IN G OF LARG E
M IC R O W A V E C IR C U ITS
by
S T E V E N G L E N SK AG G S
A thesis su b m itted to th e G raduate Faculty of
N orth C arolina S tate University
in p artia l fulfillment of the
requirem ents for the Degree of
D octor of Philosophy
D E P A R T M E N T O F E L E C T R IC A L A N D C O M P U T E R
E N G IN E E R IN G
Raleigh
1999
A P P R O V E D BY:
C hair of Advisory Com m ittee
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
UMI Number: 9933902
Copyright 1999 by
Skaggs, Steven Glen
All rights reserved.
UMI Microform 9933902
Copyright 1999, by UMI Company. All rights reserved.
This microform edition is protected against unauthorized
copying under Title 17, United States Code.
UMI
300 North Zeeb Road
Ann Arbor, MI 48103
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Abstract
Skaggs, Steven Glen
Efficient Harm onic B alance M odeling of Large M icrowave Circuits.
(U nder th e direction of Michael B. S teer.)
T h e purpose of this research is to provide im provem ents in th e harm onic balance
technique for th e sim ulation of large microwave circuits.
T h e harm onic balance
technique becom es less efficient w ith an increase in circu it size, as th e nonlinear
system of equations to be solved becom es large. Since a Jaco b ian m a trix of rank
N costs 0 ( N 3) floating p o in t operations to decompose, it is desirable to reduce
th e b o th th e num ber of tim es th e Jacobian m ust be ev alu ated and th e am ount of
processing required for m a trix decom position. For N ew ton-R aphson based harm onic
balance, th e Jacobian m ay b e ap p ro x im ated in such a w ay th a t decom position is
less expensive. Additionally, th e approxim ated Jacobian is often superior to th e
original form ulation of th e Jaco b ian in term s of the num ber o f iteratio n s required for
sim ulation convergence. Sim ilarly, K rylov subspace based m eth o d s m ay be im proved
by using Jacobian preconditioners. T his linear iterativ e tech n iq u e has been shown
to be m ore efficient for large m o d erately nonlinear m icrow ave circuits. This stu d y
explores th e effects of different preconditioners for linear ite ra tiv e solvers in th e
harm onic balance technique.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
B iograp h ical S u m m ary
Steven Glen Skaggs was born D ecem ber 12, 1966 in Lafayette, IN . He received
th e Bachelor of Science degree in electrical engineem g s umma cum laude from N orth
C arolina S ta te U niversity in 1989. In 1991, he received the M aster of Science de­
gree, also from N o rth C arolina S tate U niversity. His M asters thesis research was the
ex tractio n of m icrow ave tran sisto r m odel p aram eters using tree annealing optim iza­
tion. From 1994 to 1995, he was em ployed by C om pact Software in P aterson, NJ
as a senior m icrow ave circuit engineer, working on a com mercial harm onic balance
sim ulator. Since A ugust of 1995, he has been em ployed by Avant! C orporation, first
as a software developer and more recently as a product specialist. His research inter­
ests include h arm o n ic balance sim ulation of microwave circuits and device modeling
an d p aram eter ex tractio n .
ii
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
A c k n o w le d g e m e n ts
F irst of all, I m ust th an k my wife Leah and my son Luke for th eir sacrifice during
all this, especially in th e last year. I am looking forward to having more tim e to
spend w ith you two. T his work is d edicated to you.
N ext, I am deeply indebted to M ichael Steer for his guidance and direction
during m y g rad u ate and even u n dergraduate education. M ichael also served as my
“conscience” on occasion when I let m y full tim e job responsibilities prevent me from
progressing on my thesis. Speaking of m y conscience, thanks to D ad for keeping up
w ith m y progress. L e t’s go play some golf now.
T hanks also to those who have served as my supervisors while I have been
working full tim e. T hanks, Jason G erber, M ark Basel, Jeff B yrd, Jo h n Studders, and
K eith Lanier. You guys were all supportive and understanding about my graduate
studies. T h an k s also to Carlos Christoffersen and Shunmin W ang for your assistance
in the past year.
A nd finally, thanks to my m any friends and co-workers who never laughed in my
face when I talked ab o u t “finally finishing m y PhD .”
111
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Table of Contents
L ist o f F ig u res
L ist o f T ab les
1 In tro d u c tio n
2
1.1
M o tiv a tio n ..............................................................................................................
1.2
T hesis O v erv iew ....................................................................................................
R e v ie w o f H arm on ic B alance T echniques
2.1
Posing th e Harmonic Balance A nalysis P r o b l e m ......................................
2.2
R elax atio n M ethods
2.3
N ew ton M e t h o d s .................................................................................................
2.4
Inexact N ew ton M e th o d s ...................................................................................
2.5
3
..........................................................................................
2.4.1
Iterativ e Linear S o lv e rs .........................................................................
2.4.2
Incom plete LU D ec o m p o sitio n ...........................................................
S u m m a r y ..............................................................................................................
D e v e lo p m e n t o f a H arm onic B a la n c e Sim ulator
3.1
N ew ton-based Harmonic B alance A n a l y s i s ................................................
3.1.1
3.2
Form ing the Jacobian M a t r i x ...........................................................
Im provem ents in Numerical Techniques for Harm onic Balance
3.2.1
. . . .
Sparse M atrix T e c h n iq u e s ..................................................................
iv
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
4
3.2.2
O v e r s a m p lin g ..............................................................................................24
3.2.3
T h e S ta te Variable A pproach
3.2.4
T h e C h o rd M e th o d ....................................................................................26
3.2.5
Jaco b ian A pproxim ation a n d Preconditioning Techniques . . .
...............................................................25
27
N e w to n -b a sed H a rm o n ic B alance w ith A p p roxim ate Jacob ia n M a­
tric es
39
4.1
D istrib u ted A m p lifie r .............................................................................................. 39
4.1.1
Using th e Full Jacobian M a t r i x ........................................................... 42
4.1.2
Using Block Jacobian M atrix T e c h n iq u e s ..........................................50
4.1.3
Using th e Linear Jacobian along w ith the Diagonal
of th e
N onlinear J a c o b i a n .................................................................................... 53
4.2
5
4.1.4
Using th e Linear Jacobian o n l y ........................................................... 56
4.1.5
U sing a Threshold Value for N onlinear Jacobian C ontributions
4.1.6
S u m m a r y .....................................................................................................64
59
N onlinear Transm ission L i n e s ............................................................................. 67
4.2.1
A 10 diode N L T L ....................................................................................... 70
4.2.2
A 47 diode N L T L .......................................................................................81
4.2.3
S u m m a r y .................................................................................................... 89
H arm on ic B a la n c e Sim u lation U sin g In e x a c t N ew to n M e th o d s w ith
a P re co n d itio n ed Ja co b ia n M atrix
5.1
91
Sim ulation of a D istributed A m p lifie r ...............................................................91
v
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
5.2
6
5.1.1
Low Input P o w e r ........................................................................................ 92
5.1.2
High Input P o w e r ........................................................................................ 94
S im ulation of a Short N L T L ...............................................................................99
5.2.1
Low input v o lta g e ........................................................................................ 99
5.2.2
High input v o l t a g e ...................................................................................105
5.3
S im ulation of a Longer N L T L ..........................................................................110
5.4
S u m m a r y ................................................................................................................115
C on clu sion s and Future R esearch
117
6.1
D is c u s s io n ................................................................................................................117
6.2
Suggestions for F urther R e s e a r c h ...................................................................119
vi
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
List of Figures
‘2 .0.1 Circuit p artitio n ed into linear a n d nonlinear subcircuits..............................
4
3.2.1 Diagonal blocking scheme using 16 blocks along th e m a trix diagonal.
The shaded areas indicate th e location of m atrix entries which will
be used........................................................................................................................ 33
3.2.2 Diagonal blocking scheme using 8 blocks along th e m a trix diagonal.
The shaded areas indicate th e location of m atrix entries which will
be used. Each sm all shaded block represents all derivatives of the
error function at a given frequency w ith respect to all the state
variables a t a second given frequency................................................................33
3.2.3 Diagonal blocking scheme using 4 blocks along the m a trix diagonal.
The shaded areas indicate th e location of m atrix entries which will
be used. Each sm all shaded block represents all derivatives of the
error function at a given frequency w ith respect to all the state
variables a t a second given frequency................................................................34
3.2.4 Diagonal blocking scheme using 2 blocks along the m a trix diagonal.
The shaded areas indicate th e location of m atrix entries which will
be used. Each sm all shaded block represents all derivatives of the
error function a t a given frequency w ith respect to all th e state
variables a t a second given frequency................................................................34
vii
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
3.2.5
Level one off-diagonal blocking schem e. T he shaded areas indicate th e
all derivatives of th e erro r function a t a given frequency with respect
to all th e state variables a t a second given freq u en cy .................................35
3.2.6
Level two off-diagonal blocking schem e. T he shaded areas indicate th e
location of m atrix en tries w hich will be used. Each small shaded
block represents all derivatives of th e error function at a given fre­
quency with respect to all th e s ta te variables a t a second given fre­
quency. .......................................................................................................................36
3.2.7
Level th ree off-diagonal blocking schem e. T h e shaded areas indicate
th e location of m atrix en tries w hich will be used. Each small shaded
block represents all derivatives of th e error function at a given fre­
quency with respect to all th e s ta te variables a t a second given fre­
quency. .......................................................................................................................36
4.1.1
D istrib u ted Amplifier C i r c u i t ................................................................................... 40
4.1.2
M agnitude of the o u tp u t voltage sp ectru m of the distributed am plifier
in dB m w ith input pow er levels of 0 (O ), 10(-f), and 20 d B m (d ).
T h e x-axis represents frequency in H z.............................................................. 41
4.1.3
R u n tim e in machine cycles for sim ulation of the distributed am plifier
for different input pow er levels............................................................................ 42
viii
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
4.1.4
N u m b er of New ton-Rap hson iterations required for convergence of sim ­
u la tio n of the d istrib u te d am plifieer circuit w ith respect to input
pow er level.................................................................................................................43
4.1.5
M ag n itu d e of nonlinear jaco b ian contributions with respect to proxim ity
to th e diagonal of th e jacobian in th e sim ulation of th e distributed
am plifier circuit. T h e x-axis is th e absolute difference between the
row an d column indices of th e nonlinear entry, while th e y-axis is the
ab so lu te value of th e entry. T h e entries of the Jacobian at 0 dBm
in p u t power are represented by O, while the entries of th e Jacobian
a t 20 dB m input power are represented by -f-................................................46
4.1.6
R u n tim e in m achine cycles for harm onic balance of th e d istrib u ted am ­
plifier circuit w ith respect to input power level. T h e upper curve
corresponds to calculating th e Jacobian at every step in th e NewtonR ap h so n solving process, while th e lower curve corresponds to using
th e sam e Jacobian th ro u g h o u t th e solving process.......................................49
4.1.7 N u m b er of iterations required for sim ulation of the d istrib u ted amplifier
w ith different blocking schem es.......................................................................... 51
4.1.8 S im ulation runtim e of th e d istrib u ted amplifier for different blocking
schem es. R untim e is given in m achine cycles.................................................52
4.1.9 N u m b er of iterations required for sim ulation of the d istrib u ted amplifier
using nonlinear Jaco b ian contributions on the m a trix diagonal only.
ix
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
54
4.1.10 S im ulation runtim e required for sim ulation of th e d istrib u ted amplifier
using nonlinear Jacobian contributions on th e m atrix diagonal only.
55
4.1.11 N um ber of iterations required for sim ulation of th e distrib u ted amplifier
w hen only linear Jacobian contributions are u sed ........................................ 57
4.1.12 S im ulation runtim e (in m achine cycles) required for th e distributed am ­
plifier when only linear Jacobian contributions are used.............................. 58
4.1.13 S im ulation runtim e (in m achine cycles) required for th e distributed am ­
plifier a t low power w hen Jacobian entries are subject to threshold
levels. T he x-axis represents the base 10 logarithm of th e threshold
value............................................................................................................................ 61
4.1.14 N um ber of iterations required for convergence of th e Newton-Raphson
m eth o d for the d istrib u ted amplifier at low power when Jacobian
en tries are subject to threshold levels. T he x-axis represents the
base 10 logarithm of th e threshold value..........................................................62
4.1.15 S im ulation runtim e (in m achine cycles) required for th e distributed am ­
plifier a t high power w hen Jacobian entries are subject to threshold
levels. The x-axis represents th e base 10 logarithm of th e threshold
value............................................................................................................................ 65
x
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
4.1.16 N um ber of iterations required for convergence of th e N ew ton-Raphson
entries are subject to threshold levels. T he x-axis represents the
base 10 logarithm of th e threshold value......................................................... 66
4.2.1
N onlinear transm ission line unit cell....................................................................... 68
4.2.2
M easured(solid line) and sim ulated(dashed line) o u tp u t waveform for
th e 47 diode NLTL..................................................................................................69
4.2.3
M agnitude of voltage o u tp u t spectrum of th e 10-Diode NLTL w ith 1 (0 ),
3 (+ ), 6(1=1), and 9 ( x ) volt input AC voltages................................................. 71
4.2.4
N um ber of iterations and ratio of unused to used Jaco b ian entries for
sim ulation of th e 10 diode NLTL w ith lv AC in p u t voltage. The
dashed line represents the ratio of th e m agnitude of unused Jacobian
entries to the to tal m agnitude of all Jacobian entries, while th e solid
line represents the corresponding num ber of iteratio n s required for
convergence................................................................................................................74
4.2.5
N um ber of iterations and ratio of unused to used Jaco b ian entries for
sim ulation of the 10 diode NLTL w ith 3 volt AC in p u t voltage. The
dashed line represents th e ratio of th e m agnitude of unused Jacobian
entries to the to tal m agnitude of all Jacobian entries, while the solid
line represents th e corresponding num ber of iterations required for
convergence............................................................................................................... 76
xi
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
4.2.6 N um ber o f ite ra tio n s and ratio of unused to used Jacobian entries for
sim ulation of th e 10 diode NLTL w ith 6 volts A C in p u t voltage. T he
dashed line represents the ra tio of th e m agnitude of unused Jacobian
entries to th e to ta l m agnitude of all Jacobian en tries, while the solid
line rep resen ts th e corresponding num ber of itera tio n s required for
convergence................................................................................................................ 78
4.2.7 N um ber of ite ra tio n s and ratio of unused to used Jacobian entries for
sim ulation of th e 10 diode NLTL w ith 9 volt AC in p u t voltage. T he
dashed line represents th e ratio of th e m agnitude of unused Jacobian
entries to th e to ta l m agnitude of all Jacobian entries, while the solid
line represents th e corresponding num ber of itera tio n s required for
convergence................................................................................................................ 80
4.2.8 Sim ulation ru n tim e and ratio of unused to used Jacobian entries for
the 47 diode NLTL w ith 1 volt AC input voltage. T h e dashed line
represents th e ratio of th e m agnitude of unused Jacobian entries
to th e to ta l m agnitude of all Jacobian entries, while th e solid line
represents th e corresponding ru n tim e required for convergence. . . .
xu
Reproduced with permission of the copyright owner. Further reproduction prohibited w ithout permission.
83
4.2.9 Sim ulation ru n tim e and ratio of unused to used Jacobian entries for
th e 47 diode NLTL w ith 2 volt AC input voltage. T he dashed line
represents the ratio of th e m agnitude of unused Jacobian entries
to th e to ta l m agnitude of all Jacobian entries, while the solid line
represents th e corresponding runtim e required for convergence.
. . .
S5
4.2.10 Sim ulation ru n tim e and ratio of unused to used Jacobian entries for
th e 47 diode NLTL w ith 3 volt AC input voltage. T he dashed line
represents th e ratio of th e m agnitude of unused Jacobian entries
to th e to ta l m agnitude of all Jacobian entries, while the solid line
represents the corresponding runtim e required for convergence.
. . .
86
4.2.11 S im ulation ru n tim e and ratio of unused to used Jacobian entries for
th e 47 diode NLTL w ith 4 volt AC input voltage. T he dashed line
represents th e ratio of the m agnitude of unused Jacobian entries
to th e to ta l m agnitude of all Jacobian entries, while the solid line
represents the corresponding runtim e required for convergence. . . .
88
5.1.1 Sim ulation ru n tim e (solid line) an d ratio of the m agnitude of ail unused
Jacobian entries to th e m agnitude of all Jacobian entries calculated
(dashed line) versus m atrix sparsity for th e d istrib u ted am plifier
circuit. T h e input power level is 0 d B m .........................................................93
xm
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
5.1.2 Sim ulation ru n tim e (solid line) and ratio of th e m ag n itu d e of all unused
Jacobian entries to th e m agnitude of all Jacobian entries calculated
(dashed line) versus m atrix sparsity for th e d istrib u ted amplifier
circuit. T h e in p ut power level is 5 d B m ...........................................................95
5.1.3 Sim ulation ru n tim e (solid line) and ratio of th e m agnitude of all unused
Jacobian entries to the m agnitude of ail Jacobian entries calculated
(dashed line) versus m atrix sparsity for th e d istrib u ted amplifier
circuit. T h e input power level is 10 d B m ........................................................ 96
5.1.4 Sim ulation ru n tim e (solid line) and ratio of th e m agnitude of all unused
Jacobian entries to the m agnitude of all Jacobian entries calculated
(dashed line) versus m atrix sparsity for th e d istrib u ted amplifier
circuit. T h e in p ut power level is 15 d B m ........................................................ 98
5.2.1 Sim ulation ru n tim e (solid line) and ratio of th e m agnitude of all unused
Jacobian entries to the m agnitude of all Jacobian entries calculated
(dashed line) versus m atrix sparsity for the 10 diode NLTL. The
in put AC voltage level is 1 volt.................................................................100
5.2.2 Sim ulation ru n tim e (solid line) and ratio of th e m agnitude of all unused
Jacobian entries to the m agnitude of all Jacobian entries calculated
(dashed line) versus m atrix sparsity for th e 10 diode NLTL. The
in put AC voltage level is 2 volts............................................................... 102
xiv
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
5.2.3 S im ulation runtim e (solid line) a n d ratio of th e m ag n itu d e of all unused
Jacobian entries to th e m agnitude of ail Jaco b ian entries calculated
(dashed line) versus m a trix sparsity for th e 10 diode NLTL. The
in p u t AC voltage level is 3 v o lts....................................................................... 104
5.2.4 Sim ulation runtim e (solid line) a n d ratio of th e m ag n itu d e of all unused
Jacobian entries to th e m agnitude of all Jaco b ian entries calculated
(dashed line) versus m a trix sparsity for th e 10 diode NLTL. The
in p u t AC voltage level is 4 volts...................................................................... 106'
5.2.5 Sim ulation runtim e (solid line) a n d ratio of th e m ag n itu d e of all unused
Jacobian entries to th e m agnitude of all Jaco b ian entries calculated
(dashed line) versus m a trix sparsity for th e 10 diode NLTL. The
in p u t AC voltage level is 5 volts...................................................................... 108
5.2.6 S im ulation runtim e (solid line) a n d ratio of th e m ag n itu d e of all unused
Jacobian entries to th e m agnitude of all Jaco b ian entries calculated
(dashed line) versus m a trix sparsity for th e 10 diode NLTL. The
in p u t AC voltage level is 6 volts...................................................................... 109
5.3.1 Sim ulation runtim e(solid line) and ratio of th e m ag n itu d e of all un­
used Jacobian entries to th e m agnitude of all Jacobian entries calculated(dashed line) versus m a trix sparsity for th e 20 diode NLTL.
T h e inp u t AC voltage level is 1 volt................................................................ I l l
xv
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
5.3.2
Sim ulation runtim e(solid line) an d ratio of th e m ag n itu d e of all un­
used Jaco b ian entries to th e m ag n itu d e of all Jacobian entries calcu lated (d ash ed line) versus m a trix sp arsity for th e 20 diode NLTL.
T he in p u t AC voltage level is 2 v o lts...............................................................112
5.3.3
Sim ulation runtim e(solid line) an d ratio of th e m ag n itu d e of all un­
used Jaco b ian entries to the m ag n itu d e of all Jacobian entries calcu lated (d ash ed line) versus m a trix sparsity for th e 20 diode NLTL.
T he in p u t AC voltage level is 3 v o lts...............................................................114
xvi
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
L ist o f T ab les
3.2.1 R elative A pproxim ate R u n tim e for LU D ecom position of a Rank 135
Jaco b ian M a t r i x ..................................................................................................... 29
4.1.1
Average m agnitude of nonlinear jacobian contributions w ithin each
frequency p a rtitio n w ith input power level of 0 dB m .
Each row
corresponds to th e frequency of th e unknown w hile each column
corresponds to th e frequency of th e error function. T h e final Jaco­
bian used in th e sim u latio n is shown, for which th e average of all
th e nonlinear Jaco b ian contributions is 1.349e-3. T h e first row and
colum n are frequencies in G H z............................................................................47
xvii
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
4.1.2
Average m agnitude of nonlinear jacobian contributions w ithin each
frequency p artitio n with input power level of 20 dB m . Each row
corresponds to the frequency of the unknown while each column
corresponds to th e frequency of th e error function. T h e final Jaco­
bian used in th e sim ulation is shown, for which the average of all
the nonlinear Jacobian contributions is 1.482e-3. T h e first row and
column are frequencies in GHz........................................................................... 48
4.2.1 Blocking approxim ation schemes. A num ber by itself refers to the
num ber of diagonal blocks used. A num ber preceded by th e letter
“o” indicates an off-diagonal blocking scheme............................................... TO
4.2.2 N um ber of Jacobian calculations required for convergence and simu­
lation ru n tim e for different blocking schemes for th e 10 diode NLTL
circuit w ith an input AC voltage level of 1 volt.............................................75
4.2.3
N um ber of Jacobian calculations required for convergence and simu­
lation ru n tim e for different blocking schemes for th e 10 diode NLTL
circuit w ith an input AC voltage level of 3 volts...........................................75
4.2.4
N um ber of Jacobian calculations required for convergence and simu­
lation ru n tim e for different blocking schemes for th e 10 diode NLTL
circuit w ith an input AC voltage level of 6 volts...........................................79
xviii
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
4.2.5 N um ber of Jacobian. calculations required for convergence and sim u­
lation runtim e for different blocking schem es for th e 10 diode NLTL
circuit with an input AC voltage level of 9 volts........................................... 81
4.2.6
N um ber of Jacobian calculations required for convergence and sim u­
lation runtim e for different blocking schem es for the 47 diode NLTL
circuit with an input AC voltage level of 1 v olt............................................. 83
4.2.7
N um ber of Jacobian calculations required for convergence and sim u­
lation runtim e for different blocking schem es for th e 47 diode NLTL
circuit with an input AC voltage level of 2 volts........................................... 84
4.2.8
N um ber of Jacobian calculations required for convergence and sim u­
lation runtim e for different blocking schem es for th e 47 diode NLTL
circuit with an input AC voltage level of 3 volts........................................... 87
4.2.9 N um ber of Jacobian calculations required for convergence and sim u­
lation runtim e for different blocking schem es for th e 47 diode NLTL
circuit with an in p u t AC voltage level of 4 volts........................................... 88
5.1.1 N um ber of Jacobian calculations required for convergence and sim ­
u lation runtim e for different blocking schem es for the d istrib u te d
am plifier circuit w ith an in p u t power level of 0 dB m ................................... 93
5.1.2 N um ber of jacobian calculations required for convergence and sim ­
ulatio n runtim e for different blocking schemes for the d istrib u ted
am plifier circuit w ith an input power level of 5 dB m ................................... 95
xix
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
5.1.3 N um ber of Jaco bian calculations req u ired for convergence an d sim ­
ulation ru n tim e for different blocking schemes for th e d istrib u ted
amplifier circuit w ith a n input pow er level of 10 dB m ................................. 97
5.1.4 N um ber of Jaco bian calculations req u ired for convergence an d sim ­
ulation ru n tim e for different blocking schemes for the d istrib u te d
am plifier circuit w ith an input pow er level of 15 dB m .................................98
5.2.1 N um ber of Jaco bian calculations req u ired for convergence an d sim u­
lation ru n tim e for different blocking schem es for the 10 diode NLTL
circuit w ith an input voltage of 1 v o lt............................................................. 100
5.2.2 N um ber of Jaco bian calculations req u ired for convergence an d sim u­
lation ru n tim e for different blocking schem es for the 10 diode NLTL
circuit w ith an in p u t voltage of 2 volts. N /C denotes no convergence. 102
5.2.3 N um ber of Jacobian calculations req u ired for convergence an d sim u­
lation ru n tim e for different blocking schem es for the 10 diode NLTL
circuit w ith an input voltage of 3 volts. N /C denotes no convergence. 103
5.2.4 N um ber of Jacobian calculations req u ired for convergence an d sim u­
lation ru n tim e for different blocking schem es for the 10 diode NLTL
circuit w ith an input voltage of 4 volts. N /C denotes no convergence. 106
5.2.5 N um ber of Jaco bian calculations req u ired for convergence an d sim u­
lation ru n tim e for different blocking schem es for the 10 diode NLTL
circuit w ith an input voltage of 5 volts. N /C denotes no convergence. 107
xx
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
5.2.6 N u m b er of Jacobian calculations required for convergence and simu­
latio n runtim e for different blocking schem es for th e 10 diode NLTL
circuit w ith an input voltage of 6 volts. N /C denotes no convergence. 108
5.3.1 N um ber of Jacobian calculations required for convergence and simu­
latio n runtim e for different blocking schem es for th e 20 diode NLTL
circuit w ith an input voltage of 1
volt..................................................I l l
5.3.2 N u m b er of Jacobian calculations required for convergence and simu­
latio n runtim e for different blocking schem es for the 20 diode NLTL
circuit w ith an input voltage of 2
volts................................................ 112
5.3.3 N u m b er of Jacobian calculations required for convergence and simu­
latio n runtim e for different blocking schem es for th e 20 diode NLTL
circuit w ith an input voltage of 3
volts................................................ 113
6.1.1 R u n tim e required for harm onic balance sim ulation of a 20 diode
NLTL using the exact N ew ton m ethod an d th e inexact Newton
m eth o d . T he runtim es given are th e best o b tain ed from all available
Jaco b ian approxim ations and preconditioners............................................. 119
xxi
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Chapter 1
Introduction
1.1
Motivation
As com m unication technologies continue to advance, there is a growing need for an
increased use of th e frequency spectrum . T he move to higher frequency systems pro­
vides m any benefits, including sm aller an ten n a sizes, wider bandw idth, and higher
resolution for im aging. Traditionally, travelling-wave tubes have been the only de­
vices available w hich can produce the necessary tens or hundreds of w atts of power
at m illim eter wave frequencies. U nfortunately, these devices suffer the disadvan­
tages of large size an d weight, and also require high voltage power supplies. On the
other hand, solid s ta te devices designed for m illim eter wave applications are small
and do not need high voltage power supplies, b ut provide lim ited am ounts of power.
Therefore, in ord er to produce the power required for high frequency applications,
the power from m an y solid state devices m ust be combined.
In order to sh o rten the design cycle of power-combining circuits, efficient CAD
tools are required. Harm onic balance analysis is now an integral p a rt of designing
nonlinear m icrow ave and m illim eter wave circuits. W ith an increase in the num ber
of devices an d th e level of input power to be sim ulated, harm onic balance simulators
1
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
are being pushed to th eir lim its.
This thesis will present advances in CAD techniques for designing quasi-optical
system s. In particular, im provem ents will be developed for num erical techniques for
harm onic balance analysis, both in term s of sim ulation ru n tim e and m em ory storage
requirem ents.
1.2
Thesis Overview
C hapter 2 is a review of th e state-of-the-art of th e harm onic balance technique. The
review discusses th e latest published techniques, including th e use of iterativ e linear
solvers in K rylov subspace techniques.
In C h a p te r 3, the building blocks of a general-purpose harm onic balance sim u­
lator are developed. T here is an em phasis on th e num erical techniques necessary
to optim ize th e speed and accuracy of th e sim ulator. C onsideration is also given
to the m em ory storage requirem ents for sim ulation of large nonlinear circuits. Sev­
eral different schemes for Jacobian m atrix approxim ation and preconditioning axe
presented. T hese schemes have been im plem ented in b o th a traditional NewtonR aphson solver as well as a lin ear itera tiv e solver.
T he application of Jacobian approxim ations for N ew ton-Raphson-based har­
m onic balance sim ulation is d em o n strated in C h ap ter 4. A distributed am plifier
circuit an d nonlinear transm ission line circuits of different lengths are sim ulated
using th e different approxim ations. T hese techniques are analyzed in term s of th eir
2
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
effects upo n sim ulator perform ance.
C h a p te r 5 presents th e use of inexact New ton m ethods in th e harm onic balance
sim u latio n of the distrib u ted am plifier and nonlinear transm ission line circuits. T he
approxim ations used for N ew ton-R aphson-based sim ulations are used as precon­
ditioners for th e iterative lin ear solver. These preconditioners are com pared w ith
respect to th eir im pact upon sim u lato r perform ance.
C h a p te r 6 draws conclusions from this stu d y and presents suggestions for future
work.
3
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Chapter 2
Review of Harmonic Balance Techniques
Harm onic balance analysis is widely used for steady-state sim ulation of nonlinear
microwave circuits. This technique involves partitioning a circuit into two subcir­
cuits, one containing only linear devices an d th e other containing only the nonlinear
devices, as show n in Figure 2.1. T he stead y -state voltage and current waveforms
are represented in the freqeuncy dom ain by a truncated Fourier spectrum , the form
of which is assum ed a priori. The problem then becomes one of determ ining the
voltages and currents at the interface nodes.
In this ch ap ter, form ualtion of the harm onic balance equations and different
techniques for solving th em will be discussed.
LINEAR
NONLINEAR
SUBCIRCUIT
ELEMENT
Figure 2.0.1: C ircuit partitioned into linear and nonlinear subcircuits.
4
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
2.1
Posing the Harmonic Balance Analysis Problem
If th e circuit in Figure 2.1 has N interface nodes and if v is the vector of node
voltage waveforms, th e n th e application of K irchoff’s current law (KCL) to each
node results in a system of equations [1]
/ ( u , t) = i ( v ( t )) + ^rq(v(t)) + [
CLt
J —oo
y( t — r ) u ( r ) d r + is(t) = 0
(2.1.1)
w here i(v(t)) is the sum of nonlinear currents en terin g the interface nodes, q(v(t))
is th e sum of charges entering the interface nodes, y is the im pulse response of
th e linear subcircuit, an d is is the set of independent external source currents.
T h is form ulation is well su ited to finding th e nonlinear response to th e circuit state
variables as the nonlinearities are generally defined in the tim e dom ain. However,
a convolution operation is necessary for th e calculation of the linear response. If
th is equation is transform ed to the frequency dom ain, the linear response is easily
calculated by m ultiplication with the m odified nodal adm ittance m atrix . In that
case, E quation 2.1.1 becomes
F { V ) = I ( V ) + fi Q ( V ) + Y V + I S = 0
(2.1.2)
where ft is a m atrix of frequency com ponents representing the tim e differentiation
in th e frequency dom ain. T his is the equation th a t harm onic balance techniques try
to solve. T he nonlinear response to the interface s ta te variables is calculated in the
tim e dom ain and transform ed to the frequency dom ain via th e fast Fourier trans­
form (F F T ), while th e linear response is conveniently calculated in th e frequency
5
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
dom ain. G enerally som e form of N ew ton’s m ethod is em ployed to solve th e system
of nonlinear equations F ( V ) .
W h ile this equation was derived w ith reference to
node voltages as th e unknown variables for ease of n o tatio n , the harm onic balance
equations can easily accom odate cu rren ts as the sta te variables as well. In th e for­
m ulation of Rizzoli, et al., th e s ta te variables and equations are quite general and
reference b oth voltage and current unkowns at the su b circu it interface [2].
A discrete set of analysis frequencies are chosen a priori for steady-state solu­
tion. For single-tone analysis, th e frequencies chosen are th e fundam ental excitation
tone and some num ber of its harm onics, including dc. M ulti-tone analysis involves
th e noncom m ensurate excitation tones, dc, and interm odulation products up to a
given order. For a circuit w ith N interface nodes and I \ frequency com ponenents,
th e system E ( X ) of N K equations in N K unknowns m u st be solved. For sm all
nonlinear circuits excited w ith low in p u t power sources, th e nonlinear problem to
be solved is not terrib ly difficult, since N is small and a sm all K is sufficient for low
in p u t power. However as N and K increase, the tra d itio n a l Newton m ethods for
solving E quation 2.1.2 rapidly becom e more expensive. H igher drive levels produce
m ore coupling betw een signal frequencies, and thus a richer frequency sp ectru m
is necessary to describe the signal, thus increasing K .
Higher power levels also
m ean a less well-conditioned sy stem of equations. N aturally, N will increase w ith
an increase in th e num ber of n o n lin ear devices to be sim ulated. T he ability of a
given solution technique to solve E q u atio n 2.1.2 for increasing N and K values is of
6
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
p aram o u n t im portance. A dditionally, th e solution technique of choice m ust be able
to converge even w hen th e system becomes increasingly ill-conditioned due to high
driving power levels.
T h e focus of th is C h a p ter is the set of different techniques th a t have been em ­
ployed to solve this system of unknowns. T he m erits and liabilities of each technique
will be discussed.
2.2
Relaxation Methods
O ne of th e sim plest techniques for solving th e harm onic balance system of equations
is to use the fixed-point relaxation m ethod, or ~p-factor” m eth o d , of Hicks and K han
[3]. T his m ethod involves th e iterative equation
= p ii(u ,) +
(1 -
p W t (a ,)
( 2 .2 .1 )
w here i\{w) is th e lin ear current and ij. is th e nonlinear cu rre n t a t th e k ik node
an d th e j th iteration. No Jacobian m atrix need be calculated or inverted, so th e
com putaion cost is m in im al for this m ethod. A n im p o rtan t elem en t of this m ethod
is choosing the right value for p, 0 < p < 1, which can g reatly affect convergence.
Hicks and Khan an d others [4], [5] have explored the convergence properties of this
technique. A m ultiple-reflection m ethod used by K err [6] was show n to be a special
case of the p factor m eth od.
Cam acho-Penolosa [5] developed an algorithm for
determ ining th e o p tim al p factor.
W hile the relaxation m eth o d is a sim ple, fast, and efficient algorithm , it is not
7
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
a robust m ethod. Its perform ance generally degrades rapidly w ith an increase in
th e n u m b er of nonlinear elem ets being analyzed. Also, unlike m ore com putationally
intensive m ethods, it is n ot well suited for analysis of circuits excited w ith two or
m ore noncom m ensurate frequencies.
2.3
Newton Methods
T h e N ew ton-Raphson technique is widely used in trad itio n al harm onic balance sim­
ulators. This technique provides quadratic convergence provided th e initial solu­
tio n guess is sufficiently close to the m inim um of th e error function. T he .NewtonR aphson m ethod requires th e calculation of th e Jacobian m atrix J ( X ) = d F / d X ,
w here F ( X ) is the KCL erro r as described in E q u atio n 2.1.2 and X is the vector of
unknow ns. The well-known N ew ton update equation is given by
A ‘'+1 = X ' - a J ~ 1( X i ) F ( X i)
(2.3.1)
w here j denotes the j th N ew ton-Raphson itera tio n , a is a scalar dam ping factor,
J ( X ) is th e Jacobian m atrix , and F ( X ) is th e KCL error function. T he iterative
process continues until
is sufficiently sm all. Note th a t th e m ost expensive
p a rt of this algorithm in term s of cpu tim e an d m em ory usage is th e form ulation,
inversion, and solution of th e Jacobian m atrix. Inverting the m a trix is an 0 ( N 3)
o p eratio n for an N x N m atrix . There are several ways to reduce th e im pact of this
expensive process. Chang, H eron, and Steer introduced th e block Jacobian m ethod
[8], in which only blocks along th e diagonal of th e Jacobian were used. This allows
8
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
th e individual blocks to be inverted ra th e r than the e n tire Jacobian m atrix. Due to
the stru c tu re of th e Jacobian m a trix , the linear contributions are all located inside
these blocks. T hey also used th e Sam anskii m ethod, which involves reuse of th e
sam e inverted Jacobian for m ore th a n one iteration. T hese m ethods axe considered
to be best su ited to mildly nonlinear system s due to th e inherent inaccuracies of th e
Jacobian.
Rizzoli used sparse m atrix techniques to im prove sim ulation perform ance. He
used a Jacobian tem plate such th a t some specific m a trix elem ents were autom atically
set to zero so th a t specialized m a trix solvers could be used [10]. While this technique
takes advantage of the faster solving capability, its ab ility to handle high power levels
seems suspect. Higher drive levels generally result in a m ore dense Jacobian, and
im p o rtan t inform ation could be lost w ith this technique.
As for handling high power circuit sim ulations, several techniques have been sug­
gested. O ne of th e most com m only used is the norm -reducing technique of Yeager
and D u tto n [11] which adjusts th e N ew ton-update direction such th at the resulting
Newton ite ra tio n step is an op tim al one. Rizzoli used th is technique in conjunc­
tion w ith param etric modeling of nonlinear devices to extend the power handling
capabilities of his sim ulator [12] [13] [14]. The p aram etric modeling technique is
useful w ith th e sta te variable approach described in [14]. Instead of defining th e
system of unknowns to be strictly voltages or currents, th e unknowns are chosen to
be p aram etric state-variables th a t m ap to voltages or currents depending on th e ir
9
Reproduced with permission o f the copyright owner. Further reproduction prohibited without permission.
value. For exam ple, a param etric m odel of a pn ju n ctio n diode can be defined as
Vx 4- J l n [ l + a ( x — K )]
if Vx < x
(2.3.2)
v(t ) =
if x < Vr
7s exp (aVx) [1 + a (x — Vi)] — I s
if Vi < x
I s [exp (a x ) — 1]
if x < Vi
^^
i(t) =
where Vi is a th resh o ld voltage determ in ed em pirically. Due to th e diode’s exponen­
tial nonlinearity, this type of m odeling is helpful in preventing wild guesses in the
Newton solver w hich could occur a t th e early stages of harm onic balance sim ulation.
A nother problem to be addressed is th e perform ance of N ew ton m ethods in har­
monic balance w hen a circuit w ith a large num b er of nonlinear elem ents is sim ulated.
The increase in cpu tim e with increasing problem size is significant due to th e 0 ( N 3)
process of d irect inversion of the Jacobian. W hile the inexact N ew ton m ethods to
be discussed la te r seem best suited to this problem , Rizzoli has also proposed a socalled “hierarchical harm onic-balance” technique [15]. T he unknow n variables are
subdivided in to m a ster and slave sets, and a two level New ton itera tio n technique
is employed to solve th e circuit. T h e applicability of this alg o rith m seems to be
dependent on th e circuit topology.
2.4
Inexact Newton Methods
The m ain draw back to using N ew ton’s m eth o d for solving E qu atio n 2.1.2 is the cost
of inverting th e Jacobian. A ctually th e Jacobian is usually n ot inverted, but LU
10
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
decom posed to solve
Jy = b
where
bis th e error vector for th e given iterate.
(2.4.1)
Once th e LU factorization is
done, th e sam e decom posed m a trix can be reused as m any tim es as
Sam anskii m ethod.
desired in th e
N evertheless, although the inverted Jaco b ian is not form ed
explicitly, th e decom position of th e Jacobian is still an 0 ( N 3) process.
2.4.1
Iterative Linear Solvers
To avoid th is com putational bottleneck, several authors [16], [17], [18] have pro­
posed using iterative linear solvers for solving Equation 2.4.1. M elville et al. first
proposed th is in [16], describing th e use of th e QM R algorithm [19]. T he advantage
of using lin ear solvers for E q u a tio n 2.4.1 is th a t only v ecto r-m atrix m ultiplication
is necessary, such th a t th e solving tim e increases only slightly g reater than linearly
w ith an increasing num ber of unknow ns. Melville forms th e following equation for
Jacobian calculation:
J = GPTP-1 + CPTDP-1
(2.4.2)
where G is a diagonal m a trix of tim e dom ain derivatives of nonreactive circuit el­
em ents, C is a diagonal m a trix of tim e dom ain derivatives of reactive elem ents, T
is a linear o p erato r representing th e tim e-to-frequency Fourier transform , D repre­
sents th e tim e differentiation o p erato r, and P is a d ata p e rm u ta tio n operator. For a
circuit w ith n nodes and N h arm o n ic balance analysis frequencies, we can calculate
11
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
I
th e order of operations necessary to solve E quation 2.4.1. M ultiplications of the
diagonal m atrices G an d C w ith a vector cam be accom plished in 0 ( n N ) operations.
A pplications of T m ay be im plem ented th rough th e F F T , costing 0 ( n N l o g N ) op­
erations. T he tim e differentiation operator D as well as the P and P ~ l m atrices
can be done in 0 ( n N ) operations [16]. T hus, m ultiplications between th e Jacobian
m a trix an d a vector require 0 ( n N l o g N ) operations.
T h e draw back of linear m ethods however, is th a t they do not converge reliably.
To assist in convergence, some sort of preconditioner is employed. M elville used a
preconditioner to m odify Equation 2.4.1 to
J~lJx = J~lb
(2.4.3)
which has the sam e solution. Ideally, th e preconditioner J should be a good ap­
proxim ation to J an d also easy to invert.
Instead of solving E quation 2.4.3, he
solved
J z = b.
(2.4.4)
Melville reported th e use of th e Jacobian m a trix linearized around th e dc operating
point of th e circuit. T his can be easily found from the circuit ad m ittan ce m atrix,
which m ust be form ed a t th e outset of th e harm onic balance sim ulation for cal­
culation of th e linear response at each N ew ton iteration. The Jacobian m atrix is
organized such th a t Jacobian contributions from th e linear circuit are located in N
blocks along th e diagonal when N analysis frequencies are used. Thus, this precon­
d itioner m ay be inverted block-wise, which is significantly less expensive th a n a full
12
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
m atrix inversion, especially for a circuit w ith a large num ber of unknowns. Melville
reported th at this technique was best su ited for large circuits operating in a rnildly
nonlinear regime.
In [16], Feldm ann an d Melville extended th e ir technique to handle m ore general
nonlinear circuits. In fact, the m ain addition to their earlier K rylov subspace based
algorithm was to im prove on their preconditioning m atrix.
Instead of using the
linear Jacobian contributions only, the nonlinear contributions which appear in the
diagonal blocks of th e Jacobian are used as well. It is rep o rted here th a t for stronger
nonlinearities, som e inform ation in off-diagonal blocks m ust also be included in the
preconditioner, resu ltin g in the loss of the block-diagonal stru c tu re . W hile the blockdiagonal stru c tu re is very helpful in factorization, th e m a trix is still m uch more
sparse th an th e original Jacobian m atrix , resulting in perform ance im provem ents.
Rizzoli also explored the use of inexact Newton m ethods for large microwave
circuits [18]. He co n trasts the exact an d inexact Newton m ethods by w riting the
exact Newton u p d a te
J^i+i = X-i + n t-
(2.4.5)
where ra,- is the ex act Newton update. For a large num ber of circuit unknowns N ,
Rizzoli states th a t it m ay be more efficient to use an inexact N ew ton u p d ate s,-, such
th a t
|]£ (X t) + /(X .-K-ll < M E ( X {)\\
13
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
(2.4.6)
and
-XV+i = X i + Si
(2.4. i )
where /,- (0 < f i < 1) is a forcing term . W hen / t- = 0, the u p d ate reduces to
the exact u p d a te given in E q u atio n 2.4.4. O therw ise, /,- serves as a m easure of
how m uch th e inexact Newton u p d a te differs from th e exact update. In Rizzoli’s
im plem entation, the choice of / t- is updated at each itera tio n . He reports using an
initial /o = 0.5 and calculating subsequent values v ia th e equation
,
f‘=
||£(X,-) - E (JfH ] - /( X .- O s i-d l
p t M
'
{
8)
Once th e forcing term f i is chosen, the Newton e q u atio n is approxim ately solved
until E q u a to n 1.4.5 is satisfied. T h e GMRES ite ra tiv e solver was chosen to perform
this ap p ro x im ate solution. As w ith th e work presen ted by Melville and Feldm an,
the convergence properties of th e iterative solver w ere improved by selecting an
app ro p riate preconditioner. Rizzoli replaces th e ex act N ew ton equation
J(Xi)m = -E (X i)
(2.4.9)
with
J ( X i ) P r l Pini = - E ( X i )
(2.4.10)
where Pi is a nonsingular m atrix of rank N . Sim ilar to M elville and Feldm an, Rizzoli
stresses th e im portance of P being a close approxim ation to J which is easier to
invert. O nce a preconditioner has been defined, th e in itial guess for
14
Reproduced with permission o f the copyright owner. Further reproduction prohibited without permission.
is defined
by
s [0) = _ p - i - E ( X i )
(2.4.11)
an d a set of real vectors of length N is defined:
A',1" = [/jv - ./(A '.)P r1] E ( X , i
(2.4.12)
A f> = J ( X {) P - 1I < r l
( q > 1)
where I y is the id e n tity m atrix . The vector space spanned by th e vectors K j ,
1 < q < Q is called a K rylov subspace of dim ension Q. T he Q th -o rd er approxim ation
of Si is then given by
s\Q) = s\0) + P,rl J 2 <*qK \q)
(2.4.13)
m aking th e corresponding residual
rJQ) = E ( X i ) + J ( X i ) s \ Q) = K \ x) + £
a qK f +1).
(2.4.14)
7=1
A least squares m eth o d is used to find the a q coefficients such th a t | | r ^ | | is m ini­
m ized. If Equation 2.4.5 is satisfied,
For sufficiently large Q
is tak en as th e approxim ate Nevvton update.
this is guarranteed to be the case since lim<2_ 00
= n,-
[20]. Rizzoli claims to use a sm aller dim ension Krylov subspace for th e first few
approxim ate updates, gradually increasing th e dim ension as th e final solution is
approached [21], b u t th a t generally Q < 50 is sufficient for m ost harm onic balance
problem s [22].
Telichevesky, et al. also have reported using iterative m ethods for solving E qua­
tion 2.4.1 [23]. T hey po in t o u t th a t if GM RES is used for solving a linear system , the
15
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
m atrix representing the system need not be explicitly formed as only m atrix-vector
products are necessary. Again, th e need for a preconditioner to help in convergence
for th is m eth o d is stressed. T h e block Jacobian as a preconditioner is reported
as being very effective for m ildly nonlinear problem s. However, the authors claim
th a t for m ore strongly nonlinear system s, the loss of inform ation when off-diagonal
blocks are discarded often is too g reat to be overcome. In this situation, th e authors
use higher order finite-difference techniques in the tim e dom ain to precondition th e
Jacobian inform ation.
2.4.2
Incomplete LU Decomposition
The use of Incom plete LU (ILU) decom position in circuit sim ulators has been re­
ported by Eickhoff and Engl [24]. W hile the application in this paper is not h ar­
monic balance sim ulation, the problem of solving a circuit w ith m any nodes via
approxim ate Newton techniques is discussed. T he Jacobian m atrix is split such
th a t J = Q + R where
Q = LU
(2.4.15)
and rem ainder R ^ 0. T he ILU factors L and U usually are defined as the stan d a rd
LU decom position w ithout any of th e fill-in elem ents generated during th e decom po­
sition process. Eickhoff and Engl have defined a m ethod in which some of th e fill-ins
are used. T he “level” of a given fill-in elem ent is defined by th e m anner in which
it was generated. The m ain diagonal entries axe assigned level 0, while off-diagonal
16
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
entries axe assigned level 1. A fill-in elem en t generated from two off-diagonal entries
is assigned level 2, an d so on. T he level assigned to a fill-in elem ent is equal to the
sum of th e levels o f its parents. T he m ax im u m level L of entries allowed th e n defines
an L-level ILU schem e.
T he ILU m eth o d th en is a technique in which some Jacobian m atrix elem ents are
neglected in th e solving process. In c o n tra st to th e preconditioning schem es already
m entioned, this technique can be th o u g h t of as preconditioning applied during the
m atrix decom position stage. D eterm ining th e app ro p riate ILU level corresponds to
finding th e ap p ro p riate preconditioner.
2.5
Summary
T he harm onic balance technique is a m a tu re algorithm for solving nonlinear high
frequency circuits. T h e challenges of th is technique are those problems involving a
large num ber of nonlinear devices, stro n g ly nonlinear devices, and circuits driven at
high power levels. Com m on to all these challenges is th e problem of solving a large
nonlinear system of equations via som e form of N ew ton’s m ethod. As this iterativ e
process progresses, a large linear system of equations m ust be solved m u ltip le tim es
along th e way. W h eth er these linear problem s are solved directly or indirectly, the
m ost successful m eth o d s employ som e sort of preconditioner to ap p ro x im ate the
Jacobian m atrix . A preconditioner m u st be found th a t is sufficiently close to th e
actu al Jacobian, b u t a t th e sam e tim e is easier to invert. Finding th e ap p ro p riate
17
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
preconditioner is of th e u tm o st im portance to all th e techniques discussed.
18
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Chapter 3
Development of a Harmonic Balance Simulator
T he harm onic balance techniques developed in this study have been im plem ented
in a sim ulator called TRANSIM . This sim ulator has the ability to perform several
different types of analysis, including b o th a Newton-based and an iterative linear
solver-based harm onic balance analysis. T h e focus of this ch ap ter is the develop­
m ent of a N ew ton-based harmonic balance sim ulator, b ut m any of the num erical
techniques have been im plem ented in an itera tiv e linear solver as well.
3.1
Newton-based Harmonic Balance Analysis
As discussed in C h ap ter 2, the nonlinear circuit is first p artitioned into its linear and
nonlinear subcircuits. This results in a nonlinear subcircuit w ith N ports, where
N is th e n u m b er of nonlinear term inals not connected to ground, or the com m on
node of th e circu it. Each of the N ports connects to a term inal of a device of th e
nonlinear subcircuit.
T he DC solution a t th e interface nodes is usually found first and used as th e
initial guess for th e interface state variables. Since no F F T calculations are required
and th e n u m b er of s ta te variables is greatly reduced, th e DC solution is generally
easy to find th ro u g h N ew ton’s m ethod.
19
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
T h e linear response to each interface voltage is found by m atrix-vector m ultipli­
cation
Il = Y V
(3.1.1)
where Ii, is th e vector of linear currents, Y is th e m odified nodal ad m ittan ce m atrix,
and V is th e vector of interface voltages. In order to find the nonlinear response to
th e interface voltages, an inverse F F T is first perform ed on each frequency-dom ain
voltage to obtain the tim e-dom ain vector v ( t) , w hich contains th e tim e-dom ain
voltage values at tim e points evenly spaced th ro u g h one period of th e fundam ental
frequency. T h e tim e-dom ain nonlinear currents in i(t) are then found through eval­
uation of th e nonlinear device models and converted back to th e frequency domain
via th e F F T . The Kirchoff’s current law (KCL) e rro r function isth en defined in the
frequency dom ain as
F ( V ) = I L( V ) + I NL(V ).
(3.1.2)
This nonlinear system of equations must then be solved such th a t
l|F (V )|| < e
(3.1.3)
for som e sufficiently sm all user-defined e.
3.1.1
Forming the Jacobian Matrix
T he Jaco b ian m atrix consists of derivatives of th e error function in Equation 3.1.2
w ith respect to the unknow n interface quantities V . T he derivatives of I I w ith
20
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
respect to V axe easily found through E q u a tio n 3.1.1. T h e derivative of a linear
current w ith respect to th e interface voltages will be the linear a d m itta n c e seen at
th e corresponding p o rt of the linear circuit. T his inform ation is readily available
in th e modified nodal ad m ittan ce m atrix . T h e nonlinear Jaco b ian contributions
< 9 In l/# V axe not as straightforw ard to calculate. As reported in [7], th e frequency
dom ain derivatives can be found directly from th e tim e-dom ain derivatives and
the F F T . The real a n d im aginary com ponents of the com plex variables are stored
separately as real num bers, requiring the co m p u tatio n of four derivatives to describe
th e relationship betw een current and voltage a t any given frequency. For example,
th e derivative d ’L /d 'V w here I and V are com plex quantities is represented by the
m atrix
332(1)
332(V)
33(1)
332(V)
332(1)
33(V)
33(1)
33(V )
(3.1.4)
w here 3? and Sr represent th e real and im ag in ary parts, respectively. N ote th a t DC
q uantities are stric tly real and therefore do n ot require as m uch storage space in the
Jacobian. For exam ple, for DC current q u an tities, the derivatives of th e im aginary
current are not necessary, but the derivatives of the real cu rren t w ith respect to
b oth real and im aginary AC voltage contributions m ust be found. O f course, only
one num ber is required to represent the derivative of a DC q u a n tity w ith respect to
an o th er DC quantity, since both are real num bers.
For a circuit w ith N interface nodes, th e re are N unknown DC q u an tities. If there
are M — 1 AC analysis frequencies, there are an additional 2 x (M — 1) unknown AC
21
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
quan tities since the AC q u an tities are expressed in term s of real and im ag in ary parts.
T h u s, for a circuit w ith N interface nodes a n d M analysis frequencies including DC,
th ere will be 2N ( M — 1) + IV or N ( 2 M — 1) unknow n quantities.
3.2
Improvements in Numerical Techniques for Harmonic
Balance
O nce th e Jacobian has b een calculated, th e N ew ton u p d ate can be o b tain ed through
E q u atio n 2.3.1. However, since this equation requires th e inversion of th e Jacobian,
it can be quite expensive to evaluate directly. Also, th e extensive use of th e F F T in
th e calculation of b o th th e Jacobian and th e n on lin ear system response can introduce
num erical error to th e sim ulation. B oth of these situations point o u t th e need for
fast and accurate nu m erical techniques for h arm o n ic balance sim ulation.
Several num erical techniques have been im plem ented to im prove b o th th e ac­
curacy and runtim e o f th e sim ulator. For im provem ent of sim ulation speed and
m em ory storage requirem ents, a sparse m a trix package has been im plem ented in
TR A N SIM . T he sim u lato r also has the cap ab ility to reuse th e decom posed Jaco­
b ia n m atrix m ultiple tim es to further im prove sim ulation runtim e. A dditionally, th e
state-variable m ethods discussed in C h ap ter 2 are used in T R A N S IM to im prove
th e convergence abilities of th e sim ulator. To avoid th e num erical noise introduced
by rep eated use of th e F F T , a technique called oversam pling is used. Finally, there
axe several different m a trix preconditioning techniques available in T R A N SIM . A
22
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
detailed description of these techniques follows.
3.2.1
Sparse Matrix Techniques
Since th e Jacobian m atrix can be excessively large, a sparse m atrix package was
im plem ented in TRAN SIM . T his package stores only th e non-zero entries with th eir
row an d colum n indices. Obviously, one of the advantages to this approach is a
reduction in th e am ount of m em ory necessary to store th e Jacobian, but there are
also advantages in term s of calculation speed. No operations are performed on th e
zero-valued m atrix com ponents as they do not exist in m em ory. In general, th e
overhead associated w ith allocating and m anaging th e sparse m atrix structures is
trivial com pared to the im provem ents in runtim e an d m em ory storage.
For exam ple, a d istrib u ted am plifier circuit w ith nine unknow n variables and
eight analysis frequencies has a Jacobian m atrix w ith rank 135 (N(2M -1)). T h e
num ber of nonzero elem ents for th e Jacobian is 2822, requiring th e storage of 2822
double precision quantities along w ith 2822 row indices and 2822 column indicies.
Each double precision num ber requires eight bytes, and each integer requires four,
resulting in a to tal of 45152 bytes to hold all th e Jacobian inform ation.
If th e
additional non-zero entries were stored in memory, we would need space for 135 x
135 = 18225 double precision num bers, or 145800 bytes. T his m eans th a t a dense
m atrix would need more th a n th ree times as m uch m em ory as th e sparse m atrix
requires.
23
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
T he sparse m atrix package in T R A N SIM also contains optim ized routines for
LU decomposition and solving. T h e to tal sim ulation runtim e for th e distributed
am plifier circuit using sparse m atrices is about 26 seconds on a Sun M icrosystems
U ltra l workstation. W hen sta n d a rd m atrix solving techniques are used, the runtim e
is about one m inute. This ru n tim e im provem ent is due exclusively to the use of
sparse matrices.
3.2.2
Oversampling
As reported in [9], aliasing error can be introduced into harm onic balance sim ulation
due to the tru n cated spectrum used in finding the steady-state solution. Frequency
dom ain voltages are transform ed via th e F F T to the tim e dom ain an d applied to
th e nonlinear circuit elem ents to determ ine th e nonlinear cu rren t response. If an
insufficient num ber of harm onics is used in th e F F T , th e voltages applied to the
nonlinear elem ents will be incorrect, causing th e current response to be incorrect.
T his alters the error surface so th a t th e sim ulation may be inaccurate. Additionally,
even if the voltage is accurately expressed w ith one set of analysis frequencies, the
nonlinear current response m ay b e inaccurately represented due to a larger required
bandw idth.
This is particularly im p o rtan t when highly nonlinear circuits are sim ulated at
high drive levels. O ne way to avoid th is problem is to increase th e n u m b er of analysis
frequencies, but this will increase th e size of the Jacobian m atrix . Since we need
24
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
to invert o r a t least ap p ro x im ately invert th e Jacobian, which is an 0 ( N 3) process,
increasing th e num ber of analysis frequencies can be q u ite expensive.
However,
th e oversam pling technique im plem ented in TR A N SIM increases only th e num ber
of frequencies used d u ring th e F F T . T he quantities to be transform ed v ia F F T
are p ad d ed w ith zeros.
For exam ple, if there are eight analysis frequencies, one
m ight oversam ple by a facto r of two. T h e q u an tity being transform ed would have
frequencies nine through sixteen set to zero. T he tim e-dom ain q u an tity th e n would
have tw ice as m any tim ep oints, providing a m ore accu rate waveform for tim e-dom ain
sim ulation.
W hen th e tim e-dom ain response is transform ed back to th e sixteen
analysis frequencies, only th e first eight frequencies are considered for calculating
th e harm onic balance e rro r and Jacobian.
3.2.3
The State Variable Approach
T hus far th e harm onic balance technique has been discussed in term s of th e volt­
ages being th e unknown q u an tities and the resulting linear and nonlinear currents
form ing th e error function. As rep o rted in C hapter 2, th e s ta te variables need not
be re stricted to node voltages only. For strongly nonlinear devices such as diodes, a
p a ram etric m odel m ay b e used th a t defines bo th voltage and current as a function
of a s ta te variable X . A p ara m e tric m odel would be defined as
v ( t ) = u [x(£), d x / d t , • *• i d Px / d t 71, ££>(£)]
(3.2.1)
z(i) = v [z(£), d x / d t , • • •, cPx / dt 71, X£>(i)]
(3.2.2)
25
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
where x £>(f) is a v ecto r of tim e-delayed s ta te variables. An exam ple of a param etric
m odel was given in E quation 2.3.2 for a pn ju n ctio n diode.
Not e th a t for high
values of th e s ta te variable, th e device cu rren t increases linearly while th e voltage
increases logarithm ically. T h e current-voltage relationship rem ains th e same, but
th e relationship betw een th e unknown s ta te variable and the device current is no
longer exponential.
TR A N SIM uses s ta te variables to define all its nonlinear elem ent models. For
som e devices such as nonlinear resistors th e s ta te variable is m erely the device
voltage, while for others such as diodes, a p ara m e tric model is used. The m ain
benefit of using th e param etric model is th a t it can elim inate wild guesses in th e
early stages of th e sim ulation.
3.2.4
The Chord Method
Because calculation an d inversion of the Jaco b ian is com putationally expensive, it is
desirable to avoid th is cost whenever possible. Frequently, th e Jacobian need not be
recalculated a t th e next iteration of N ew ton’s m eth o d . The reuse of th e calculated
an d inverted Ja co b ian m atrix is called th e chord m ethod. This m eth o d can greatly
expedite th e sim u latio n of m ildly nonlinear circuits. The tradeoff in accuracy of th e
Jacobian can b ecom e significant for m ore strongly nonlinear circuits, requiring m ore
frequent Jaco b ian recalculation.
26
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
3.2.5
Jacobian Approximation and Preconditioning Tech­
niques
T h e m ost com putationally expensive p art of harm onic balance analysis is solving
a large linear system of equations a t each iteration of N ew ton’s method. Several
au th o rs previously m entioned have used preconditioners to approxim ate the Jaco­
b ian for itera tiv e linear techniques. T he sam e sort of approxim ations can also be
m ad e for N ew ton-based sim ulation. T h e selection of an appropriate preconditioner
is of great im p o rtan ce for reducing th e am ount of com puting resources needed for
sim ulation.
In TR A N SIM , selecting the preconditioner or approxim ate Jacobian am ounts to
selecting which Jacobian elem ents to elim inate before m a trix decomposition. Several
different schemes have been im plem ented, but all of these schemes affect only th e
nonlinear contributions to the Jacobian.
T he linear contributions are necessary
for sim ulation convergence and are never elim inated in any of the preconditioning
schem es.
For th e purposes of this discussion, th e term “m atrix sparsity” will be used to
quan tify th e percentage of m atrix elem ents which will be ignored by an approxim a­
tio n scheme. If a given approxim ation uses 25 percent of th e possible m atrix entries,
th e m atrix sp arsity is said to be 75 percent. If only 10 percent of all m atrix entries
are used, th e sp arsity is 90 percent, and so on.
27
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
U sin g o n ly lin ear con trib u tion s to th e Jacobian
One preconditioning technique is to use only th e linear con trib u tio n s to th e Jaco­
bian. Since th e linear contributions to th e Jacobian are co n sta n t w ith respect to
the values of th e unknowns at th e interface nodes, the Jaco b ian m u st be calculated
and decom posed only once at th e o u tset of sim ulation. A dditionally, th e nonlinear
Jacobian entries need not be calculated a t all. As explained earlier, th e nonlinear
Jacobian entries are calculated via th e F F T , so removing th e n eed for these calcula­
tions can significantly im prove sim ulation runtim e. T he decom posed m atrix can be
reused via th e chord m ethod at every itera tio n of N ew ton’s m eth o d . Also, because
all the Jacobian inform ation is stored in blocks along th e diagonal, th e blocks can
be decom posed individually, saving tim e for this 0 ( N 3) process.
For exam ple, th e d istributed am plifier circuit w ith nine unknow ns and eight
analysis frequencies produces a Jacobian of rank 135. T he lin ear Jacobian contribu­
tions are located in eight blocks along th e diagonal. T he first block contains the DC
Jacobian contributions and has rank 9. T he other seven blocks each are rank 18.
Making th e approxim ation th a t th e LU decom position tim e for a m a trix is equal to
K N 3, where K is a constant, we can estim a te the runtim e im provem ent m ade by
inverting th e m a trix block-wise. Table 3.2.1 shows the ru n tim e for each of th e seven
blocks along w ith th e runtim e for decom posing the entire m a trix . T h e block-wise
m ethod of decom posing th e m atrix is roughly 98 percent fa ste r th a n decomposing
the entire m atrix .
28
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Block No.
runtim e
1
2
3
4
5
6
7
8
total(block-w ise)
K{ 93)
J\(183)
K( 1 8 3)
I<{183)
I<( 183)
K{ 183)
/v(183)
Af(183)
K(41553)
to ta l(e n tire m atrix)
K(2460375)
Table 3.2.1: Relative A p proxim ate R untim e for LU D ecom position of a R ank 135
Jaco b ian M atrix
T h is technique is easily im p lem en ted and can work for weakly nonlinear cir­
cuits, b u t it is not robust. C ircuits w ith strong nonlinearities require a t least some
no nlinear inform ation in th e Jaco b ian for convergence of N ew ton’s m eth o d to be
achieved.
R e la x a tio n M eth od
A n o th er technique for ap p ro x im atin g th e Jacobian m a trix is to use only those non­
linear Jaco b ian contributions w hich occur on the m atrix diagonal. T his is equivalent
to th e relax atio n m ethod discussed in C hapter 2. No interactions betw een circuit
q u a n titie s at different frequencies are considered w ith th is m ethod, nor are inter­
actions betw een circuit q u an tities a t different nodes. A dding only diagonal entries
to th e linear Jacobian does n o t ad d any com plexity to th e process of decomposing
th e m a trix , since all th e diagonal elem ents will already co n tain contributions from
29
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
the linear circuit. However, now th a t only som e nonlinear contributions are being
used, th e Jaco b ian may need to be recalculated and decomposed. Also, because
the harm onic balance m ethod calculates th e nonlinear circuit response and deriva­
tives in th e tim e dom ain for one com plete period of th e fundam ental frequency, all
freqeuncy dom ain derivatives are calculated a t once. Thus, the nonlinear Jacobian
contributions m ust be calculated for en tire m atrix. T he Jacobian inform ation is still
contained in blocks along th e diagonal which allows for faster LU decom position.
W hile th is m ethod is m ore robust th a n not using any nonlinear inform ation a t all
in the Jacobian, it still is not sufficient for strongly nonlinear circuits, as no frequency
coupling inform ation is considered. Also, th e relationship between unknown circuit
variables a n d th e error quantities are considered only for quantities at th e same
node. For exam ple, the relationship betw een drain current and gate-source voltage
for a M E S F E T is not considered in this technique, since this relationship does not
pertain to quan tities at the sam e node.
B lock J a co b ia n M ethod s
The block Jacobian m ethod expands th e range of nonlinear Jacobian contributions
considered. T h e block size can be varied such th a t m ore frequency coupling can
be considered, allowing for a m ore accu rate representation of the Jacobian, but
m aintaining th e block stru ctu re w hich reduces LU decom position runtim e.
In order to understand the block Jacobian technique fully, it is helpful to have
some insight into the stru ctu re of th e Jacobian m atrix. The contributions of the
30
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
linear elem ents occur only betw een variables a t the sam e frequency. Only th e nonlin­
ear elem ents can produce currents a t frequencies different th an th at of th e exciting
voltages. If th e Jacobian is stru ctu red such th a t th e relationships between circuit
p aram eters are grouped by frequency, all linear contributions to the Jacobian will be
contained in blocks along th e m atrix diagonal. T h e only entries in the off-diagonal
blocks will be those contributed by circuit nonlineaxities. For example, a circuit
w ith M interface nodes and K analysis frequencies would have a m atrix stru ctu re
of th e following form
3 F 0Or
dX gO r
3FW
dXoor
3 F Af0r
3*00r
3 F o ir
3*00r
3 F o i,
3*00r
3FvfA',
3*00r
dFnor
3 * A f0 r
dFlOr3 * A f0 r
3 F \rrir
3 * Af0 r
3 F o ir
3 * A f0 r
3 F 0 i,
3 * A f0 r
3F \fA -i
3 * AfOr
dF nnr
3*01r
dFwr3*01r
3 F \m r
3*01r
3 F o ir
3*01r
3 F 0 i,
3 * 0 lr
dFMKi
3*01r
3F)Qr.
3 * o i.
3 F l0 r
3 * o i.
3 F Af0r
3*01.
3 F 0,r
3*01.
3 F o i,
3 * o i.
3Fnnr
3 * A /A ,
3 F 0i r
3 F o i;
3* a/ a'.
3 F l0 r
3 F ;Vf0r
3* a/-a'.
3* a/ a'.‘ 3* a/ a'.
.
.
d F M Ki
3 * o i,
3Fu
a
3* ma".
where dF-nfq/dX^FQ is th e derivative of F at node n and frequency / w ith respect
to X at node N and frequency F , where q and Q are eith er r or i, denoting th e real
and im aginary parts of F and X . As explained above, th e linear adm ittance m a trix
contributes only to those m atrix elem ents where / = F . These elements occur only
on blocks along the diagonal of the m atrix. The m a trix can then be rew ritten in
31
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
term s of th e different frequency blocks as
Baja
B 0,i
Bo,2
•
•
•
B q ,k
B \fl
B i,i
B\,2
•
•
•
B i,k
B 2,0
B2,l
B2,2
•
•
•
B 2,k
B m ,o B m ,i
Bm ,2
•
•
•
B m ,k
w here each block B{,j contains all derivatives of th e error function a t th e ith frequency
w ith respect to s ta te variables at th e j tfl frequency.
Block Jacobian techniques fall into two different categories, diagonal blocking
an d off-diagonal blocking. Diagonal blocking occurs when only blocks th a t occur on
th e m atrix diagonal are used, as shown in Figures 3.2.1 th ro u g h 3.2.4. N ote th a t
th e used elem ents from each row and colum n of th e m atrix are represented by only
one block.
For th e strictly diagonal block m ethod, th e num ber of blocks used can range
from two up to the n u m b e r of analysis frequencies, increasing by powers of two.
E ach block contains inform ation pertaining to th e sam e n u m b er of frequency com ­
ponents. For exam ple, if th ere are eight analysis frequencies, blocking schemes using
two, four, and eight blocks are available. T h e eight block schem e would result in
blocks w ith one freqeuncy each, while th e blocks in th e four block schem e would
co ntain inform ation a b o u t two frequencies. A dditionally, th e D C com ponents of th e
Jaco b ian m ay be included, b u t this prevents th e block-wise decom position of th e
m a trix , as the DC com ponents are located along th e top and left sides of th e m atrix
32
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Figure 3.2.1: Diagonal blocking scheme using 16 blocks along the m a trix diagonal.
T h e shaded areas in d icate th e location of m a trix entries which will be used.
Figure 3.2.2: D iagonal blocking scheme using 8 blocks along the m a trix diagonal.
T h e shaded areas in d icate th e location of m a trix entries which will be used. Each
sm all shaded block rep resents all derivatives of th e error function a t a given fre­
quency with respect to all th e s ta te variables a t a second given frequency.
33
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Figure 3.2.3: Diagonal blocking scheme using 4 blocks along th e m atrix diagonal.
T he shaded areas in d icate th e location of m atrix entries which will be used. Each
sm all shaded block represents all derivatives of the error function a t a given fre­
quency with respect to all th e state variables at a second given frequency.
Figure 3.2.4: Diagonal blocking scheme using 2 blocks along th e m atrix diagonal.
T he shaded areas in d icate th e location of m atrix entries which will be used. Each
sm all shaded block represents all derivatives of the error function a t a given fre­
quency w ith respect to all th e state variables a t a second given frequency.
34
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Figure 3.2.5: Level one off-diagonal blocking schem e. T he shaded areas indicate the
location of m atrix entries w hich will be used. E ach block represents all derivatives
of th e error function at a given frequency w ith respect to all th e s ta te variables at
a second given frequency.
as shown in Equations 3.2.3 a n d 3.2.4.
T h e oth er blocking schem es are based on th e proxim ity of a given block to the
diagonal blocks when th e highest level of diagonal blocking is used. In this case,
th ere are K diagonal blocks for a circuit w ith K analysis frequencies. If the block
row and column indicies are i and j respectively, an level q off-diagonal blocking
schem e would include all blocks for which \i —j \ < q. Figures 3.2.5 through 3.2.7
show th e structure of th e Jaco b ian m atrix for levels one through th re e off-diagonal
blocking schemes.
For circuits w ith m o d e ra te nonlinearities, th e diagonal block Jaco b ian m ethod
generally outperform s th e m ethods previously discussed. W hen th e n u m b er of blocks
35
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Figure 3.2.6: Level two off-diagonal blocking scheme. T he shaded areas indicate th e
location of m a trix entries which will be used. Each sm all shaded block represents
all derivatives of th e error function a t a given frequency w ith respect to all the sta te
variables at a second given frequency.
Figure 3.2.7: Level three off-diagonal blocking scheme. T he shaded areas indicate
the location of m a trix entries w hich will be used. Each sm all shaded block represents
all derivatives of th e error function a t a given frequency w ith respect to all th e s ta te
variables a t a second given frequency.
36
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
is equal to th e n u m b er of analysis frequencies, th e e x tra nonlinear contributions
m ake little difference in th e overall runtim e an d m em o ry storage requirem ents com ­
p ared to th e linear m a trix . T h e stronger th e nonlinearities, the larger th e blocks
m u st be for convergence to be obtained, and th e increase in necessary ru n tim e is
proportional to th e increase in block size. U sing th e off-diagonal schemes in these
situ atio n s can help provide convergence w ith an im provem ent in runtim e over using
th e full Jacobian m a trix , though th e loss of th e block-diagonal stru ctu re lim its th e
im provem ent available.
T h e block techniques have a sim ilar effect to th e norm -reducing techniques de­
scribed in [11]. E rro r function contributions a t a given frequency are m ost strongly
d ep en d en t upon s ta te variables at th e sam e an d n earb y frequencies. T h e block J a ­
cobian m ethods en su re th a t these relationships axe em phasized, heavily biasing th e
directio n of th e N ew ton step. T h e result is th a t th e system solution is often found
m ore quickly th a n w hen th e full Jacobian m a trix is used.
T h resh o ld in g T ech n iq u es
T h e thresholding tech n iq u e involves selecting a m in im u m lim it a m atrix e n try m ust
satisfy to be included in th e Jacobian preconditioner. T he m atrix entries which
do n o t m eet this criterio n are discarded. O bviously, some care m ust be tak e n in
selecting a thresh o ld value.
TR A N SIM allows th is value to be specified by an
ab so lu te value or by som e percentage of th e average value of Jacobian entries. It
is im p o rta n t to n o te th a t th e largest Jacobian values generally occur w ith in th e
37
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
diagonal blocks. For this reason, thresholding techniques generally do not provide
a noticeable im provem ent over choosing an effective blocking scheme.
38
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Chapter 4
Newton-based Harmonic Balance with
Approximate Jacobian Matrices
In this C hapter we will exam ine the sim ulation of two types of nonlinear circuits in
detail and analyze th e perform ance of N ew ton-based harm onic balance sim ulation
using Jacobian approxim ations discussed in C h ap ter 3. The different Jacobian ap­
proxim ations will be com pared and a general rule of thum b for determ ining the best
approxim ation for each type of circuit will be discussed.
4.1
Distributed Amplifier
T he first circuit to b e sim ulated is the d istrib u ted am plifier circuit shown below in
Figure 4.1.1. T his circu it uses three M E SFET devices distributed along transm ission
lines. Sim ulation perform ance was m easured w ith an input frequency of 4 GHz and
in p u t power levels of 0, 5, 10, 15, and ‘20 dB m . T h e m agnitude of th e o u tp u t voltage
sp ectru m is shown in Figure 4.1.2.
T h e unknown s ta te variables for this circuit are th e gate an d d rain voltages for
each of th e three tran sisto rs and the current th ro u g h th e three voltages sources, for
a to ta l of nine unknow n variables. As shown in C hapter 3, th e Jacobian m atrix
39
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
ww
vwv
Figure 4.1.1: D istrib u ted A m plifier C ircuit
will have ran k 2 N ( M — 1) for a circu it w ith N unknow n quantities and M analysis
frequencies. W ith eight analysis frequencies, this results in a Jacobian m atrix of
rank 135.
In each sim ulation, all linear contributions to the Jacobian are used.
Each approxim ation m ethod has different criteria for determ ining which nonlinear
Jacobian contributions are used. O nce a Jacobian has been form ulated and reduced
to an LU m a trix , it is reused via th e chord m ethod until th e residual error is reduced
by less th a n five percent, at w hich tim e th e Jacobian is recalculated.
All sim ulations of this circuit were run on a Sun M icrosystem s Sparc-20. R untim e
perform ance is m easured in m achine cycles as rep o rted by Quantify, a pro d u ct of
P u re-A tria for analyzing softw are perform ance.
40
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
—1 -20
Cl, -60
0
5
10
15
20
25
F R E Q U E N C Y (G H z)
F igure 4.1.2: M agnitude of the o u tp u t voltage spectrum of th e d istrib u ted am plifier
in dB m w ith in p u t power levels of 0 (O ), 10(-f-), and 20 d B m (n ). T he x-axis
represents frequency in Hz.
41
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
4.1.1
Using the Full Jacobian Matrix
Using th e full Jacobian M atrix is th e case of not using an approxim ation a t all.
Sim ulations perform ed in this m anner will provide a baseline for m easuring perfor­
m ance im provem ents. Additionally, insight into the stru c tu re of th e full m atrix is
needed to d eterm in e how best to approxim ate the Jacobian.
The to tal harm onic balance ru n tim e for each of th e sim ulated input powers is
shown in Figure 4.1.3, and the corresponding num ber of N ew ton-R aphson iterations
is shown in Figure 4.1.4.
2.15
C/5
_<u
"o
>5
O
00
W
U
>
u
w
z
55
u
2.1
2.05
2
1.95
1.9
<
1.85
1.8
0
5
10
15
20
IN PU T P O W E R (dBm)
Figure 4.1.3: R un tim e in machine cycles for sim ulation of th e distributed am plifier
for different in p u t pow er levels.
As expected, th e sim ulation ru n tim e increases w ith th e am ount of input power
applied to th e circuit in sim ulation. Larger input power generally results in a m ore
42
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
90
88
oo
g 86H
<
§3 84H
tu
S 82OJ
03
2
z
807876
0
5
15
10
20
INPUT POWER (dBm)
F ig u re 4.1.4: N um ber of N ew ton-R aphson iterations required for convergence of
sim ulation of th e d istrib u ted am plifieer circuit w ith respect to in p u t power level.
43
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
dense Jacobian m atrix and a larger m agnitude in th e nonlinear contributions to th e
Jacobian. T hus, th e co m p u tatio n an d LU decom position of the Jacobian is m ore
expensive for higher in p u t power. In th e case of the circuit a t hand, except for th e
20 dB m in p u t power level, th e Jacobian is calculated only once for each input power
level considered. T his in itial Jacobian is reused throughout the sim ulation because
th e resulting harm onic balance error a t each iteratio n is always at least 5% lower
th an th e erro r a t the previous iteration. T he Jacobian is recalculated and decom ­
posed once for th e 20 d B m in p u t power sim ulation, resulting in the large increase
in sim ulation tim e seen in F igure 4.1.2. T h e first Jacobian calculation assumes only
the DC solution a t th e linear-nonlinear circuit interface, so this Jacobian will be th e
sam e regardless of the AC in p u t power level. Thus, th e difference in perform ance is
due to th e relativ e inaccuracy of th e initial Jacobian a t each of the following itera­
tions. For low in p u t power, th e nonlinear contributions are small enough th a t the
initial Jaco b ian is a relatively accu rate representation of any Jacobian recalculated
during th e itera tiv e N ew ton-R aphson process. This representation is not quite as
accurate for th e higher in p u t power levels, resulting in slower convergence.
T h e m agnitudes of th e o u tp u t voltage spectra for in p u t powers of 0, 10, and
20 dB m are shown in F igure 4.1.3. Clearly, th e o u tp u t spectrum of the am plifier
increases su perlinearly w ith an increase in in p u t power. T h is is especially evident at
the higher analysis frequencies. A high input power level results in a flatter o u tp u t
spectrum w ith larger values at all frequencies.
44
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
F u rth er insight in to th e stru ctu re of th e Jacobian m atrix for this particular cir­
cuit m ay be gained by exam ining th e Jacobian at different iterations and at different
power levels th ro u g h o u t th e solution process. For this exam ination the Jacobian m a­
trix was recalculated a t each step in the iterativ e process. Regardless of input power,
th e nonlinear Jaco b ian contributions occured in th e same m atrix locations for this
circuit. T his im plies th a t th e behavior of th e circuit is not strongly nonlinear, as
increasing in p u t pow er does not result in a m ore dense Jacobian m atrix. However,
as shown in Figure 4.1.5, th e m agnitude of off-diagonal Jacobian entries increases
considerably w ith an increase in input power, m eaning th a t the relationship be­
tween circuit q u an tities at different frequencies is stronger as input power increases.
For higher input pow er levels, where the circuit quantities become larger at higher
analysis frequencies, th e off-diagonal entries become m ore im portant in obtaining
convergence.
Each block of th e Jacobian represents th e derivative of th e error function at one
p articu lar frequency / e w ith respect to th e sta te variables a t a particular frequency
f x . T he diagonal blocks represent th e case where / e — f x - Table 4.1.1 and Ta­
ble 4.1.2 show th e average nonlinear contribution to the Jacobian w ithin each block
at input power levels of 0 dB m and 20 dB m respectively. T he diagonal blocks and
their neighbors generally have averages greater th an the average of all nonlinear con­
tributions. T he linear Jacobian contributions, not included in these tables, would
greatly increase th e average m agnitude in th e diagonal blocks. W ith an increase in
45
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
0.016
0.014
0.012
ffl
Q o.oi
D
g
0.008
O
^
0.006
«»
0.004
«x>
ooo
0
20
40
60
80
100
120
140
P R O X IM IT Y T O M A T R IX D IA G O N A L
Figure 4.1.5: M agnitude of nonlinear jacobian contributions with respect to prox­
im ity to the diagonal of the jacobian in the sim ulation of th e distributed am plifier
circuit. T he x-axis is th e absolute difference between th e row and column indices of
the nonlinear entry, while the y-axis is the absolute value of the entry. T he entries
of th e Jacobian a t 0 dBm input power are represented by O , while the entries of the
Jacobian at 20 dB m input power cure represented by + .
46
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
0
4
8
12
16
20
24
28
0
3.984e-3
3.152&-3
1.289e-3
1.245e-3
7.597e-5
1.095e-3
1.325e-3
9.465e-4
4
2.658e-3
2.019e-3
1.712e-3
1.132e-3
6.654e-4
5.875e-4
7.174e-4
6.864e-4
8
2.612e-4
1.309e-3
1.438e-3
2.275e-3
1.50e-3
1.00e-3
4.345e-5
8.151e-4
12
6.701e-4
8.487e-4
1.909e-3
2.026e-3
2.345e-3
1.398e-3
8.417e-4
4.689e-4
16
6.777e-5
3.01e-4
9.486e-4
1.908e-3
1.844e-3
2.702e-3
1.755e-3
l.l25e-3
20
6.707e-4
3.566e-4
5.752e-4
1.115e-3
2.374e-3
2.334e-3
2.908e-3
1.682e-3
24
2.55e-4
2.702e-4
2.789e-5
5.258e-4
1.222e-3
2.453e-3
2.268e-3
3.175e-3
28
2.38e-4
3.395e-4
4.693e-4
3.169e-4
6.827e-4
1.406e-3
2.885e-3
2.702e-3
Table 4.1.1: Average m agnitude of nonlinear jacobian contributions w ithin each
frequency p a rtitio n w ith input power level of 0 dBm. Each row corresponds to th e
frequency of th e unknow n while each colum n corresponds to th e frequency of th e
error function. T h e final Jacobian used in th e sim ulation is shown, for which th e
average of all th e nonlinear Jacobian contributions is 1.349e-3. T he first row and
colum n are frequencies in GHz.
in put power, th e average values w ithin th e blocks generally increase, particularly
for the off-diagonal blocks. T his corresponds to the increase in th e contributions to
th e error fu n ctio n a t higher frequencies resulting from higher in p u t power.
A nother im p o rta n t figure of m erit for evaluating th e perform ance of the sim ulator
is th e n u m b er of Jacobian calculations an d the num ber of tim es the calculated
Jacobians are reused during sim ulation. T h e calculation an d LU decom position of
th e Jacobian is com putationally intensive, and thus can be very expensive. W hen
th e Jacobian is reused, th e am ount by w hich the error is reduced w ith each reuse is
generally less th a n th e am ount by which th e error would be reduced by recalculating
an d decom posing th e Jacobian. However, th is cost is m ore th a n offset by th e cost
savings of n o t recalculating and decom posing the Jacobian, as seen in Figure 4.1.6,
which shows th e ru n tim e for recalculating th e Jacobian a t every iteration.
47
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
T he
0
4
8
12
16
20
24
28
0
3.934e-3
3.144e-3
1.353e-3
7.707e-4
7.88e-4
2.002e-3
2.016e-3
1.312e-3
4
2.620e-3
1.976e-3
1.779e-3
1.302e-3
9.912e-4
1.131e-3
1.249e-3
1.017e-3
8
3.547e-4
1.403e-3
1.656e-3
2.474e-3
1.591e-3
7.727e-4
5.142e-4
1.504e-3
12
3.103e-4
9.784e-4
1.970e-3
2.285e-3
2.376e-3
1.337e-3
7.563e-4
9.873e-4
16
6.740e-5
4.996e-4
8.666e-4
1.866e-3
1.927e-3
2.833e-3
1.88e-3
8.13e-4
20
1.108e-3
6.924e-4
4.088e-4
1.073e-3
2.407e-3
2.571e-3
3.04e-3
1.625e-3
24
2.965e-4
4.65 le-4
2.836e-4
3.966e-4
1.145e-3
2.455e-3
2.397e-3
3.312e-3
28
3.940e-4
4.936e-4
8.034e-4
6.693e-4
3.795e-4
1.367e-3
2.972e-3
2.954e-3
T able 4.1.2: Average m agnitude of nonlinear jacobian contributions w ithin each
frequency partition w ith in p u t power level o f 20 dB m . Each row corresponds to the
frequency of th e unknow n while each colum n corresponds to th e frequency of the
erro r function. T he final Jacobian used in th e sim ulation is shown, for which the
average of all th e nonlinear Jacobian contributions is 1.482e-3. T h e first row and
colum n are frequencies in GHz.
nu m b er of iterations required for convergence when th e Jacobian is n o t reused is
n o t significantly lower th a n our m ethod of recalculating th e Jacobian only when the
erro r is not reduced by five percent or m ore, an d th e runtim e is considerably higher.
T h e reason for this is th a t th e Jacobian sim ply does not change m uch between
iteratio n s.
D ue to the m ild to m o d erate nonlinearity of th e distributed am plifier circuit, it
is evident th at this circu it is am enable to approxim ation techniques. T h e nonlinear
Jaco b ian entires occur in th e sam e m atrix locations across the en tire ran g e of input
pow er considered, an d th e initial Jacobian m a trix is sufficient for sim u latio n conver­
gence across m ost in p u t power levels as well. T h is implies th a t some approxim ation
of th e Jacobian m ay be m ade w ithout sacrificing th e convergence abilities of the
sim ulator.
48
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
2.5
V
njI
o
U
o
co
M
J
U
><
1.5
a
Z
53
u
<
0.5
0
0
1
1
5
10
1----------------------------------------------
15
20
IN P U T P O W E R (dB m )
Figure 4.1.6: R untim e in m achine cycles for harm onic balance of the distributed
am plifier circuit w ith respect to in p u t power level. T he u p p er curve corresponds
to calculating th e Jacobian at every step in th e N ewton-Raphson solving process,
while th e lower curve corresponds to using th e sam e Jacobian throughout the solving
process.
49
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
4.1.2
Using Block Jacobian Matrix Techniques
T he first approxim ation technique used for sim u latin g th e distributed am plifier cir­
cuit was th e block Jacobian m ethod as described by C hang. The blocking schemes
used were 2, 4, and 8 blocks for sim ulations using eight analysis frequencies.
Recall th e Jacobian stru ctu re as described in C h ap ter 3. The m atrix is ordered
such th a t frequency inform ation is grouped to g eth er in large blocks, w ith th e blocks
along th e diagonal representing th e relationship betw een circuit variables which rep­
resent the sam e frequency. Off-diagonal blocks represent th e relationship between
circuit variables at different frequencies. Since th e re are eight frequencies consid­
ered, a schem e of eight blocks ignores all relationships betw een variables at different
frequencies, while larger blocks include more inform ation about these relationships.
For exam ple, a scheme of two blocks with eight analysis frequencies groups dc, the
fundam ental analysis frequency, and the second an d th ird harm onics together in one
block, and th e fourth through seventh harm onics in th e o th er block. Any relation­
ship betw een circuit p aram eters a t frequencies w ithin th e sam e block is used, but
no relationship between p aram eters at frequencies from different blocks will be con­
sidered in th e analysis. For a weakly nonlinear circu it, th e blocks m ay be smaller,
s
since th e o u tp u t spectrum is generally not as rich as th e o u tput spectrum of a
m ore strongly nonlinear circuit. However, as in p u t power increases, th e Jacobian
entries in th e off-diagonal regions become larger and m ay be necessary to obtain
convergence. Thus, the block size m ay need to b e increased.
50
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
T he resulting num ber of iterations required for convergence for each blocking
scheme is shown below in Figure 4.1.7. C orresponding runtim es are shown in Fig­
ure 4.1.8.
90
NO BLOCKING
80
00
70
HH
2 BLOCKS
60
HH
40
4 BLOCKS
30
20
8 BLOCKS
0
10
5
15
20
INPUT POWER (dBm)
Figure 4.1.7: N um ber of iteratio n s required for sim u latio n of the d istrib u ted am pli­
fier w ith different blocking schemes.
As Figure 4.1.8 indicates, th e eight-block m eth o d seem ed to perform best among
the block m ethods. T h e discarded Jacobian entries ap p aren tly were not significant
enough to im pact ru n tim e negatively. In fact, th e sim ulation runtim es decreased
almost across th e bo ard w ith a decrease in th e am o u n t of nonlinear inform ation
included in th e Jacobian.
T hus, it appears th a t th e m ost im p o rtan t nonlinear
51
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
2.2
NO BLO C KIN G
2 BLOCKS
4 BLOCKS
HH
EC
0.8
<
0.6
U
8 BLOCKS
S
0.4
0.2
0
5
10
15
20
INPUT POWER (dBM)
F igure 4.1.8: Sim ulation ru n tim e of th e d istrib u ted am plifier for different blocking
schem es. R untim e is given in m achine cycles.
52
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
entries are located in th e diagonal sam e-frequency blocks. At higher input power
levels, th e im provem ent seen by using sm aller diagonal blocks can be offset som ew hat
by the need to recalculate and decom pose th e Jacobian. At th e highest power level,
the Jacobian was recalculated once for th e 4 block scheme. Nevertheless, due to
block-wise LU decom position and a fewer num ber of iterations required, th e sm aller
blocking schem es still were the m ost successful. The runtim es for the eight block
m ethod were lower th a n all others except for the 15 dB m in p u t power level, w here
the Jacobian was recalculated once d u e to insufficient reduction of the residual
error. T he Jacobians created by th e o th e r blocking schemes, while they contain
more entries w hich are technically accu rate, are nevertheless slowed by these e x tra
entries during LU decom position and solving. Additionally, th e direction of th e
Newton u p d a te is im proved by rem oving the m atrix entries in off-diagonal blocks
as evidenced by th e lower num ber of iterations. The runtim e advantages gained by
doing block-wise LU decom position indicate th a t there is little to gain by including
non-diagonal blocks which would p rev en t block-wise decom position. The m ethod of
using off-diagonal blocks m ay prove useful for more strongly nonlinear circuits.
4.1.3
Using the Linear Jacobian along with the Diagonal
of the Nonlinear Jacobian
T he next ap p roxim ation technique im plem ented was the use of only the nonlinear
Jacobian contributions which occur along th e diagonal of th e Jacobian.
53
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
This is
equivalent to using th e relaxation technique of Hicks an d K han. This technique has
been found to be useful a t low input power, but perform ance rapidly d eteriorates
w ith increasing in put power, as seen in Figure 4.1.9 and Figure 4.1.10. F ar m ore
iteratio n s are required as input power increases, due to th e increasing inaccuracy of
th e approxim ation. At th e 10 dBm level and above, th e runtim e resulting from this
technique is higher th a n th a t obtained by using th e full Jacobian m atrix.
300
CO 2 5 0
z
o
^200
o4
E-1
►-i 150
CLi
O
04
pq 100
PQ
s
p
Z
50
0
5
10
15
20
INPUT POWER (dBm)
F igure 4.1.9: N um ber of iterations required for sim ulation of the d istrib u ted am pli­
fier using nonlinear Jacobian contributions on th e m atrix diagonal only.
It is im p o rta n t to note th at th e largest Jacobian entries are not being used
w ith this technique. Figure 4.1.5 shows th a t th ere are m any off-diagonal nonlinear
54
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
MACHINE CYCLES (GCycles)
8
7
6
5
4
3
2
1
0
0
5
10
15
20
INPUT POWER (dBm)
Figure 4.1.10: Sim ulation, ru n tim e required for sim ulation of the d istrib u ted amplifier
using nonlinear Jaco b ian contributions on th e m a trix diagonal only.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Jacobian contributions w hich are m uch larger th an th e nonlinear contributions along
th e diagonal. However, it has now been shown th a t th ese larger elem ents are not
necessary for convergence, even at high input levels. Also, since th e linear Jacobian
contributions are present only in th e smallest size diagonal blocks, this technique
has th e added efficiency of being able to use block-wise m atrix decom position. Of
course, for this exam ple th e price associated w ith th e high num ber of necessary
iteratio n s makes this technique u n attractiv e for high in p u t levels.
4.1.4
Using the Linear Jacobian only
For weakly nonlinear circuits, th e nonlinear contributions to the Jacobian are not
always necessary for convergence of th e N ew ton-Raphson algorithm . A t low input
powers especially, th e linear contributions alone m ay be sufficient for convergence
in ju st a few steps. For th e d istrib u ted amplifier circuit, th e linear portion of the
Jaco b ian is sufficient for convergence at all input power levels investigated. The
n u m b er of reuses increases significantly with increased in p u t power. Figure 4.1.11
shows th e num ber of reuses of th e linear Jacobian required for convergence with
respect to input power level. T h e corresponding runtim es are shown in Figure 4.1.12.
O nce again, the ru n tim es a t th e 10 dBm in p u t power level and above are as
high or higher th an those using th e full Jacobian. C learly this is a m ethod which
works best at low in p u t power. C om pared to using only th e nonlinear contributions
along th e diagonal, this technique takes more th a n tw ice th e num ber of iterations
56
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
700
600
00
Z
O
HH 500
H
<
gj 400
H
i—(
Pu
O 300
PC
w
29 200
D
Z
100
0
5
10
15
20
INPUT POWER (dBm)
Figure 4.1.11: N um ber of iterations required for sim ulation of the distributed am ­
plifier w hen only linear Jacobian contributions are used.
57
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
14
GO
0
10
5
15
20
INPUT POWER (dBm)
Figure 4.1.12: S im ulation ru n tim e (in m achine cycles) req u ired for the distrib u ted
am plifier when o n ly linear Jacobian contributions are used.
58
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
to ob tain convergence at the highest power level. In this case, because th e Jacobian
need be calcu lated only once and reused throughout th e sim ulation, the relaxation
technique costs alm ost nothing e x tra as com pared to using only linear contributions.
B oth techniques result in a m atrix which can be inverted block-wise, and since the
nonlinear co n trib u tio n s are calculated only once, the associated cost of calculating
these con trib u tio n s is negligible.
4.1.5
Using a Threshold Value for Nonlinear Jacobian Con­
tributions
A nother tech n iq u e for approxim ating the Jacobian m atrix is to set a threshold value
for nonlinear Jacobian contributions. All nonlinear contributions with a m agnitude
g reater th a n th e threshold value are used in the approxim ate Jacobian, while the
o ther n onlinear contributions are discarded. From experim entation, it was found
th a t th e lin ear Jacobian contributions are critical for sim ulation convergence and
m ust not be rem oved from the m a trix regardless of th eir m agnitudes. O f course,
th e o ptim al th resh o ld value for th e nonlinear Jacobian contributions m ust be d eter­
m ined, an d th is value depends on circuit param eters.
It has alread y been shown in Sections 4.1.2,
4.1.3, and
4.1.4 th a t rem oving
som e of th e nonlinear contributions to th e Jacobian can have a positive im p act on
th e sim ulation runtim e. At some point, however, rem oving too m uch d ata can cause
significant increase in runtim e due to inaccuracy of th e Jacobian. The threshold-
59
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
ing m eth o d a tte m p ts to avoid this problem by including only the m ost significant
nonlinear Jaco b ian contributions.
T h e d istrib u ted am plifier circuit was sim ulated w ith several different threshold
values a t th e sam e input pow er levels discussed previously. Obviously, different input
power levels will generate Jaco b ian entries of different m agnitudes, so it would seem
th a t different threshold levels would apply.
F in d in g A p p rop riate T h resh o ld Values
For low in p u t power, in this case 0 and 5 dBm, th e threshold of infinity will actually
work well for this circuit, as seen when only th e linear Jacobian contributions were
used. F igure 4.1.13 shows a logrithm ic scale plot of th e runtim e for 0 and 5 dBm
inpu t pow er a t different threshold levels as well as th e ru n tim e when no thresholding
is used. T h e x-axis is th e base 10 logarithm of th e threshold value, and the y-axis
is th e ru n tim e in m achine cycles.
C learly th ere is a range of threshold values which perform best for th e low input
levels, sta rtin g where th e base 10 logarithm of th e threshold value is about —2.2.
This corresponds to a threshold value of about 6.e-3. It has already been shown in
Section 4.1.4 th a t the nonlinear contributions to th e Jacobian are unnecessary for
fast ru n tim es a t low power levels, so it is expected th a t raising th e threshold even
higher has little im pact on sim ulation runtim e.
No m a tte r which threshold values are used, th e b est runtim e for low input power
is still slightly higher th a n th a t obtained when th e highest level of diagonal blocking
60
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
2.2
5 dBm(no thresholding)^
CO
•8
I-®
1
16
ffl
1-4
J
y
U
M
2
£
u
<
0 dBm (no thresholding)
,a
1
°-8
0.6
5 dBm
s
0.4
0 dBm
0.2
•5
-4.5
-4
-3.5
•3
-2.5
-2
-1.5
1
LOG(THRESHOLD VALUE)
Figure 4.1.13: Sim ulation runtim e (in m achine cycles) required for the distributed
amplifier at low power when Jacobian en tries are subject to threshold levels. T he
x-axis represents th e base 10 logarithm of th e threshold value.
61
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
100
90
00
80
Z
2
70
1 «•
t2 50
o
Oh
4Q
pa 40
co
1 30
z
5 dBm
10
0 dBm
■5
-4.5
-4
-3.5
•3
-2.5
•2
-1.5
•1
LOG(THRESHOLD VALUE)
Figure 4.1.14: N u m b er of iterations req u ired for convergence of th e N ew ton-Raphson
m ethod for th e d istrib u te d amplifier a t low power when Jaco b ian entries are su b ject
to threshold levels. T h e x-axis represents the base 10 lo g arith m of the threshold
value.
62
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
(8 blocks) was used. As Figure 4.1.5 im plies, unless th e th resh o ld value is very
high, th ere are nonlinear Jacobian entries very far from th e m a trix diagonal, which
prevents block-wise decomposition of th e m a trix . T he ru n tim e savings seen by th e
thresholding techniques is due to two factors. T h e num ber of elem ents to be factored
in the LU decom position process is reduced, resulting in a slight runtim e im prove­
m ent. However, th e m ain reason for th e ru n tim e im provem ent is th e reduction in
the num ber o f iteratio n s required for convergence. By choosing only th e largest non­
linear co n trib u tio n s to th e Jacobian, th e thresholding technique is em phasizing th e
strongest relationships between the s ta te variables and th e erro r function. This em ­
phasis biases th e direction of the Newton step , resulting in fewer Newton-Raphson
iterations, as seen in Figure 4.1.14.
At higher in p u t power levels, th e tradeoff betw een Jaco b ian accuracy and sim ­
ulation ru n tim e is m ore evident. Figure 4.1.15 shows a logrithm ic scale plot of th e
runtim e for 10, 15, and 20 dBm input pow er a t different threshold levels as well as
the runtim e a t th ese power levels when no thresholding is used. As was the case
with low pow er in p u t, th e m inim um for all th e curves occurs a t a threshold level
of 6.e-3. However, ru n tim e does increase w ith an increase in th e threshold level.
At these higher in p u t power levels, the nonlinear Jacobian contributions are larger,
with m ore significant entries further away from th e m atrix diagonal th a n for lower
power levels. As th e threshold value increases, sim ulation ru n tim e approaches th a t
of the technique w hich uses no nonlinear Jaco b ian contributions, as the num ber of
63
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
nonlinear contributions to th e Jacobian is greatly reduced. The num ber of iterations
required for sim ulation convergence is shown in Figure 4.1.16.
O f th e 2484 nonlinear contributions to th e Jacobian, 150 are above th e optim al
threshold level of 6.e-3 for an input power level of 20 dB m . Of those, 106 occur in
th e blocks for which
/
e
= f x , an(i 143 occur either in these diagonal blocks or blocks
d irectly adjacent to th e diagonal blocks. Thus, even a t high input power levels, th e
largest Jacobian contributions are located in or n ear th e diagonal blocks. For th e 0
dB m in p u t level, 114 of 2484 nonlinear contributions to the Jacobian are above the
thresh o ld level, 21 of w hich do not occur in th e diagonal blocks. All of these occur
in blocks directly ad jacen t to the diagonal blocks. N ote th a t at 0 dB m input power,
a th resh o ld of 0.01 provides convergence w ith th e sam e runtim e. T he only nonlinear
Jaco b ian contributions which exceed this threshold occur in the diagonal blocks,
b u t n o t all the contributions in the diagonal blocks are used. Again, because the
Jaco b ian inversion is n o t done block-wise, th e ru n tim e for the threshold technique
is still slightly higher th a n th e 8-block m ethod.
4.1.6
Summary
From observing sim u lato r perform ance for all th e approxim ation m ethods investi­
g ated in this chapter, it appears th a t the best technique for this circuit is to use th e
diagonal blocks of th e Jacobian only. This technique uses roughly the sam e nonlin­
ear Jacobian entries of th e optim al threshold value, and thus emphasizes th e sam e
64
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
2.2
20 dBm (no thresholding)
C/5
JO
£ 1-8
U
S
1 .6
00
W
J
u
><
1.4
u
10 dBm, 15 dBm (no thresholding)
1.2
m
5 1
s
u
<
20 dBm
0.8
s
15 dBm
0.6
10 dBm
0.4
■3
-
2.8
-
2.6
-2 .4
-
2.2
■2
LOG(THRESHOLD VALUE)
Figure 4.1.15: Sim ulation ru n tim e (in m achine cycles) required for th e distributed
amplifier at high power when Jaco b ian entries are subject to threshold levels. T he
x-axis represents th e base 10 logarithm of th e threshold value.
65
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
100
90
oo
80
70
60
50
40
20 dBm
15 dBm
20
10 dBm
■3
-
2.8
-
-2.4
2.6
-
2.2
■2
LOG(THRESHOLD VALUE)
Figure 4.1.16: N u m b er of iterations required for convergence of th e N ew ton-Raphson
m eth o d for th e d istrib u te d amplifier at high pow er w hen Jacobian entries are subject
to threshold levels. T h e x-axis represents th e base 10 logarithm of th e threshold
value.
66
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
im p o rtan t Jaco b ian entries, b u t it has th e added advantage of simplifying th e LU
decom position of th e m atrix . W hile th e m atrix was decom posed only once for m ost
of th e sim ulations, th e block Jacobian m ethod still provided slightly faster runtim es
th an th e th resh o ld m eth o d by providing an equally good N ew ton update direction
and a fa ste r m ethod for decom posing th e m atrix. T h e m ethods which involve using
only linear Jacobian contributions or only the nonlinear contributions which occur
on th e m a trix diagonal are useful only for m ildly nonlinear circuits and only a t low
input pow er levels. Even in th ese cases, the diagonal block m ethods perform ed at
least as well as these techniques. For this circuit, th e best Jacobian approxim ation
is th e diagonal block m e th o d which uses the sam e num ber of blocks as analysis
frequencies.
4.2
Nonlinear Transmission Lines
A n o nlinear transm ission line(N LTL) consists of reverse biased diodes d istrib u ted
along a transm ission line a t regular intervals. A large single^tone AC voltage is used
to excite th e circuit a t its in p u t, producing a very short d u ratio n voltage spike at
its o u tp u t. A u n it cell of th is type of transm ission line is shown in F igure 4.2.1.
Several u n it cells are connected in series to form a tru e NLTL.
Due to th e strong n onlinearity of th e circuit elem ents a n d th e high in p u t voltage
levels a t w hich th e circuit is driven, this circuit is an excellent test for a harm onic
balance sim u lato r. T h e o u tp u t frequency spectrum of th is type of circuit will be
67
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
F igure 4.2.1: Nonlinear transm ission line u nit cell.
very dense due to th e high input power and strong nonlinearity of th e circuit. Also,
th e num ber of u n it cells is generally large as well, resulting in a large num ber of
unknown variables. As a result, th e Jacobian m atrix will be large. Figure 4.2.2
shows the o u tp u t waveform of a 47 diode NLTL driven at 14 volts AC.
Transm ission lines of varying lengths were sim ulated w ith approxim ate Jaco­
bian m atrices. T h e Jacobian approxim ations used for this circuit include the block
m ethods already discussed as well as blocking techniques which include some offdiagonal blocks. T h e off-diagonal blocks allow for a more regular treatm en t of the
cross-frequency inform ation than th e diagonal block Jacobian m ethods. The differ­
en t off-diagonal m eth o d s include blocks up to some predeterm ined num ber of blocks
away from th e diagonal blocks. For exam ple, if th e off-diagonal level is two, then
all blocks B ij, w here \i — j \ < 2 are used. Additionally, th e off-diagonal blocking
technique is used in place of the threshold technique. It was determ ined in Sec­
tio n 4.1 th a t th e advantage gained by using th e threshold technique was not due to
an im provem ent in th e m atrix decom position b u t rather due to an im provem ent in
68
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
>
LU
(D
<
f__l
O
>
h13
a. -10
hZ>
EXPERIMENTAL DATA
HARMONIC BALANCE
o
-15
0
10
20
30
40
50
60
TIME (ps)
Figure 4.2.2: M easured(solid line) and sim ulated(dashed line) o u tp u t waveform for
the 47 diode NLTL.
69
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
th e d irection of th e Nevvton-Raphson u p d a te . T h e significant nonlinear elem ents in
th e off-diagonal blocks of th e Jacobian generally occur in blocks close to th e m atrix
diagonal. In this case th e lower level off-diagonal block Jacobian techniques m ay be
used. For future reference, th e blocking schem es are defined in Table 4.2.1.
Blocking
Scheme
D escription
1
2
4
8
16
32
ol
o2
O ne block used, i.e., w ithout approxim ation.
Two diagonal blocks used.
Four diagonal blocks used.
Eight diagonal blocks used.
Sixteen diagonal blocks used.
T hirty-tw o diagonal blocks used.
Level one off-diagonal blocking used.
Level two off-diagonal blocking used.
T ab le 4.2.1: Blocking approxim ation schem es. A num ber by itself refers to the
n u m b er of diagonal blocks used. A n u m b er preceded by the letter “o” indicates an
off-diagonal blocking schem e.
4.2.1
A 10 diode NLTL
A circuit of ten unit cells was sim ulated a t several different power levels. T h e o u tp u t
pow er spectrum is shown below in F igure 4.2.3 for input voltages of 1, 3, 6, and 9
volts. Clearly, th e o u tp u t spectrum changes dram atically with an increase in drive
level. T h e spectrum decreases m onotonically for th e low input voltages, b u t becomes
non-m onotonic and fla tte r as th e in p u t level is increased.
T h e sim ulator perform ance was m easu red w ith respect to the cpu ru n tim e re­
q u ire d for convergence a n d th e num ber of iteratio n s required. T h e n u m b er of tim es
th e Jaco b ian is recalculated during th e sim u latio n is another im p o rtan t perform ance
70
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
0
20
40
60
80
100
120
140
160
180
FREQUENCY (GHZ)
Figure 4.2.3: M agnitude of voltage o u tp u t spectrum of th e 10-Diode NLTL w ith
1 (0 ), 3 (+ ), 6 ( d ), and 9 ( x) volt in p u t AC voltages.
71
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
m easurem ent, and th is is generally reflected in th e runtim e required for convergence.
Results are presented w ith respect to the sparsity of the m atrix approxim ation. Ma­
trix sparsity was defined in Section 3.2.5 as th e percentage of th e m atrix which is
being neglected in th e approxim ation. For exam ple, using no approxim ation would
be 0% sparsity, while using th e two-block m eth o d is 50% sparsity. Thus, m atrix
sparsity indicates th e stru ctu re of the Jacobian m atrix approxim ation.
T he num ber of Newton-Raphson iterations and the “Jacobian entry ratio” are
reported for each sim ulation. The Jacobian en try ratio is defined as the ratio be­
tween the sum of th e m agnitudes of all Jacobian entries used by each approxim ation
and the sum of th e m agnitudes of all Jacobian entries before th e approxim ation is
applied. In other words, each tim e the Jacobian is calculated, th e sum of th e mag­
nitudes of all Jacobian entries is calculated. T h e Jacobian approxim ation is then
applied, resulting in som e m atrix entries being discarded. T h e sum of the magni­
tudes of all discarded entries is then calculated. T he ratio of th e two m agnitudes is
then the “Jacobian e n try ratio.” This ratio is an indication of how m uch inform ation
is being discarded for each approximation.
Low Input V oltage
Results for an in p u t voltage of 1 volt are shown in Figure 4.2.4. Generally, th e Jaco­
bian approxim ation techinques rarely perform ed b etter th an using th e full Jacobian
m atrix at this in p u t level. The performance for all approxim ations was very similar.
Though the nonlinearities of this circuit are strong, this drive level is low enough
72
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
th a t m ost of th e Jaco b ian elem ents of significant size are located close to the m a trix
diagonal.
However, for lower in p u t levels, there seems to be a range of Jacobian sparsity
which provides slight perform ance improvements. As seen in F igure 4.2.4, 1 volt of
ex citatio n is sm all enough th a t settin g large sections of th e Jaco b ian to zero has little
effect on th e num ber of iterations required for sim ulator convergence. Due to th e
low level of in p u t power, these sections are close to zero anyway. T h e only situations
where th e Jacobian approxim ations had any effect on th e n u m b er of iterations were
a t high levels of sparsity. T h e first spike in the iterations curve corresponds to using
eight diagonal blocks in th e Jacobian approxim ation. T h e first dip in the curve
corresponds to using level one off-diagonal blocking as shown in Figure 3.2.5. T h e
final two points on th e g rap h correspond to 16 and 32 diagonal blocks. However,
all of th e Jacobian ap p roxim ations provided a ru n tim e im provem ent over not using
an approxim ation. W hen large sections of the m a trix are set to zero, the m atrix
decom position tim e decreases accordingly.
A t an in p u t AC voltage level of 3 volts, a richer Jacobian m a trix is formed. T hus,
th e blocking m ethods resu lt in a larger amount of inform ation being lost th an for an
in put level of 1 volt. As seen in Figure 4.2.5, th e blocking m eth o d s result in higher
num ber of iteratio n s only for th e highest level of sparsity. A gain, the technique
which provided th e g reatest benefit was the level one off-diagonal approxim ation.
U nlike th e m ore w eakly nonlinear distributed am plifier circuit, there is a point a t
73
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
0.1
22
0.09
0 .0 8
H
0 .0 7
Ph
0 .0 6
04
<
P-,
O
w
PQ
%
ID
20
0.05
19
0 .0 4
^
0 .0 3
S
<
0.02 C)
iz;
0.01
--r
18
0
0.1
~~"
0 .2
0 .3
>
0 .4
»_____ r ~ ~ . I
0 .5
0 .6
0 .7
_ f
0 .8
il_
0 .9
1
JACOBIAN MATRIX SPARSITY
F igure 4.2.4: N um ber of iterations a n d ratio of unused to used Jacobian entries for
sim ulation of th e 10 diode NLTL w ith Iv AC input voltage. T h e dashed line rep re­
sents th e ratio of th e m agnitude of unused Jacobian entries to th e total m agnitude
of all Jacobian entries, while th e solid line represents th e corresponding num ber of
iteratio n s required for convergence.
74
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Blocking
Schem e
M atrix
Sparsity
1
2
4
8
16
32
ol
o2
o5
o l6
0
0.5
0.75
0.875
0.9375
0.96875
0.907
0.8506
0.6843
0.2298
N um ber of
Jacobian
Evaluations
2
2
2
2
2
2
2
2
2
2
R untim e
(cpu secs)
N um ber of
Iterations
R untim e
Im provem ent (%)
213
210
213
219
220
218
209
209
211
209
19
19
19
21
21
22
18
19
19
19
N /A
1.4
0.0
-2.8
-3.3
-2.3
1.9
1.9
0.9
1.9
Table 4.2.2: N um ber of Jacobian calculations required for convergence and sim ula­
tion ru n tim e for different blocking schemes for th e 10 diode NLTL circuit w ith an
input AC voltage level of 1 volt.
Blocking
Scheme
M atrix
Sparsity
N um ber of
Jacobian
Evaluations
R untim e
(cpu secs)
N um ber of
Iterations
R untim e
Im p ro vem ent ( %)
1
2
4
8
16
32
ol
o2
o5
o l6
0
0.5
0.75
0.875
0.9375
0.96875
0.907
0.8506
0.6843
0.2298
2
2
2
2
2
2
2
2
2
2
253
248
245
244
251
270
225
245
251
254
36
35
33
33
35
40
31
36
36
36
N /A
2.0
3.2
3.6
0.8
-6.7
11.1
3.2
0.8
-0.4
Table 4.2.3: N um ber of Jacobian calculations required for convergence and sim ula­
tion ru n tim e for different blocking schemes for th e 10 diode NLTL circuit w ith an
input AC voltage level of 3 volts.
75
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
40
0 .4
0 .3 5
38
0.3
00
2
36
O
H
<
0 .2 5
0.2
fc
2
w
32
0 .1 5 03
03
s
D
30
2
0.1
0 .0 5
26
0
0.1
0.2
0 .3
0 .4
0.5
0.6
0 .7
0.8
0 .9
1
JACOBIAN MATRIX SPARSITY
Figure 4.2.5: N um ber of itera tio n s and ratio of unused to used Jacobian entries
for sim ulation of the 10 diode NLTL w ith 3 volt AC input voltage. T h e dashed
line represents the ratio of th e m agnitude of unused Jacobian entries to th e to tal
m agnitude of all Jacobian en tries, while th e solid line represents th e corresponding
num ber of iterations required for convergence.
76
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
w hich rem oving Jaco b ian inform ation in off-diagonal blocks has a negative im p act
on sim ulation runtim e, even a t low in p u t pow er levels. For the 10-diode line, th is
level of sparsity appears to be the highest level diagonal blocking approxim ations.
B oth th e 16 and 32 block approxim ations resu lted in a longer runtim e th a n th a t
o b tain ed by not ap p ro x im ating th e Jacobian a t all. Rem oving too m uch inform ation
from th e Jacobian resu lts in inaccurate N ew ton up d ates to the vector of unknow ns
an d slows convergence. Even a t low input pow er levels, th e am ount of processing
tim e saved in factoring a sim pler Jacobian m a trix was offset significantly by th e
n u m b er of iterations req u ired for convergence.
H ig h In p u t V oltage
O nce th e in p u t AC voltage was increased to 6 volts, th e Jacobian m atrix becam e
dense enough th a t significant am ounts of inform ation were lost even for th e m ost
conservative approxim ations to the Jacobian. In fact, th e only approxim ations to
m ake even a m odest im provem ent in sim ulator perform ance were th e level 16 an d 4
off-diagonal blocking schem es. From exam ining F igure 4.2.6, it is apparent th a t th e
n u m b er of iterations required for convergence is roughly proportional to th e am o u n t
of inform ation being rem oved by the different approxim ation techniques. A lso, any
ad vantage in m a trix decom position tim e gained by using the approxim ations is lost
du e to an increase in th e num ber of iterations required for convergence. For som e
approxim ations, th e n u m b er of required Jaco b ian calculations is larger as well, as
seen in Table 4.2.4.
77
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
0 .5
240
CO
220
Z 200
O
H 180
<
0.45
2
E-
0.4
<
0.35
>-i
OJ
c* 160
w
H
140
»—H
120
0.3
g
0.25
^
w
0.2
m
100
w
£3
80
D
60
£
0.15
g
0.1
^
0.05
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0
1
JACOBIAN MATRIX SPARSITY
Figure 4.2.6: N u m b er of iterations and ratio of unused to used Jacobian entries
for sim ulation of th e 10 diode NLTL w ith 6 volts AC in p u t voltage. T he dashed
line represents th e ratio of th e m ag n itu d e of unused Jacobian entries to th e total
m agnitude of all Jacobian entries, w hile th e solid line represents th e corresponding
num ber of iteratio n s required for convergence.
78
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Blocking
Scheme
M atrix
Sparsity
N um ber of
Jaco b ian
E valuations
R untim e
(cpu secs)
N um ber of
Iterations
R u n tim e
Im provem ent (96)
1
2
4
8
16
32
ol
o2
oo
ol6
0
0.5
0.75
0.875
0.9375
0.96875
0.907
0.8506
0.6843
0.2298
3
3
3
3
17
16
4
4
3
3
342
363
396
427
1933
1835
549
524
340
333
37
45
55
65
222
215
71
68
37
35
N /A
-6.1
-15.8
-24.9
-465
-437
-60.5
-53.2
0.6
2.6
Table 4.2.4: N um ber of Jaco b ian calculations required for convergence and sim ula­
tion runtim e for different blocking schemes for th e 10 diode NLTL circuit w ith an
input AC voltage level of 6 volts.
Sim ilar results are seen for an input AC voltage level of 9 volts, as shown in
Figure 4.2.7 and Table 4.2.5. Again, too m uch inform ation is being rem oved from
the Jacobian m atrix by th e different approxim ation techniques for th e techniques
to be successful. Only th e level 16 off-diagonal approxim ation was able to provide
a slight ru n tim e im provem ent.
Figures 4.2.4 through 4.2.7 show how th e Jacobian m atrix becom es m ore dense
as th e in p u t drive level is increased. As the voltage is increased, th e ratio of the
m agnitude of unused Jacobian entries to th e to ta l m agnitude of all Jacobian en­
tries rises m ore quickly w ith an increase in th e level of m atrix sparsity. It is also
interesting to note th e relationship between th e effectiveness of th e approxim ation
techniques and the form of th e o u tp u t voltage spectrum . As shown in Figure 4.2.3,
79
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
0.3
180
160
0.25
o
CO
Z
O
H
140
0.2
H
<
06
£ 120
ca
H
0.15
IX, 100
o
>
06
EH
Z
w
z
Cti
w
0.1
<
S3
PQ 80
o
D
^ 60
0.05
u
<
40
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
JACOBIAN MATRIX SPARSITY
Figure 4.2.7: N um ber of iterations an d ratio of unused to used Jacobian entries
for sim ulation of th e 10 diode NLTL w ith 9 volt AC in p u t voltage. The dashed
line represents th e ratio of th e m agnitude of unused Jacobian entries to the total
m agnitude of all Jacobian entries, while th e solid line represents th e corresponding
num ber of iteratio n s required for convergence.
80
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Blocking
Scheme
M atrix
Sparsity
N um ber of
Jaco b ian
Evaluations
R untim e
(cpu secs)
N um ber of
Iterations
R u n tim e
Im provem ent ( %)
1
2
4
8
16
32
ol
o2
o5
o l6
0
0.5
0.75
0.875
0.9375
0.96875
0.907
0.8506
0.6843
0.2298
5
5
14
N /C
N /C
N /C
N /C
N /C
8
5
588
629
1621
N /C
N /C
N /C
N /C
N /C
951
561
59
76
173
N /C
N /C
N /C
N /C
N /C
91
56
N /A
-7
-176
N /C
N /C
N /C
N /C
N /C
-61.7
4.6
Table 4.2.5: N um ber of Jaco b ian calculations req u ired for convergence an d sim ula­
tion ru n tim e for different blocking schemes for th e 10 diode NLTL circuit w ith an
in p u t AC voltage level of 9 volts.
th e o u tp u t spectrum of this circuit is m onotonically decreasing for in p u t levels of
1 and 3 volts, while the sp ectru m is m uch bro ad er for the higher in p u t levels of 6
an d 9 volts. T he relationship betw een th e higher order harm onics becom es m ore
im p o rtan t as th eir relative m agnitudes increase. Inform ation corresponding to this
relationship is located away from th e diagonal of th e Jacobian, requiring m ore offdiagonal inform ation for an approxim ation to b e useful.
4.2.2
A 47 diode NLTL
A 47 diode NLTL was built an d m easured by R odw ell’s research group a t th e U ni­
versity of California at S anta B arb ara [25]. T h e circuit was m easured w ith an in p u t
signal of 27 dB m at 9 GHz. H arm onic balance sim ulations of this circu it a t bo th
81
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
low and high in p u t power levels were perform ed using TR A N SIM , w ith high power
results re p o rte d in [26]. M atrix approxim ations were used w ith some success a t low
input power, b u t th e effectiveness of approxim ate Jacobians deteriorated rapidly
w ith an increase in in p u t power.
Low in p u t v o lta g e
T he 47 diode NLTL was sim ulated using 16 analysis frequencies w ith input AC
voltages of 1, 2, 3, an d 4 volts. T he associated system of equations to be solved is
of ran k 1457. A t an AC input level of 1 volt, th e different approxim ation techniques
perform ed sim ilarly. All of these sim ulations required only two Jacobian evaluations
to be perform ed, b u t sim ulation runtim es and th e n u m b er of iterations required
for convergence were slightly different. As seen in F igure 4.2.8 and Table 4.2.6,
shortest sim u latio n ru n tim e and the fewest num ber of itera tio n s required occurred
for th e 16 block diagonal m ethod, w hich was the m ost sparse. T he block diagonal
m ethods generally provided faster runtim es and required fewer iterations than offdiagonal m eth o d s in th is case. Again, th e relative uniform ity in th e perform ance
of th e different Jacobian approxim ations can be traced to th e fact th a t m ost of
the Jacobian inform ation is located in th e sam e-frequency blocks along the m atrix
diagonal. As seen in Figure 4.2.8, th e next best approxim ation corresponded to a
very sm all ra tio of th e sum of th e m agnitudes of all unused Jacobian entries to th e
sum of th e m agnitudes of all Jacobian entries. T his is th e level th ree off-diagonal
m ethod.
82
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
0.12
1440
1420
1400
zn
0.08
*3 1380
0.06 pj
H
^
1360
OS
0.04 0
1340
0.02
1320
1300
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
M A TR IX SPARSITY
Figure 4.2.8: Sim ulation ru n tim e and ratio of unused to used Jacobian entries for
th e 47 diode NLTL w ith 1 volt AC input voltage. T he dashed line represents the
ratio of th e m agnitude of unused Jacobian entries to th e to tal m agnitude of all
Jaco b ian entries, while th e solid line represents th e corresponding runtim e required
for convergence.
Blocking
Scheme
M atrix
Sparsity
1
2
4
8
16
ol
o2
o3
0
0.5
0.75
0.875
0.9375
0.907
0.8506
0.7932
N um ber of
Jacobian
Evaluations
2
2
2
2
2
2
2
2
Runtim e
(cpu secs)
N um ber of
Iterations
Runtim e
Improvement (%)
1416
1424
1364
1368
1314
1406
1370
1410
26
26
25
24
21
26
26
26
N /A
-0.6
3.7
3.4
7.1
0.7
3.2
0.4
T able 4.2.6: N um ber of Jacobian calculations required for convergence and sim ula­
tio n ru n tim e for different blocking schemes for th e 47 diode NLTL circuit w ith an
in p u t AC voltage level of 1 volt.
83
Reproduced with permission o f the copyright owner. Further reproduction prohibited without permission.
At th e 2 volt AC level, the 16 block diagonal m eth o d again proved most effective
in reducing sim ulation runtim e, as seen in Figure 4.2.9 and Table 4.2.7. Each m ethod
once again required only 2 Jacobian calculations for sim ulation convergence, but
at this level th e difference betw een th e 16 block diagonal m eth o d and th e other
approxim ations was more significant. N ote th a t th e level th ree off-diagonal m ethod,
sparsity of ab o u t 0.79, is not able to provide a ru n tim e im provem ent over not using
a Jacobian approxim ation. T his level of off-diagonal blocking was the second best
approxim ation a t th e one volt level. Also, th e Jaco b ian e n try ratio for the 16 block
diagonal m e th o d is m ore than twice as high as it was for a 1 volt AC input, indicating
th a t th e m ag n itu d e of the Jacobian entries in off-diagonal blocks is increasing w ith
an increase in drive level.
Blocking
Scheme
M atrix
Sparsity
1
2
4
8
16
ol
o2
o3
0
0.5
0.75
0.875
0.9375
0.907
0.8506
0.7932
N um ber of
Jacobian
Evaluations
2
2
2
2
2
2
2
2
R untim e
(cpu secs)
N um ber of
Iterations
R untim e
Im provem ent (%)
1845
1891
1869
1728
1488
1725
1824
1878
52
54
53
45
31
44
50
52
N /A
-2.5
-1.3
6.3
19.4
6.5
1.1
-1.8
Table 4.2.7: N um ber of Jacobian calculations required for convergence and sim ula­
tion ru n tim e for different blocking schem es for th e 47 diode NLTL circuit w ith an
input AC voltage level of 2 volts.
At an AC in p u t voltage of 3 volts, th e num ber of Jacobians required for sim ula­
tion convergence varies somewhat for th e various Jacobian approxim ations. Again,
84
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
0.3
1900
1850
0.25
1800
0.2
1750
1700
0.15
1650
O
H
<
OS
>*
ftS
H
Z
EU
z
0.1
1600
1550
0.05
<
PQ
O
U
<
1500
1450
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
M A TRIX SPARSITY
Figure 4.2.9: S im ulation ru n tim e and ratio of unused to used Jacobian entries for
th e 47 diode NLTL w ith 2 volt AC input voltage. T h e dashed line represents the
ratio of th e m ag n itu d e of unused Jacobian entries to the to tal m agnitude of all
Jaco b ian entries, w hile th e solid line represents th e corresponding ru n tim e required
for convergence.
’85
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
3600
3400
3200
^3000
W
S.
a 2800
H
z
D 2600
a
2400
2200
2000
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
MATRIX SPARSITY
Figure 4.2.10: Sim ulation runtim e an d ratio of unused to used Jacobian entries for
the 47 diode NLTL w ith 3 volt AC input voltage. T he dashed line represents the
ratio of th e m ag n itu d e of unused Jacobian entries to the to ta l m agnitude of all
Jacobian entries, while th e solid line represents the corresponding runtim e required
for convergence.
as seen in Figure 4.2.10 and Table 4.2.8, the 16 block diagonal approxim ation was
the best perform er, requiring only 2 Jacobian evaluations throughout the sim ula­
tion. T h e 8 an d 4 block diagonal techniques along w ith th e level 1 and 3 off-diagonal
techniques required 3 Jacobians for convergence, while the level 2 off-diagonal ap­
proxim ation required 4 Jacobian evaluations. The 2 block diagonal approxim ation
required 5 Jacobain evaluations as did th e m ethod of using th e full Jacobian m atrix.
W hen th e in p u t AC level is increased to 4 volts, sim ulation convergence cannot
be reached using th e full Jacobian. m a trix w ith 16 analysis frequencies. T he level
two and level th re e off-diagonal approxim ations also fail to provide convergence.
86
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Blocking
Scheme
M atrix
S parsity
N um ber of
Jacobian
Evaluations
R untim e
(cpu secs)
N um ber of
Iterations
R u n tim e
Im provem ent (%)
1
2
4
8
16
ol
o2
o3
0
0.5
0.75
0.875
0.9375
0.907
0.8506
0.7932
5
5
3
3
2
3
4
3
3370
3560
2375
2360
2087
2171
2684
2349
52
55
53
51
63
40
43
51
N /A
-5.6
29.5
30.0
38.1
35.6
20.4
30.3
T able 4.2.8: N um ber of Jacobian calculations required for convergence and simula­
tio n ru n tim e for different blocking schemes for th e 47 diode NLTL circuit with an
in p u t AC voltage level of 3 volts.
However, the o th er Jacobian approxim ations do provide convergence and the re­
su lts com pare very favorably to sim ulation w ith m ore analysis frequencies. This
shows th a t the approxim ations can help avoid th e need for increasing th e num ber of
analysis frequencies or th e use of excessive oversam pling as discussed in C hapter 3.
R untim es and Jacobian en try ratios are shown for th e different Jaco b ian approxim a­
tions w ith an AC in p u t level of 4 volts in F igure 4.2.10 and Table 4.2.9. Note th at
a t th e highest level of m atrix sparsity, alm ost half th e m agnitude of th e Jacobian is
being neglected.
V ery high inp ut v o lta g e
In order to obtain convergence at 27 dB m in p u t power, a continuation m ethod was
necessary. T he in p u t voltage level was increased in 2 volt increm ents up to the 14
87
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
0.5
25000
0.45
20000
O
P
0.4
15000
0.35
0.3
5000
0.25
I*
04
Z
U
Z
<
5
o
u
<
0.2
0.4
0.5
0.6
0.7
0.8
0.9
1
MATRIX SPARSITY
Figure 4.2.11: S im ulation runtim e and ratio of unused to used Jacobian entries for
th e 47 diode NLTL w ith 4 volt AC input voltage. T he dashed line represents th e
ratio of the m ag n itu d e of unused Jacobian entries to th e to ta l m agnitude of all
Jacobian entries, w hile th e solid line represents the corresponding runtim e required
for convergence.
Blocking
Scheme
M atrix
S p arsity
N um ber of
Jacobian
Evaluations
R untim e
(cpu secs)
N um ber of
Iterations
R untim e
Im provem ent (%)
1
2
4
8
16
ol
o2
0
0.5
0.75
0.875
0.9375
0.907
0.7932
N /C
18
18
7
3
6
N /C
N /C
23053
13397
5399
2730
4752
N /C
N /C
26
25
24
21
26
N /C
N /C
N /A
N /A
N /A
N /A
N /A
N /C
Table 4.2.9: N u m b er of Jacobian calculations required for convergence and sim ula­
tion runtim e for different blocking schem es for the 47 diode NLTL circuit w ith an
in p u t AC voltage level of 4 volts.
88
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
volt level which corresponds to 27 dBm . At th e early stages of this process, a sm all
n u m b er of analysis frequencies were used. T h e num ber of analysis frequencies was
increased throughout th e continuation process w ith increases in in p u t voltage, w ith
40 analysis frequencies necessary for sim ulation convergence when th e circuit was
finally driven a t 14 volts AC. T h e m easured a n d sim ulated o u tp u t waveforms of this
circu it are shown in F igure 4.2.2. The Jacobian approxim ations discussed for the
d istrib u te d amplifier and th e 10 diode soliton line were unable to provide sim ulation
convergence a t this level.
4.2.3
Summary
T h e perform ance of N ew ton-R aphson based harm onic balance technique for sim u­
latin g nonlinear transm ission lines clearly m ay be im proved by approxim ating the
Jaco b ian m atrix , especially for low input drive levels. Not only is th e tim e required
to decom pose the m atrix reduced, but th e num ber of iterations required to achieve
convergence is also reduced. W ith the ap p ro p riate selection of a Jaco b ian approxi­
m atio n , th e direction of th e resulting N ewton step m ay be superior to th a t provided
by th e full Jacobian m a trix . By focusing th e Newton up d ate direction upon the
diagonal blocks of th e m a trix , the Jacobian approxim ations em phasize th e m ost
im p o rta n t relationships betw een th e s ta te variables and the error function. As seen
w ith th e 47 diode line, an ap p ro p riate Jacobian approxim ation m ay even extend the
ab ility o f th e sim ulator to converge w ith a given set of analysis frequencies. This
89
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
can greatly reduce the n um ber of sta te variables to be solved, resulting in a m uch
sm aller system of nonlinear equations w ith a corresponding reduction in m atrix
decom position an d solving.
90
Reproduced with permission o f the copyright owner. Further reproduction prohibited without permission.
Chapter 5
Harmonic Balance Simulation Using Inexact
Newton Methods with a Preconditioned
Jacobian Matrix
T his chapter exam ines th e use of inexact N ew ton m ethods for harm onic balance
sim ulation of th e circuits discussed in C h ap ter 4. T h e block Jacobian approxim a­
tions described in C h ap ter 4 are used as preconditioners for iterative linear solution
of Equation 2.4.1. T h e quasi-Newton process used here is described in Section 2.4.1
and is the well known GMRES technique.
5.1
Simulation of a Distributed Amplifier
T h e distributed am plifier shown in Figure 4.1.1 was sim ulated using an iterative lin­
ea r solver to determ ine th e quasi-Newton step a t each iteration. T hirty-tw o analysis
frequencies were used in th e sim ulations. Convergence was obtained only up to an
in pu t power level of 15 dB m with this technique, as opposed to 20 dB m w ith the
exact Newton m eth o d . Results were obtained fo r in p u t power levels of 0, 5, 10, and
15 dBm with an in p u t frequency of 4 GHz.
The unknown circuit quantities are th e g a te and drain voltages for each of the
91
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
th ree tran sisto rs. W ith th irty -tw o analysis frequencies, th e Jacobian will have rank
378. T h e Jacobian preconditioning techniques used for this circuit are basically
th e sam e as th e approxim ations used w ith the N ew ton-based harm onic balance
sim ulations described in C h a p te r 4.
5.1.1
Low Input Power
Figure 5.1.1 shows the ru n tim e in cpu seconds and th e Jaco b ian en try ratio versus
th e sp arsity of the Jacobian preconditioner for an in p u t level of 0 dB m . All precon­
d itioning techniques show an im provem ent in runtim e over using no preconditioner
a t all, i.e., m atrix sp arsity of zero. T he runtim e seems to be inversely related to
th e am o u n t of Jacobian inform ation deleted by each preconditioning technique. Of
course, as seen in the figure, th e m agnitude of th e deleted Jacobian entries is quite
sm all relativ e to the overall m ag n itu d e of all the Jacobian entries. W hile all th e pre­
conditioners provided a ru n tim e im provem ent, th e best im provem ent was a slightly
less th a n 10 percent decrease in runtim e. This m oderately nonlinear circuit driven
a t low pow er does not produce m uch of a distinction betw een th e different preconditioners, as most of th e Jaco b ian inform ation is located in the diagonal blocks
represen tin g sam e frequency circuit quantities. T h e ru n tim e variation is due to th e
effect of th e different m a trix stru ctu res upon decom position tim e for th e precondi­
tioner. T able 5.1.1 shows th a t th e num ber of Jacobian evaluations is th e sam e for
all preconditioners, while th e n u m b er of iterations required for convergence is equal
92
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
to or g reater th a n th e case of not preconditioning th e Jacobian.
235
0.025
230
0.02
0.015
W
220
0.01 2
215
0.005 ^
Pi
210
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
M A TR IX SPARSITY
Figure 5.1.1: S im ulation runtim e (solid line) and ratio of th e m agnitude of all unused
Jacobian entries to th e m agnitude of all Jacobian entries calculated (dashed line)
versus m atrix sp a rsity for the distrib u ted am plifier circuit. The input power level is
0 dB m
Blocking
Scheme
M atrix
S parsity
N um ber of
Jacobian
Evaluations
R untim e
(cpu secs)
N um ber of
Iterations
R u n tim e
Im provem ent (%)
1
2
4
8
16
32
ol
o2
0
0.5
0.75
0.875
0.9375
0.96875
0.907
0.8506
3
3
3
3
3
3
3
3
235
223
218
214
222
222
223
222
16
16
16
17
19
20
18
16
N /A
5.1
7.2
8.9
5.5
5.5
5.1
5.5
Table 5.1.1: N u m b er of Jacobian calculations required for convergence and sim ula­
tion ru n tim e for different blocking schemes for th e d istrib u ted am plifier circuit with
an in p u t power level of 0 dBm .
93
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
A fter increasing th e in p u t power to 5 dB m , a n increase in the Jacobian entries
in off-diagonal blocks is seen, resulting in greater perform ance variation am ong the
different preconditioners. As shown in Figure 5.1.2 an d Table 5.1.2, perform ance
is very sim ilar to th e 0 dB m input case up to a certain level of m atrix sparsity,
b u t ru n tim e is higher for th e level two off-diagonal preconditioner th a n when no
preconditioner is used. T h e best runtim e was provided by the level one off-diagonal
schem e. W ith only one level of off-diagonal blocks used, only two Jacobians are
needed for convergence as opposed to three for all o th e r preconditioners.
A t th e lower in p u t power levels, the diagonal block preconditioners seem to
provide a somewhat consistent im provem ent corresponding to the increase in the
sp arsity of the preconditioners. Also, it is interesting to note th a t the 16 an d 32 block
m ethods provide very sim ilar perform ance a t these levels. At the sam e tim e, the
different off-diagonal preconditioners seem to be less effective than the block diagonal
preconditioners of sim ilar sparsity. This is due to th e m ore difficult decom position
for off-diagonal techniques.
5.1.2
High Input Power
For an in p u t power level of 10 dBm , m ore variation is seen in the perform ance of
th e preconditioners. As seen in Figure 5.1.3 and T able 5.1.3, the best ru n tim e was
provided by the 16 block preconditioner, b u t th e 32, 8, and 4 block preconditioners
provided a sim ilar ru n tim e im provem ent. T he off-diagonal preconditioners, while
94
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
0.045
0.04
0.035
0.03
2
H
<
06
0.025 H
z
0.02
z
<
0.015 3
0.01
O
U
<
0.005
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
MATRIX SPARSITY
Figure 5.1.2: S im ulation runtim e (solid line) and ratio of the m ag n itu d e of all unused
Jacobian entries to the m agnitude of all Jacobian entries calculated (dashed line)
versus m atrix sp arsity for th e d istrib u ted am plifier circuit. T h e input power level is
5 dBm .
Blocking
Scheme
M atrix
S parsity
N um ber of
Jacobian
Evaluations
R untim e
(cpu secs)
N um ber of
Iterations
R untim e
Im provem ent (%)
1
2
4
8
16
32
ol
o2
0
0.5
0.75
0.875
0.9375
0.96875
0.907
0.8506
3
3
3
3
3
3
2
3
241
228
222
223
219
214
162
256
26
26
27
35
27
18
36
27
N /A
5.4
7.9
7.5
9.1
11.2
32.8
-6.2
Table 5.1.2: N u m b er of jacobian calculations required for convergence and sim ula­
tion ru n tim e for different blocking schemes for th e distrib u ted am plifier circuit w ith
an input power level of 5 dB m .
95
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
th e y provide a slight ru n tim e im provem ent o f 8.4 percent, axe still not as effective
as th e block diagonal schem es.
0.08
0.07
O
0.06
<.
az
0.05 >-
06
E0.04 g
Z
0.03
<
ca
O
0.02 U
<
l—s
0.01
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
MATRIX SPARSITY
F ig u re 5.1.3: Sim ulation ru n tim e (solid line) a n d ratio of th e m ag n itu d e of all unused
Jaco b ian entries to th e m agnitude of all Jaco b ian entries calculated (dashed line)
versus m atrix sparsity for th e distributed am plifier circuit. T h e in p u t power level is
10 dB m .
W hen the input pow er is increased to 15 d B m , th e preconditioners show much
m ore inconsistent perform ance, as seen in T able 5.1.4 an d Figure 5.1.4. The 16
block preconditioner, w hich was the best perform er for an in p u t power level of 10
d B m , was unable to provide a runtim e im provem ent over n ot using a preconditioner
a t all. T h e 32 block diagonal preconditioner perform ed best in this case, providing
a ru n tim e decrease of alm o st 15 percent. T h e level one off-diagonal schem e was the
w orst perform er for th e 15 d B m input sim ulations, while th e level two off-diagonal
96
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Blocking
Scheme
M atrix
Sparsity
N um ber of
Jacobian
Evaluations
R untim e
(cpu secs)
N um ber of
Iterations
R untim e
Im provem ent (%)
1
2
0
3
3
3
3
3
3
3
3
238
224
213
26
26
22
N /A
5.9
10.5
210
21
11.8
207
214
218
218
18
24
13
4
8
16
32
ol
o2
0.5
0.75
0.875
0.9375
0.96875
0.907
0.8506
21
21
10.1
8.4
8.4
Table 5.1.3: N um ber of Jacobian calculations required for convergence and sim ula­
tion ru n tim e for different blocking schem es for th e d istrib u ted amplifier circuit with
an input pow er level of 10 dB m .
scheme provided only a slight ru n tim e im provem ent. T he two preconditioners which
resulted in poorer runtim es th a n those obtained by not using a preconditioner both
required an e x tra Jacobian calculation.
97
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
230
0.25
220
0.2 O
210
160
0.05 O
150
140
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
MATRIX SPARSITY
F igure 5.1.4: S im ulation runtim e (solid line) and ratio of th e m agnitude of all unused
Jacobian entries to th e m agnitude of all Jacobian entries calculated (dashed line)
versus m atrix sparsity for the distributed amplifier circuit. T he input power level is
15 dBm .
Blocking
Scheme
M atrix
S parsity
N um ber of
Jacobian
Evaluations
R untim e
(cpu secs)
N um ber of
Iterations
R untim e
Im p rovem ent (%)
1
0
2
0.5
0.75
0.875
0.9375
0.96875
0.907
0.8506
2
2
175
165
161
158
208
149
224
165
45
45
46
43
18
34
16
45
N /A
5.7
4
8
16
32
ol
o2
2
2
3
2
3
2
8.0
9.7
-IS .9
14.9
-28.0
5.7
Table 5.1.4: N um ber of Jacobian calculations required for convergence and sim ula­
tio n ru n tim e for different blocking schemes for the distributed am plifier circuit with
an input power level of 15 dBm.
98
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
5.2
Simulation of a Short NLTL
T he 10 -diode NLTL described in C hapter 4 was sim ulated w ith input AC voltages
of 1 thro u g h 6 volts a t one volt increm ents. T h e analysis frequencies chosen for
these sim ulations were DC, th e fundam ental frequency of 9 GHz, and th e next 30
harm onics for a total of 32 analysis frequencies. T hus, th e rank of th e Jacobian
m atrix is 630. Each of th e blocking preconditioning techniques was used for these
sim ulations.
5.2.1
Low input voltage
For an in p u t voltage of only one volt, all preconditioning m ethods worked fairly
well. All ru n tim es were sh o rter th a n th a t o b tain ed by using no preconditioner, and
as seen in T able 5.2.1, th e range of im provem ent in ru n tim e is from 17.8 percent
for th e 32 block diagonal preconditioner to 44 percent for the 16 block diagonal
preconditioner. Each of th e off-diagonal preconditioners provided about 30 percent
ru n tim e im provem ent versus not using a preconditioner, b u t they were not as ef­
fective as th e block diagonal preconditioners w ith sim ilar levels of sparsity, as seen
in th e Table. T his is m ost likely due to th e in h eren t advantages of decom posing a
block-diagonal m atrix. As seen in th e table, th e level two and level three off diagonal
preconditioners provide convergence w ith th e sam e num ber of Jacobian evaluations
and iteratio n s as the 16 block diagonal preconditioner, yet do not provide q u ite as
great a ru n tim e benefit.
99
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Blocking
Scheme
1
2
M atrix
S parsity
N um ber of
Jacobian
Evaluations
R untim e
(cpu secs)
N um ber of
Iterations
R untim e
Im provem ent (%)
0
4
4
4
4
4
259
186
158
151
145
213
181
184
190
19
19
19
24
24
32
23
19
19
N /A
28.2
39.0
41.7
44.0
17.8
30.1
29.0
26.6
0.5
0.75
0.875
0.9375
0.96875
0.907
0.8506
0.7932
4
S
16
32
ol
o2
o3
6
4
4
4
Table 5.2.1: N um ber of Jacobian calculations req u ired for convergence and sim ula­
tion ru n tim e for different blocking schemes for th e 10 diode NLTL circuit w ith an
in p u t voltage of 1 volt.
0.08
260
0.07
240
0.06
<
220
^
0.05
C/3
M
§E-
O
0.04
200
Z
D
>«
C4
H
Z
w
0.03
z
0.02
o
180
160
<
3
u
<
0.01
140
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
M ATRIX SPARSITY
Figure 5.2.1: S im ulation rnntim e (solid line) an d ra tio of the m agnitude of all unused
Jacobian entries to th e m agnitude of all Jaco b ian entries calculated (dashed line)
versus m atrix sp arsity for th e 10 diode NLTL. T h e in p u t AC voltage level is 1 volt.
100
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
W ith an increase in voltage to 2 volts, the preconditioners with the greatest
am ount of sparsity no longer are able to provide sim ulation convergence. Figure 5.2.2
also shows th a t th e ratio of unused to used Jacobian entries rises more quickly w ith
increasing sparsity th a n it did for 1 volt input voltage. T h e range of im provem ent
a t this level of input ranges from 16 to 40 percent, as seen in Table 5.2.2. In this
case th e greatest ru n tim e im provem ent was provided by th e preconditioner which
uses four blocks along th e diagonal.
As seen in th e figure, this level of sparsity
results in a much larger ratio of unused Jacobian entries to th e total m agnitude
of th e Jacobian as com pared to m ost of th e other preconditioners which provided
sim ulation convergence. It is also interesting to note th a t th e level three off-diagonal
preconditioner uses th e m ost Jacobian inform ation of any of th e preconditioners,
yet still provides a 27 percent im provem ent in runtim e. T h e level one off-diagonal
preconditioner provides a slight ru n tim e im provem ent, b ut th e improvem ent is not
as significant as it was for a 1 volt in p u t. W ith an increase to 2 volts, a loss in
Jacobian accuracy of th e level one off-diagonal preconditioner becomes evident, as
this preconditioner requires an additional Jacobian evaluation. T he level two and
th ree off-diagonal preconditioners perform much as th ey d id for a 1 volt input,
providing runtim e decreases of 29.4 and 27.1 percent respectively.
For a 3 volt in p u t AC voltage, the tren d s observed in increasing voltage from 1 to
2 volts continued, as shown in Figure 5.2.3 and Table 5.2.3. T h e ratio of unused to
used Jacobian entries continued to rise m ore quickly w ith increasing m atrix sparsity
101
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Blocking
Scheme
M atrix
S parsity
N um ber of
Jacobian
Evaluations
R u n tim e
(cpu secs)
N um ber of
Iterations
R untim e
Improvem ent (%)
1
2
0
6
0.5
0.75
0.875
0.9375
0.96875
0.907
0.8506
0.7932
6
402
290
241
N /C
N /C
N /C
339
284
293
28
28
29
N /C
N /C
N /C
38
28
28
N /A
27.9
40.1
N /C
N /C
N /C
15.7
29.4
27.1
4
8
16
32
ol
o2
o3
6
N /C
N /C
N /C
7
6
6
Table 5.2.2: N um ber of Jacobian. calculations required for convergence and sim ula­
tion runtim e for different blocking schem es for th e 10 diode NLTL circuit w ith an
input voltage of 2 volts. N /C denotes no convergence.
0.035
420
400
0.03
380
0.025
<
360
0.02
340
0.015 2
300
0.01
O
280
0.005
260
240
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
M ATRIX SPARSITY
Figure 5.2.2: Sim ulation runtim e (solid line) a n d ratio of th e m agnitude of all unused
Jacobian entries to th e m agnitude of all Jaco b ian entries calculated (dashed line)
versus m atrix sp arsity for the 10 diode NLTL. T h e input AC voltage level is 2 volts.
102
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
th a n it did for low er in p u t levels. Also, th e preconditioners which are m ost sparse
were unable to p ro v id e sim ulation convergence. In this case, th e level 1 off-diagonal
preconditioner was also unable to provide sim ulation convergence.
Among th e p reconditioners which were ab le to provide convergence, th e runtim e
im provem ent ranges from 13.8 to 40.9 percent. Again, the 4 block preconditioner
seem ed to provide th e best runtim e im provem ent, and tins level of sparsity corre­
sponds to the highest ra tio betw een unused Jaco b ian entries and the to ta l m agnitude
of all Jacobian en trie s. It is interesting to n o te th a t th e percent im provem ent pro­
vided by this p reco n d itio ner was somewhat co n stan t for all three voltage levels. In
fact, when a p a rtic u la r preconditioner was able to provide sim ulation convergence,
th e runtim e im provem ent was relatively consistent for all three voltage levels.
Blocking
Scheme
M atrix
S p arsity
N um ber of
Jacobian
Evaluations
R u n tim e
(cpu secs)
N um ber of
Iterations
R u n tim e
Im provem ent (%)
1
2
0
6
0.5
0.75
0.875
0.9375
0.96875
0.907
0.8506
0.7932
6
6
406
287
240
N /C
N /C
N /C
N /C
294
350
29
30
31
N /C
N /C
N /C
N /C
34
37
N /A
29.3
40.9
N /C
N /C
N /C
N /C
27.6
13.8
4
8
16
32
ol
o2
o3
N /C
N /C
N /C
N /C
6
7
Table 5.2.3: N u m b er of Jacobian calculations required for convergence and sim ula­
tion runtim e for different blocking schemes for th e 10 diode NLTL circuit w ith an
in p u t voltage of 3 v olts. N /C denotes no convergence.
A nother im p o rta n t fact is th a t the diodes in th e circuit are not being driven into
forw ard bias at th ese in p u t levels, nor is th e reverse bias large enough to result in
103
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
420
0 .0 4
400
' 0.035
380
0.03
3 360
ffl
§ 340
<
0.025
0.02
E—1
2
320
D
gg
0.015
300
0.01
280
O
0.005
260
240
0
0.1
0.2
0.3
0.4
0.6
0.5
0.7
0.8
0.9
MATRIX SPARSITY
Figure 5.2.3: S im ulation runtim e (solid line) and ratio of th e m ag n itu d e of all unused
Jacobian entries to th e m agnitude of all Jacobian entries calcu lated (dashed line)
versus m atrix sp arsity for th e 10 diode NLTL. T h e input AC voltage level is 3 volts.
104
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
diode breakdow n. In stead , th e diodes are m erely actin g as nonlinear capacitances
w ith a nonlinear current-voltage relationship. N evertheless, even w ith a low AC
in p u t voltage of two or th re e volts, the m ost sparse preconditioners were not able
to provide sim ulation convergence. This indicates th a t there is enough frequency
coupling even at these low in p u t levels to affect th e convergence of th e sim ulator.
5.2.2
High input voltage
W hen th e inp u t voltage was increased to 4 volts, th e sam e effects associated with
increasing th e in p u t voltage level were seen. T h e ra tio of th e m agnitude of th e un­
used Jacobian entries to th e to ta l m agnitude of all Jaco b ian entries continued to rise
m ore quickly w ith an increase in m atrix sparsity. F igure 5.2.4 and Table 5.2.4 show
th a t at 4 volts, th e level 3 off-diagonal preconditioner is no longer the schem e which
retains the m ost Jacobian inform ation. In fact, this is the preconditioner which
perform s best a t this in p u t level, due to a sm aller n u m b er of Jacobian evaluations
th a n any other preconditioner. T he 4 block diagonal preconditioner rem ains the
scheme which uses th e least am ount of Jacobian inform ation, and it provides a 40.7
percent ru n tim e im provem ent.
At th e in p u t level of 5 volts, the off-diagonal preconditioning schem es seem to
work best. T h e best ru n tim e im provem ent was o b tain ed by th e level one off-diagonal
preconditioner, b u t th e level tw o and level th ree off-diagonal preconditioners also
perform ed well, as seen in F igure 5.2.5 and Table 5.2.5. Note th e relatively high
105
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Blocking
Schem e
M atrix
Sparsity
N um ber of
Jacobian
Evaluations
R untim e
(cpu secs)
N um ber of
Iterations
R untim e
Im provem ent (%)
1
0
6
2
0.5
0.75
0.875
0.9375
0.96875
0.907
0.8506
0.7932
6
6
403
289
239
N /C
N /C
N /C
N /C
333
185
28
30
30
N /C
N /C
N /C
N /C
35
N /A
28.3
40.7
N /C
N /C
N /C
N /C
17.4
54.1
4
8
16
32
ol
o2
o3
N /C
N /C
N /C
N /C
7
4
21
Table 5.2.4: N um ber of Jacobian calculations required for convergence an d simula­
tio n ru n tim e for different blocking schemes for th e 10 diode NLTL circuit w ith an
in p u t voltage of 4 volts. N /C denotes no convergence.
0.06
c/:
S
P
400
0.05
350
P
<
0:5
o.o4 r*
>*
05
300
0.03
z
O
05
O
§
Z
<
0.02
250
^
o
o.oi
200
ie;n
u
<
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
M A T R IX SPA R SITY
F igure 5.2.4: Sim ulation ru n tim e (solid line) and ratio of the m agnitude of all unused
Jaco b ian entries to th e m agnitude of all Jacobian entries calculated (dashed line)
versus m a trix sparsity for th e 10 diode NLTL. T h e in p u t AC voltage level is 4 volts.
106
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
values of th e Jaco b ian ratios for these preconditioners. At th e higher in p u t levels,
these ratios begin to have a relationship w ith ru n tim e th a t is d irectly proportional as
opposed to th e inversely proportional relationship for low input voltages. O f course
th e notable exception occurs for th e level one off-diagonal preconditioning scheme,
which also produced th e fastest runtim e.
Blocking
Scheme
M atrix
S parsity
N um ber of
Jacobian
Evaluations
R u n tim e
(cp u secs)
N um ber of
Iterations
R u n tim e
Im provem ent ( %)
1
0
2
0.5
0.75
0.875
0.9375
0.96875
0.907
0.8506
0.7932
7
7
7
N /C
N /C
N /C
3
469
339
282
N /C
N /C
N /C
140
287
248
31
38
38
N /C
N /C
N /C
30
32
29
N /A
27.7
39.9
N /C
N /C
N /C
70.1
38.8
47.1
4
8
16
32
ol
o2
o3
6
5
Table 5.2.5: N um ber of Jacobian calculations required for convergence and sim ula­
tio n runtim e for different blocking schem es for th e 10 diode NLTL circuit w ith an
in p u t voltage of 5 volts. N /C denotes no convergence.
Once th e in p u t voltage was raised to a level of 6 volts, it becam e clear th a t th e
iterativ e linear ap p ro ach was nearing its lim its in its ability to provide sim ulation
convergence. It was very difficult to find a preconditioner th a t was accu rate enough
to provide convergence an d sim ple enough to provide runtim e im provem ent. All of
th e previously used preconditioning schem es eith er did not provide convergence or
provided a m uch slower runtim e th a n n o t using a preconditioner a t all, as shown
in Figure 5.2.6 an d Table 5.2.6. H igher level off-diagonal schem es were used until
one was found w hich provided convergence a t a faster runtim e. T h e level eight off107
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
500
450
0.2
400
^
W
2
P
z
p
350
O
p
<
06
0.15 £
E-
Z
300
w
0.1
06 250
z
<H
CO
o
o
200
0.05 <
150
100
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
M A TR IX SPA R SITY
Figure 5.2.5: S im ulation runtim e (solid line) and ratio of th e m agnitude of all unused
Jacobian entries to the m agnitude of all Jacobian entries calculated (dashed line)
versus m a trix sp arsity for th e 10 diode NLTL. T he input AC voltage level is 5 volts.
Blocking
Scheme
M atrix
S parsity
N um ber of
Jacobian
Evaluations
R untim e
(cpu secs)
N um ber of
Iterations
R untim e
Im provem ent (%)
1
2
0
4
103
72
131
N /C
N /C
N /C
N /C
N /C
5
482
4911
2860
4484
N /C
N /C
N /C
N /C
N /C
290
42
341
291
404
N /C
N /C
N /C
N /C
N /C
31
N /A
-919.
-493.
-8.30
N /C
N /C
N /C
N /C
N /C
39.8
4
8
16
32
ol
o2
o3
08
0.5
0.75
0.875
0.9375
0.96875
0.907
0.8506
0.7932
0.5362
Table 5.2.6: N u m b er of Jacobian calculations required for convergence and sim ula­
tion ru n tim e for different blocking schem es for th e 10 diode NLTL circuit w ith an
input voltage of 6 volts. N /C denotes no convergence.
108
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
0.25
5000
4500
0.2
400 0
<
(X
>«
0.15
£
§
0.1
2000
<
OQ
1500
o
.u
0.05
1000
500
0
0.1
0.2
0 .3
0.4
0.6
0.5
0 .7
0.8
0.9
MATRIX SPARSITY
F igure 5.2.6: Sim ulation ru ntim e (solid line) and ratio of th e m agnitude of all unused
Jacobian entries to th e m agnitude of all Jacobian entries calculated (dashed line)
versus m atrix sparsity for th e 10 diode NLTL. T he input AC voltage level is 6 volts.
109
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
diagonal schem e provided convergence at a 40 percent faster ru n tim e th a n not using
a preconditioner. T he level of sparsity for this preconditioner is very close to the
sp arsity level of the two block diagonal preconditioner, yet th e ru n tim es obtained
by th e tw o schemes are vastly different. Also, th e ratio of Jacobian entries increases
fairly rap id ly in changing from th e two block diagonal to the level eight off-diagonal
preconditioner. While b oth of th ese preconditioners consider some level of frequency
coupling, th e level eight off-diagonal preconditioner includes all frequency coupling
betw een all circuit q u an itites w ithin eight harm onics of each other. T h e 2 block
diagonal preconditioner, however, leaves out a m ajo rity of frequency coupling in the
m iddle of th e outp u t sp ectrum .
5.3
Simulation of a Longer NLTL
N ext th e length of th e NLTL was doubled to 20 diodes. T he circuit was sim ulated
at th e sam e in put voltages as th e shorter line. For this circuit, convergence was not
o b tain ed for th e higher in p u t voltages.
As seen in Table 5.3.1 an d F igure 5.3.1, th e best perform ing preconditioners are
th e 16 an d 32 block diagonal schem es. These two preconditioners require only three
Jacobian calculations for convergence as com pared to four Jacobian calculations for
th e o th er preconditioners. A dditionally, these are th e preconditioners w hich contain
th e least am ount of Jacobian inform ation.
Table 5.3.2 and Figure 5.3.2 show the perform ance of th e preconditioners when
110
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Blocking
Scheme
M atrix
S p arsity
N um ber of
Jacobian
Evaluations
R untim e
(cpu secs)
N u m b er of
Ite ratio n s
R untim e
Im provem ent (%)
1
2
0
4
4
4
4
3
3
4
4
4
1317
733
504
395
270
272
665
721
755
21
N /A
44.3
61.7
70.0
79.5
79.3
49.5
45.3
42.7
0.5
0.75
0.875
0.9375
0.96875
0.907
0.8506
0.7932
4
8
16
32
ol
o2
o3
20
20
19
20
24
20
18
19
Table 5.3.1: N u m b er of Jacobian calculations required for convergence and sim ula­
tion ru n tim e for different blocking schem es for th e 20 diode NLTL circuit with an
input voltage o f 1 volt.
1400
0.25
1200
0.2 O
F
1000
0.15
F
£
H
Z
w
800
0.1
z
<
S
600
o
0.05
400
200
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
u
<
1
M ATRIX SPA R SITY
Figure 5.3.1: S im u latio n runtim e(solid line) and ratio o f th e m agnitude of all unused
Jacobian entries to th e m agnitude of all Jacobian en tries calculated(dashed line)
versus m a trix sp a rsity for the 20 diode NLTL. T he in p u t AC voltage level is 1 volt.
Ill
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Blocking
Schem e
M atrix
S parsity
N um ber of
Jacobian
Evaluations
R untim e
(cpu secs)
N um ber of
Iterations
R untim e
Im provem ent (%)
1
0
2
0.5
0.75
0.875
0.9375
0.96875
0.907
0.8506
0.7932
30
31
N /C
N /C
5
11493
6584
N /C
N /C
491
523
N /C
N /C
N /C
119
134
N /C
N /C
13
31
N /C
N /C
N /C
N /A
42.7
N /C
N /C
95.7
95.4
N /C
N /C
N /C
4
8
16
32
ol
o2
o3
6
N /C
N /C
N /C
T able 5.3.2: N um ber of Jacobian calculations required for convergence and sim ula­
tio n ru n tim e for different blocking schemes for th e 20 diode NLTL circuit w ith an
in p u t voltage of 2 volts.
0.4
12000
0.35
10000
^
Vi
s
8000
0.25
6000
>
0.2
z
£3
0.15 <
4000
2000
0.05
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
M A TR IX SPARSITY
F igure 5.3.2: S im ulation runtim e(solid line) and ratio of th e m agnitude of all unused
Jaco b ian entries to th e m agnitude of all Jacobian entries calculated(dashed line)
versus m a trix sp arsity for th e 20 diode NLTL. T h e in p u t AC voltage level is 2 volts.
112
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
th e input to th e 20 diode NLTL is 2 volts. In this case, the 16 and 32 diagonal block
schemes again perform ed best. T he perform ance im provem ent was quite d ram atic,
w ith a 95 percent ru n tim e improvement observed for b oth preconditioners. T h e
num ber of Jaco b ian calculations required for convergence was m uch smaller for th ese
schemes.
Blocking
Scheme
M atrix
S parsity
N um ber of
Jacobian
Evaluations
R untim e
(cpu secs)
N um ber of
Iterations
Runtim e
Improvem ent(% )
1
2
0
41
39
N /C
N /C
N /C
7
N /C
N /C
N /C
15956
8356
N /C
N /C
N /C
634
N /C
N /C
N /C
185
174
N /C
N /C
N /C
40
N /C
N /C
N /C
N /A
47.6
N /C
N /C
N /C
96.0
N /C
N /C
N /C
4
8
16
32
ol
o2
o3
0.5
0.75
0.875
0.9375
0.96875
0.907
0.8506
0.7932
Table 5.3.3: N um ber of Jacobian calculations required for convergence and sim ula­
tion runtim e for different blocking schemes for the 20 diode NLTL circuit w ith an
input voltage of 3 volts.
As seen in T able 5.3.3, the 32 diagonal block preconditioner again provided the
m ost dram atic ru n tim e improvement for a 3 volt input, b ut th e 16 diagonal block
scheme did not provide sim ulation convergence. In fact, only th e 32 and 2 diagonal
block schemes w ere able to solve the system of equations.
T he Jacobian ratios
for these schemes were much higher than for lower input sim ulations as show n in
Figure 5.3.3.
113
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
16000
0.35
14000
0.3
12000
0.25
^10000
Eti
S
r 3 8ooo
H
0.2
z
£
0.15
6000
0.1
4000
H
<
>*
H
Z
ffl
z
<
S
o
u
<
0.05
2000
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
MATRIX SPARSITY
Figure 5.3.3: Sim ulation runtim e(solid line) and ratio of th e m ag n itu d e of all unused
Jacobian entries to th e m agnitude of all Jacobian entries calculated(dashed line)
versus m atrix sp arsity for the 20 diode NLTL. The in p u t AC voltage level is 3 volts.
114
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
5.4
Summary
Clearly, th e choice of preconditioner for th e inexact N ew ton m ethod can have a pro­
found effect upon sim ulation perform ance. At low in p u t drive levels, modest runtim e
im provem ents were obtained from preconditioning, an d m ost preconditioners exam ­
ined here e x h ib it sim ilar behavior. T his is because a t low drive levels, the Jacobian
inform ation needed for sim ulation convergence is located w ithin th e diagonal blocks
of th e m atrix . As th e drive level was increased, th e choice of preconditioners becam e
m ore crucial.
For th e d istrib u ted am plifier circu it, the 32 block diagonal preconditioner pro­
vided ru n tim e im provem ents a t all in p u t levels.
A lthough th is method did not
always provide th e best ru n tim e com pared to other m ethods, it was more consistent
th a n th e o th e r m ethods an d was th e best perform er a t the highest drive level. At low
in p u t pow er levels, the only ad vantage provided by th e preconditioners was th a t th e
m atrix decom position was less expensive for th e diagonal block techniques th a n for
th e off-diagonal techniques. As in p u t power was increased, m atrix decomposition
tim e becam e less im portant, and th e best perform ing preconditioners were those
th a t provided a b etter direction for th e quasi-N ew ton update.
For th e 10 diode NLTL, however, th e 32 block diagonal preconditioner was one
of th e w orst perform ers, presum ably because of th e strong nonlinearity of the cir­
cuit. Frequency coupling inform ation is com pletely ignored by this preconditioner.
T h e o th er diagonal block preconditioners which contain some cross-frequency in-
115
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
form ation provided th e b est sim ulation perform ance for this circuit, especially at
low power. The off-diagonal blocking techniques perform ed well a t low power but
were not as effective as th e other diagonal blocking schemes. This is due to their
stru c tu re which is not as efficiently handled as th e diagonal block stru ctu re. How­
ever, a t higher input levels, the off-diagonal preconditioners perform ed best, as the
frequency coupling inform ation is more im p o rtan t. Block diagonal m ethods are not
as effective for frequency coupling because th ey consider th e coupling unevenly.
T h e 20 diode NLTL was sim ulated only a t low in p u t levels. It is apparent that
frequency coupling does not play as large a p a rt in the sim ulation convergence for
this circuit at low input levels, as the 32 an d 16 block diagonal preconditioners
perform ed best. T he perform ance im provem ents provided by the preconditioners in
this case were much g reater th a n for smaller circuits.
T h e inexact Newton m ethod w ith a preconditioned Jacobian m atrix proved most
effective a t lower input levels and with th e largest circuits investigated. W hile run­
tim e im provem ents were seen for the distrib u ted am plifier circuit, th e exact Newton
m eth o d exam ined in C h ap ter 4 still provided th e fastest sim ulation runtim es. Of
course, this is to be expected since the inexact N ew ton m ethod is known to be best
suited for larger circuits. T he inexact Newton m ethod also failed to converge for
th e largest circuit investigated, th e 47 diode NLTL discussed in C h ap ter 4.
116
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Chapter 6
Conclusions and Future Research
Harmonic balance analysis has been shown to be an effective technique for steadystate analysis of a wide variety of microwave circuits. This study has shown that
m any of th e com putational drawbacks of th e harm o n ic balance technique can be
reduced through judicious choice of num erical techniques.
6.1
Discussion
In C hapter 3, Jacobian m atrix approxim ations for Newton-Raphson based simu­
lation were developed. These approxim ations also a re used as preconditioners for
iterative linear solvers used in inexact Newton solvers. The Jacobian approxim a­
tions take advantage of th e special block s tru c tu re of th e m atrix which allows the
inform ation associated w ith different pairs of frequencies to be considered separately.
It was shown in C h ap ter 4 th a t when approxim ation techniques can be used,
the m ost effective approxim ations are often those w hich lead to the use of th e least
am ount of Jacobian inform ation, especially a t low drive levels. Using a diagonal
blocking schem e w ith th e largest possible num ber of blocks frequently proved to be
the m ost efficient approxim ation. Not only does this sim plify th e m atrix decom posi­
tion, b ut it forces th e Newton u p d ate direction to focus on the relationships between
117
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
th e error function contributions a n d s ta te variables a t th e sam e frequency. T hese
are often th e m ost im p o rtan t relationships between th e s ta te variables and th e error
funciton. As th e drive level increases, off-diagonal blocks of th e Jacobian becom e
m ore im p o rtan t a n d m ust be included in order to o b tain convergence. For th e 47
diode nonlinear transm ission line, it was also shown th a t Jacobian approxim ations
can reduce th e n u m b er of analysis frequencies required for convergence. This results
in fewer s ta te variables and thus a m uch sm aller Jacobian m atrix.
In C h ap ter 5, th e preconditioners used for an ite ra tiv e linear solver were also
block-based, again w ith the m ost effective preconditioners being those which used
the sam e-frequency diagonal blocks of th e Jacobian. In cases of higher drive levels,
where this preconditioner did not provide sim ulation convergence, preconditioners
which use some cross-frequency inform ation proved useful. T h e iterative linear solver
proved not to be as effective for sim ulating at high drive levels, but when th ere are
a large num ber o f unknown variables a t low to m o d erate input levels, this tech­
nique proved su p erio r to the ex act Newton m ethod. T able 6.1.1 shows a ru n tim e
com parison for th e 20 diode NLTL excited at 1, 2, and 3 volts. Thirty-tw o analysis
frequencies were used in the sim ulation, resulting in a Jaco b ian m atrix of rank 1260.
Clearly th ere is a distinct ad vantage to using an ite ra tiv e linear approach for
larger circuits, b u t th e ability of th is m ethod to provide convergence a t high drive
levels is suspect. As th e drive level increases and th e ite ra tiv e linear approach be­
comes less effective, th e runtim e ra tio between the two techniques decreases slightly.
118
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
In p u t
V oltage
1
2
3
E xact New ton
R untim e
964
1069
1223
Inexact Newton
R untim e
270
491
634
T ab le 6.1.1: R untim e required for harm onic balance sim ulation of a 20 diode NLTL
u sing th e exact N ew ton m ethod and th e inexact Newton m eth o d . T he runtim es
given are the best o b tain ed from all available Jacobian approxim ations and precon­
d itioners.
6.2
Suggestions for Further Research
T h is stu d y shows th e im p ortance of the choice of Jacobian approxim ations and pre­
conditioners for harm onic balance sim ulation. For the circuits a t hand, th e diagonal
blocking techniques in w hich the num ber of blocks is either th e sam e or half as m uch
as th e num ber of analysis frequencies have been shown to be th e best choice, espe­
cially a t lower drive levels. As the drive level is increased, th e off-diagonal blocks
becom e more helpful. T hese blocking techniques should be applied to a wider vari­
e ty of microwave circuits to verify the generality of these conclusions. O ne circuit of
p a rtic u la r interest is a grid am plifier for quasi-optical power com bining applications.
T h is is a distributed circuit w ith a large n u m b er of nonlinear devices, providing a
larg e Jacobian m atrix w ith m any nodes per frequency. W ith two s ta te variables p er
n o n lin ear device, th e individual blocks of th e Jacobian will be larger an d m ore dense
th a n th e nonlinear transm ission lines discussed in this work, b u t th e preconditioning
techniques should still be q u ite effective. A ny additional frequency coupling m ay be
h an d led by using off-diagonal blocks.
119
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Also, m o d em digitally m odulated com m unications circuits are of in terest. A nal­
ysis of th ese circuits requires a very large num ber of analysis frequencies, producing
a correspondingly large Jacobian m a trix . Recent publications [21], [22] have shown
th at K rylov subspace techniques are particularly effective in sim ulating these cir­
cuits, b u t little is known about how to choose an appropriate preconditioner.
T h e high-pow er sim ulation of th e 47 diode NLTL required continuation m ethods
in order for sim ulation convergence to be achieved. T he continuation m eth o d im ple­
m ented in TR A N SIM uses a reduced frequency spectrum a t low input power levels.
As th e in p u t level increases, th e n u m b er of analysis frequencies used also increases.
This process could be improved by changing the Jacobian approxim ation m ethod
during continuation. For low input levels, the approxim ation should be th e diagonal
block Jaco b ian w ith th e largest possible num ber of blocks, i.e., the sam e num ber of
blocks as analysis frequencies. As th e drive level increases along w ith th e num ber
of analysis frequencies, the off-diagonal blocks could th en be used. Eventually, a
large n u m b er of off-diagonal blocks will become necessary before th e sim ulation is
com plete.
120
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
References
[1] R. Gilm ore and M .B .Steer, “N onlinear circuit analysis using th e m ethod of
harm onic balance - a review of th e a r t,” International Jo u rn al of Microwave
and M illim eter-W ave C om puter-A ided Engineering. Vol. 1, No. 1, 1991, pp.
22-37.
[2] V. Rizzoli and A. N eri, “State of th e A rt and P resent Trends in Nonlinear
Microwave CAD Techniques,” IEE E Transactions on Microwave Theory and
Techniques, Vol. 36, February 1988, pp. 343-65.
[3] R. Hicks and P. K han, “Numerical analysis of nonlinear solid-state device ex­
citation in microwave circuits,” IE E E T ransactions on M icrowave Theory and
Techniques, Vol. 30, M arch 1982 pp. 251-9.
[4] A. Kerr, “Noise a n d loss in balanced and subharm onically pum ped mixers: part
1 - theory,” IE E E Transactions on Microwave Theory and Techniques, Vol. 27,
Decem ber 1979, pp. 938-43.
[5] G. Cam acho-Penalosa, “Num erical stead y -state analysis of nonlinear microwave
circuits w ith periodic excitation,” IE E E T ransactions on Microwave Theory and
Techniques, Vol. 31, Septem ber 1983, pp. 724-30.
[6 ] A. Kerr, “A technique for determ ining th e local oscillator waveforms in a m i­
crowave m ixer,” IE E E Transactions on Microwave T heory and Techniques, Vol.
23, O ctober 1975, pp. 828-31.
[7] P.L. Heron and M .B . Steer, “Jacobian C alculation Using th e M ultidim ensional
Fast Fourier Transform in the H arm onic Balance Analysis of N onlinear Cir­
cuits,” IE E E T ransactions on Microwave T heory and Techniques, Vol. 38, April
1990, pp. 429-31.
[8 ] C.R. Chang, P.L. H eron, and M .B. S teer, “Harm onic balance and frequency
dom ain sim ulation of nonlinear microwave circuits using th e block Newton
m ethod,” IE E E T ransactions on Microwave Theory and Techniques, Vol. 38,
April 1990, pp. 431-4.
[9] P.L. Heron, C.R. C hang, and M.B. Steer, “Control of A liasing in the Harmonic
Balance S im ulation of N onlinear M icrowave C ircuits,” 1989 IE E E M TT-S In­
ternational Sym posium Digest, Ju n e 1989, pp. 355-358.
[10] V. Rizzoli et al., “T h e exploitation of sparse m atrix techniques in conjunction
w ith the piecewise harm onic-balance m ethod for nonlinear microwave circuit
analysis.” 1990 IE E E M TT-S In tern atio n al Sym posium D igest, Ju n e 1990, pp.
1295-1298.
[11] H. Yeager and R. D u tto n , “Im provem ent in norm -reducing Newton m ethods
for circuit sim ulation,” IE E E Transactions on C om puter A ided Design of Inte­
grated C ircuits an d System s, May 1989, pp. 538-546.
[12] V. Rizzoli and A. N eri, “Expanding th e pow er-handling capabilities of
harm onic-balance analysis by a param etric form ulation of th e M ESFET
m odel,” Electronics L etters, Vol. 26, A ugust 16, 1990, pp. 1359-1361.
121
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
[13] V. Rizzoli e t al., “A highly efficient p-n ju n c tio n m odel for use in harm onicbalance sim u latio n ,” 19th European M icrow ave Conference, 1989, pp. 979-984.
[14] V. Rizzoli e t al., “S tate-of-the-art h arm onic-balance sim ulation of forced non­
linear m icrow ave circuits by the piecew ise tech n iq u e,” IEEE T ransactions on
Microwave T h e o ry and Techniques, Vol. 40, pp 12-28, January 1992.
[15] V. Rizzoli et al., “A hierarchical haxm onic-balance technique for th e effi­
cient sim u latio n o f large-size nonlinear m icrow ave circuits,” 25th European
Microwave C onference, 1995, pp. 615-619.
[16] R. Melville, P. Feldm ann, J. Roychow dhury, “Efficient m ulti-tone distortion
analysis of analog in tegrated circuits,” IE E E 1995 C ustom In te g rated Circuits
Conference, pp. 241-244.
[17] P. Feldm ann, B . M elville, D. Long, “Efficient frequency dom ain analysis of
large nonlinear analog circuits,” 1996 IE E E M T T -S International Microwave
Sym posium D igest, Ju n e 1996, pp. 461-464.
[18] V. Rizzoli e t al., “H arm onic-balance sim u latio n of strongly nonlinear very largesize microwave circuits by inexact N ew ton m eth o d s,” 1996 IE E E M TT-S In­
ternational S ym posium Digest, Ju n e 1996, pp. 1357-1360.
[19] R. Freund, G. G olub, and N. N achtigal, “Ite ra tiv e solution of lin ear system s,”
A cta N um erical, 1991, pp. 57-100.
[20] Y. Saad a n d M. Schultz, “GMRES: a generalized m inim al residual m ethod for
solving n o n sy m m etric linear system s,” SIA M Jo u rn al of Scientific S tatistical
C om puting, Vol 7, Ju ly 1986, pp. 856-869.
[21] V. Rizzoli e t al., “Nonlineax processing of digitally m odulated carriers by the
inexact-N ew ton harm onic-balance tech n iq u e,” Electronics Leters, Vol. 33, Oc­
tober 9th, 1997, p p . 1760-1761.
[22] V. Rizzoli et al., “F ast and robust in ex act N ew ton approach to th e harm onicbalance analysis o f nonlinear microwave circ u its,” IE E E Microwave and G uided
Wave L etters, Vol. 7, O ctober 1997, p p .359-361.
[23] R. Telichevesky, K. K undert, I. Elfadel, a n d J . W hite, “Fast sim ulation algo­
rithm s for R F C ircu its,” IEEE 1996 C u sto m In teg rated C ircuits Conference,
pp. 437-444.
[24] K. Eickhoff an d W . Engl, “Levelized in co m p lete LU factorization and its ap­
plication to large-scale circuit sim u latio n ,” IE E E Transactions on C om puterAided Design o f In te g rated Circuits a n d System s, Vol. 14, Ju n e 1995, pp. 720727.
[25] M. Case, “N onlineax Transmission Lines for Picosecond Pulse, Im pulse and
M illim eter-W ave H arm onic G eneration,” P h D dissertation, U niversity of Cali­
fornia a t S a n ta C lara, 1993.
[26] C. Christoffersen, M . Ozkar, et al., “S ta te V ariable-B ased T ransient Analysis
Using C onvolution,” accepted fo r publication.
122
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
IMAGE EVALUATION
TEST TARGET (Q A -3 )
✓
✓
*
'
'S
V-
1 2 .8
1.0
152
13.2
IIIIM
[j 2.2
1£&
U£
l.l
12.0
1.8
1.25
L4
1.6
150mm
03
A P P L IE D A IIVMGE . Inc
—
- =
1653 E ast Main Street
R ochester, NY 14609 USA
Phone: 716/482-0300
Fax: 716/288-5989
0 1993, Applied Image. Inc.. All Rights Reserved
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
/ r
Документ
Категория
Без категории
Просмотров
0
Размер файла
5 223 Кб
Теги
sdewsdweddes
1/--страниц
Пожаловаться на содержимое документа