вход по аккаунту



код для вставкиСкачать
Patent Translate
Powered by EPO and Google
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
TECHNICAL FIELD The present invention relates to a method of synthesizing a three-dimensional
sound field.
BACKGROUND OF THE INVENTION The processing of audio signals to reproduce a three
dimensional sound field replaying to a listener having two ears has been a long-standing goal for
the inventor. One approach has been to surround the listener with multiple sound sources, such
as speakers, using multiple sound reproduction channels. Another approach has been to perform
sound recording for headphone listening using a dummy head with a microphone located in the
auditory canals of the artificial ear. A particularly promised approach to such sound field
binaural synthesis is described in the European patent EP-B-0689756, which uses only one pair
of loudspeakers and two signal channels The sound field allows the listener to perceive the sound
source to appear somewhere on the sphere surrounding the listener's head located at the center
of the sphere. It has direction information.
Monophonic sound sources can be digitally processed via head response transfer function (HRTF;
head response transfer function), and the resulting stereo pair signal is a natural three-
dimensional signal as shown in FIG. Have cues. HRTFs can be implemented with the use of a set
of filters, one related to the left ear response and the other related to the right ear response,
often referred to as a binaural placement filter. These sound cues are introduced naturally by the
acoustic characteristics of the head and ears when we listen to the sound in real life. They
include interaural intensity difference (IAD), interaural time difference (ITD) and spectral shaping
by the outer ear. When this stereo signal set is effectively introduced to the listener's proper ear,
for example by means of headphones, the listener will be at a position in space according to the
spatial position associated with the particular HRTF used for signal processing. Perception of the
original sound.
When listening through a speaker instead of headphones, as shown in FIG. 2, the signal is not
efficiently carried to the ear and a "transaural acoustic crosstalk" appears which suppresses
three-dimensional sound cues. This means that the left ear hears a portion of the sound that the
right ear hears (after a small additional time delay of about 0.25 ms), as shown in FIG. In order to
prevent this from happening, it is known to generate a suitable "crosstalk cancellation" or
"crosstalk compensation" signal from the opposite speaker. These signals are equal in magnitude
and inverted (reversed in phase) to the crosstalk signal and are designed to cancel them. There
are also more advanced configurations to compensate for the second-order (and higher-order)
effects of the cancellation signals that themselves contribute to second-order crosstalk, and these
methods are known in the art. Typical prior art ("Models of hearing" by MR Schroeder, Proc. The
configuration of IEEE, VOL. 63, issue 9, 1975 pp 1332-1350) is shown in FIG.
The effects can be made very noticeable when HRTF processing and crosstalk cancellation are
performed using high quality HRTF sources, in order (Fig. 5) and exactly. For example, move the
image (image) of the sound source around the listener in the perfect horizontal circle around the
listener for the first time from the front, move it back behind the listener and back again through
the left side It is possible. Furthermore, it is possible to move in a vertical circle around the
listener or to make the sound source seem to come from any selected position in space. However,
some specific locations are more difficult to synthesize than others, some are considered to be
practical reasons, some because of mental hearing.
For example, the effect of a sound source moving up and down directly is greater at the listener's
side (90 ° azimuth) than at the front (0 ° azimuth). This is probably because there is more
information on the difference between left and right working on the brain. Similarly, it is difficult
to displace between the sound source directly in front of the listener (azimuth 0 °) and the
sound source just behind the listener (azimuth 180 °). This is because there is no temporal
control information acting on the brain (at ITD = 0) and the spectral data, which is other
information valid for the brain, is quite similar at both of these locations. In fact, the high
frequency (HF) energy perceived when the sound source is in front of the listener is high, and the
high frequencies from the front sound source are reflected from the back wall of the outer ear
into the ear canal, but from the rear sound source It can not be diffracted enough around the ear
wing. In fact, the limiting feature in the reproduction of three-dimensional sound from two
loudspeakers is transaural crosstalk cancellation, and there are three main factors:
1. HRTF quality. The quality of the 30 ° HRTF (of FIG. 3) from which the cancellation
algorithm (of FIG. 4) is derived is important. Both the artificial head that derives them and the
methodology of measurement must be appropriate. 2. Signal processing algorithm. The
algorithm must be implemented effectively.
3. HF effect. In theory, "perfect" crosstalk cancellation can be performed but not in practice.
Apart from the differences between the individual listeners and the artificial head that derives the
algorithm HRTF, the difficulty relates to high frequency components above a few kHz. The
crosstalk wave and the cancellation wave mix to form a node when optimal cancellation is
arranged to occur at each ear of the listener. However, when the node is only at a single point in
space, and one moves further away from the node, the two signals are no longer timed relative to
each other, and the cancellation is incomplete. Because of the misalignment, the signals can
actually be mixed to produce a composite signal, which at some frequencies will be larger than
the original, which itself is an undesirable crosstalk. However, in practice, the head acts as an
effective barrier to higher frequencies because of its relative size to the frequency in question,
transversal auditory crosstalk is naturally limited, and the problem is not as severe as expected.
Several attempts have been made to limit the spatial dependence of crosstalk cancellation
systems at these higher frequencies. Cooper and Bauck (US Pat. No. 4,893,342) introduce a high
frequency cut filter into the crosstalk cancellation configuration to ensure that the HF component
(> 8 kHz or such) is not actually canceled at all (but not offset). It was only sent directly to the
speakers as in the stereo. The problem with this is that the brain perceives the location of the HF
sound (i.e. the placed sound) is at the location of the loudspeaker itself, as both ears listen to the
interrelated signals from each individual loudspeaker It is. It is true that it is difficult to place
these frequencies correctly, but the whole effect nevertheless produces the original HF sound
forward for all the required spatial positions, which is Suppress the illusion when trying to
synthesize the sound located at.
Even when crosstalk is optimally canceled at high frequencies, the listener's head can not be
guaranteed to be correctly located, so the non-cancelled HF component is located by the brain in
the speaker itself and thus in front of the listener It can seem like this, but it is difficult to do the
back side composition. The following additional practical aspects also prevent optimal
transversal auditory crosstalk cancellation.
1. Speakers often do not have a good frequency response. 2. Audio systems may not have
well-matched L-R gains. 3. The computer configuration (software preset) may be set to have an
incorrect L-R balance.
Many sound sources used in computer games mainly contain low frequency energy (eg
explosions and "collision" effects), so transversal auditory crosstalk cancellation is appropriate
for these long wavelength sources, The above restrictions are not necessarily important.
However, it is very difficult to perform effective crosstalk cancellation if the source contains
predominantly higher frequency components such as a bird song, and especially if it has a
relatively pure sinusoidal sound. . Song of birds, call of insects, etc. are used to make a big effect
in making the environment in the game, which often requires that such effect be located in the
back hemisphere. This is particularly difficult to do using currently known methods.
Further, improved methods of sound reproduction are described in U.S. Pat. No. 4,219,696, U.S.
Pat. No. 4,524,451 and U.S. Pat. No. 4,845,775 which illustrate the background art in this
technical field.
SUMMARY OF THE INVENTION According to the present invention, there is used a system
having a set of front speakers located in front of a preferred position of the listener and a set of
rear speakers located behind the preferred position. A) determining a desired position of a placed
sound source with respect to the preferred position in the three-dimensional sound field; b) using
the placed sound source in the three-dimensional sound field Providing a set of bi-audio signals
comprising corresponding left and right channels, c) using the front signal gain control means
and the rear signal gain control means, the left channel signal of the set of bi-audio signals
Control the gain of each of the two to provide respective gain-controlled left front and left rear
signals, and d) using the front signal gain control means and the rear signal gain control means.
Controlling the gain of the right channel signal of the set of acquisition signals to provide each
gain controlled right front and right rear signal, and e) a ratio of front signal gain to rear signal
gain, the preferred position Control as a function of the desired position of said placed sound
source for f and f) using the respective cross crosstalk compensation means, cross crosstalk
compensation on the set of gain controlled forward and backward signals A method is provided
for implementing and driving the corresponding loudspeakers in use using these compensated
signal sets.
The present invention relates to the reproduction of three-dimensional sound from a multispeaker system, in particular a four-speaker system, and provides the improved effect of the rear
placement of virtual sources.
Despite the advantages of current two-speaker three-dimensional sound systems for multispeaker systems, for obvious reasons that cost, wiring complexity and extra audio drivers are
required, some multimedia users The ratio takes advantage of the fact that it will already or will
own a four-speaker configuration providing selective formats such as Dolby DigitalTM.
(However, it should be noted that such a format is only a two-dimensional "surround" system that
does not allow true three-dimensional source placement unlike the present invention. The
present invention allows the conventional two-speaker three-dimensional source material to be
reproduced with four (or more) speaker systems, providing a true three-dimensional virtual
source arrangement. The present invention is particularly useful when performing effective
backward placement of HF (high frequency) rich virtual sources, providing listeners with
enhanced three-dimensional sound. This is achieved in a very simple way, but it is effective.
shown in FIG. 12, it is useful to define a spatial reference system for the listener, and FIG. 12
shows the listener's head surrounded by unit dimension reference spheres and Show your
shoulders. The horizontal plane that cuts the ball is shown in FIG. 12 along with the horizontal
axis. The longitudinal axis is P-P 'and the lateral axis is Q-Q', both passing through the center of
the listener's head. Here, the azimuth, chosen as in the prior art, is measured from the front pole
(P) to the rear pole (P '), with positive values on the right side of the listener and negative values
on the left side. For example, the right pole Q 'is at an azimuth of + 90 ° and the left pole (Q) is
at -90 °. The posterior pole P 'is at + 180 ° (ie -180 °). A median plane bisects the listener's
head perpendicular to the fore-and-aft direction (running along the axis P-P '). The elevation
angle is measured directly upward (or optionally downward) from the horizontal plane.
In principle, two-channel three-dimensional acoustic signals are (a) one set of speakers in front
(± 30 °); (b) one set of speakers in the rear (b) as described in UK Patent GB2311706B. ± 150
°) or (c) can be effectively replayed through either of these. However, when making crosstalk
cancellation less than effective enough, the virtual sound source image either moves towards the
speaker position or between them, due to the aforementioned reasons such as poor L-R balance.
And "smeared out" between the speakers. Under extreme conditions, the image is broken and
obscured. The following two examples illustrate this point.
EXAMPLE 1 If, for example, a forward virtual sound source at an azimuth of + 45 ° is
reproduced by a set of conventional (forward) speakers at ± 30 °, and an optimal crossing for
any of the above reasons If less than auditory crosstalk cancellation, the acoustic image is pulled
to the speaker position, in particular to the near-ear speaker (ie, the right speaker position: + 30
°). Although this is clearly undesirable, the positional "error" from + 45 ° to + 30 ° is
relatively small. However, if the virtual sound source is behind, for example, + 150 °, the same
effect occurs, but the “error” is very large (+ 150 ° to + 30 °), destroying the image, and
moving the behind image to the front of the listener Pull out.
Example 2 If, for example, the rear virtual sound source at an azimuth of + 135 ° is reproduced
by the rear speaker set at ± 150 ° (FIG. 6), and likewise less than the optimal cross-aural
crosstalk cancellation. For example, the acoustic image is likewise pulled to the loudspeaker
position, in particular to the near-ear speaker (ie the loudspeaker position on the right: + 150 °).
In this case, the positional "error" from + 135 ° to + 150 ° is relatively small. However,
assuming that the virtual sound source is ahead, for example, + 30 °, the same effect occurs, but
the “error” is very large (+ 30 ° to + 150 °), destroying the image and causing the forward
image to be behind the listener Pull.
From the above two examples, the rear speaker set is better than the front one to reproduce the
rear virtual image, and the front speaker set is rear than the front one to reproduce the front
image. It can be inferred that it is good. However, consider the third method here. This method
uses forward and backward pairs together, at the same volume, and the same distance from the
listener. Under these conditions, when there is little optimal transversal auditory crosstalk
cancellation, the acoustic image is pulled to both the front and back speaker positions, and the
resulting destruction of the acoustic image becomes confusing and vague.
For these unsatisfactory methods, the present invention takes advantage of this "image pull"
effect to selectively direct the forward virtual sound source to the forward speaker set and
selectively direct the backward virtual sound source to the backward speaker set. As a result, if
crosstalk cancellation is less than a reasonable amount, the virtual source will be drawn into the
correct hemisphere rather than destroyed. This pointing is realized, for example, by an algorithm
that determines the ratio of L-R signal pairs to be sent to the front and rear speakers,
respectively, using the azimuth angle of each virtual sound source. The explanation is as follows.
a) A four-speaker configuration as shown in FIG. 7 is arranged in the horizontal plane, and the
speakers are arranged at ± 30 ° and ± 150 ° symmetrically with respect to the intermediate
plane. (These parameters can of course be chosen to suit a variety of different listening
arrangements. b) The left channel signal source is sent to both speakers on the left via front and
rear cross-talking crosstalk cancellation means after the front and rear gain control respectively.
c) The right channel signal source is sent to both right speakers via forward and backward
transversal crosstalk cancellation means after forward and backward gain control means
respectively. d) The forward and backward gain control means are simultaneously
complementarily controlled, preferably to provide a unity gain (or near it) overall for both
forward and backward elements, if the position of the sound image is When moving around the
listener, little or no change in sound intensity is perceived.
A schematic of the invention is shown in FIG. (For clarity, a single source is shown as described
below, but of course multiple sources are actually used, as described below. ) Referring to FIG. 8,
signal processing is performed as follows. 1. The sound source is sent to the HRTF "placement
for dual hearing" filter according to the details of FIG. 1 to generate both L and R channels for
subsequent processing.
2. The set of L channel and R channel is sent to (a) forward gain control means and (b)
backward gain control means. 3. The forward and backward gain control means control the
gains of the forward and backward channel sets, respectively, and a particular gain factor is
equally applied to the forward L and R channel sets, and other particular gain factors are It is
equally applied to the set of L and R channels behind.
4. The outputs of L and R of the front gain control means are sent to the front crosstalk
cancellation means, which drives each front speaker from there. 5. The L and R outputs of the
rear gain control means are sent to the rear crosstalk cancellation means which drives each rear
speaker from there. 6. The gains of the forward and backward gain control means are
controlled to be determined by the azimuth of the virtual sound source according to a simple
predetermined algorithm.
7. The sum of the gains of the front and rear gain control means is typically a unit quantity.
(But it doesn't have to be if your personal preferences call for forward or backward biased
effects. 2.) If multiple sources are generated in accordance with the present invention, each
source must be treated above the individual biases up to the TCC stage according to the signal
path shown in FIG. The front left, rear right and rear left signals must be summed and sent to the
front and rear TCC stages at nodes FR, FL, RR and RL (of FIG. 8).
There are numerous types of methods that can be used as an algorithm to control the azimuthal
dependence of the forward and backward gain control means. The following example uses the
descriptive word "crossfade" as the overall effect is very vague between the front and rear
speakers in an azimuth dependent manner. These examples were chosen to show the most useful
variants of the algorithm: three main factors: (a) linearity, (b) cross-damping region, (c) crossdamping modulus. And are shown in FIG. 9, FIG. 10 and FIG.
FIG. 9 (a) shows the simplest cross-damping algorithm, where the forward gain factor is unity at
0 ° and decreases linearly to zero at 180 ° according to the azimuthal angle. The backward
gain factor is the inverse function of this. At 90 ° azimuth, both forward and backward gain
factors are equal (0.5). FIG. 9 (b) shows a linear cross-damping algorithm similar to FIG. 9 (a), but
chosen so that the initial cross-damping is 90 °. Thus, the forward gen factor is a unit quantity
between 0 ° and 90 ° and decreases linearly to be zero at 180 ° according to the azimuth
angle. Similarly, the backward gain factor is the inverse function of this.
FIG. 10 (a) shows an algorithm similar to FIG. 9 (b) but the cross attenuation for the back channel
is limited to 80%. Thus, the forward gain factor is unity between 0 ° and 90 °, and linearly
decreases to be 0.2 at 180 ° according to the azimuth angle. Similarly, the backward gain factor
is the inverse function of this. FIG. 10 (b) shows a format somewhat similar to that of FIG. 9 (a)
except that the cross-damping function is non-linear. The advantage is that there is no abrupt
transition point where the increasing cosine function is used for cross damping and the rate of
change of cross damping suddenly reverses (eg when changing from 0 ° to 180 ° in the
previous example) There is.
FIG. 11 (a) shows a non-linear cross-decay where the start point of the cross-decay (analogous to
the linear method of FIG. 9 (b)) is 90 °, and FIG. (F) Similar to (a)) shows similar non-linear cross
attenuation limited to 80%. In the above example, the algorithm for controlling the azimuthal
dependence of the forward and backward gain control means is a function of the azimuthal angle
and not dependent on the elevation angle. However, such an algorithm has the disadvantage that
small changes in the position of the virtual source at high elevations can result in large changes
in the gain sent to the front and rear speakers. For this reason, it is desirable to use an algorithm
that varies the gain smoothly (ie, continuously) as a function of both angles. As an example, f (で
き る, θ) = (1−cos (θ) cos (Φ)) / 2 can be used, where 仰 is an elevation angle and θ is an
azimuth angle.
The forward and backward transversal auditory crosstalk cancellation parameters can be
separately configured to conform to non-complementarily defined angles if desired. For example,
the front is ± 30 °, and the rear is ± 120 ° other than 150 °. Forward and backward
transversal auditory crosstalk cancellation parameters, if desired, as described in copending UK
patent application no. 9816059.1 and US patent application no. It can be configured separately
to accommodate different distances between the listener and the rear speakers and between the
listener and the front speakers.
While a set of head response transfer functions (HRTFs) covering the entire 360 ° can be used,
using forward hemisphere HRTFs in both the forward and aft hemispheres has the advantage of
reducing storage space or processing power. This is because the sound source located at the rear
is reproduced through the rear speaker, so that if the rear hemisphere HRTF is used, the required
twice spectral distortion is generated and the listener's head is In order to provide its inherent
back spectral deformation in addition to that introduced by the HRTF. Thus, a head response
transfer function for a sound source having a desired position at an azimuthal angle of (180-θ)
behind a preferred position of the listener is a desired position at a predetermined azimuthal
angle of θ before the preferred position of the listener It is desirable that the head response
transfer function for the sound source has substantially the same, and the HRTF where the
azimuth of 150 ° is desirable is also substantially the same as the HRTF of the azimuth of 30 °,
and so on.
The present invention can be configured to operate with an additional set of speakers simply by
adding the appropriate gain and TCC stage when making the configuration shown in FIG.
Furthermore, only a single dual auditory placement stage for each source is required, and
currently each TCC stage is the sum of the contributions from each gain stage. For example, the
third set of loudspeakers located at ± 90 ° to the side (ie making a total of six) does not require
an additional dual auditory placement stage and one extra set of gain stages for each source And
a single extra TCC stage for an additional set of speakers, configured with the proper angle (90
° in this example) and distance.
It may be desirable to combine a regular stereo feed or a multi-channel surround sound feed with
the arranged sound sources provided by the present invention. To accomplish this, the signals of
each speaker provided by the present invention are simply added to the signals from other
sources before being sent to the speakers to produce the desired combination.
Без категории
Размер файла
25 Кб
Пожаловаться на содержимое документа