вход по аккаунту



код для вставкиСкачать
Patent Translate
Powered by EPO and Google
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
PROBLEM TO BE SOLVED: To reduce the processing requirements of the entire convolution
system while substantially maintaining the overall quality of the convolution process. Kind Code:
A1 A method of processing a series of input audio signals representing a series of virtual audio
sources placed at predetermined positions around a listener to generate a reduced audio output
signal set to be reproduced on a speaker device around the listener Convoluting the input audio
signal with the beginning of the corresponding impulse response, mapping the initial sound and
early reflections to the impulse response of the corresponding virtual audio source onto the
corresponding speaker device to form a series of initial responses The combined output forms
the combined mix from the audio output signal, forms the combined tail from the end of the
corresponding impulse response, combines the combined synthesis and convolves with the
combined convolution tail, and forms the combined tail response, a corresponding series for each
audio output signal Combine the initial response of and the corresponding combined tail
response to produce an audio output signal Formation to. [Selected figure] Figure 3
Audio signal processing method and apparatus
International PCT Application No. PCT / AU93 / 00330, entitled "High-Precision and HighEfficiency Digital Filter," filed by the applicant, adds to the possibility of effective and long
convolution of detailed impulse response functions. A process of convolution with very low
latency is disclosed.
It is known that when using impulse response function convolution to add "color" to an audio
signal, for example, when playing on headphones, the signal provides a "from outside the head"
listening experience.
Unfortunately, while using advanced algorithmic techniques such as Fast Fourier Transform
(FFT), the convolution process often requires too much computation time. As is often the case
when the full capacity of surround sound is needed, the computational requirements may
increase if multiple channels need to be folded independently. Modern DSP processors may not
be able to provide the resources for complete signal convolution, especially when imposing realtime limitations on convolution latency.
Thus, there is a general need to reduce the processing requirements of the overall convolution
system while substantially maintaining the overall quality of the convolution process.
According to the first aspect of the present invention, a set of audio output signals which are
processed by processing a series of input audio signals representing a series of virtual sound
sources arranged at predetermined positions around the listener to generate a set of audio
output signals A way to play is provided.
The method maps (a) for each input audio signal and each audio output signal (i) the input audio
signal to the initial sound and early reflections of the impulse response of the corresponding
virtual audio source substantially to the corresponding speaker device Convoluting with the
initial head portion of the corresponding impulse response; (b) forming a combined mix from the
audio input signal (i) for each of the input audio signal and each of the audio output signal; (ii) a
single convolution Determining the tails; (iii) convoluting the combined mix with a single
convolutional tail to form a combined tail response; (c) for each of the audio output signals (i) a
corresponding series of initial responses; Combine audio output signal with corresponding
combined tail response Forming in the step of forming a. A single convolutional tail can be
formed by combining the tails of the corresponding impulse response. Alternatively, a single
convolutional tail can be a selected one of the virtual speaker tail impulse responses. Ideally, the
method further comprises: (a) constructing the corresponding impulse response function set (b)
dividing the impulse response function into a number of segments (c) for a given number of
segments, Preprocessing the impulse response function by reducing the impulse response value
at the end.
Preferably, the input audio signal is transformed into the frequency domain and the convolution
is performed in the frequency domain. If it is preferable to use higher frequency coefficients that
are zeroed out, impulse response function simplification is possible in the frequency domain by
zeroing out the higher frequency coefficients and eliminating the multiplication step.
The convolution is preferably performed using a low latency convolution process. The low
latency convolution process converts the predetermined first block size portion of the input
audio signal into a corresponding frequency domain input function block, and the impulse
response signal predetermined second block size portion corresponds to a corresponding
frequency domain impulse coefficient block Converting each of the frequency domain input
coefficient blocks with a predetermined one of the corresponding frequency domain impulse
coefficient blocks in a predetermined manner to produce a combined output block, and
combining the predetermined output block Adding together and generating a frequency domain
output response for each of the audio output signals, converting the frequency domain output
response to a corresponding time domain audio output signal, and outputting the time domain
audio output signal And the step of Rukoto is preferable.
According to another aspect of the present invention, a series of input audio signals representing
a series of virtual sound sources arranged at predetermined positions around the listener are
processed to generate a reduced audio output signal set, and reproduced by speakers arranged
around the listener A way to do that is provided. The method comprises the steps of: (a)
generating a series of impulse response functions that substantially map virtual audio sources
corresponding to corresponding speaker devices; (b) dividing the impulse response functions into
a number of segments; (C) steps of a predetermined number of segments reducing the impulse
response value at the end of the segments to generate a modified impulse response, and (d) each
of the input audio signal and the audio output signal corresponding to (i) Convoluting the input
audio signal with the response portion of the corresponding modified impulse that maps the
corresponding virtual audio source to the corresponding speaker device.
In accordance with another aspect of the present invention, a method is presented that
simultaneously convolves multiple audio signals representing audio signals from various first
sources and allows the audio environment to be simulated and projected from a second set of
output sources. Yes. The second set of output sources, when placed in the (a) audio environment,
filters each of the multiple audio signals independently at the first beginning of an impulse
response function that substantially maps the first source. And (b) enabling reverberation tail
filtering of the multiplexed audio signal with a reverberation tail filter formed from the later part
of the impulse response function.
The filtering can be done via convolution in the frequency domain, and preferably the audio
signal is first transformed into the frequency domain. The series of input audio signals can
include a left front channel signal, a right front channel signal, a front center channel signal, a
left rear channel signal and a right rear channel signal. The audio output signal can include left
and right output signals.
The invention can be implemented in many different ways. For example, utilizing a skip
prevention processor unit located within the CD-ROM player unit, utilizing a dedicated integrated
circuit consisting of a modified form of digital to analog converter, utilizing a dedicated or
programmable digital signal processor Or utilizing a DSP processor interconnected between an
analog to digital converter and a digital to analog converter. Alternatively, the invention can be
implemented using separately separable external devices and a pair of headphones connected to
the center of the sound output signal generator, this sound output signal being output in digital
form and processed by the external device Be done.
Another modification can include changing the impulse response function in a predetermined
manner using variable control. While there are other forms which fall within the scope of the
present invention, it will now be described by way of example only with reference to the
accompanying drawings in which the preferred form of the present invention is illustrated.
Description of the Preferred and Other Embodiments In the preferred embodiment, the maximum
length convolution of a series of input signals is approximated to the impulse response function
for each ear, and the output is summed over the left and right ears and reproduced on the
headphones Is desirable.
Referring to FIG. 1, a total convolution of six input Dolby Surround sound sets of signals
consisting of front left, center front, right front, left surround, right surround, and low frequency
effect channels, each of which is collectively denoted as 2 The process is shown.
Left and right impulse response functions are used for each channel. Thus, for the left front
channel 3, the corresponding left front impulse response function 4 is convoluted 6 with the left
signal. The left front impulse response function 4 is an impulse response received by the left ear
with respect to the ideal spike output from the left front channel speaker placed at the ideal
position. The output 7 is summed 10 to the headphone left channel signal.
Similarly, the corresponding impulse response 5 for the right ear of the left channel speaker is
convoluted with the left front signal (8) to produce the output 9 and summed 11 to the right
channel. A similar process is performed for each of the other signals. Thus, the configuration of
FIG. 1 requires approximately 12 convolution steps for six input signals. Such a large number of
convolutions can be quite cumbersome for DSP chips, especially when the desired long
convolutions are used.
Referring now to FIG. 2, a standard "overlap and save" convolution process is shown. This is fully
published in standard texts such as "Digital Signal Processing" in 1992 by John Proakis and
Dimitis Manolakis, McMillan Publishing Company (John Proakis and Dimitis Manolakis, McMilan
Publishing Company) There is.
In the traditional overlap and save method shown in FIG. 2, the input signal 21 is digitized and
split into N sample blocks 22. N is usually a power of two. Similarly, an impulse response 23 of
length N is usually determined by taking the desired environmental measurement and is zeroed
to be of length 2N. The first mapping 24 is applied to the 2N blocks of the impulse response 23
to form N complex numbers with real and imaginary coefficients. Next, N frequency coefficients
are generated using FFT. Step 24 may be performed once before the start of the process, and the
corresponding frequency domain coefficients 25 may be stored and used later.
The block of length 2N of the input audio is then received and again using Fast Fourier
Transform to determine the corresponding frequency domain data 28 corresponding to the 2N
real input values. The two sets of data are then multiplied 30 element by element to generate
frequency domain data 31. Next, an inverse Fourier transform is performed to generate 2N real
values, the first N 34 are discarded, and the second N 35 become the output value 36 of the
output audio. The process illustrated in FIG. 2 is known as the standard frequency domain
convolutional process. However, unfortunately, depending on the value of N (processing time is O
(N log N)), because the FFT process takes a finite time, because the input data needs to be
collected into blocks. When the first 2N input values are input to the first FFT 27, there is a fixed
latency or delay from the inverse FFT 32 to the subsequent output. This delay or latency is
sometimes extremely undesirable, especially if it is necessary to meet real-time requirements.
In the aforementioned PCT Application No. PCT / AU93 / 00330 a method has been disclosed
which enables a very low latency convolution process suitable for real time use. The low latency
process is described with reference to FIG. 3 which illustrates the basic steps 40 of the low
latency process, where the audio input is first converted to the frequency domain by the FFT
frequency domain overlap process 41, with reference to the aforementioned PCT specification. I
will discuss briefly about The frequency domain data is stored 42 and then passed to the next
storage block, eg 43, 44, after each convolution "cycle". The frequency domain data, e.g. 42, is
first multiply 50 by corresponding frequency domain coefficients 51 corresponding to the first
part of the impulse response function.
At the same time, the previously delayed frequency domain data 43 is multiplied 54 with the
frequency domain coefficients 53 corresponding to the later part of the impulse response
function. This step is repeated for the remaining impulse response functions. The outputs are
summed 56 per element to produce all frequency domain data, inverse fast Fourier transformed,
and half 57 of the data discarded to produce an audio output 58. The configuration of FIG. 3
allows extremely long convolutions to be performed with low latency.
It is possible to perform the same general process as FIG. 1 using the general process of FIG. This
is illustrated in FIG. There, six input channels, for example 60 each, are first transformed into the
frequency domain using an FFT overlap process 61. Each channel is then combined 62 in the
frequency domain with frequency domain coefficients that correspond to the impulse response
function. The frequency domain process 62 may also include summing frequency components to
form the left and right outputs 64, 65. Finally, inverse frequency domain discarding processes
66, 67 are used to generate the left and right channel outputs.
By simplifying the number of convolutions required, the computational requirements can be
reduced significantly. Referring now to FIG. 5, one form of simplification implemented in the
preferred embodiment is shown. There, the central channel 70 is multiplied by a gain factor 71
and added 72, 73 to the left and right channels respectively. Similarly, parts of the low frequency
effects channel 71 are added to each of the other channels.
These signals are then applied to Fourier transform overlap processors 75, 78. The number of
channels performing the computationally intensive Fourier transform process is reduced from six
to four. Next, Fourier domain process 84 is used to generate outputs 79, 89 from which the left
and right channels are left using inverse Fourier transform and discard processes 82, 83.
Referring now to FIG. 6, the overall idealized end result of the process of FIG. 5 is schematically
illustrated for the left ear. There, four input signals 90 are each applied to a corresponding
maximum length finite impulse response filter 91 to 94 and then summed 95 to form a
corresponding output signal 96. Unfortunately, in order to obtain high levels of realism, very
long impulse response functions must often be used. For example, an impulse response function
with a tap length of about 7,000 taps is not uncommon for a standard 48 KHz audio signal.
Again, with the configuration of FIG. 6, excessive computational requirements result in extending
the length of the filter.
Analysis of the impulse response coefficient details and some experiments show that all the cues
needed for accurate localization of the sound source are included directly and within the time of
the first few reflections, the rest of the impulse response is the acoustic environment Indicates
that it is only necessary to highlight the “size” and “liveness” of the Using this observation,
it is possible to separate each direct or "head" portion of the response (e.g., the first 1024 taps)
from the rumor or "tail" portion. The "tail" portion can all be summed, and the resulting filter can
be excited with the sum of the individual input signals. This simplified implementation is shown
schematically in FIG. The head filters 101-104 can be short 1024 tap filters, and the signals are
summed 105 and fed to an expanded tail filter, which can have around 6000 taps, and the
results summed 109 and output. Repeat this process for the right ear. By using a combined tail,
the computational requirements are reduced in two ways. First, there is a clear reduction in the
number of terms in the convolutional sum that needs to be calculated in real time. This reduction
is by a multiple of the number of input channels. Second, the calculation latency of the tail filter
calculation may be short enough to align the first tap of the tail filter with the last tap of each
head filter. When using block filter implementation techniques such as overlap / add, overlap /
save, or the low latency convolution algorithm of the aforementioned PCT application, this will
optionally head the tail with the larger block. It means that it is possible to carry out at a lower
frame rate.
Referring now to FIG. 8, an overall flowchart of frequency domain processing when
implementing the combined tail system of FIG. 7 is shown in detail. Configuration 110 of FIG. 8 is
intended to operate as a frequency domain process such as 84 in FIG. The overall system
includes summations 111 and 112, which output to the left and right channels, respectively. The
four inputs are front left, front right, back left and back right, processed symmetrically and the
first input is stored in the delay storage block 113 and is frequency domain coefficients derived
from the first part of the impulse response The output is fed to the adder 111. The right channel
is also processed symmetrically to generate the right channel output, which will not be discussed
further here.
After the "cycle", the delayed block 113 is sent to the delay block 120. It will be apparent to
those skilled in the DSP programming art that this may be a simple remapping of data block
pointers. During the next cycle, coefficients 121 are multiplied 122 with the data in block 120
and the output is sent to left channel adder 111. The two sets of coefficients 115 and 121
correspond to the head portion of the impulse response function. Each channel has a head
function that is individualized for the left and right output channels.
The outputs from delay blocks 120, 125, 126 and 127 are sent to adder 130 and the sum is
stored in delay block 131. The delay block 131 and the subsequent delay blocks, eg 132, 133,
implement a combined tail filter, and the first segment stored in the delay block 131 is multiplied
136 by the factor 137 and sent to the left channel total 111. In the next cycle, delay cycle 131 is
sent to block 132 and a process similar to that performed for the remaining delay blocks, eg,
133, is performed. Again, the right channel is processed symmetrically.
It will be apparent from the foregoing discussion that there are many impulse response functions
or portions thereof used in the construction of the preferred embodiment. From this, the process
optimization of the generation of frequency domain coefficient blocks will be further discussed,
initially with reference to FIG. The impulse response 140 is divided into multiple segments 141
of length N to determine the required frequency domain coefficients. After inserting 142
additional N value zero data values into each segment, FFT mapping is applied 143 to the N
complex data to convert the values into N frequency domain coefficients 144. This process can
be repeated to obtain subsequent frequency domain coefficients 145, 146, 147.
By using the segmentation process of FIG. 9, unnaturally high frequency components may be
generated. This is a direct result of the segmentation process and its interaction with the fast
Fourier transform. The fast Fourier transform must yield frequency components that
approximate discontinuities in the end data values (FFT is periodic for the remainder divided by
the data size). The resulting FFT often has very high frequency components that exist
substantially as a result of this discontinuity. In the preferred embodiment, it is desirable that the
process be performed to reduce high frequency components to the point where a significant
amount of computation can be discarded due to the collection of frequency domain components
being zero. This process of creating a band limited frequency domain coefficient block is
discussed with reference to FIG.
The first impulse response 150 is again divided into segments 151 of length N. Next, each
segment is inserted 152 to a length 2N. The data 152 is then multiplied by a "windowing"
function 153 which includes progressive ends 156,167. The two ends are designed to map the
ends of the data sequence 151 to zero values while retaining information between them. The
resulting output 159 contains zero values at points 160 and 161. Next, the output 159 is
subjected to a real FFT process to generate frequency domain coefficients 165 with some larger
coefficients 167 in the lower frequency domain of the Fourier transform in addition to some
small components 166 that can be discarded. Therefore, the final subset 169 of frequency
domain components is used as the frequency domain component that represents the
corresponding part of the impulse response data.
Discarding component 166 means that only a limited form of convolution processing need be
performed during the convolution process, and it is unnecessary to multiply the entire set of N
complex coefficients. There is. This is because a significant part of the N complex coefficients is
zero. This also increases the efficiency gain in that the computational requirements of the
convolution process are limited. Furthermore, by taking advantage of the fact that coefficient
discarding can reduce both data and coefficient storage, it is possible to significantly reduce the
memory requirements of the algorithm.
In the preferred embodiment, N equals 512, the head filter is 1024 taps in length and the tail
filter is 6144 taps in length. Thus, the head filter is composed of the coefficients of two blocks
each (as shown in FIG. 8), and the tail filter is composed of the coefficients of 12 blocks each. In
the preferred embodiment, all head filters and the first four blocks of each tail filter are
implemented using a complete set of coefficient Fourier transforms, and the next four blocks of
each tail filter have lower frequency components. Only half are implemented with coefficient
blocks, and furthermore, the last four blocks of each tail filter are implemented with coefficient
blocks in which only the lower quarter of the frequency component is present.
The preferred embodiment uses a higher frequency audio input, but can be extended to
situations where it is desirable to maintain low frequency computational requirements. For
example, employing a sample rate of 96 KHz for digital samples is now commonplace in the art,
and it is therefore desirable to enable convolution of impulse responses sampled at this rate as
well. Referring to FIG. 11, one form of magnification with a lower impulse response sample rate
is shown. In this configuration, the 96 KHz rate input signal 170 is sent to the delayed buffer
171. The input signal is also low pass filtered 172, then decimated to a 48 kHz rate 2 to 1 173,
and then it is FIR filtered based on the system outlined above. It is processed. Then the sample
rate is doubled 175 followed by a low pass filter 176. The signal output from the low pass filter
176 is added 177 to the previously delayed input signal sent from the delay buffer 171
multiplied by the gain factor A. The output of the adder 177 forms a convoluted digital output.
Normally, if the desired 96 KHz impulse response is represented as h 96 (t), the 48 KHz FIR
coefficient denoted as h 48 (t) can be derived from LowPass [h 96 (t)]. This notation is intended
to mean that the original impulse response h 96 (t) is low pass filtered. However, in the improved
method of FIG. 11, the desired response is denoted h 96 (t) and the delayed impulse response is
denoted Α · δ (t − τ) as a 48 KHz FIR coefficient denoted h 48 (t) Can be derived from
LowPass [h96 (t) −A · δ (t−τ)]. The choice of the delay factor τ and the gain factor A is such
that the signal obtained from the gain element 178 has the correct arrival time and magnitude
and produces the high frequency components sought in the direct reach of the 96 KHz acoustic
impulse response. Also, instead of using delay and gain configurations, sparse FIRs can be used
to generate multiple wideband and frequency shaped echoes.
Thus, it is understood that the present preferred embodiment provides a reduced complexity
convolutional system while preserving the numerous characteristics of a full convolutional
system. The preferred embodiment takes a multi-channel digital input signal such as Dolby Pro
Logic, Digital (AC-3) and DTS, or a surround sound input signal and outputs it using one or more
sets of headphones. Using the techniques described above, binaural processing of the input
signal to enhance the listening experience on a wide range of source material through
headphones so that it can be heard "out of the head" and to hear surround sound more and more
It is possible.
If there is a processing technology that can provide such an out-of-head effect, a system can be
provided that inherits processing using a number of different embodiments. For example, as
many physical implementations as possible are possible, and the final result can be implemented
using analog or digital signal processing techniques or a combination of both. In a purely digital
implementation, it is assumed that the input data is obtained in digitally time-sampled form. If
the embodiment is implemented as part of a digital audio device such as a compact disc (CD),
minidisc, digital video disc (DVD) or digital audio tape (DAT), the input data is already available in
this form. If the unit is itself implemented as a physical device, the unit may include a digital
receiver (SPDIF or optical or electrical similar). If the invention is implemented such that only the
analog input signal is available, the analog signal needs to be digitized using an analog to digital
converter (ADC).
The digital input signal is then processed by some form of digital signal processor (DSP).
Examples of available DSPs are:
1. Semi-custom or full-custom integrated circuit designed as a DSP dedicated to the task. 2.
Programmable DSP chip, eg Motorola DSP56002. 3. One or more programmable logic devices.
When using the embodiment with a particular set of headphones, it is possible to use filtering of
the impulse response function to compensate for the unwanted frequency response
characteristics of those headphones. After processing, a digital-to-analog converter (DAC) is used
to convert the stereo digital output signal to an analog signal, amplifying it as necessary, and
possibly leading to the stereo headphone output, possibly through other circuits. This last step
takes place inside the audio device if the embodiment is built-in and as part of a separate device
if the embodiment is implemented as a separate device.
ADCs and / or DACs can also be incorporated into the same integrated circuit as a processor. It is
also possible to implement the embodiment to perform part or all of the processing in the analog
domain. The embodiment preferably has some way of switching on and off the "binauralizer"
effect, and includes a way of switching between equalizer settings for various headphone sets or
perhaps an output volume. It is possible to incorporate control methods of other variations of
In the first embodiment shown in FIG. 12, the processing steps are incorporated into a portable
CD or DVD player as an alternative to the skip prevention IC. Many of the currently available CD
players have an "anti-skip" feature that buffers data read from CDs into random access memory.
If a "skip" is detected, i.e. the audio flow is interrupted by a mechanism that causes the unit to
move out of the track, the unit can re-read the data from the CD while playing the data from the
RAM. This skip protection is sometimes implemented as a dedicated DSP with on-chip RAM or as
an external dedicated DSP.
This embodiment is implemented so that it can be used as an alternative to the skip prevention
processor with minimal modifications of the current design. This implementation is probably
feasible as a full custom integrated circuit and could serve both current skip prevention
processors and "out of the head" processing execution. An "out of head" algorithm can be run
using the part of the RAM already included for skip prevention to do HRTF type processing.
Many of the building blocks of the skip prevention processor are also used for the process
described in the present invention. An example of such an arrangement is shown in FIG.
In this embodiment, a custom DSP 200 is provided as an alternative to the skip prevention DSP in
the CD or DVD player 202. The custom DSP 200 receives input data from the disc and outputs a
stereo signal to the digital-to-analog converter 201, which provides an analog output, whose
analog output is amplified to the left and right 204, 205 speakers. Give an output. The custom
DSP can include on-board RAM 206 or external RAM 207 as needed. A binauralizer switch 28
can be provided to insert and remove the binauralizer effect.
In the second embodiment shown in FIG. 13, the process is incorporated into digital audio device
210 (such as a CD, minidisc, DVD, or DAT player) as an alternative to a DAC. In this
implementation, signal processing is performed by a dedicated integrated circuit 211
incorporating a DAC. This is because integrated circuits can be substantially pin compatible with
current DACs, and can be easily incorporated into digital audio devices with only minor
modifications to current designs.
The custom IC 211 includes an on-board DSP core 212 and a conventional digital-to-analog
converter 213. The custom IC receives the normal digital data output and processes it via DSP
212 and digital to analog conversion 213 to enable stereo output. Again, the binauralizer switch
214 can be provided to control the required binaization effect.
In the third embodiment shown in FIG. 14, the process is incorporated into the digital audio
device 220 (such as a CD, minidisc, DVD or DAT player) as an additional step 221 in the digital
signal chain. There is. In this implementation, the signal processing is performed by a dedicated
or programmable DSP 221 implemented inside the digital audio device and inserted in a stereo
digital signal chain before the DAC 222.
In the fourth embodiment shown in FIG. 15, the process is incorporated into an audio device
(such as a personal cassette player or stereo radio receiver 230) as an additional step in the
analog signal chain . This embodiment uses an ADC 232 and utilizes an analog input signal. This
embodiment can possibly be manufactured on a single integrated circuit 231 incorporating the
ADC 232, DSP 233 and DAC 234. It can also incorporate some analog processing. This can be
easily added to the analog signal chain in the current design of cassette players and similar
In the fifth embodiment shown in FIG. 16, the process is implemented as an external device
utilized with stereo input in digital form. As mentioned above, this embodiment may itself be a
physical unit or be integrated into a headphone set. It can also be battery powered, with the
option of drawing power from an external DC plug pack power supply. This device receives
digital stereo input in optical or electrical form available on some CD and DVD players or similar
devices. The input style is SPDIF or similar, and the unit can support surround sound styles like
Dolby Digital AC-3 or DTS. It can also have an analog input as described below. The processing is
performed by the DSP core 241 inside the custom IC 242. This is followed by the DAC 243. If the
DAC can not drive the headphones directly, additional amplifiers 246, 247 are added after the
DAC. This embodiment of the present invention can be implemented on a custom integrated
circuit 242 that incorporates a DSP, DAC and possibly a headphone amplifier.
Alternatively, the present embodiment can be implemented as a physical unit itself or integrated
into a headphone set. It can also be battery powered, with the option of drawing power from an
external DC plug pack power supply. The device receives an analog stereo input and converts it
to digital data via an ADC. This data is then processed using a DSP and reconverted to analog via
a DAC. Some or all of this processing may alternatively be performed in the analog domain. This
embodiment can be manufactured on a custom integrated circuit incorporating an ADC, DSP,
DAC and necessary analog processing circuitry and possibly headphone amplifiers.
This embodiment may incorporate distance or "zoom" control which allows the listener to change
the perceived distance or environment of the sound source. In the preferred embodiment this
control is implemented as a slider control. If this control is at a minimum, it feels as if coming
from close proximity to the ear, and in fact it can be a plain stereo sound without binaural effects.
When this control is set to maximum, the sound is perceived as coming from a distance. The
control can be changed between the minimum and the maximum, and the perception of the
"outside the head" of the sound can be controlled. By leaving control from a minimum position
and sliding in the maximum direction, the user can adjust the binaural effect experience more
quickly than adjusting with a simple binaural on / off switch. It is possible. Implementation of
such control can include utilizing different filter response sets for different distances.
An implementation example having a slider mechanism is shown in FIG. It is also possible to
provide additional control for switching the audio environment. Furthermore, it is possible to
implement the embodiment as a general integrated circuit solution adapted to various
applications including those mentioned above. This same integrated circuit can be incorporated
into almost any audio device with headphone output. It is a basic building block of all the
physical units specifically created as an implementation of the present invention. Such integrated
circuits may include some or all of an ADC, DSP, DAC, memory I <2> S stereo digital audio input,
S / PDIF digital audio input, headphone amplifier and control pins, and various modes (eg, analog
or digital) Input) allows the device to operate.
Those skilled in the art will appreciate that numerous variations and / or modifications can be
made without departing from the spirit or scope of the invention as generally described, as set
forth in the specific examples. Accordingly, the present embodiments are to be considered in all
respects as illustrative and not restrictive.
FIG. 5 schematically illustrates the process of performing a full convolution process and mapping
a series of signals to two headphone output channels. FIG. 5 illustrates a traditional overlap and
save FFT process. FIG. 2 illustrates the low latency process used in the preferred embodiment.
FIG. 2 schematically illustrates a general frequency domain convolution process. Figure 5
illustrates a first simplification of the process of Figure 4; FIG. 7 illustrates an ideal processing of
a series of input signals to the left ear of the headphones. It is a figure which shows the 1st
simplification of the processing requirement of FIG. FIG. 8 illustrates in further detail the
frequency domain implementation of the configuration of FIG. 7 utilizing low latency convolution.
FIG. 5 illustrates a standard synthesis process for deriving frequency domain coefficients. FIG. 7
illustrates a modification of frequency domain coefficient generation. FIG. 7 illustrates the
extension of the preferred embodiment to higher frequency audio data. FIG. 7 illustrates an
embodiment using the audio processing circuit as an alternative to the skip prevention feature of
current CD players. FIG. 7 is a diagram showing an embodiment in which the present audio
processing circuit is used in the same IC package as a digital / analog converter. FIG. 6 shows an
embodiment of the use of the present audio processing circuit in front of a digital to analog
converter in a signal chain. FIG. 7 illustrates an embodiment of using the audio processing circuit
in an arrangement with analog to digital and digital to analog converters. FIG. 18 illustrates an
extension to the circuit of FIG. 17 including an optional digital input. FIG. 2 shows several
possible physical embodiments of the invention.
Без категории
Размер файла
34 Кб
Пожаловаться на содержимое документа