close

Вход

Забыли?

вход по аккаунту

?

JP2015165658

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JP2015165658
Abstract: The present invention provides an artificial hearing headset for improving directional
sound from an external sound source. A headset has a pair of headphones and microphone
arrays 114a and 114b. Microphone array signals VLF, VLR, VRF and VRR are converted into
beamformed directivity signals ULF, ULR, URF and URR. , The noise reduction block (common
noise reduction mask) 314a, 314b is used to suppress the diffusion signal component, and the
plurality of head transfer function (HRTF) pairs 316a, 316b, 316c, 316d are used to convert to
binaural format . [Selected figure] Figure 3
Artificial hearing headset
[0001]
The present application relates to an artificial hearing headset that improves the directional
sound of an external sound source while suppressing diffuse sound.
[0002]
Artificial hearing refers to electronic devices designed to improve the perception of music and
speech.
Common artificial hearing devices include cochlear implants, hearing aids, and other devices that
provide a sense of sound to the hearing impaired person. Many headphones today include a
noise cancellation feature that blocks or suppresses external noise that adversely affects the
10-05-2019
1
concentration of the user or the ability to hear the audio reproduced from the electronic device
connected to the headphones. These noise cancellation features typically suppress all external
sounds, including both diffuse and directional sounds, which also effectively make the headphone
wearer deaf.
[0003]
One or more embodiments of the present disclosure relate to a headset comprising a pair of
headphones including a left headphone with a left speaker and a right headphone with a right
speaker. The microphone array pair can include a left microphone array integrated with the left
headphone and a right microphone array integrated with the right headphone. Each of the
microphone array pairs may include at least a front microphone and a rear microphone to
receive external audio from an external sound source. The headset can further include a digital
signal processor configured to receive left and right microphone array signals associated with
external audio. A digital signal processor generates a pair of directional signals from each of the
left and right microphone array signals, suppresses diffuse sound from the directional signal
pairs, and generates a parametric model of head-related transfer functions (HRTF) pairs of
directional signals. It may be further configured to apply to each pair and add the HTRF output
signal from each pair of HRTF pairs to produce a left headphone output signal and a right
headphone output signal.
[0004]
A pair of headphones can play audio content from an electronic audio source. Each pair of
directional signals can include forward and backward pointing beam signals. A digital signal
processor can apply noise reduction to directional signal pairs using a common mask to suppress
uncorrelated signal components.
[0005]
The left microphone array signal can include at least a left front microphone signal vector and a
left back microphone signal vector. Additionally, the digital signal processor can calculate left
cardioid signal pairs from the left front and rear microphone signal vectors. In addition, the
digital signal processor calculates real-valued time-dependent and frequency-dependent masks
based on the left cardioid signal pair and the left microphone array signal, and the left-front and
10-05-2019
2
left-dependent The rear microphone signal vector can be multiplied to obtain left front and rear
pointing beam signals.
[0006]
The right microphone array signal includes at least a right front microphone signal vector and a
right back microphone signal vector. Additionally, the digital signal can calculate the right
cardioid signal pair from the right front and rear microphone signal vectors. In addition, the
digital signal processor calculates real-valued time-dependent and frequency-dependent masks
based on the right cardioid signal pair and the right microphone array signal, and generates the
time-dependent and frequency-dependent masks in front of the right front and the right. The rear
microphone signal vector can be multiplied to obtain the right front and rear pointing beam
signals.
[0007]
One or more additional embodiments of the present disclosure relate to a method for enhancing
directional sound from an audio source external to the headset. The headset can include left
headphones with a left microphone array and right headphones with a right microphone array.
The method can include receiving a pair of microphone array signals corresponding to an
external audio source. The microphone array signal pairs can include left and right microphone
array signals. The method may also include generating a pair of directional signals from each of
the microphone array signal pairs and suppressing the spread signal component from the
directional signal pairs. The method applies a parametric model of HRTF pairs to each pair of
directional signals and adds an HTRF output signal from each pair of HRTF pairs, the left
headphone output signal and the right headphone And generating an output signal.
[0008]
Suppressing the diffuse signal component from the directional signal pair can include applying
noise reduction to the directional signal pair using a common mask to suppress uncorrelated
signal components.
[0009]
10-05-2019
3
The left microphone array signal can include at least a left front microphone signal vector and a
left back microphone signal vector.
Generating directional signal pairs from left microphone array signals may include calculating
left cardioid signal pairs from left front and rear microphone signal vectors. It calculates the realvalued time-dependent and frequency-dependent masks based on the left cardioid signal pair and
the left microphone array signal, and the left front and back microphones for the time-dependent
and frequency-dependent masks, respectively. And D. multiplying the signal vector to obtain left
front and rear pointing beam signals.
[0010]
The right microphone array signal may include at least a right front microphone signal vector
and a right back microphone signal vector. Generating directional signal pairs from the right
microphone array signals can include calculating right cardioid signal pairs from the right front
and rear microphone signal vectors. It calculates real-valued time-dependent and frequencydependent masks based on right cardioid signal pairs and right microphone array signals, and
the right front and back microphones for time-dependent and frequency-dependent masks,
respectively. And D. multiplying the signal vector to obtain the right front and rear pointing beam
signals.
[0011]
Suppressing the diffuse signal component from the directional signal pair can include applying
noise reduction to the directional signal pair using a common mask to suppress uncorrelated
signal components.
[0012]
Furthermore, one or more additional embodiments of the present disclosure relate to methods
for improving directional sound from an audio source external to the headset.
The headset can include left headphones with a left microphone array and right headphones with
a right microphone array. Each microphone array can include at least a front microphone and a
back microphone. For each microphone array, the method can include receiving a microphone
10-05-2019
4
array signal corresponding to an external audio source. The microphone array signal may include
at least a front microphone signal vector corresponding to the front microphone and a back
microphone signal vector corresponding to the back microphone. The method calculates a
forward pointing beam signal and a backward pointing beam signal from the forward and
backward microphone signal vectors and applies a noise reduction mask to the forward pointing
and backward pointing beam signals to suppress uncorrelated signal components, The method
may further include obtaining a noise reduced forward pointing beam signal and a noise reduced
backward pointing beam signal. The method also applies a frontal transfer function (HRTF) pair
to the noise reduced forward pointing beam signal to obtain a forward direct HRTF output signal
and a forward indirect HRTF output signal, and a noise reduced backward Applying the backward
HRTF pair to the pointing beam signal to obtain backward direct HRTF output signals and
backward indirect HRTF output signals. In addition, the method adds a forward direct HRTF
output signal and a backward direct HRTF output signal to obtain at least a portion of the first
headphone signal and adds a forward indirect HRTF output signal and a backward indirect HRTF
output signal. Obtaining at least a portion of the second headphone signal.
[0013]
The method adds a first headphone signal associated with the left microphone array to a second
headphone signal associated with the right microphone array to form a left headphone output
signal, and is associated with the right microphone array And adding the first headphone signal
to the second headphone signal associated with the left microphone array to form a right
headphone output signal.
[0014]
Computing the forward pointing beam signal and the backward pointing beam signal from the
forward and backward microphone signal vectors can include computing a cardioid signal pair
from the forward and backward microphone signal vectors.
It calculates real-valued time-dependent and frequency-dependent masks based on cardioid
signal pairs and microphone array signals, and the forward and backward microphone signal
vectors in time-dependent and frequency-dependent masks. And multiplying to obtain forward
pointing and backward pointing beam signals.
[0015]
10-05-2019
5
The time and frequency dependent masks may be calculated as the absolute value of the
normalized cross spectral density of the front and back microphone signal vectors calculated by
time averaging. In addition, time-dependent and frequency-dependent masks can be further
modified using non-linear mapping to reduce and expand the forward pointing and backward
pointing beam signals. For example, the present invention provides the following items. (Item 1)
A pair of headphones including a left headphone having a left speaker and a right headphone
having a right speaker, a left microphone array integrated with the left headphone, and a right
microphone array integrated with the right headphone A pair of microphone arrays, each
including at least a front microphone and a rear microphone for receiving external audio from an
external sound source; left and right microphone arrays associated with the external audio A
digital signal processor configured to receive a signal, comprising: generating a pair of
directional signals from each of the left and right microphone array signals; suppressing diffuse
sound from the directional signal pairs; Directional signal for parametric model of function
(HRTF) pair A digital signal processor, adapted to apply to each pair, and further configured to
add the HTRF output signal from each pair of HRTF pairs to produce a left headphone output
signal and a right headphone output signal . 2. The headset of claim 1, wherein the pair of
headphones are further configured to play audio content from an electronic audio source. 3. The
headset according to any one of the above items, wherein each pair of directional signals
comprises front and rear pointing beam signals. (Item 4) The headset according to any one of the
above items, wherein the left microphone array signal includes at least a left front microphone
signal vector and a left back microphone signal vector. (Item 5) The digital signal processor
configured to generate the directional signal pair from the left microphone array signal,
calculates a left cardioid signal pair from the left front and rear microphone signal vectors, and
Compute real-valued time-dependent and frequency-dependent masks based on the geoid signal
pairs and the left microphone array signal and multiply the time-dependent and frequencydependent masks by the respective left front and rear microphone signal vectors An headset
according to any of the preceding items, comprising the digital signal processor, configured to
acquire left front and rear pointing beam signals.
(Item 6) The headset according to any one of the above items, wherein the right microphone
array signal includes at least a right front microphone signal vector and a right back microphone
signal vector. (Item 7) The digital signal processor configured to generate the directional signal
pair from the right microphone array signal, calculates a right cardioid signal pair from the right
front and rear microphone signal vectors, and Compute real-valued time-dependent and
frequency-dependent masks based on the geoid signal pairs and the right microphone array
signal, multiply the time-dependent and frequency-dependent masks by the respective right front
and back microphone signal vectors A headset according to any of the preceding items,
comprising the digital signal processor, configured to obtain right front and rear pointing beam
signals. (Item 8) The digital signal processor configured to suppress spread noise from the
10-05-2019
6
directional signal pair reduces noise in the directional signal pair using a common mask to
suppress uncorrelated signal components. A headset according to any one of the preceding items
comprising the digital signal processor, wherein the digital signal processor is configured to
apply. 9. A method for improving directional sound from an audio source external to a headset,
the headset including a left headphone having a left microphone array and a right headphone
having a right microphone array. Receiving a pair of microphone array signals corresponding to
the external audio source, the microphone array signal pairs including a left microphone array
signal and a right microphone array signal, and the microphone array signal pair. Generating a
pair of directional signals from each of the above, suppressing a spread signal component from
the directional signal pairs, and applying a HRTF paired parametric model to each pair of
directional signals Add the HTRF output signal from each pair of HRTF pairs to the left
headphone output signal and And generating a right headphone output signal, the method. (Item
10) The method according to the above item, wherein the left microphone array signal includes
at least a left front microphone signal vector and a left back microphone signal vector. (Item 11)
Generating the directional signal pair from the left microphone array signal includes calculating a
left cardioid signal pair from the left front and rear microphone signal vectors, the left cardioid
signal pair, and the left cardioid signal pair. Calculating real-valued time-dependent and
frequency-dependent masks based on the microphone array signal; multiplying the timedependent and frequency-dependent masks by the respective left front and rear microphone
signal vectors; A method according to any one of the preceding items, comprising acquiring
forward and backward pointing beam signals.
(Item 12) The method according to any one of the above items, wherein the right microphone
array signal includes at least a right front microphone signal vector and a right back microphone
signal vector. (Item 13) Generating the directional signal pair from the right microphone array
signal comprises: calculating a right cardioid signal pair from the right front and rear
microphone signal vectors; the right cardioid signal pair and the right Calculating real-valued
time-dependent and frequency-dependent masks based on the microphone array signal;
multiplying the time-dependent and frequency-dependent masks by the respective right front
and rear microphone signal vectors; A method according to any one of the preceding items,
comprising acquiring forward and backward pointing beam signals. (Item 14) Suppressing the
spread signal component from the directional signal pair includes applying noise reduction to the
directional signal pair using a common mask to suppress an uncorrelated signal component. The
method according to any one of the above items. 15. The method of any one of the preceding
claims, wherein each pair of directional signals comprises forward and backward pointing beam
signals. 16. A method for enhancing directional sound from an audio source external to a
headset, the headset including a left headphone having a left microphone array and a right
headphone having a right microphone array. And each microphone array comprises at least a
front microphone and a back microphone, and for each microphone array receiving a
microphone array signal corresponding to the external audio source, the microphone array signal
10-05-2019
7
corresponding to the front microphone Receiving at least a corresponding front microphone
signal vector and a rear microphone signal vector corresponding to the rear microphone;
receiving from the front and rear microphone signal vectors a front pointing beam signal and a
rear pointing bi Calculating a noise signal, applying a noise reduction mask to the front pointing
and rear pointing beam signals to suppress uncorrelated signal components, and reducing noise,
the front pointing beam signal and the noise reduced rear pointing beam signal Obtaining a
forward direct HRTF output signal and a forward indirect HRTF output signal by applying a
forward head transfer function (HRTF) pair to the noise reduced forward pointing beam signal;
Applying a backward HRTF to the determined backward pointing beam signal to obtain a
backward direct HRTF output signal and a backward indirect HRTF output signal; adding the
forward direct HRTF output signal and the reverse direct HRTF output signal; At least one
headphone signal Includes obtaining a partial, by adding the forward indirect HRTF output signal
and the backward indirect HRTF output signal, obtaining at least a portion of the second
headphone signal, the method.
17. The first headphone signal associated with the left microphone array is added to the second
headphone signal associated with the right microphone array to form a left headphone output
signal. Adding the first headphone signal associated with the microphone array to the second
headphone signal associated with the left microphone array to form a right headphone output
signal, any of the above items Or the method described in one item. (Item 18) Calculating the
front pointing beam signal and the rear pointing beam signal from the front and rear microphone
signal vectors comprises: calculating a cardioid signal pair from the front and rear microphone
signal vectors; Calculating a real-valued time-dependent and frequency-dependent mask based on
the pair and the microphone array signal; multiplying the time-dependent and frequencydependent mask by the respective forward and backward microphone signal vectors The method
according to any one of the above items, comprising obtaining a front pointing and a rear
pointing pointing beam signal. (Item 19) The time-dependent and frequency-dependent mask is
calculated as an absolute value of normalized cross spectral density of the front and back
microphone signal vectors calculated by time averaging. Method described. 20. The timedependent and frequency-dependent mask of claim 20, wherein the time-dependent and
frequency-dependent masks are further modified using non-linear mapping to reduce or enlarge
the forward pointing and backward pointing beam signals. Method. Abstract An artificial hearing
headset for improving directional sound from an external audio source. The headset is a pair of
headphones, each having a microphone array that ties the listener to the environment through
multiple microphones, even while listening to content presented from the electronic audio source
through the headphones. including. The microphone array signal is first converted to a
beamformed directional signal. The diffuse signal components may be suppressed using a
common noise reduction mask. The speech signal is then converted to binaural format using
multiple HRTF pairs.
10-05-2019
8
[0016]
FIG. 1 is an environmental view showing an exemplary artificial hearing headset worn by a
person in accordance with one or more embodiments of the present disclosure. 1 is a simplified
exemplary schematic view of an artificial hearing headset in accordance with one or more
embodiments of the present disclosure. FIG. 7 is an exemplary signal processing block diagram in
accordance with one or more embodiments of the present disclosure. FIG. 7 is another exemplary
signal processing block diagram in accordance with one or more embodiments of the present
disclosure. FIG. 7 is a simplified exemplary process flow diagram of a microphone array signal
processing method in accordance with one or more embodiments of the present disclosure. FIG.
7 is another simplified exemplary process flow diagram of a microphone array signal processing
method in accordance with one or more embodiments of the present disclosure.
[0017]
In the following detailed description, reference is made to the accompanying drawings that form
a part thereof. In the drawings, similar symbols typically identify similar components, unless
context dictates otherwise. The division of examples in the functional blocks, modules or units
shown in the figures should be construed as indicating that these functional blocks, modules or
units are necessarily implemented as physically separate units. Absent. The functional blocks,
modules or units shown or described may be implemented as separate units, circuits, chips,
functions, modules or circuit elements. One or more functional blocks or units may also be
implemented in a common circuit, chip, circuit element or unit.
[0018]
The illustrative embodiments described in the detailed description, drawings, and claims are not
limiting. Other embodiments may be utilized and other changes may be made without departing
from the spirit and scope of the subject matter presented herein. Aspects of the present
disclosure may be arranged, substituted, combined, and designed in a wide variety of different
configurations, as generally described herein and illustrated in the drawings, all of which It is
expressly contemplated to be part of the present disclosure.
[0019]
10-05-2019
9
FIG. 1 shows an environmental view representing an exemplary artificial hearing headset 100
worn by a person 102 having a left ear 104 and a right ear 106 in accordance with one or more
embodiments of the present disclosure. The headset 100 can include a pair of headphones 108
including a left headphone 108a and a right headphone 108b, which transmits sound waves 110,
112 to each respective ear 104, 106 of the person 102. Each headphone 108 is a microphone
such that the left microphone array 114a is disposed on the left side of the user's head and the
right microphone array 114b is disposed on the right side of the user's head when the headset
100 is worn. An array 114 can be included. Microphone arrays 114 may be integrated with their
respective headphones 108. Additionally, each microphone array 114 can include a plurality of
microphones 116 including at least a front microphone and a back microphone. For example, the
left microphone array 114a can include at least a left front microphone 116a and a left back
microphone 116c, while the right microphone array 114b can include at least a right front
microphone 116b and a right rear microphone 116d. The plurality of microphones 116 may be
omnidirectional, but other types of directional microphones with different polarity patterns, such
as unidirectional or bi-directional microphones may be used.
[0020]
The pair of headphones 108 may be tightly sealed noise-canceling headphones, ear-on-ear
headphones, in-ear type earphones, or the like. Thus, the listener may be securely isolated while
listening to content such as music or audio presented from the electronic audio source 118 via
the headphones 108 and may be audibly connected only through the microphone 116 to the
external world. Signal processing is a microphone signal so as to preserve the natural hearing of
a desired external sound source, such as a voice coming from a certain direction, while
suppressing unwanted diffuse sound such as audience or crowd noise, aircraft internal noise,
traffic noise etc. May apply to According to one or more embodiments, directional hearing can be
enhanced by natural hearing, for example, to determine distant sound sources from noise that
can not be heard normally. Thus, the artificial hearing headset 100 can provide "superhuman
hearing" or "acoustic magnifier".
[0021]
FIG. 2 is a simplified exemplary schematic diagram of a headset 100 in accordance with one or
more embodiments of the present disclosure. As shown in FIG. 2, the headset 100 can include an
analog-to-digital converter (ADC) 210 associated with each microphone 116 to convert analog
10-05-2019
10
audio signals to digital form. The headset can further include a digital signal processor (DSP) 212
to process the digitized microphone signal. For ease of explanation, as used throughout this
disclosure, generic references to microphone signals or microphone array signals, unless
otherwise specified, in either analog or digital form, and these in the time or frequency domain
Can point to the signal of
[0022]
Each headphone 108 can include a speaker 214 for generating sound waves 110, 112 in
response to input audio signals. For example, left headphone 108a may include left speaker 214a
to receive left headphone output signal LH from DSP 212, and right headphone 108b may
include right speaker 214b to receive right headphone output signal RH from DSP 212. be able
to. Thus, the headset 100 can further include a digital to analog converter DAC and / or a
speaker driver (not shown) associated with each speaker 214. The headphone speaker 214 may
be further configured to receive an audio signal from an electronic audio source 118, such as an
audio playback device, a cell phone, or the like. The headset 100 can include a wire 120 (FIG. 1)
and an adapter (not shown) connectable to the audio source 118 to receive audio signals from
the source. Additionally or alternatively, the headset 100 may receive audio signals from the
electronic audio source 118 wirelessly. Although not illustrated, audio signals from electronic
audio sources can undergo their own signal processing before being transmitted to the speaker
214. Headset 100 may be configured to simultaneously transmit sound waves representing
sound from external sound source 216 and sound from electronic sound source 118. Thus, the
headset 100 can be generally useful to any user who wants to listen to music or telephone
conversations while connected to this environment.
[0023]
FIG. 3 shows an exemplary signal processing block diagram that may be implemented at least in
part by the DSP 212 to process the microphone array signal v. The ADC 210 is not shown in FIG.
3 to highlight the DSP signal processing block. The same signal processing block is utilized for
each ear and pair added at the output to form the final headphone signal. As shown, the signal
processing block is divided into the same signal processing unit 308 including a left microphone
array signal processing unit 308 a and a right microphone array signal processing unit 308 b.
For ease of explanation, the same portion 308 of the signal processing algorithm applied to one
of the microphone array signals is described generically (ie without left or right designation)
unless otherwise indicated. Ru. The generic notation for references to signals associated with the
microphone array 114 generally includes (A) an “F” or “+” designation in the subscript of
10-05-2019
11
the signal identifier for front or forward, or B) Include either "R" or "-" designation in the
subscript of the signal identifier that represents rear or rear. In contrast, a particular reference to
the signal associated with the left microphone array 114a includes an additional "L" designation
in the subscript of the signal identifier to indicate pointing to the position of the left ear.
Similarly, a specific reference to the signal associated with the right microphone array 114b
includes an additional "R" designation in the subscript of the signal identifier to indicate pointing
to the position of the right ear.
[0024]
Using this notation, the front microphone signals of the microphone array 114 may be labeled
generically with vF, while the specific reference to the left front microphone signal associated
with the left microphone array 114a is labeled with vLF Specific references to the right front
microphone signal vector associated with the right microphone array 114b may be labeled vRF.
The generic reference notation is used to an applicable degree, since many of the exemplary
formulas defined below are equally applicable to the signal received from either the left
microphone array 114a or the right microphone array 114b. Ru. However, the signals labeled in
FIG. 3 use a specific reference notation such that both the left signal processor 308a and the
right signal processor 308b are shown.
[0025]
The microphone 116 generates a time domain signal stream. Referring to FIG. 3, the microphone
array signal v includes at least a front microphone signal vector vF and a rear microphone signal
vector vR. The algorithm operates in the frequency domain using short-term Fourier transform
(STFT) 306. The left STFT 306a forms the left microphone array signal V in the frequency
domain, while the right STFT 306b forms the right microphone array signal V in the frequency
domain. The microphone array signal V in the frequency domain includes at least a front
microphone signal vector VF and a rear microphone signal vector VR. In the first signal
processing stage, the front microphone processing block 310 (e.g. left front microphone
processing block 310a or right front microphone processing block 310b) and the rear
microphone processing block 312 (e.g. left rear microphone processing block 312a or right rear
microphone processing) Blocks 312b) each receive a front microphone signal vector VF and a
rear microphone signal vector VR. Each microphone processing block 310, 312 essentially
functions as a beamformer to generate a forward pointing directional signal UF and a backward
pointing directional signal UR from two microphones 116 of each microphone array 114. In
order to generate directional signals for the microphone array 114, the pair of cardioid signals X
10-05-2019
12
+/− may also be initially calculated using known subtraction delay equations, as described below
in Equations 1 and 2. Good.
[0026]
[0027]
[0028]
To obtain a cardioid response pattern, delay values may be selected to match the travel time of
the acoustic signal across the array axis.
The DSP delay may be quantized by the duration of a single sample.
At a sample rate of 48 kHz, for example, the minimum delay is about 21 microseconds. The
speed of sound in air fluctuates with temperature. Using 70.degree. F. as an example, the speed
of sound in air is about 344 m / s. Thus, the sound wave moves about 7 mm in 21 μs. Thus, a
delay of 4 to 5 samples at a sample rate of 48 kHz may be used for the distance between
microphones of approximately 28 mm to 35 mm. The shape of the cardioid response pattern to
the beamformed directional signal may be manipulated by changing the delay or distance
between the microphones.
[0029]
In certain embodiments, the cardioid signal X +/− may be used as a forward pointing directional
signal UF and a backward pointing directional signal UR, respectively. According to one or more
additional embodiments, rather than using the cardioid signal X +/− directly, real-valued time
and frequency dependent masks m +/− may be applied. Applying a mask is a form of non-linear
signal processing. According to one or more embodiments, a real-valued time- and frequencydependent mask m +/− may be calculated, for example, using Equation 3 below.
[0030]
10-05-2019
13
[0031]
For representing the recursively generated time average of V, α = 0.01 to 0.05, i = time index,
and there is a complex conjugate of
[0032]
As shown, the DSP 212 can calculate a real-valued time- and frequency-dependent mask m +/−
as an absolute value of the normalized cross spectral density calculated by time averaging.
In Equation 3, V can be either VF or VR.
The front pointing directivity signal UF and the rear pointing directivity signal UR then multiply
each microphone signal vector V elementwise, either m + for the front pointing beam or m− for
the rear pointing beam Can be obtained by
[0033]
[0034]
[0035]
Thus, the mask m +/−, a number between 0 and 1, can function as a spatial filter to spatially
enhance or not enhance certain signals.
In addition, with this method, the mask function can be further modified with non-linear mapping
F, as represented by Equation 6 below.
[0036]
10-05-2019
14
[0037]
For example, if a beam narrower than a standard cardioid (for example, superdirective
beamforming) is required, this function produces a low value of m indicating a low correlation
between the original microphone signal V and the difference signal X. It can be attenuated
further.
In the extreme case a "binary mask" may be used.
The binary mask may be expressed as a step function that sets all values below the threshold to
zero. Operating the mask function to reduce the beam may add distortion, while expanding the
beam may reduce distortion.
[0038]
The subsequent noise reduction block 314 (eg, the left noise reduction block 314a or the right
noise reduction block 314b) in FIG. 3 is a second noise reduction block to suppress uncorrelated
signal components indicative of diffuse (ie not directional) sound. Of common masks mNR can be
applied to the resulting forward pointing directional signal UF and backward pointing directional
signal UR. The common noise reduction mask mNR can be calculated by Equation 7 shown
below.
[0039]
[0040]
For diffuse sound, the value of the common mask mNR may be close to zero.
For discrete sounds, the value of the common mask mNR can be close to one. Once acquired, the
common mask mNR is then beamformed pointing including the noise reduced forward pointing
beam signal YF and the noise reduced backward pointing beam signal YR, as shown in Equations
10-05-2019
15
8 and 9. It can be applied to generate a sexing signal and a noise reduced directional signal.
[0041]
[0042]
[0043]
The resulting noise reduced front pointing beam signal YF and noise reduced rear pointing beam
signal YR for both left microphone array 114a and right microphone array 114b then use
inverse STFT 315 including left inverse STFT 315a and right STFT 315b. Can be converted back
to the time domain.
The inverse STFT 315 generates the forward pointing beam signal yF and the backward pointing
beam signal yR in the time domain.
The time domain beam signal may then be spatialized using a parametric model of head transfer
function pair 316. The head related transfer function (HRTF) is a response that characterizes how
the ear receives sound from a point in space. A pair of HRTFs for the two ears can be used to
synthesize binaural sounds that appear to come from a particular point in space. As an example,
parametric models of the left ear HRTF of -45 ° (forward) and -135 ° (rear) and the right ear
HRTF of + 45 ° (forward) and + 135 ° (back) may be utilized.
[0044]
Each HRTF pair 316 can include a direct HRTF and an indirect HRTF. In particular reference to
the left microphone array signal processing unit 308a shown in FIG. 3, the left front HRTF pair
316a is applied to the left noise reduced front pointing beam signal yLF to generate the left front
direct HRTF output signals HD, LF. And the left front indirect HRTF output signals HI, LF can be
obtained. Similarly, the left back HRTF pair 316c can be applied to the left noise reduced back
pointing beam signal yLR to obtain the left back direct HRTF output signals HD, LR and the left
back indirect HRTF output signals HI, LR . The left front direct HRTF output signals HD, LF and
the left rear direct HRTF output signals HD, LR may be added to obtain at least a first portion of
10-05-2019
16
the left headphone output signal LH. Meanwhile, the left front indirect HRTF output signals HI
and LF and the left back indirect HRTF output signals HI and LR may be added to obtain at least
a first portion of the right headphone output signal RH.
[0045]
In particular reference to the right microphone array signal processor 308b, the right front HRTF
pair 316b is applied to the right noise reduced front pointing beam signal yRF to generate the
right front direct HRTF output signals HD, RF and the right front indirect HRTF. The output
signals HI and RF can be obtained. Similarly, the right rear HRTF pair 316 d can be applied to the
right noise reduced rear pointing beam signal yRR to obtain the right rear direct HRTF output
signals HD, RR and the right rear indirect HRTF output signals HI, RR . Right front direct HRTF
output signals HD, RF and right rear direct HRTF output signals HD, RR may be added to obtain
at least a second portion of the right headphone output signal RH. On the other hand, the right
front indirect HRTF output signals HI, RF and the right rear indirect HRTF output signals HI, RR
may be added to obtain at least a second portion of the left headphone output signal LH.
[0046]
Collectively, the final left headphone output signal LH and right headphone output signal RH sent
to the respective left headphone speaker 214a and right headphone speaker 214b may be
expressed using Equations 10 and 11 below.
[0047]
[0048]
[0049]
FIG. 4 is an exemplary utilizing HRTF pairs 416a-d according to the parametric model disclosed
in US Patent Application No. 2013/0243200 A1, published Sep. 19, 2013, which is incorporated
herein by reference. Signal processing applications.
As shown, each HRTF pair 416a-d includes one or more summing filters (e.g., "Hsrear") to
10-05-2019
17
convert the directional signals yLF, yLR, yRF, yRR into their respective direct and indirect HRTF
output signals. , Cross filters (eg, “Hcfront”, “Hcrear”, etc.), or interaural delay filters (eg,
“Tfront,” “Trear”, etc.).
[0050]
FIG. 5 is a simplified process flow diagram of a microphone array signal processing method 500
in accordance with one or more embodiments of the present disclosure.
At step 505, the headset 100 can receive a microphone array signal v.
More specifically, the DSP 212 can receive the left microphone array signals vLF and vLR and the
right microphone array signals vRF and vRR, and convert these signals into the frequency
domain. From the microphone array signals, DSP 212 may then generate a pair of beamformed
directional signals UF, UR for each microphone array 114, as provided in step 510. At step 515,
the DSP 212 may implement noise reduction to suppress diffuse noise by applying a common
mask mNR. The resulting noise reduced directional signal Y may be transformed back to the
frequency domain (not shown). Next, HRTF pairs 316 may be applied to each noise reduced
directional signal y, as provided at step 520, to convert the audio signal to a binaural format. At
step 525, the final left headphone output signal LH and right headphone output signal RH are
paired with the signal outputs from the respective left microphone array signal processing unit
308a and right microphone array signal processing unit 308b, as described above with respect
to FIG. It can be generated by adding in units.
[0051]
FIG. 6 is a more detailed exemplary process flow diagram of a microphone array signal
processing method 600 in accordance with one or more embodiments of the present disclosure.
As described above with respect to FIG. 3, the same steps may be employed in processing both
left and right microphone array signals. In step 605, the headset 100 can receive the left
microphone array signal vLF, vLR and the right microphone array signal vRF, vRR. The left
microphone array signals vLF, vLR may represent audio received from the external sound source
216 at the left front microphone 116a and the left back microphone 116c. Similarly, the right
microphone array signals vRF, vRR may represent the sound received from the external sound
source 216 at the right front microphone 116b and the right rear microphone 116d. Each input
10-05-2019
18
microphone signal may be converted from analog form to digital form as provided at step 610.
Further, at step 615, the digitized left and right microphone array signals may be transformed
into the frequency domain using, for example, short-term Fourier transform (STFT) 306. The left
front microphone signal vector VLF and the left rear microphone signal vector VLR and the right
front microphone signal vector VRF and the right rear microphone signal vector VRR may be
obtained respectively as a result of conversion to the frequency domain.
[0052]
In step 620, the DSP 212 calculates a pair of cardioid signals X +/- for each of the left front
microphone signal vector VLF and the left back microphone signal vector VLR and the right front
microphone signal vector VRF and the right back microphone signal vector VRR. be able to. The
cardioid signal X +/− may be calculated using a subtractive delay beamformer as shown in
Equations 1 and 2. A time and frequency dependent mask m +/− may then be calculated for
each pair of cardioid signals X +/− as provided at step 625. For example, DSP 212 can calculate
the time and frequency dependent mask m +/− using the left cardioid signal and the left
microphone signal vector, as shown by Equation 3. The DSP 212 can also calculate separate time
and frequency dependent masks m +/− using the right cardioid signal and the right microphone
signal vector. The time- and frequency-dependent masks m +/− are then applied to their
respective microphone signal vectors V using Equations 4 and 5 as demonstrated in step 630, for
the left front pointing beam signal ULF And the left rear pointing beam signal ULR and the right
front pointing beam signal URF and the right rear pointing beam signal URR can be generated.
The beamformed signal may be subjected to noise reduction at step 635 to suppress
uncorrelated signal components. In order to achieve this purpose, the common mask mNR is
converted to the left front pointing beam signal ULF and the left rear pointing beam signal ULR
and the right front pointing beam signal URF and the right rear pointing beam signal URR using
Equations 8 and 9. It may apply. The common mask mNR can suppress diffuse sound, thereby
emphasizing directional sound, and can be calculated as described above with respect to Eq.
[0053]
At step 640, the resulting noise reduced beam signal Y may be transformed back to the time
domain using the inverse STFT 315. The resulting time domain beam signal y may then be
converted to binaural form at step 645 using the HRTF pair 316 parametric model. For example,
DSP 212 can apply a parametric model of left ear HRTF pair 316a, c to spatialize the noise
reduced left front pointing beam signal yLF and left back pointing beam signal yLR for the left
microphone array 114a. Similarly, DSP 212 can apply a parametric model of right ear HRTF pair
10-05-2019
19
316b, d to spatialize the noise reduced right forward pointing beam signal yRF and right
backward pointing beam signal yRR to the right microphone array 114b. . At step 650, the
various left HRTF output signals and right HRTF output signals are then paired to generate
respective left headphone output signal LH and right headphone output signal RH as described
above with respect to Equations 10 and 11. Can be added.
[0054]
Although exemplary embodiments are described above, these embodiments are not intended to
describe every conceivable form of the present invention. Rather, it is understood that the terms
used herein are words of description rather than limitation, and that various changes may be
made without departing from the spirit and scope of the subject matter presented herein. . In
addition, the features of the various implemented embodiments may be combined to form further
embodiments of the present disclosure.
10-05-2019
20
Документ
Категория
Без категории
Просмотров
0
Размер файла
34 Кб
Теги
jp2015165658
1/--страниц
Пожаловаться на содержимое документа