close

Вход

Забыли?

вход по аккаунту

?

JP2010124370

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JP2010124370
A sound signal is processed in a frequency domain to generate a sound signal with relatively
reduced noise. A signal processing apparatus 10 converts each of two sound signals among
sound signals on a time axis input from at least two sound input units MIC1 and MIC2 into
spectrum signals on a frequency axis. Orthogonal transform units 212 and 214, a phase
difference calculation unit 222 for obtaining a phase difference between two spectrum signals on
the converted frequency axis, and a first spectrum signal for each frequency when the phase
difference is within a predetermined range And a filter unit 300 that generates a filtered spectral
signal by combining the second spectral signal and the phase-shifted spectral signal to generate a
phase-shifted spectral signal. . [Selected figure] Figure 3A.
Signal processing apparatus, signal processing method, and signal processing program
[0001]
The present invention relates to the processing of sound signals, and in particular to the
processing of sound signals in the frequency domain.
[0002]
The microphone array can use an array of microphones to provide directivity to the sound signal
by processing the received and converted sound signal.
[0003]
By processing sound signals from a plurality of microphones to improve the S / N (signal-tonoise) ratio in the microphone array device, the sound waves coming from the direction different
04-05-2019
1
from the receiving direction of the target sound or the suppression direction Unnecessary noise
can be suppressed.
[0004]
In a known noise component suppression device, there is provided a means for obtaining
frequency components by channel by analyzing the frequency of each sound receiving position
input signal detected at a plurality of positions, out of the desired direction for the frequency
components of each channel. First beam former processing means for suppressing noise in the
speaker direction and obtaining target voice component by filter processing with a filter
coefficient for reducing sensitivity, and for reducing frequency sensitivity of each channel
frequency component in the certain means The second beamformer processing means for
suppressing the speaker voice by the filter processing to obtain the noise component, the noise
direction is estimated from the filter coefficient of the first beamformer processing means, and
the target voice direction is estimated from the filter coefficient Estimation means, the first beam
former processing means corrects the arrival direction of the target voice of the input target
according to the estimation target voice direction of the estimation means, and Means for
correcting the direction of arrival of noise of the input object in response to the estimated noise
direction of the estimation means, means for performing spectral subtraction processing based
on the outputs of the first beamformer processing means and the second beamformer processing
means, Means for obtaining a directionality index according to the time difference and amplitude
difference of the incoming sound from the output of the means, and means for performing
spectral subtraction processing control based on the directionality index and the target voice
direction of the certain means.
As a result, it is possible to perform noise suppression processing with a small amount of
calculation and capable of eliminating sudden noise.
[0005]
Certain known directional sound collectors receive sound input from sound sources present in
multiple directions and convert them into signals on the frequency axis.
A suppression function for suppressing the converted signal on the frequency axis is calculated,
and the calculated suppression function is multiplied by the amplitude component of the signal
04-05-2019
2
on the frequency axis of the original signal to correct the signal on the converted frequency axis
Do.
The phase component of the converted signal on each frequency axis is calculated for each same
frequency, and the difference of the phase component is calculated. Based on the calculated
difference, a probability value indicating the probability that the sound source is present in a
predetermined direction is identified, and based on the identified probability value, suppression
of suppressing sound input from a sound source other than the sound source in the
predetermined direction Calculate the function Thereby, when a signal from a sound source
present in a plurality of directions, an audio signal including noise or the like is input, it is not
necessary to install a large number of microphones, and an audio signal emitted by a sound
source in a predetermined direction is simple. Emphasis can be made to suppress ambient noise.
Patent Document 1: Japanese Patent Application Publication No. 2001-100800 Patent Document
2: Japanese Patent Application Publication No. 2007-318528 "Small Feature-Microphone Array"
Journal of the Acoustical Society of Japan, Vol. 51, No. 5, 1995, pp. 384-414
[0006]
In a voice processing apparatus having a plurality of sound input units, each sound signal is
processed in the time domain so that the suppression direction can be made in the direction
opposite to the receiving direction of the target sound, sample delay and subtraction of each
sound signal I do. In this process, noise from the suppression direction can be sufficiently
suppressed. However, when there are a plurality of directions of arrival of background noise such
as traveling noise and noise of noise in a car, for example, the directions of arrival of background
noise from a suppression direction are also more than one, so that noise can not be sufficiently
suppressed. On the other hand, if the number of sound input units is increased, the noise
suppression capability increases, but the cost increases and the size of the sound input unit
increases.
[0007]
The inventor is more accurate if noise suppression is performed by synchronizing and
subtracting two sound signals in the frequency domain according to the sound source direction
of the sound signal of the sound input unit in a device having a plurality of sound input units. It
is recognized that noise can be sufficiently suppressed.
[0008]
04-05-2019
3
An object of the present invention is to process a sound signal in the frequency domain to
generate a sound signal with relatively reduced noise.
[0009]
According to a feature of the present invention, a signal processing apparatus having at least two
sound input units respectively uses two sound signals of the sound signals on the time axis input
from the at least two sound input units. An orthogonal transform unit for converting a spectral
signal on a frequency axis, a phase difference calculation unit for obtaining a phase difference
between two converted spectral signals on the frequency axis, and the phase difference in a
predetermined range; Each component of the first spectral signal of the two spectral signals is
phase shifted for each frequency to generate a phase shifted spectral signal, and the phase
shifted spectral signal and the two spectral signals are generated. And c) combining the second
spectral signal with the second spectral signal to generate a filtered spectral signal.
[0010]
The present invention also relates to a method and program for realizing the above-mentioned
signal processing device.
[0011]
According to the present invention, it is possible to generate a sound signal with relatively
reduced noise.
[0012]
Embodiments of the present invention will be described with reference to the drawings.
Similar components are given the same reference numerals in the drawings.
[0013]
FIG. 1 shows at least two microphones MIC1, MIC2,. . .
04-05-2019
4
Shows the arrangement of the array of.
[0014]
In general, a plurality of microphones MIC1, MIC2,. . .
The arrays of are spaced apart at a known distance d from one another on a straight line.
Here, as a typical example, it is assumed that at least two adjacent microphones MIC1 and MIC2
are arranged on a straight line at a distance d from each other.
The distances between adjacent microphone microphones need not be equal, and may be of
different known distances, as long as the sampling theorem is satisfied, as described below.
[0015]
In the embodiment, an example using two microphones MIC1 and MIC2 among a plurality of
microphones will be described.
[0016]
In FIG. 1, the target sound source SS is located to the left of the microphone MIC1 on a straight
line, and the direction of the target sound source SS is the sound reception direction or target
direction of the microphone arrays MIC1 and MIC2.
Typically, the sound source SS for receiving sound is the speaker's mouth, and the sound
receiving direction is the direction of the speaker's mouth. A predetermined angle range near the
sound receiving angle direction may be set as the sound receiving angle range. Also, typically, the
direction (+ π) opposite to the sound receiving direction may be set as the main suppression
direction of noise, and a predetermined angle range near the main suppression angle direction
may be set as the noise suppression angle range. The suppression angle range of noise may be
determined for each frequency f.
04-05-2019
5
[0017]
The distance d between the microphones MIC1 and MIC2 is preferably set to satisfy the
condition of distance d <sonic velocity c / sampling frequency fs so as to satisfy the sampling
theorem or the Nyquist theorem. In FIG. 1, the directivity characteristics or directivity patterns
(for example, the cardioid shape) of the microphone arrays MIC1 and MIC2 are shown by closed
dashed curves. The input signal received and processed by the microphone arrays MIC1 and
MIC2 depends on the incident angle θ (= −π / 2 to + π / 2) of the sound wave relative to the
straight line on which the microphone arrays MIC1 and MIC2 are arranged, It does not depend
on the radial incident direction (0 to 2π) on the plane perpendicular to the straight line.
[0018]
The voice of the target sound source SS is detected at the right side microphone MIC2 with a
delay time τ = d / c delayed from the left side microphone MIC1. On the other hand, the noise 1
in the main suppression direction is detected at the left microphone MIC1 with a delay time τ =
d / c delayed from the right microphone MIC2. The noise 2 in the suppression direction shifted
within the suppression range of the main suppression direction is detected by being delayed by
the delay time τ = d · sin θ / c with respect to the microphone MIC2 on the right side in the
microphone MIC1 on the left side. The angle θ is the arrival direction of the noise 2 in the
assumed suppression direction. In FIG. 1, an alternate long and short dash line indicates the wave
front of noise 2. The arrival direction of noise 1 in the case of θ = + π / 2 is the suppression
direction of the input signal.
[0019]
The noise 1 (θ = + π / 2) in the main suppression direction is the input signal IN2 (t) of the
adjacent microphone MIC2 on the right delayed by τ = d / c from the input signal IN1 (t) of the
microphone MIC1 on the left. It can be suppressed by subtraction. However, it is not possible to
suppress the noise 2 coming from the angular direction (0 <θ <+ π / 2) shifted from the main
suppression direction.
[0020]
The inventor phase-synchronizes one of the spectrums of the input signals of the microphones
04-05-2019
6
MIC1 and MIC2 with the spectrum of the other according to the phase difference between the
two input signals for each frequency, and obtains the difference between the spectrum of one
and the other It was recognized that noise in the direction of the suppression range could be
sufficiently suppressed.
[0021]
FIG. 2 shows a schematic configuration of a microphone array device 100 including the actual
microphones MIC1, MIC2 of FIG. 1 according to an embodiment of the present invention.
The microphone array device 100 comprises a memory 202 including microphones MIC1 and
MIC2, amplifiers 122 and 124, low pass filters (LPFs) 142 and 144, a digital signal processor
(DSP) 200, and RAM and the like. The microphone array device 100 may be, for example, an invehicle device having a voice recognition function or an information device such as a car
navigation device, a hands-free telephone, or a mobile telephone.
[0022]
As an optional additional configuration, the microphone array device 100 may be coupled to or
include components for the speaker orientation detection sensor 192 and the orientation
determination unit 194. The processor 10 and the memory 12 may be included in one device
including the utilization application 400 or may be included in another information processing
device.
[0023]
The speaker orientation detection sensor 192 may be, for example, a digital camera, an
ultrasonic sensor or an infrared sensor. As an alternative configuration of the direction
determining unit 194, the direction determining unit 194 may be implemented on the processor
10 operating according to the program for direction determination stored in the memory 12.
[0024]
04-05-2019
7
Analog input signals converted from sound by the microphones MIC1 and MIC2 are supplied to
amplifiers 122 and 124, respectively, and amplified by the amplifiers 122 and 124, respectively.
The outputs of the amplifiers 122, 124 are respectively low pass filtered, for example coupled to
the inputs of a low pass filter 142, 144 at a cut-off frequency fc (for example 3.9 kHz). Although
only a low pass filter is used here, a band pass filter may be used or a high pass filter may be
used in combination.
[0025]
The outputs of the low pass filters 142, 144 are respectively coupled to the inputs of analog to
digital converters 162, 164 at a sampling frequency fs (e.g. 8 kHz) (fs> 2 fc) and converted into
digital input signals. The time domain digital input signals IN 1 (t), IN 2 (t) from the analog to
digital converters 162, 164 are respectively coupled to the inputs of a digital signal processor
(DSP) 200.
[0026]
The digital signal processor 200 converts the time domain digital signal outputs IN1 (t) and IN2
(t) into signals in the frequency domain and processes them using the memory 202 to suppress
noise in the direction of the suppression range, A processed time domain digital output signal
INd (t) is generated.
[0027]
As mentioned above, digital signal processor 200 may be coupled to direction determiner 194 or
processor 10.
In this case, the digital signal processor 200 suppresses the noise in the suppression direction in
the suppression range on the opposite side according to the information indicating the sound
reception range from the direction determination unit 194 or the processor 10.
[0028]
04-05-2019
8
The direction determining unit 194 or the processor 10 may process the setting signal input by
the user's key input to generate information representing a sound reception range. Also, the
direction determination unit 194 or the processor 10 detects or recognizes the presence of the
speaker based on the detection data or the image data captured by the sensor 192 to determine
the direction of the speaker, and receives the voice. Information representing the range may be
generated.
[0029]
The output of the digital output signal INd (t) is used, for example, for voice recognition or a call
of a mobile phone. The digital output signal INd (t) is provided to the subsequent utilization
application 400 where it is, for example, digital to analog converted by the digital to analog
converter 404 and low pass filtered by the low pass filter 406 to generate an analog signal. Or
stored in the memory 414 and used by the speech recognition unit 416 for speech recognition.
The speech recognition unit 416 may be a processor implemented as hardware or a processor
that operates according to a program stored in memory 414 including, for example, ROM and
RAM implemented as software.
[0030]
Digital signal processor 200 may be a signal processing circuit implemented as hardware, or may
be a signal processing circuit operating according to a program stored in memory 202 including,
for example, ROM and RAM implemented as software. .
[0031]
In FIG. 1, the microphone array device 100 sets an angle range around the target sound source
direction θ (= −π / 2), for example, −π / 2 ≦ θ <0 as a sound reception range, and the main
suppression direction θ = + π / 2. An angle range in the vicinity, for example, + π / 6 <θ ≦ +
π / 2 is set as the suppression range.
Further, the microphone array device 100 sets an angle range between the sound reception
range and the suppression range, for example, 0 ≦ θ ≦ + π / 6 as a transition (switching)
range.
04-05-2019
9
[0032]
FIGS. 3A and 3B show an example of a schematic configuration of a microphone array device
100 capable of relatively reducing noise by noise suppression using the arrangement of the array
of microphones MIC1 and MIC2 of FIG.
[0033]
Digital signal processor 200 includes fast Fourier transformers 212, 214 whose inputs are
coupled to the outputs of analog to digital converters 162, 164, synchronization factor generator
220, and filter unit 300.
In this embodiment, fast Fourier transform is used for frequency conversion or orthogonal
conversion, but other frequency convertible functions (for example, discrete cosine transform or
wavelet transform, etc.) may be used.
[0034]
The synchronization coefficient generation unit 220 includes a phase difference calculation unit
222 that calculates the phase difference between complex spectra of each frequency f, and a
synchronization coefficient calculation unit 224. The filter unit 300 includes a synchronization
unit 332 and a subtraction unit 334.
[0035]
The time domain digital input signals IN1 (t), IN2 (t) from the analog to digital converters 162,
164 are provided to the inputs of fast Fourier transformers (FFTs) 212, 214, respectively. The
fast Fourier transformers 212 and 214 multiply the signal sections of the digital input signals
IN1 (t) and IN2 (t) by the overlap window function and Fourier transform or orthogonal
transform the product in a known form. , Generate complex spectra IN1 (f), IN2 (f) in the
frequency domain. Here, IN1 (f) = A1e <j (2πft + φ1 (f))>, IN2 (f) = A2e <j (2πft + φ2 (f))>, f is
frequency, A1 and A2 are amplitudes, j is unit imaginary number, φ 1 (f) and φ 2 (f) are delay
phases that are functions of frequency f. As the overlapping window function, for example, a
Hamming window function, a Hanning window function, a Blackman window function, a 3 sigma
Gaussian window function, or a triangular window function can be used.
04-05-2019
10
[0036]
The phase difference calculation unit 222 sets the phase difference DIFF (f) (radian, rad) of the
phase spectrum component indicating the sound source direction for each frequency f between
two adjacent microphones MIC1 and MIC2 separated by the distance d to Find by equation. DIFF
(f) = tan <-1> (IN2 (f) / IN1 (f)) = tan <-1> ((A2e <j (2πft + φ2 (f))> / A1e <j (2πft + φ1 (f))> ) =
Tan <−1> ((A2 / A1) e <j (φ2 (f) −φ1 (f)>)) Here, the noise source of noise of a specific
frequency f approximates that with only one sound source . Also, if it can be approximated that
the amplitudes (A1, A2) of the input signals of the microphones MIC1 and MIC2 are the same (|
IN1 (f) | = | IN2 (f) |), the value A2 / A1 is approximated as 1. You may
[0037]
FIG. 4 shows the microphone arrays MIC1, MIC2,. . . The phase difference DIFF (f) (.pi..ltoreq.DIFF (f) .ltoreq..pi.) For each frequency calculated by the phase difference calculator
222 according to the arrangement of
[0038]
The phase difference calculation unit 222 supplies the value of the phase difference DIFF (f) of
the phase spectrum component for each frequency f between two adjacent input signals IN1 (f)
and IN2 (f) to the synchronization coefficient calculation unit 224. Do.
[0039]
The synchronization coefficient calculation unit 224 determines that noise of the suppression
range θ (for example, + π / 6 <θ ≦ + π / 2) in the input signal at the position of the
microphone MIC1 for the specific frequency f is in the input signal of the microphone MIC2. It is
estimated that the same noise arrives delayed by the phase difference DIFF (f).
Furthermore, the synchronization coefficient calculation unit 224 gradually changes the
processing method in the sound reception range and the noise suppression processing level in
the suppression range in the transition range θ (for example, 0 ≦ θ ≦ + π / 6) at the position
04-05-2019
11
of the microphone MIC1. Or switch.
[0040]
The synchronization coefficient calculator 224 calculates the synchronization coefficient C (f)
according to the following equation based on the phase difference DIFF (f) of the phase spectrum
components for each frequency f.
[0041]
(A) The synchronization coefficient calculator 224 sequentially calculates the synchronization
coefficient C (f) for each temporal analysis frame (window) i in the fast Fourier transform.
i is the temporal order number of the analysis frame (0, 1, 2, ... Represents. Synchronization
coefficient C (f, i) = Cn (f, i) when the phase difference DIFF (f) is a value in the suppression range
(for example, + π / 6 <θ ≦ + π / 2): Initial order number i = For 0, C (f, 0) = Cn (f, 0) = IN1 (f, 0)
/ IN2 (f, 0) For order number i> 0, C (f, i) = Cn ( f, i) = αC (f, i-1) + (1-α) IN1 (f, i) / IN2 (f, i)
[0042]
Here, IN1 (f, i) / IN2 (f, i) represents the ratio of the complex spectrum of the input signal of the
microphone MIC1 to the complex spectrum of the input signal of the microphone MIC2, that is,
the amplitude ratio and the phase difference. It can also be said that IN1 (f, i) / IN2 (f, i)
represents the inverse of the ratio of the complex spectrum of the input signal of the microphone
MIC2 to the complex spectrum of the input signal of the microphone MIC1. α indicates the
addition ratio or synthesis ratio of the delay phase shift amounts of the previous analysis frame
for synchronization, and is a constant in the range of 0 ≦ α <1. 1-.alpha. Represents the
synthesis ratio of the delayed phase shift amount of the current analysis frame to be added for
synchronization. The current synchronization factor C (f, i) is the ratio of the synchronization
factor of the previous analysis frame to the complex spectrum of the input signal of the
microphone MIC1 to the microphone MIC2 of the current analysis frame, the ratio α: (1-α) Is
the sum of
[0043]
(B) The synchronization coefficient C (f) = Cs (f): C (f) = Cs when the phase difference DIFF (f) is a
04-05-2019
12
value within the sound reception range (eg, −π / 2 ≦ θ <0) (F) = exp (-j2πf / fs) or C (f) = Cs
(f) = 0 (without synchronization subtraction)
[0044]
(C) When the phase difference DIFF (f) is a value of the angle θ (for example, 0 ≦ θ ≦ + π / 6)
in the transition range, the synchronization coefficient C (f) = Ct (f) is the angle θ Accordingly,
the weighted average of Cs (f) and Cn (f) in the above (a): C (f) = Ct (f) = Cs (f) × (θ−θtmin) /
(θtmax−θtmin) + Cn (f) × (θtmax−θ) / (θtmax−θtmin) Here, θtmax represents the angle
of the boundary between the transition range and the suppression range, and θtmin represents
the angle of the boundary between the transition range and the sound reception range.
[0045]
In this manner, the phase difference calculation unit 222 generates the synchronization
coefficient C (f) according to the complex spectra IN1 (f) and IN2 (f) to generate the complex
spectra IN1 (f) and IN2 (f), And the synchronization coefficient C (f) to the filter unit 300.
[0046]
In the filter unit 300, the synchronization unit 332 calculates multiplication of the following
equation to synchronize the complex spectrum IN2 (f) with the complex spectrum IN1 (f) to
generate a synchronized spectrum INs2 (f) Do.
INs2(f)=C(f)×IN2(f)
[0047]
Subtraction unit 334 subtracts complex spectrum INs2 (f) multiplied by coefficient β (f) from
complex spectrum IN1 (f) according to the following equation to generate complex spectrum INd
(f) in which noise is suppressed. .
INd (f) = IN1 (f)-. Beta. (F) .times.INs2 (f) Here, the coefficient .beta. (F) is a preset value in the
range of 0.ltoreq..beta. (F) .ltoreq.1.
04-05-2019
13
The coefficient β (f) is a function of the frequency f and is a coefficient for adjusting the degree
of subtraction of the synchronization coefficient. For example, the direction of arrival of the
sound represented by the phase difference DIFF (f) in order to largely suppress the noise that is
the arrival sound from the suppression range while suppressing the occurrence of distortion of
the signal that is the arrival sound from the reception range. The coefficient β (f) may be set
such that the case where L is in the suppression range is larger than the case where it is in the
sound reception range.
[0048]
Digital signal processor 200 further includes an inverse fast Fourier transformer (IFFT) 382. The
inverse fast Fourier transformer 382 receives the spectrum INd (f) from the synchronization
coefficient calculator 224, performs inverse Fourier transform on it, performs overlap addition,
and generates an output signal INd (t) in the time domain at the position of the microphone
MIC1. Do.
[0049]
The output of the inverse fast Fourier transformer 382 is coupled to the input of the utilization
application 400 located downstream.
[0050]
The output of the digital output signal INd (t) is used, for example, for voice recognition or a call
of a mobile phone.
The digital output signal INd (t) is provided to the subsequent utilization application 400 where it
is, for example, digital to analog converted by the digital to analog converter 404 and low pass
filtered by the low pass filter 406 to generate an analog signal. Or stored in the memory 414 and
used by the speech recognition unit 416 for speech recognition.
[0051]
The components 212, 214, 220-224, 300-334 and 382 of FIGS. 3A and 3B may be viewed as a
flow diagram implemented by a digital signal processor (DSP) 200 implemented as an integrated
04-05-2019
14
circuit or implemented in a program. You can also.
[0052]
FIG. 5 shows a flow chart for the generation of the complex spectrum performed by the digital
signal processor (DSP) 200 of FIG. 3A according to the program stored in the memory 202.
Thus, this flowchart corresponds to the functionality implemented by components 212, 214,
220, 300 and 382 of FIG. 3A.
[0053]
Referring to FIGS. 3A and 5, in step 502, digital signal processor 200 (fast Fourier transform
units 212 and 214) receives two time domain digital input signals IN1 (t1) supplied from analog
to digital converters 162 and 164. ) And IN2 (t) are input and captured respectively.
[0054]
At step 504, the digital signal processor 200 (fast Fourier transform units 212, 214) multiplies
each of the two digital input signals IN1 (t) and IN2 (t) by an overlap window function.
[0055]
In step 506, the digital signal processor 200 (fast Fourier transform units 212, 214) Fourier
transforms the digital input signals IN1 (t) and IN2 (t) to form complex spectra IN1 (f) and IN2 (f)
in the frequency domain. Generate
[0056]
In step 508, the digital signal processor 200 (the phase difference calculator 222 of the
synchronization coefficient generator 220) determines the phase difference DIFF (f) = tan <-1>
(between the spectra IN1 (f) and IN2 (f)). Calculate IN2 (f) / IN1 (f).
[0057]
In step 510, the digital signal processor 200 (the synchronization coefficient calculator 224 of
the synchronization coefficient generator 220) determines the complex spectrum of the input
signal of the microphone MIC1 with respect to the input signal of the microphone MIC2 based on
04-05-2019
15
the phase difference DIFF (f). The ratio C (f) is calculated according to the following equation as
described above.
[0058]
(A) When the phase difference DIFF (f) is a value of the suppression angle range, the
synchronization coefficient C (f, i) = Cn (f, i) = αC (f, i−1) + (1−α) IN1 (f, i) / IN2 (f, i).
(B) When the phase difference DIFF (f) is the value of the sound receiving angle range, the
synchronization coefficient C (f) = Cs (f) = exp (−j2πf / fs) or C (f) = Cs (f) = 0.
(C) A weighted average of the synchronization factors C (f) = Ct (f), Cs (f) and Cn (f) if the phase
difference DIFF (f) is a value within the transition angle range.
[0059]
In step 514, the digital signal processor 200 (the synchronization unit 332 of the filter unit 300)
calculates the equation: INs2 (f) = C (f) IN2 (f) to obtain the complex spectrum IN2 (f) as the
complex spectrum IN1 (f). Synchronize to f) to generate a synchronized spectrum INs2 (f).
[0060]
In step 516, the digital signal processor 200 (subtraction unit 334 of the filter unit 300)
subtracts the complex spectrum INs2 (f) multiplied by the coefficient β (f) from the complex
spectrum IN1 (f) (INd (f) = IN1 (f)-. Beta. (F) .times.INs2 (f)) generates a noise-suppressed complex
spectrum INd (f).
[0061]
In step 518, the digital signal processor 200 (inverse fast Fourier transform unit 382) receives
the spectrum IN d (f) from the synchronization coefficient calculation unit 224, performs inverse
Fourier transform on the spectrum IN d (f), performs overlap addition, and calculates the time at
the position of the microphone MIC1. An output signal INd (t) of the area is generated.
[0062]
04-05-2019
16
Thereafter, the procedure returns to step 502.
Steps 502-518 are repeated for the required time period to process the required period input.
[0063]
In this manner, according to the above-described embodiment, the input signals of the
microphones MIC1 and MIC2 can be processed in the frequency domain to relatively reduce
noise in the input signals.
Processing the input signal in the frequency domain as described above can detect the phase
difference with higher accuracy than processing the input signal in the time domain, thus
producing higher quality speech with reduced noise can do.
The processing of the input signals from the two microphones described above can be applied to
the combination of any two of the plurality of microphones (FIG. 1).
[0064]
According to the above-described embodiment, when processing certain recorded speech data
including background noise, a suppression gain of about 6 dB will be obtained, compared to the
usual suppression gain of about 3 dB.
[0065]
FIGS. 6A and 6B show the setting state of the sound reception range, the suppression range, and
the transition range set based on the data of the sensor 192 or the key input data.
A sensor 192 detects the position of the speaker's body.
Direction determining unit 194 sets a sound reception range so as to cover the speaker's body in
04-05-2019
17
accordance with the detected position.
Direction determining unit 194 sets a transition range and a suppression range according to the
sound reception range.
The setting information is supplied to the synchronization coefficient calculator 224 of the
synchronization coefficient generator 220. As described above, the synchronization coefficient
calculation unit 224 calculates the synchronization coefficient according to the set sound
reception range, the suppression range, and the transition range.
[0066]
6A, the face of the speaker is located on the left side of the sensor 192, and the sensor 192 sets
the center position θ of the face area A of the speaker at an angle θ = θ1 = −π / 4 as an
angular position in the sound receiving range, for example. To detect. In this case, the direction
determining unit 194 sets the angle range of the sound reception range narrower than the angle
π so as to include the entire face area A based on the detected data θ = θ1. The direction
determining unit 194 sets the entire angular range of each of the transition ranges adjacent to
the sound receiving range to, for example, a predetermined angle π / 4. The direction
determining unit 194 further sets the angle of the entire suppression range located on the
opposite side of the sound reception range as the remaining angle.
[0067]
In FIG. 6B, the face of the speaker is located below or in front of the sensor 192, and the sensor
192 sets the center position θ of the face area A of the speaker at an angle θ = θ2 = 0, for
example, as an angular position in the sound receiving range. To detect In this case, the direction
determining unit 194 sets the angle range of the sound reception range narrower than the angle
π so as to include the entire face area A based on the detected data θ = θ2. The direction
determining unit 194 sets the entire angular range of each of the transition ranges adjacent to
the sound receiving range to, for example, a predetermined angle π / 4. The direction
determining unit 194 further sets the angle of the entire suppression range located on the
opposite side of the sound reception range as the remaining angle. Instead of the position of the
face, the position of the speaker's body may be detected.
04-05-2019
18
[0068]
When the sensor 192 is a digital camera, the direction determining unit 194 recognizes the
image data acquired from the digital camera to determine the face area A and the center position
θ thereof. The direction determining unit 194 sets a sound reception range, a transition range,
and a suppression range based on the face area A and its center position θ.
[0069]
In this manner, the direction determining unit 194 can variably set the sound reception range,
the suppression range, and the transition range according to the detection position of the face or
body of the speaker detected by the sensor 192. As an alternative configuration, the direction
determining unit 194 may variably set the sound reception range, the suppression range, and the
transition range according to the key input. By variably setting the sound reception range and
the suppression range in this manner, it is possible to narrow the sound reception range as much
as possible and to suppress unnecessary noise of each frequency in the widest suppression range
as much as possible.
[0070]
The embodiments described above are merely typical examples, and it is obvious to those skilled
in the art that the components of the respective embodiments are combined, and the variations
and variations thereof are apparent to those skilled in the art. It will be appreciated that various
modifications of the above-described embodiments can be made without departing from the
scope of the invention as set forth.
[0071]
FIG. 1 shows the arrangement of an array of at least two microphones, each as a sound input,
used in an embodiment of the present invention.
FIG. 2 shows a schematic configuration of a microphone array device including the actual
microphones of FIG. 1 according to an embodiment of the present invention. 3A and 3B show an
example of a schematic configuration of a microphone array device capable of relatively reducing
04-05-2019
19
noise by noise suppression using the arrangement of the microphone array of FIG. FIG. 4 shows
the phase difference of the phase spectrum component for each frequency calculated by the
phase difference calculation unit according to the arrangement of the microphone array of FIG.
FIG. 5 shows a flowchart for the generation of the complex spectrum performed by the digital
signal processor (DSP) of FIG. 3A according to a program stored in memory. 6A and 6B show the
set states of the sound reception range, the suppression range, and the transition range set based
on the sensor data or the key input data.
Explanation of sign
[0072]
DESCRIPTION OF SYMBOLS 100 microphone array apparatus MIC1, MIC2 microphone 122, 124
amplifier 142, 144 low pass filter 162, 164 analog-to-digital converter 212, 214 fast Fourier
transformer 200 digital signal processor 220 synchronization coefficient generation unit 222
phase difference calculation unit 224 synchronization coefficient calculation unit 300 filter unit
332 synchronization unit 334 subtraction unit 382 inverse fast Fourier transformer
04-05-2019
20
Документ
Категория
Без категории
Просмотров
0
Размер файла
32 Кб
Теги
jp2010124370
1/--страниц
Пожаловаться на содержимое документа