close

Вход

Забыли?

вход по аккаунту

?

JP2015126279

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JP2015126279
PROBLEM TO BE SOLVED: To provide an audio signal processing device capable of obtaining a
sufficient audio processing result even if a microphone interval is large. SOLUTION: The present
invention relates to an apparatus for suppressing noise components contained in an input sound
signal acquired by a pair of microphones by coherence filter processing. Then, a means for
calculating the coherence filter coefficient, a means for estimating the arrival direction of the
disturbing sound contained in the input voice signal, and a correction gain corresponding to the
estimated arrival direction are obtained, and The method is characterized by comprising means
for correction with the correction gain and means for performing coherence filter processing by
applying coherence filter coefficients of the entire band including the corrected low-pass
coherence filter coefficient. [Selected figure] Figure 4
Audio signal processing device and program
[0001]
The present invention relates to an audio signal processing apparatus and program, for example,
dealing with an audio signal of a telephone, a teleconference device, etc. (In this specification, an
audio signal such as an audio signal or an audio signal is called "audio signal"). It can be applied
to communication devices and communication software.
[0002]
The coherence filter method is mentioned as one of the methods of suppressing the noise
component contained in the acquired audio | voice signal.
03-05-2019
1
As described in Patent Document 1, the coherence filter method is a method of suppressing a
noise component having a large deviation in the arrival direction by multiplying cross
correlations of signals having dead angles on the left and right for each frequency.
[0003]
In an audio signal processing apparatus to which the coherence filter method is applied, a first
directivity signal including only a component arriving from the left side of the front and a
component arriving from the right side using input signals acquired by the two microphones The
noise component is suppressed by generating a second directivity signal containing only the
signal component and multiplying the input signal as a coefficient value (coherence filter
coefficient) for each frequency component of the first and second directivity signals. I got a
signal. Here, an arithmetic expression of the coherence filter coefficient is described in Patent
Document 1.
[0004]
Here, the coherence, which is the average value of the coherence filter coefficients for each
frequency component that is the correlation value of the first and second directional signals, is an
interference sound (a voice or sound that interferes with the voice to be extracted) It is a
parameter that reflects the arrival direction of. In patent document 2, using the coherence in a
disturbance sound area is described for estimation of the arrival direction of a disturbance sound.
[0005]
JP 2013-126026 JP 2013-182044
[0006]
Asano Tadashi, edited by the Acoustical Society of Japan, "Array signal processing of sound",
Corona company, February 2011, the first edition published
[0007]
03-05-2019
2
The voice signal processing device to which the coherence filter method is applied is applied to,
for example, a voice input / recognition device of a smartphone.
It is also conceivable to provide two microphones at a distance (for example, about 14 cm) in the
longitudinal direction of the smartphone. In this case, the distance between the two microphones
(for example, about 3 cm) in the width direction of the smartphone , Microphone spacing)
becomes quite large.
[0008]
When the microphone spacing is extended beyond a predetermined value, the directivity
sharpness of the low-pass components of the first and second directivity signals is increased.
FIG. 9 is an explanatory view of this. FIG. 9 shows the directivity for a certain low frequency
component (for example, about 500 Hz) in the first directivity signal. The dashed curve shows
the case where the microphone spacing is narrow, the solid curve shows the case where the
microphone spacing is wide, and the directivity of the direction shifted 90 ° to the right from
the front is several times that of the wide microphone spacing (FIG. 9 4 The directivity is sharp.
The circle in FIG. 9 is drawn as a scale which gives an indication of the strength of directivity.
[0009]
When the coherence filter coefficient is calculated based on the first and second directivity
signals having sharp directivity, the behavior of the coherence filter coefficient largely changes
as compared with the case where the microphone spacing is narrow. The reason is that, for
example, when the disturbing sound comes from the left, the signal value becomes small because
the first directivity signal hits the dead angle, but the signal component on the left side of the
second directivity signal is (the narrow microphone spacing In the case of (1) above). Therefore,
since the characteristic difference between the first and second directivity signals becomes
extremely large, the correlation (coherence filter coefficient) becomes extremely small. When
such a coherence filter coefficient is applied to noise suppression processing, although a strong
noise suppression effect is obtained, even the target voice component is greatly distorted, and the
sound quality is significantly degraded.
03-05-2019
3
[0010]
In addition, when the microphone spacing is increased, a kind of error component such as spatial
aliasing is mixed, and there is a problem that the shape of directivity to be formed is deformed
(see page 76 of Non-Patent Document 1). FIG. 10 shows an example of directivity change due to
space aliasing. FIG. 10 shows directivity at a certain frequency in the first directivity signal. The
dotted curve shows the directivity when the microphone spacing is narrow, and has a cardioid
shape. On the other hand, the solid curve shows the directivity when the microphone spacing is
wide, and due to the spatial alias, it has a unique shape with blind spots at various azimuths.
[0011]
When the coherence filter coefficient is calculated from the first and second directivity signals
having such directivity and the coherence is calculated from the coherence filter coefficient, the
behavior of the coherence changes significantly. This is because, since the directivity has a shape
close to right and left symmetry, if the direction of arrival of noise is, for example, 45 ° to the
left, both of the first and second directivity signals will capture the interference sound Therefore,
although the interference sound is generated, the correlation is increased, and as a result, the
coherence is increased.
[0012]
As described above, when the distance between the microphones becomes large and the space
alias component is mixed, there is a problem that an accurate estimation result can not be
obtained in the determination of the arrival direction.
[0013]
Therefore, there is a need for an audio signal processing apparatus and program that can obtain
sufficient audio processing results even if the microphone spacing is large.
[0014]
A first aspect of the present invention relates to (1) a coherence filter for calculating a coherence
filter coefficient in an audio signal processing apparatus for suppressing noise components
included in an input audio signal acquired by a pair of microphones acquired by coherence filter
03-05-2019
4
processing. Low-frequency range calculation means, (2) arrival direction estimation means for
estimating the arrival direction of disturbance sound included in the input voice signal, and (3)
obtaining a correction gain corresponding to the estimated arrival direction. Low-pass filter
coefficient correction means for correcting the coherence filter coefficients of the image with the
correction gain, and (4) a filter for performing coherence filter processing by applying the
coherence filter coefficients of the entire band including the low-pass coherence filter coefficients
corrected. And processing execution means.
[0015]
A second aspect of the present invention is an audio signal processing program for suppressing
noise components contained in an input audio signal obtained by capturing a pair of
microphones by coherence filter processing, comprising: (1) a coherence filter coefficient (2) the
arrival direction estimation means for estimating the arrival direction of the disturbance sound
included in the input speech signal; (3) the correction gain corresponding to the estimated arrival
direction Low-pass filter coefficient correction means for correcting the low-pass coherence filter
coefficients with the correction gain, and (4) applying the full-band coherence filter coefficients
including the corrected low-pass coherence filter coefficients It is characterized in that it
functions as a filter processing execution unit that performs filter processing.
[0016]
According to a third aspect of the present invention, in the audio signal processing apparatus for
estimating the arrival direction of the interference sound included in the input audio signal
obtained by the pair of microphones, (1) calculating at least the coherence filter coefficient of the
limited band Means for calculating coherence filter coefficients, (2) means for calculating the
limited band coherence using the limited band coherence filter coefficients, (3) various values of
the limited band coherence and the arrival direction Are associated in advance, and based on the
calculated coherence of the limited band, the present invention is characterized by including
arrival direction acquisition means for obtaining the associated arrival direction and outputting it
as an estimated value.
[0017]
A fourth aspect of the present invention is an audio signal processing program for estimating an
arrival direction of an interference sound contained in an input audio signal obtained by
capturing a pair of microphones, the computer comprising: Coherence filter coefficient
calculation means for calculating coherence filter coefficients; (2) limited band coherence
calculation means for calculating the coherence of the limited band by applying the coherence
filter coefficients of the limited band; (3) various types of coherence of the limited band A value
and an arrival direction are associated in advance, and it is characterized in that it functions as an
03-05-2019
5
arrival direction acquisition unit that obtains the associated arrival direction based on the
calculated coherence of the limited band and outputs it as an estimated value.
[0018]
According to the present invention, it is possible to realize an audio signal processing apparatus
and program capable of obtaining sufficient audio processing results even if the microphone
spacing is large.
[0019]
It is explanatory drawing which shows the behavior of the coherence filter coefficient in the low
region | range in case the microphone space | interval is wide about two arrival directions.
It is explanatory drawing which shows the behavior of the coherence filter coefficient in the low
region | range in case the microphone space | interval is narrow about two arrival directions.
FIG. 1 is a block diagram showing an entire configuration of an audio signal processing device
according to a first embodiment.
It is a block diagram which shows the detailed structure of the coherence filter process part in
the audio | voice signal processing apparatus of 1st Embodiment.
It is explanatory drawing which shows the conversion table which matched the arrival direction
and the correction gain which the correction gain determination part in the audio | voice signal
processing apparatus of 1st Embodiment utilizes.
It is a block diagram which shows the detailed structure of the arrival direction estimation part in
the coherence filter process part in the audio | voice signal processing apparatus of 1st
Embodiment.
It is explanatory drawing which shows the conversion table which matched the middle region
coherence and the arrival direction which the arrival direction estimation part in the coherence
03-05-2019
6
filter process part in the audio | voice signal processing apparatus of 1st Embodiment utilizes.
It is explanatory drawing which shows the behavior of the middle zone coherence for every
arrival direction of a disturbance sound. FIG. 7 is an explanatory view that the sharpness of the
directivity of the low frequency component in the directivity signal is increased when the
microphone interval is spread beyond a predetermined value. It is explanatory drawing which
shows the example of the change of the directional characteristic by space alias. It is explanatory
drawing of the term showing an angle.
[0020]
(A) First Embodiment A first embodiment of an audio signal processing apparatus and program
according to the present invention will be described in detail with reference to the drawings. The
speech processing apparatus according to the first embodiment applies the coherence filter
method to suppress the noise component contained in the speech signal.
[0021]
(A-1) Concept of the First Embodiment The phenomenon in which the directivity of the low
frequency component in the directivity signal becomes strong accompanying the expansion of
the microphone interval is inevitable as described above. In the first embodiment, in the first
embodiment, the sound quality and the suppression performance are approximately compatible
so that excessive suppression processing is not performed when the microphone interval is wide
enough to increase the directivity of the low frequency component. It is decided to correct the
coherence filter coefficient to the value.
[0022]
1 and 2 are diagrams comparing the behavior of the coherence filter coefficient in the low range
between the cases where the microphone spacing is wide and narrow for the two incoming
directions (diagonally front and side), and FIG. 1 is a wide microphone spacing. The behavior of
the case is shown, and FIG. 2 shows the behavior when the microphone spacing is narrow. In this
specification, "diagonal front", "intermediate" and "lateral" described later refer to directions
deviated from the front (front) by an angle as shown in FIG. When the microphone spacing is
03-05-2019
7
wide, as can be seen from FIG. 1, the coherence filter coefficient takes an extremely small value
and there is almost no change due to the arrival direction of the disturbing sound. On the other
hand, when the microphone spacing is narrow, as can be seen from FIG. 2, the coherence filter
coefficient becomes large, and the difference in the range depending on the arrival direction also
becomes remarkable.
[0023]
Based on such characteristics, when the microphone spacing is wide, the coherence filter
coefficient value in which the suppression performance and the sound quality are approximately
compatible with each other when the microphone spacing is narrow is approached. Specifically,
for each direction of arrival, the ratio of the coefficient value when the microphone spacing is
narrow to the coherence ratio when the microphone spacing is wide is previously calculated and
stored, and this is corrected to the low-pass coherence filter coefficient. Multiply as a gain. As
described above, since the range of low-pass coherence filter coefficients can be specified to
some extent when the microphone spacing is narrow and wide, it is possible to set the ratio of
both in advance.
[0024]
The first embodiment is a case where the microphone distance is wide (for example, 10 several
cm), and the first embodiment corrects the coherence filter coefficient in the low band to obtain a
coherence filter coefficient when the microphone distance is narrow. I tried to improve the sound
quality by approaching the characteristics.
[0025]
(A-2) Configuration of the First Embodiment FIG. 3 is a block diagram showing the configuration
of the audio signal processing device according to the first embodiment.
Here, the portion excluding the pair of microphones m1 and m2 can be configured by hardware,
and can also be realized by software (audio signal processing program) executed by the CPU and
the CPU However, even when any implementation method is adopted, functionally it can be
represented in FIG.
03-05-2019
8
[0026]
In FIG. 3, the audio signal processing apparatus 10 according to the first embodiment includes a
pair of microphones m1 and m2, an FFT (fast Fourier transform) unit 11, a coherence filter
processing unit 12, and an IFFT (inverse fast Fourier transform) unit 13. Have.
[0027]
The pair of microphones m1 and m2 are disposed apart by a predetermined distance (or an
arbitrary distance) which is wide enough to cause the above-described problem, and each capture
surrounding sound.
Each of the microphones m1 and m2 is non-directional (or has very slight directivity in the front
direction). Audio signals (input signals) captured by the microphones m1 and m2 are converted
into digital signals s1 (n) and s2 (n) through corresponding A / D converters (not shown) and
provided to the FFT unit 11 . Here, n is an index representing the order of sample input, and is
represented by a positive integer. In the text, it is assumed that the smaller n is the older input
sample, and the larger n is the newer input sample. The band of the audio signal (input signal) is,
for example, 0 Hz to 8000 Hz. As a partial band within this band, there is a low band or a mid
band described later.
[0028]
The FFT unit 11 receives the input signal series s1 (n) and s2 (n) from the microphones m1 and
m2, and performs fast Fourier transform (or discrete Fourier transform) on the input signals s1
and s2. Thereby, input signals s1 and s2 can be expressed in the frequency domain. Note that, in
carrying out the fast Fourier transform, analysis frames FRAME1 (K) and FRAME2 (K) consisting
of predetermined N samples from the input signals s1 (n) and s2 (n) are constructed and applied.
Although the example which comprises analysis frame FRAME1 (K) from input signal s1 (n) is
shown to the following (1) Formula, analysis frame FRAME 2 (K) is the same.
[0029]
Here, K is an index representing the order of frames, and is expressed by a positive integer. In the
text, the smaller K is the older analysis frame, and the larger K is the newer analysis frame.
03-05-2019
9
Further, in the following description, it is assumed that the index representing the latest analysis
frame to be analyzed is K, unless otherwise specified.
[0030]
The FFT unit 11 performs fast Fourier transform processing for each analysis frame to convert it
into frequency domain signals X1 (f, K) and X2 (f, K), and obtains the obtained frequency domain
signals X1 (f, K). And X 2 (f, K) to the coherence filter processing unit 12 respectively. Here, f is
an index representing a frequency. In addition, X1 (f, K) is not a single value, and as shown in
equation (2), X1 (f, K) is composed of spectrum components of plural frequencies f1 to fm.
Furthermore, X1 (f, K) is a complex number and consists of a real part and an imaginary part.
The same applies to X2 (f, K) and B1 (f, K) and B2 (f, K) described later.
[0031]
X1 (f, K) = {X1 (f1, K), X1 (f2, K),..., X1 (fm, K)} (2) In the coherence filter processing unit 12
described later, the frequency domain signal X1 ( Of the f, K) and X 2 (f, K), the frequency domain
signal X 1 (f, K) is the main signal, and the frequency domain signal X 2 (f, K) is the sub
processing ((7) Reference), the frequency domain signal X2 (f, K) may be main, and the frequency
domain signal X1 (f, K) may be sub-processed.
[0032]
The coherence filter processing unit 12 has a detailed configuration shown in FIG. 4 to be
described later, executes coherence filter processing, obtains a signal Y (f, K) in which noise
components are suppressed, and gives it to the IFFT unit 13. is there.
[0033]
The IFFT unit 13 performs inverse fast Fourier transform on the noise-suppressed signal Y (f, K)
to obtain an output signal y (n) which is a time domain signal.
[0034]
FIG. 4 is a block diagram showing the detailed configuration of the coherence filter processing
unit 12.
[0035]
03-05-2019
10
In FIG. 4, the coherence filter processing unit 12 includes an input signal receiving unit 21, a
directivity forming unit 22, a filter coefficient calculating unit 23, a coherence calculating unit
24, an arrival direction estimating unit 25, a correction gain determining unit 26, and a filter
coefficient correcting unit 27, a filter processing unit 28 and a signal transmission unit 29 after
filtering.
[0036]
The input signal receiving unit 21 receives the frequency domain signals X1 (f, K) and X2 (f, K)
output from the FFT unit 11.
[0037]
The directivity forming unit 22 forms first and second directivity signals B1 (f, K) and B2 (f, K).
The method of forming directional signals B1 (f, K) and B2 (f, K) can apply the existing method,
for example, the method of obtaining by the operation according to equations (3) and (4) Can be
applied.
[0038]
The filter coefficient calculation unit 23 calculates the coherence filter coefficient coef (f, K)
according to the equation (5) based on the first and second directional signals B1 (f, K) and B2 (f,
K). It is
[0039]
The coherence calculation unit 24 calculates coherence COH (K) as an estimated index value of
the arrival direction of the interference sound according to the equation (6) based on the
coherence filter coefficient coef (f, K).
As shown in equation (6), the coherence COH (K) is obtained by arithmetically averaging the
coherence filter coefficient coef (f, K) at a mid frequency (for example, about 2000 Hz to 4000
Hz as a mid frequency) or all frequencies It is a value.
03-05-2019
11
[0040]
The arrival direction estimation unit 25 estimates the arrival direction of the interference sound
and obtains the estimated arrival direction Angle.
Here, the arrival direction means an angle deviated from the front direction. For example, when it
is shifted to the right side or to the left side, it becomes an eyebrow as well, and information on
which side it deviated is not included.
As described later, the determination of the correction gain gain (K) does not require information
on which side it deviated.
[0041]
Although the arrival direction estimation unit 25 may estimate without using the coherence COH
(K), the case of estimation using the coherence COH (K) will be described below.
The arrival direction estimation unit 25 may apply an existing method as described in Patent
Document 2 as a method of estimating the arrival direction using the coherence COH (K).
However, in the first embodiment, estimation is performed according to the estimation method
using the new coherence COH (K) executed by the detailed configuration shown in FIG. 6
described later.
In the case where the existing estimation method as described in Patent Document 2 is applied,
the coherence calculation unit 24 described above calculates the coherence COH as a value
obtained by arithmetically averaging the coherence filter coefficients coef (f, K) at all frequencies.
Calculate K). On the other hand, when applying the estimation method to which the detailed
configuration shown in FIG. 6 is applied, the above-mentioned coherence calculation unit 24
arithmetically averages the coherence filter coefficients coef (f, K) for the midrange frequency.
The coherence COH (K) is calculated as
03-05-2019
12
[0042]
The correction gain determination unit 26 obtains the correction gain gain (K) with respect to the
coherence filter coefficient coef (f, K) in the low band (for example, 1000 Hz or less) based on the
estimated arrival direction Angle. As the correction gain determination unit 26, one using a table
shown in FIG. 5 in which the estimated arrival direction Angle and the correction gain gain (K)
are associated can be mentioned. In FIG. 5, when the estimated arrival direction Angle belongs to
the range と き に, the correction gain gain (K) is α, and when the estimated arrival direction
Angle belongs to the range 補正, the correction gain gain (K) is β, and the estimated arrival
direction Angle Represents that the correction gain gain (K) is γ when it belongs to the range φ.
The correction gain gain (K) is set to the ratio of the coherence filter coefficient when the
microphone spacing is narrow to the coherence filter coefficient when the microphone spacing is
wide in the arrival direction, and the correction gain gain (K) By multiplying the coherence filter
coefficients in the case where the distance is wide, it can be converted to the coherence filter
coefficients in the case where the microphone distance is narrow.
[0043]
When the arrival direction estimation unit 25 can estimate only at a value shifted by a
predetermined angle, such as diagonally forward, middle (oblique front and horizontal middle
angle), and horizontal, the estimated arrival direction Angle of the table is It is sufficient to
describe the estimated angle. In addition, as the correction gain determination unit 26, one using
a function for calculating the correction gain gain (K) from the estimated arrival direction Angle
may be applied.
[0044]
The filter coefficient correction unit 27 corrects the low pass coherence filter coefficient coef (f,
K) by the correction gain gain (K).
[0045]
The filter processing unit 28 applies the coherence filter coefficient coef (f, K) after low-pass
correction to perform coherence filter processing on the main frequency domain signal X1 (f, K)
as shown in equation (7). Then, the noise-suppressed signal (filtered signal) Y (f, K) is obtained.
03-05-2019
13
Equation (7) represents each operation (multiplication process) of each frequency.
[0046]
Y (f, K) = X1 (f, K) × coef (f, K) (7) Here, the physical meaning of the coherence filter process is
supplemented. The coherence filter coefficient coef (f, K) (the same applies to the coherence filter
coefficient after low-pass correction) is a cross-correlation of signal components having blind
spots on the left and right, so there is no bias in the arrival direction when the correlation is
large. It is a voice component arriving from the front, and when the correlation is small, it is also
associated with the arrival orientation of the input voice as if the arrival orientation is a
component biased to the right or left. Therefore, it can be said that multiplying the coherence
filter coefficient coef (f, K) is a process for suppressing the noise component coming from the
side.
[0047]
The post-filtering signal transmission unit 29 provides the post-noise suppression signal Y (f, K)
to the IFFT unit 13 in the subsequent stage. The post-filtering signal transmission unit 29
increases K by 1 to start processing of the next frame.
[0048]
FIG. 6 is a block diagram showing the detailed configuration of the arrival direction estimation
unit 25 described above. In FIG. 6, the arrival direction estimation unit 25 includes a coherence
reception unit 31, an inquiry unit 32, a storage unit 33, and an arrival direction transmission unit
34.
[0049]
The coherence receiver 31 receives the coherence COH (K) calculated by the coherence
calculator 24. Here, the coherence COH (K) is a value obtained by arithmetically averaging the
03-05-2019
14
coherence filter coefficients coef (f, K) with respect to (the frequency of) the mid region, and may
be hereinafter referred to as mid region coherence.
[0050]
The storage unit 33 stores a conversion table in which the midrange coherence COH (K) and the
arrival direction Angle are associated with each other.
[0051]
The inquiry unit 32 is for extracting the arrival direction Angle corresponding to the midrange
coherence COH (K) received by the coherence reception unit 31.
[0052]
The arrival direction transmission unit 34 outputs the arrival direction Angle extracted by the
inquiry unit 32 to the correction gain determination unit 26.
[0053]
FIG. 7 is an explanatory diagram of the contents of the conversion table in the storage unit 33. As
shown in FIG.
In the example of FIG. 7, when the mid-range coherence COH (K) is A or more and less than B,
diagonally front is associated as the arrival direction Angle, and when the mid-range coherence
COH (K) is B or more and less than C, the arrival direction Angle The horizontal direction is
associated with each other, and the intermediate direction is associated as the arrival direction
Angle when the mid-range coherence COH (K) is greater than C and less than D.
[0054]
The conversion table is configured to obtain the arrival direction based on the magnitude
relationship specific to the midrange coherence filter coefficients including the spatial alias.
[0055]
03-05-2019
15
FIG. 8 is an explanatory view showing the behavior of the midrange coherence for each arrival
direction of the disturbance sound.
In the middle frequency band of 2000 Hz to 4000 Hz, as shown in FIG. 10 described above, it
has directivity in several azimuths and is shaped to be nearly symmetrical.
Therefore, the difference between the mid-range coherence COH (K) depending on the arrival
direction is likely to occur, such that the mid-range coherence COH (K) is small in one arrival
direction, and the mid-range coherence COH (K) becomes large in another direction. Based on
this point, the conversion table of FIG. 7 is configured.
Here, it should be noted that the range of the mid-range coherence according to the arrival
direction is the magnitude function of the mid-range coherence COH (K) and the arrival direction,
as in the case of “diagonal front <lateral <medial”. The magnitude relationship does not have a
relationship of a monotonically increasing function or a monotonously decreasing function. It is
because of the inventiveness of the present inventors that it has been found that the arrival
direction can be estimated using the mid-range coherence COH (K) even though there is no such
monotonous relationship.
[0056]
(A-3) Operation of the First Embodiment Next, the overall operation, the detailed operation in the
coherence filter processing unit 12, and the arrival direction of the operation of the audio signal
processing apparatus 10 of the first embodiment with reference to the drawings. The detailed
operation in the estimation unit 25 will be described in order.
[0057]
Signals s1 (n) and s2 (n) input from the pair of microphones m1 and m2 are converted from time
domain to frequency domain signals X1 (f, K) and X2 (f, K) by the FFT unit 11, respectively. Then,
it is given to the coherence filter processing unit 12.
Thereby, coherence filter processing is performed in the coherence filter processing unit 12, and
the obtained noise-suppressed signal Y (f, K) is provided to the IFFT unit 13. In IFFT unit 13,
noise-suppressed signal Y (f, K), which is a frequency domain signal, is converted to time domain
03-05-2019
16
signal y (n) by inverse fast Fourier transform, and this time domain signal y (n) is output Be done.
[0058]
Next, the detailed operation of the coherence filter processing unit 12 will be described. The
above-described FIG. 4 showing the detailed configuration of the coherence filter processing unit
12 can also be regarded as a flowchart showing the processing of the coherence filter processing
unit 12. Although the processing of a certain frame will be described below, the processing
described below is repeated for each frame.
[0059]
When a new frame is obtained and frequency domain signals X1 (f, K) and X2 (f, K) of a new
frame (current frame K) are given from FFT unit 11, according to equations (3) and (4) , First and
second directivity signals B1 (f, K) and B2 (f, K) are calculated, and further, based on these
directivity signals B1 (f, K) and B2 (f, K), Coherence filter coefficients coef (f, K) are calculated
according to equation (5). Furthermore, based on the coherence filter coefficient coef (f, K), midrange coherence COH (K) is calculated as an index value that can be estimated for the arrival
direction of the interference sound according to the equation (6).
[0060]
Thereafter, the arrival direction estimation unit 25 refers to the conversion table shown in FIG. 7
with the mid-range coherence COH (K) as a key, and estimates the arrival direction Angle of the
interference sound.
[0061]
Then, in the correction gain determination unit 26, the conversion table of FIG. 5 is referred to
using the estimated arrival direction Angle as a key, and the correction gain gain (K) for the lowrange coherence filter coefficient coef (f, K) is obtained. The coefficient correction unit 27
corrects the low pass coherence filter coefficient coef (f, K) by the correction gain gain (K).
[0062]
Thereafter, in the filter processing unit 28, the coherence filter processing for the main
03-05-2019
17
frequency domain signal X1 (f, K) is performed according to the equation (7) based on the
obtained low pass corrected coherence filter coefficient coef (f, K) The noise-suppressed signal
(filtered signal) Y (f, K) that has been executed is provided to the IFFT unit 13, and the frame
variable K is increased by one, and processing for the next frame is started.
[0063]
Next, the detailed operation in the arrival direction estimation unit 25 will be described.
[0064]
When the coherence receiver 31 receives the mid-range coherence COH (K) calculated by the
coherence calculator 24, the coherence receiver 31 receives the mid-range coherence COH (K
from the storage unit 33 by the interrogation unit 32 of the arrival direction estimation unit 25.
The arrival direction Angle corresponding to) is extracted and output.
[0065]
The above is the outline of the operation of the first embodiment.
[0066]
(A-4) Effects of First Embodiment As described above, according to the first embodiment, even
when the microphone spacing is extended and the directivity formed in the low region becomes
extremely strong, the microphone It is possible to correct to a value close to the coherence filter
coefficient when the microphone spacing is narrow based on the difference between the
coherence filter coefficient when the spacing is narrow and the behavior specific to the
coherence filter coefficient when the microphone spacing is wide.
As a result, the sound quality does not deteriorate due to excessive suppression processing.
In addition, the restriction of the microphone spacing is relaxed, and the designer can configure
the microphone array at an arbitrary spacing.
[0067]
03-05-2019
18
Further, according to the first embodiment, even when the microphone spacing is extended and a
spatial alias is mixed, the arrival direction can be estimated based on the behavior specific to the
midrange coherence filter coefficient.
This relaxes the limitation of the microphone spacing and allows the designer to configure the
microphone array at any spacing.
[0068]
By the above effects, it is possible to expect improvement in call sound quality in a
communication apparatus such as a television conference apparatus, a mobile phone, or a smart
phone, to which the audio signal processing apparatus or program of the first embodiment is
applied.
[0069]
(B) Other Embodiments In the above description of the first embodiment, various modified
embodiments are mentioned, but further, modified embodiments as exemplified below can be
mentioned.
[0070]
In the first embodiment, the low band coherence filter coefficient is corrected with the correction
gain. However, in addition to the correction gain, the adjustment coefficient may be multiplied to
adjust the noise suppression performance and the sound quality.
For example, the adjustment factor may be made variable by an operation on the adjustment
operator (a predetermined key on the keyboard may be used), or only whether the adjustment
factor is applied may be specified. good.
[0071]
In the first embodiment, although the distance between the pair of microphones m1 and m2 is
fixed, at least one of the microphones may be movable and the distance between the
03-05-2019
19
microphones m1 and m2 may be variable.
In this case, the conversion table to be applied may be switched according to the microphone
interval.
For example, a plurality of conversion tables such as a conversion table for microphone spacing
8 cm to 10 cm, a conversion table for microphone spacing 10 cm to 12 cm, a conversion table
for microphone spacing 12 cm to 14 cm, etc. are prepared. The conversion table to be applied
may be selected according to the interval.
Here, the user may input the microphone interval, or the movement position of the microphone
may be provided stepwise, and the microphone interval may be automatically obtained by the
sensors provided at each stage.
[0072]
In the first embodiment, it is shown that the arrival azimuth Angle is obtained by using the
middle range coherence COH (K) as a key, and the correction gain gain (K) is obtained using the
arrival direction Angle as a key. A conversion table in which the correction gain gain (K) is
directly associated may be prepared, and the correction gain gain (K) may be directly obtained
using the mid-range coherence COH (K) as a key.
Also in the case of applying the other estimation method of the arrival direction described in the
first embodiment, the same modified embodiment can be mentioned.
[0073]
In the first embodiment, the correction gain is obtained from the estimated arrival direction.
However, in addition to this, the arrival direction estimated for another purpose may be applied.
For example, the flooring processing may be performed on the high-range coherence filter
coefficients by applying a flooring threshold determined in accordance with the arrival direction
of the disturbance sound (Japanese Patent Application No. 2013-154825 and the drawings).
03-05-2019
20
[0074]
In the first embodiment, although the low band to be corrected by applying the correction gain is
the same in any incoming direction, the width of the low band to be corrected by applying the
correction gain is changed according to the incoming direction. It is good. For example, when the
incoming direction is X, the low band may be up to 1000 Hz, and when the incoming direction is
Y, the low band may be up to 1100 Hz. In addition to or instead of this, the width of the low band
may be changed by the microphone spacing.
[0075]
In the first embodiment, although the middle band indicates the same regardless of the
microphone spacing, at least one of the width of the middle band and the center frequency of the
middle band may be changed depending on the microphone spacing.
[0076]
The estimation method of the arrival direction described in the first embodiment is not limited to
the noise suppression according to the coherence filter method, and various signal processing
requiring information on the arrival direction of the disturbance sound It can be applied to
For example, the estimation method of the arrival direction described in the first embodiment can
be applied to noise suppression processing other than the coherence filter method, sound source
separation processing, speech coding processing, and the like.
[0077]
Depending on the application, the calculation of the coherence filter coefficient may be limited to
the midrange.
[0078]
Here, in the estimation method of the arrival direction described in the first embodiment, the
arrival direction is estimated as the (absolute value) of the deviation angle with the front, but the
arrival from the right side or the arrival from the left side If it is necessary to calculate
information (8), if the obtained value is positive, the arrival from the right is determined, and if
03-05-2019
21
the obtained value is negative, the arrival from the left is determined. You may
[0079]
In the first embodiment, processing that has been processed with signals in the frequency
domain may be processed with signals in the time domain, if possible. Conversely, processing
that has been processed with signals in the time domain may be If possible, processing may be
performed using signals in the frequency domain.
[0080]
In the first embodiment, the application of the coherence filter method alone is shown as the
noise suppression technique, but other noise suppression techniques such as the voice switch
method, the Wiener filter method, and the frequency subtraction method are used together. It is
good.
[0081]
In each of the above embodiments, the audio signal processing apparatus and program for
immediately processing the signals captured by the pair of microphones are shown, but the audio
signal to be processed in the present invention is not limited to this.
For example, the present invention can be applied to the case of processing a pair of audio
signals read from a recording medium, and also to the case of processing a pair of audio signals
transmitted from an opposing apparatus. It can apply.
[0082]
DESCRIPTION OF SYMBOLS 10 Audio signal processing apparatus 11 FFT unit 12 coherence
filter processing unit 13 IFFT unit m1, m2 microphone 21 input signal receiving unit 22
directivity forming unit 23 filter coefficient calculating unit 24 Coherence calculation unit 25
Arrival direction estimation unit 26 Correction gain determination unit 27 Filter coefficient
correction unit 28 Filter processing unit 29 Filtered signal transmission unit 31 Coherence
reception unit 32 ... inquiry unit, 33 ... storage unit, 34 ... arrival direction transmission unit.
03-05-2019
22
Документ
Категория
Без категории
Просмотров
0
Размер файла
36 Кб
Теги
jp2015126279
1/--страниц
Пожаловаться на содержимое документа