close

Вход

Забыли?

вход по аккаунту

?

JP2011053062

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JP2011053062
To estimate the distance between a sound source and one microphone array. A sound source
distance measuring apparatus according to the present invention records the relationship
between a distance ratio and a distance ratio between a single microphone array consisting of a
plurality of microphones, a plurality of frequency domain conversion parts, a distance ratio
estimation part. And a distance determination unit. The plurality of frequency domain conversion
units respectively receive the received sound signals received by the plurality of microphones,
and convert the received sound signals into signals in the frequency domain. The in-between
ratio estimation unit estimates the in-between ratio of the sound reception signal with the signals
in the frequency domain output by the plurality of frequency domain conversion units as input. A
distance determination unit refers to a distance-to-intersection ratio database with the interportal ratio as an input, and estimates a sound source distance estimated value corresponding to
the inter-port ratio. [Selected figure] Figure 4
Inter-station ratio estimation device, sound source distance measurement device, noise removal
device, method of each device, device program
[0001]
The present invention relates to, for example, a sound source distance measuring device for
estimating the distance from a microphone array to a sound source using a single microphone
array applicable to a hands free method for operating a device by voice input, and a sound
source distance measuring device The present invention relates to a direct current ratio
estimation device used in the present invention, a noise removal device using a sound source
distance measurement device, a method of each device, and a device program.
04-05-2019
1
[0002]
The concept of measuring the distance between the conventional microphone and the sound
source disclosed in Non-Patent Document 1 will be briefly described with reference to FIG.
The idea is to estimate the distance between the microphone and the sound source by the
principle of triangulation. Two microphone arrays 1 and 2 are disposed on a straight line in the
direction orthogonal to the traveling direction of the sound wave of the sound source at an
interval of a distance D12.
[0003]
From the known distance D12 and the angles θ1 and θ2 estimated by the microphone array,
the distance D between the sound source and the straight line formed by the microphone arrays
1 and 2 is estimated.
[0004]
M. Omologo and P. Svaizer, "Use of the Crosspower-Spectrum Phase in Acoustic Event Location,"
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 5, NO. 3, MAY 1997.
[0005]
Heretofore, the only method of measuring the distance between the microphone and the sound
source has been based on the above-described triangulation principle.
In order to measure the distance D accurately by trigonometry, the view angles θ1 and θ2 from
the microphone arrays 1 and 2 need to be sufficiently different, so there is a problem that a
somewhat large installation space is required.
Also, the distance between the microphone arrays needs to be known, and multiple microphone
arrays are essential. Therefore, the installation work of the measurement environment of the
distance between sound sources is complicated.
04-05-2019
2
[0006]
The present invention has been made in view of such problems, and it is possible to estimate the
distance between the microphone and the sound source even with a single microphone array, in
order to enable the estimation of the distance between the microphone and the sound source.
Proposes a new method using (described later). Then, the sound source distance measuring
device using the direct ratio, the direct current ratio estimating device constituting the sound
source distance measuring device, the noise removing device using the sound source distance
measuring device, and the method and program of each device are provided. The purpose is
[0007]
The sound source distance measuring device according to the present invention is a distance-tostraight recording the relationship between a single microphone array composed of a plurality of
microphones, a plurality of frequency domain conversion units, a direct ratio estimation unit, and
a direct ratio and a distance. An inter-ratio database and a distance determination unit are
provided. The plurality of frequency domain conversion units respectively receive the received
sound signals received by the plurality of microphones, and convert the received sound signals
into signals in the frequency domain. The in-between ratio estimation unit estimates the inbetween ratio of the sound reception signal with the signals in the frequency domain output by
the plurality of frequency domain conversion units as input. The distance determination unit
estimates a distance corresponding to the direct-to-intersection ratio database with reference to
the direct-to-intersection ratio database with the direct-to-intersection ratio as an input.
[0008]
Further, in the direct-to-inside ratio estimation device of the present invention, the same
microphone array as the sound source distance measuring device of the present invention, a
single microphone array consisting of a plurality of microphones, a plurality of frequency domain
conversion units, and a direct-current ratio estimation unit Prepare.
[0009]
Further, the noise removal device of the present invention further includes a processing target
signal generation unit, a target signal adjustment unit, and an inverse frequency domain
conversion unit in the configuration of the sound source distance measurement device of the
04-05-2019
3
present invention.
The processing target signal generation unit combines signals in the frequency domain output
from the plurality of frequency domain conversion units to generate a processing target signal.
The target signal adjustment unit receives the processing target signal and the direct current
ratio, and generates a processed signal in which the amplitude of the processing target signal is
adjusted according to the value. The inverse frequency domain conversion unit converts the
processed signal into a time domain signal.
[0010]
According to the sound source distance estimation device of the present invention, since it is
possible to estimate the distance between the sound source and the microphone array with one
small scale microphone array, it is possible to save space. In addition, there is no need for
information that must be measured and placed in advance, such as the distance between
microphone arrays that changes with each installation.
[0011]
Further, the direct-to-right ratio estimating device of the present invention can provide the directto-ear ratio which is an index for sound source distance estimation. Also, the noise removal
device according to the present invention estimates the direct ratio and filters the received signal
according to the value. The direct-to-inside ratio is a ratio of direct sound to indirect sound
(reversed sound) included in the received sound, and is a value that changes monotonously
according to the distance between the microphone and the sound source. By filtering the
received sound according to this value, it is possible to pick up a sound by emphasizing or
suppressing only the component of the sound source determined to be within a certain distance
range. As a result, it is possible to pick up (remove noise from) only the sound of a sound source
located at a specific distance with one microphone array.
[0012]
The figure which shows an example of the scene which utilizes the sound source distance
measuring apparatus 100 of this invention. The figure which shows the propagation path of the
04-05-2019
4
sound indoors. The figure which shows the relationship between the ratio between direct and the
distance between microphones. The figure which shows the operation | movement flow of the
figure sound source distance estimation apparatus 400 which shows the function structural
example of the sound source distance estimation apparatus 400 of this invention. FIG. 6 is a
diagram showing an example of a functional configuration of a direct-to-proportion ratio
estimation unit 43. The figure which shows the function structural example of the noise removal
apparatus 700 of this invention. FIG. 16 shows an operation flow of the noise removal apparatus
800. The figure which shows the experimental condition of an effect confirmation experiment.
The figure which shows an example of a direct ratio. The figure which shows the view which
measures the distance between the conventional microphone and sound source disclosed by the
nonpatent literature 1. FIG.
[0013]
Hereinafter, embodiments of the present invention will be described with reference to the
drawings. The same reference numerals are given to the same components in the drawings, and
the description will not be repeated. Also, in the following description, the symbols “¯”, “^”,
etc. used in the text should originally be written directly above the previous character, but due to
the limitations of the text notation, immediately after the character Described in. In the formula,
these symbols are described at their original positions.
[0014]
Before describing the embodiments, the idea of the present invention will be described.
[0015]
The present invention estimates the distance between a microphone array and a sound source
using a single microphone array.
The scene which utilizes the sound source distance estimation apparatus 400 of this invention at
FIG. 1 is illustrated. One microphone array 11 and a speaker 12 exist in a room 10 having
reverberation characteristics. The microphone array 11 and the speaker 12 are disposed at a
distance.
04-05-2019
5
[0016]
In this situation, we would like to estimate the distance D between the speaker 12 and the
microphone array 11. Therefore, the present invention estimates the distance between the sound
sources using the direct ratio.
[0017]
The direct ratio is a ratio of direct sound to indirect sound (reverberation sound) included in the
received sound. FIG. 2 shows a propagation path of sound from the sound source 21 to the
microphone 22 when the microphone is placed indoors and sound is recorded. The direct sound
is a sound wave indicated by a thick solid line which directly reaches from the sound source 21
to the microphone. One reverberation sound is a sound wave indicated by a broken line which
reaches the microphone 22 after the sound emitted from the sound source 21 is reflected by a
wall, a floor, a ceiling or the like.
[0018]
FIG. 3 shows the relationship between the in-plane ratio and the distance between microphones.
The horizontal axis in FIG. 3 is the distance from the microphone to the sound source, and the
vertical axis is the direct ratio. In general, indirect sound exhibits a constant magnitude that does
not depend on the distance from the microphone. With respect to the indirect sound, the direct
sound exhibits a monotonically decreasing characteristic as the distance from the microphone
increases. The direct ratio divided by the indirect sound by the indirect sound becomes a
characteristic that monotonously decreases with the increase of the distance as the direct sound.
[0019]
The sound source distance estimation apparatus of the present invention makes it possible to
estimate the distance between the microphone array and the sound source from the received
sound received by one microphone array by using this direct ratio. The direct ratio estimation
device of the present invention outputs a direct ratio. Moreover, the noise removal apparatus of
this invention removes the noise of a sound reception signal according to the in-room ratio which
the in-room ratio estimation apparatus outputs.
04-05-2019
6
[0020]
FIG. 4 shows an example of the functional configuration of a sound source distance estimation
apparatus 400 according to the present invention. The operation flow is shown in FIG. The noise
eliminator 400 includes one microphone array 41, a plurality of frequency domain conversion
units 421 to 42M, an inter-area ratio estimation unit 43, and a distance-inter-area ratio database
(hereinafter referred to as a distance-inter-area ratio DB). And 44) and a distance determination
unit 45. Each functional component except the microphone array 41 is realized by, for example,
a predetermined program being read into a computer including a ROM, a RAM, a CPU, and the
like, and the CPU executing the program.
[0021]
The microphone array 41 comprises a plurality of microphones m1,. A plurality of frequency
domain conversion units 421, ..., 42M receive the received signals xm (n) received by the
plurality of microphones m1, ... mM, respectively, and convert the respective received signals into
signals in the frequency domain ( Step S42). The frequency domain conversion units 421, ..., 42M
sample the sound reception signal xm (n) at, for example, a sampling frequency of 16 kHz and
convert it into a digital signal. For example, 256 samples are made into one frame, and discrete
Fourier transform is performed in each frame. The conversion is performed to output the
frequency component Xm (ω, l) (step S42). ω is a frequency and l is a frame number. The A / D
converter for converting the sound reception signal xm (n) into a digital signal is omitted.
[0022]
The direct-to-right ratio estimation unit 43 estimates the direct-to-right ratio E of the sound
reception signal with the signal Xm (ω, l) in the frequency domain output by the plurality of
frequency domain conversion units 421, ..., 42m as an input (step S43) .
[0023]
The distance-intersection ratio DB 44 records the relationship between the inter-area ratio E and
the distance between the microphone array and the sound source.
04-05-2019
7
The distance determination unit 45 receives the direct ratio as an input, refers to the distancedirect ratio DB 44, and estimates a distance corresponding to the direct ratio (step S45). The
operations from step S42 to step S45 are continued until all the sound reception signals xm (n)
are finished.
[0024]
By the above operation, for example, only a sound within a specific distance range is enhanced
by one microphone array, and sounds outside the range are suppressed for noise removal.
Hereinafter, the present invention will be described in more detail by showing a more specific
functional configuration example of each part.
[0025]
[In-Plane Ratio Estimating Unit] FIG. 6 shows an example of a functional configuration of the inplane ratio estimating unit 43. The in-between ratio estimating unit 43 includes a spatial
correlation matrix calculating unit 431, a signal power estimating unit 432, and an in-between
ratio calculating unit 433. The spatial correlation matrix calculating unit 431 receives the signals
X1 (ω, l),..., XM (ω, l) in the frequency domain output by the plurality of frequency domain
transforming units 421,. (Ω, l),..., XM (ω, l) are vectorized, and the spatial correlation matrix R
(ω) shown in equation (1) is calculated using the input signal.
[0026]
[0027]
Here, T represents transposition of a matrix, H represents conjugate transposition, and L
represents the number of frames to be averaged.
The spatial correlation matrix R (ω) is input to the signal power estimation means 432.
04-05-2019
8
[0028]
The signal power estimation unit 432 is given by each component R ij (ω) of the spatial
correlation matrix R (ω) output by the spatial correlation matrix calculation means 431, the
microphone arrangement of the microphone array given in advance, and the direction of the
sound source In equation (5), which is configured from the matrix Rd (ω) (formula (3)), each
component dij (ω) of the matrix Rr (ω) (formula (4)), and each component rij (ω) The matrix A
(ω) shown and B (ω) shown in equation (6) are used.
[0029]
[0030]
Here, Dmn is the distance between the mth microphone and the nth microphone, and θ is the
direction of the sound source as viewed from the front of the microphone array.
Here, the shape of the microphone array is a linear arrangement, and the front of the
microphone array means the normal direction of the line in which the microphones are arranged.
[0031]
[0032]
Then, by setting up the simultaneous equations shown in equation (7) and solving them, a vector
P (ω) composed of the power Pd (ω) of the direct sound and the power Pr (ω) of the
reverberation (equation (8)) The direct sound power Pd (ω) and the reverberation sound power
Pr (ω) are respectively output.
[0033]
[0034]
The matrix Rd (ω) when the arrangement of the microphone array is an arrangement other than
a straight line can be expressed in the form shown in a more general expression (9).
04-05-2019
9
[0035]
[0036]
Here, Dmn (θ) represents the difference in distance between the m-th microphone and the n-th
microphone when viewed from the direction of the angle θ °.
Further, for the solution of the simultaneous equations of the equation (7), for example, as shown
in the equation (11), the pseudo inverse matrix A <+> (ω) of the A (ω) (the equation (10)) It is
done by the method of hanging from the left of).
[0037]
[0038]
The in-plane ratio calculation means 43 calculates and outputs the in-plane ratio E by the
equation (12) from the direct sound power Pd (ω) and the reverberation sound power Pr (ω).
[0039]
[0040]
With the configuration of the direct-to-right ratio estimation unit 43 described above, one
microphone array 41, and a plurality of frequency domain conversion units 421 to 42M, the
direct-to-right ratio estimation device 71 that outputs the direct-to-ear ratio E can be configured.
Further, the in-plane ratio may be obtained from the eigenvalues obtained by eigenvalue
expansion of the spatial correlation matrix R (ω).
[0041]
04-05-2019
10
Information on the relationship between the distance and the in-plane ratio is recorded in
advance in the distance-in-intersection ratio DB 44.
The information on the relationship between the distance and the in-plane ratio can be obtained
by linear interpolation of pairs (d1, E1), (d2, E2),... A functional expression d = f (E) indicating the
relationship between the distance between the function and the approximation function obtained
from the set of (d1, E1), (d2, E2),.
The function f (E) is described, for example, in the reference "M. Tohyama et. Al.
"The Nature and Technology of Acoustic Space," Academic Press, 1995.
"It is described in.
[0042]
The distance determination unit 45 refers to the relationship between the distance ratio E
recorded from the distance ratio estimation unit 43, the distance recorded in the distancedistance ratio DB 44, and the distance ratio, and the distance ratio E. The source distance
estimated value d ^ corresponding to is output.
[0043]
When the pairs (d1, E1), (d2, E2),... Which correspond to the distances and the in-plane ratios are
stored in the distance-in-intersection DB 44, the sound source distance estimated values Find and
output d ^.
[0044]
First step: Among E1, E2,... Stored in the distance-intersection ratio DB 44, two inter-area ratios
Em and En adjacent to the inter-area ratio E obtained by the inter-area ratio estimation unit 43
are determined.
[0045]
Second step: The distances dm and dn corresponding to the direct current ratios Em and En,
04-05-2019
11
respectively, are obtained from the distance-direct current ratio DB44.
[0046]
Third step: From the distances dm and dn, a sound source distance estimated value d ^ is
obtained by linear interpolation as shown in equation (13).
[0047]
[0048]
Further, when the functional expression d = f (E) is stored in the distance-distance ratio DB 44,
the distance determination unit 45 estimates the sound source distance from the distance ratio E
input from the distance ratio estimation unit 43. Calculate and output the value d ^.
[0049]
The inter-period ratio calculation means 433 divides the accumulated value ωωPd (ω) of direct
sound power of all frequencies ω by the accumulated value ωωPr (ω) of indirect sounds of all
frequencies ω as shown in equation (12). The calculated value is calculated as the direct ratio E.
Some of the received sound signals have components concentrated in a specific frequency band.
When the ratio E of the received sound signals is calculated by the ratio calculation means 433,
the estimation accuracy of the ratio E is degraded.
[0050]
Therefore, as shown in the equation (14), the accuracy of estimation of the in-plane ratio is
improved by using the in-plane ratio calculating means 433 '(FIG. 6) for calculating the in-plane
ratio E in a specific frequency region Ω. Can do.
[0051]
04-05-2019
12
[0052]
Here, the frequency domain Ω is determined, for example, by selecting a frequency band in
which signal components concentrate.
For example, among the output Xm (ω, l) of the frequency domain conversion unit 42m
connected to an arbitrary m-th microphone, the absolute value of Xm (ω, l) is preset as shown in
the equation (15) It is determined by selecting the frequency ω having a value larger than the
threshold Pth or selecting the frequency ω from the one with the largest absolute value of Xm
(ω, l) to the K-th.
[0053]
[0054]
Here, Pth is, for example, the average value of all the frequencies of | Xm (ω, l) |.
[0055]
FIG. 7 shows an example of a functional configuration of the noise removal apparatus 700 of the
present invention.
The operation flow is shown in FIG.
The noise removal apparatus 700 includes the direct-current ratio estimation apparatus 71
described in the first embodiment, the processing target signal generation unit 72, the target
signal adjustment unit 73, and the inverse frequency domain conversion unit 74.
[0056]
Processing target signal generation unit 72 receives signal Xm (ω, l) in the frequency domain
output by the plurality of frequency domain conversion units 421 to 42 M in direct-current ratio
04-05-2019
13
estimation device 71 as input signal to be processed X (ω, l) Are output (step S72).
The processing target signal Y (ω, l) is a signal obtained by combining the signal Xm (ω, l) in the
frequency domain by, for example, addition means (not shown).
Prior to the addition, the signal Xm (ω, l) in each frequency domain may be multiplied by a
weight.
[0057]
The target signal adjustment unit 73 receives the processing target signal X (ω, l) output from
the processing target signal generation unit 72 and the processing target signal X (ω, l) output
from the processing unit signal generation unit 72. The processed signal Y (ω, l) is generated by
adjusting the amplitude of X (ω, l) (step S73).
The inverse frequency domain conversion unit 74 converts the processed signal Y (ω, l) into a
time domain signal y (n) (step S74).
[0058]
The target signal adjustment unit 73 includes, for example, a distance calculation unit 721, a
filter formation unit 722, and a multiplication unit 723.
The distance calculating means 721 incorporates a functional expression d = f (E) indicating the
relationship between the distance between the microphone array 41 and the sound source and
the ratio E, and the sound source corresponding to the ratio E input An estimated distance value
d ^ is calculated (distance calculation step S721).
[0059]
The filter forming means 722 sets the sound source distance estimation value d ^ to emphasize
time frequency components taking values between the two threshold values df and dn, which are
different, as shown in equation (16), A filter is formed to emphasize only the sound sources in the
04-05-2019
14
band-like region in the two distance sections.
[0060]
[0061]
Here, l and ω of G (ω, l) are the L frames and straight lines averaged by the equation (1) in the
spatial correlation matrix calculating means 431 among the processes of the above-mentioned
ratio calculating unit 43. The same G (ω, l) is multiplied to all the frequencies included in the
frequency Ω (formula (14)) averaged by the ratio calculation means 433.
Further, in the equation (16), the values of G (ω, l) do not necessarily have to be 1 and 0, and
may be values having sufficiently different magnitudes such as 0.9 and 0.1.
[0062]
The multiplying means 723 multiplies the processing target signal X (ω, l) by the filter G (ω, l)
to generate a processed signal Y (ω, l).
Therefore, the processed signal Y (ω, l) is obtained by emphasizing or suppressing the sound of
the sound source located in two distance sections, that is, in a specific distance range from the
microphone array 41.
The post-processing signal Y (ω, l) is converted by the inverse frequency domain conversion unit
73 into the signal y (n) in the time domain.
[0063]
[Experimental Results] For the purpose of confirming the effect of the present invention,
computer simulation is performed in which two sound sources are disposed at different positions
in the same direction as viewed from the microphone array and the sound of the sound source
far from the microphone array is suppressed. The
04-05-2019
15
[0064]
The simulation conditions are shown in FIG.
We assumed a room with a plane size of 4 x 6 m and a height of 2.5 m.
A microphone array was used in which three microphones were linearly arranged at an interval
of 4 cm.
The size of the microphone array is 8 cm.
In the microphone array, the central microphone was placed at a height of 1.5 m and 1 m from a
wall of 4 m.
Then, a sound source emitting white noise following a normal distribution is placed in the
direction of an angle of 10 ° from the central axis of the central microphone, the distance from
the microphone array is changed, and the direct ratio is estimated each time.
[0065]
In FIG. 10, the horizontal axis represents the distance [cm] between the microphone array and
the sound source, and the vertical axis represents the direct ratio [dB]. The direct ratio estimated
by the method of the present invention is plotted by ○. The actual ratio between the two
determined from the impulse response is plotted by □. Although a tendency different from the
actual value is exhibited at 20 cm or less, the same tendency as the actual value is exhibited at a
distance of 30 cm or more.
[0066]
It can also be understood well from FIG. 10 that the distance can be determined from the value of
the direct ratio.
04-05-2019
16
[0067]
In this way, even with a single compact microphone array, it is possible to estimate the distance
between the microphone and the sound source.
The idea of the present invention can be applied to a direct ratio estimation device, a sound
source distance estimation device using the direct ratio estimation device, and a noise removal
device.
[0068]
Note that the processes described in the above method and apparatus are not only performed in
chronological order according to the order of description, but also may be performed in parallel
or individually depending on the processing capability of the apparatus that executes the process
or the need. Good.
[0069]
Further, when the processing means in the above-mentioned device is realized by a computer, the
processing content of the function that each device should have is described by a program.
Then, by executing this program on a computer, the processing means in each device is realized
on the computer.
[0070]
The program describing the processing content can be recorded in a computer readable
recording medium. As the computer readable recording medium, any medium such as a magnetic
recording device, an optical disc, a magneto-optical recording medium, a semiconductor memory,
etc. may be used. Specifically, for example, as a magnetic recording device, a hard disk device, a
flexible disk, a magnetic tape or the like as an optical disk, a DVD (Digital Versatile Disc), a DVDRAM (Random Access Memory), a CD-ROM (Compact Disc Read Only) Memory), CD-R
(Recordable) / RW (Rewritable), etc. as magneto-optical recording medium, MO (Magneto Optical
04-05-2019
17
disc) etc., as semiconductor memory, EEP-ROM (Electronically Erasable and Programmable Only
Read Memory) etc. It can be used.
[0071]
Further, this program is distributed, for example, by selling, transferring, lending, etc. a portable
recording medium such as a DVD, a CD-ROM or the like in which the program is recorded.
Furthermore, the program may be stored in a storage device of a server computer, and the
program may be distributed by transferring the program from the server computer to another
computer via a network.
[0072]
Further, each means may be configured by executing a predetermined program on a computer,
or at least a part of the processing content may be realized as hardware.
04-05-2019
18
Документ
Категория
Без категории
Просмотров
0
Размер файла
28 Кб
Теги
jp2011053062
1/--страниц
Пожаловаться на содержимое документа