close

Вход

Забыли?

вход по аккаунту

?

JP2015159458

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JP2015159458
Abstract: The present invention provides an apparatus for creating a sound source estimation
image which can reliably estimate a sound source even if there are a plurality of sound sources
of different frequencies in a close place. Kind Code: A1 A sound source direction estimation
apparatus for estimating a sound source direction for each frequency from a difference in arrival
time of sound pressure signals input to a sound collection unit including a plurality of
microphones M1 to M5. Means 32, shooting means 12 for shooting an image of the sound
source direction, data of the direction of the sound source estimated by the sound source
direction estimation means 32 and image data as a video signal of the direction of the sound
source shot by the shooting means 12 A sound source estimation image creation means 33 for
creating a sound source estimation image, which is an image in which a figure indicating the
estimated sound source direction is drawn and synthesized, and a sound source direction for
each octave band from the created sound source estimation image And a band image creating
means 34 for creating a band image which is an image in which a figure representing. [Selected
figure] Figure 1
Device for generating sound source estimation image
[0001]
The present invention relates to an apparatus for creating an image for estimating a sound
source, using information of sounds collected by a plurality of microphones and information of
an image photographed by a photographing means.
[0002]
Conventionally, a plurality of microphones constitute a plurality of linearly arranged microphone
03-05-2019
1
pairs that intersect each other, and an arrival time difference corresponding to a phase difference
between two microphones forming a pair and an arrival between two microphones forming
another pair The sound source direction is estimated from the ratio to the time difference, and an
image collecting means such as a CCD camera is provided to capture the image of the estimated
sound source direction, and the image data obtained by capturing the image of the sound source
direction is synthesized with the data of the sound source direction. Then, a sound source
estimation image is generated by graphically displaying the sound source direction and the
sound pressure level estimated in the image, and the sound source estimation image is displayed
on a display screen or the like to visually display the sound source. A technique for grasping is
disclosed (see, for example, Patent Document 1).
[0003]
JP, 2011-238985, A
[0004]
However, in the above-mentioned conventional method, since the figure which displayed the
sound source direction and the sound pressure level is displayed for every frequency, when the
sound source of a different frequency band exists near, these sound sources can not be
distinguished. There was a case.
[0005]
The present invention has been made in view of the conventional problems, and provides an
apparatus for creating a sound source estimation image capable of reliably estimating a sound
source even if there are a plurality of sound sources of different frequencies in a close place. The
purpose is to
[0006]
The present invention comprises a sound collecting means for collecting a sound pressure signal
of sound transmitted from a sound source provided with a plurality of microphones, and a sound
source estimating a sound source direction for each frequency from the arrival time difference of
the sound pressure signal input to each microphone Direction estimation means, imaging means
for capturing an image of a sound source direction, data of a sound source direction estimated by
the sound source direction estimation means, and image data as a video signal of a sound source
direction captured by the imaging means A sound source estimation image generation unit
including a sound source estimation image generation unit that generates a sound source
03-05-2019
2
estimation image that is an image in which a figure indicating the estimated sound source
direction is drawn; It is characterized in that it comprises band image creating means for creating
a band image which is an image in which a figure indicating a sound source direction for each
octave band is drawn from the image.
This makes it easy to see the difference between the frequencies, so that even if there are a
plurality of sound sources of different frequencies near each other, the sound sources can be
reliably estimated.
[0007]
Further, the present invention further comprises a superposed band image forming means for
forming a superposed band image which is an image obtained by superimposing two or more
adjacent band images generated by the band image forming means. It is characterized by
Since the frequency domain of the superposition band image is wider than the frequency domain
of the band image, the sound source can be estimated with a small number of images.
Further, when the frequency range of interest spans a plurality of bands, it is possible to
effectively specify the sound source by estimating the sound source using the overlapping band
image instead of the band image.
[0008]
Further, according to the present invention, the sound collecting means is not located on a plane
formed by the two pairs of microphones, first and second microphone pairs respectively disposed
at predetermined intervals on two straight lines intersecting each other. The sound source
direction estimation means comprises a phase difference between the microphones constituting
the two pairs of microphones, the fifth microphone and four microphones constituting the two
pairs of microphones. The sound source direction is estimated using phase differences among the
microphones constituting four pairs of microphones each consisting of a pair of microphones.
As a result, the horizontal angle θ and the elevation angle φ can be estimated efficiently and
03-05-2019
3
accurately with a small number of microphones, so that a highly accurate source estimation
image can be created.
The summary of the invention does not enumerate all necessary features of the present
invention, and a subcombination of these feature groups can also be an invention.
[0009]
It is a functional block diagram showing the composition of the image creation device for sound
source estimation concerning an embodiment of the invention. It is a figure which shows
arrangement | positioning of the microphone which comprises a sound extraction means. It is a
figure which shows an example of the image for sound source estimation. It is a figure which
shows the creation method of a band image, and an example of a band image. It is a flowchart
which shows the production method of the band image which concerns on this Embodiment.
[0010]
Hereinafter, embodiments of the present invention will be described based on the drawings. FIG.
1 is a diagram showing the configuration of a sound source estimation image creating apparatus
1 according to the present embodiment, and the sound source estimation image creating
apparatus 1 includes a sound / image collecting unit 10, a data processing unit 20, and storage /
calculation. A unit 30 and a display means 40 are provided. The sound / image collecting unit 10
includes a sound collecting means 11, a CCD camera (hereinafter referred to as a camera) 12 as
an image collecting means, a microphone fixing unit 13, a camera support 14, a support 15, and
a rotating table 16. , And the base 17. The sound collecting unit 11 includes a plurality of
microphones M1 to M5, and measures a sound pressure signal of sound transmitted from a
sound source (not shown). In the arrangement of the microphones M1 to M5, as shown in FIG. 2,
four microphones M1 to M4 are arranged at predetermined intervals L on two straight lines
(here, x and y axes) orthogonal to each other. A second microphone pair (M1, M3) and a
microphone pair (M2, M4) are arranged to constitute a fifth microphone M5 at a position not on
the plane made by the microphones M1 to M4 (here, Z On the axis). In this example, the
microphone M5 is disposed at the top of a quadrangular pyramid whose bottom is a square
formed by the microphones M1 to M4. Thus, four microphone pairs (M5, M1) to (M5, M4) are
further configured.
03-05-2019
4
[0011]
The imaging direction of the camera 12 passes through the intersection of two orthogonal
straight lines in which two pairs of microphone pairs (M1, M3) and microphone pairs (M2, M4)
are disposed, as shown by the white arrows in FIG. It is set in a direction that makes 45 ° with 2
straight lines. The shooting direction of the camera 12 may be another direction such as the
direction from the microphone M1 to the microphone M3 (the X-axis direction in FIG. 2). In this
example, the sound / image collecting unit 10 is installed so that the shooting direction of the
camera 12 is the direction in which the sound source is estimated to be present. Microphones
M1 to M5 are installed on the microphone fixing unit 13, the camera 12 is installed on the
camera support 14, and the microphone fixing unit 13 and the camera support 14 are connected
by three columns 15. That is, the sound collection means 11 and the camera 12 are integrated.
The microphones M1 to M5 are disposed above the camera 12. Further, the base 17 is a support
member consisting of three legs, and the rotating base 16 is installed on the base 17. The camera
support 14 is mounted on a rotating member 16 r of the rotating table 16. Thus, by rotating the
rotation member 16r, the sound collecting means 11 and the camera 12 can be integrally
rotated.
[0012]
The data processing unit 20 includes a sound data input / output unit 21 and an image input /
output unit 22. The sound data input / output means 21 includes an amplifier 21a and an A / D
converter 21b. The amplifier 21a includes a low pass filter, removes high frequency noise
components from the sound pressure signal of the sound sampled by the microphones M1 to
M5, amplifies the sound pressure signal, and outputs the amplified signal to the A / D converter
21b. The A / D converter 21 b sends sound pressure waveform data obtained by A / D converting
a sound pressure signal to the data storage means 31 of the storage / calculation unit 30. The
video input / output means 22 inputs a video signal photographed by the camera 12 and sends
image data obtained by A / D converting this video signal to the data storage means 31.
[0013]
The storage and operation unit 30 includes a data storage unit 31, a sound source direction
estimation unit 32, a sound source estimation image creation unit 33, and a band image creation
unit 34. Each means which comprises the memory | storage and calculating part 30 is comprised
by the software and memory of a personal computer, for example. The data storage unit 31
03-05-2019
5
stores sound pressure waveform data and image data. The sound source direction estimating
means 32 calculates the horizontal angle θ and the elevation angle φ, which are the sound
source direction, for each frequency f using the sound pressure waveform data stored in the data
storage means 31 and the sound pressure input to the microphone M5. Measure the level and let
this be the sound pressure level of the sound propagated from the sound source. The method of
calculating the horizontal angle θ and the elevation angle φ will be described later. The sound
source estimation image creation unit 33 stores data of the sound source direction (horizontal
angle θ and elevation angle φ) and sound pressure levels, which are sound source data for each
frequency calculated by the sound source direction estimation unit 32, The image G for sound
source estimation is generated by combining the image data of different shooting directions and
the graphic (here, a circle) representing the direction of the sound source in the image and drawn
for each frequency f, and the band image creating means 34 Send to FIG. 3 is a view showing an
example of the sound source estimation image G. The sound / image collecting unit 10 is
installed at the entrance of a factory (not shown) and it is shown at which part of the machine
tool H in the factory noise is generated. Estimated. The horizontal axis in the figure is the
horizontal angle θ, and the vertical axis is the elevation angle φ. In this example, the range of
the horizontal angle θ and the range of the elevation angle φ are respectively −80 ° ≦ θ ≦ +
80 ° and −50 ° ≦ φ ≦ + 50 °. Looking at the image G for sound source estimation of the
machine tool H, it can be seen that there are many figures indicating the direction of the sound
source in the region D surrounded by a square frame in the figure.
[0014]
As shown in FIG. 4A, the band image creation unit 34 creates, from the sound source estimation
image G, a plurality of band images Gp which are images for each octave band. Specifically, a 1/3
octave band is used as an octave band, and the center frequency fp (p = 1 to 17) is, for example,
125 Hz, 160 Hz, 200 Hz,..., 3.15 kHz, 4 kHz, 5 kHz Then, 17 band images Gp are obtained. In the
case of the 1/3 octave band, the band image Gp whose center frequency is fp is an image in
which the sound source of the sound whose frequency band is [fp / 1.125, 1.125 · fp] is
displayed. Note that one octave band may be used as the octave band. FIG. 4 (b) is a band image
G7 having a center frequency of fp = 500 Hz created from the sound source estimation image G
shown in FIG. 3, and FIG. 4 (c) is a band image G13 having a center frequency fp = 2 kHz. It is. As
can be seen by comparing FIG. 3 with FIG. 4B and FIG. 4C, in the sound source estimation image
G of FIG. 3, the sound source direction is shown in the region D surrounded by a square frame in
FIG. It is difficult to identify the sound source because the figures overlap, but in the band images
G7 and G13 in Fig. 4 (b) and Fig. 4 (c), as shown by the arrows in each figure, It can be seen that
sound sources with different frequencies are present. Therefore, as in the present example, if the
band image Gp is created from the sound source estimation image G and displayed, it becomes
easy to see the difference for each frequency, so even if there are multiple sound sources with
03-05-2019
6
different frequencies near each other, Can be estimated reliably. The display means 40 includes a
display screen 40M such as a liquid crystal display, and displays the band image Gp created by
the band image creation means 34 on the display screen 40M.
[0015]
Next, a method of creating the band image Gp by the sound source estimation image creating
apparatus 1 will be described with reference to the flowchart of FIG. First, after connecting the
sound / video sampling unit 10, the data processing unit 20, the storage / calculation unit 30,
and the display means 40, the sound / video sampling unit 10 is set as a measurement point
(step S10). Then, after directing the shooting direction of the camera 12 to the planned
measurement location and confirming that the camera 12 is photographing the planned
measurement location by looking at the display screen 40M, the microphones M1 to M5 collect
the sound at the same time as the camera The image of the planned measurement location is
collected at 12 (step S11). If the field of view is larger than the image field of view when looking
at the planned measurement location from the measurement point, the rotating table 16 is set to,
for example, 3 ° / sec. The sound and the image may be sampled while rotating back and forth
around the center of the planned measurement location at a slow speed. As a rotation range,
about ± 60 ° is appropriate. The measurement may be performed after fixing the sound /
image collecting unit 10 and measuring it, rotating the rotation table 16 appropriately after the
measurement to change the photographing direction to a direction estimated to be the sound
source, and then measuring again.
[0016]
Next, sound pressure waveform data obtained by amplifying and A / D converting sound
pressure signals which are output signals of the microphones M1 to M5 are stored in the data
storage means 31, and the video signal of the camera 12 is A / D. D-conversion is performed, and
the A / D-converted image data is stored in the data storage unit 31 (step S12). Next, sound
pressure waveform data stored in the data storage means 31 is extracted, and using this sound
pressure waveform data, the sound source direction which is sound source data and the sound
pressure level are calculated (step S13). The sound source direction performs frequency analysis
of sound pressure waveform data by FFT, obtains each phase difference between the
microphones M1 to M5 for each frequency, and estimates the direction of the sound source for
each frequency from the obtained phase difference. In the present example, the horizontal angle
θ and the elevation angle φ, which are sound source directions, are determined using the
arrival time difference Dij, which is a physical quantity proportional to the phase difference,
03-05-2019
7
instead of the phase difference. Specifically, the sound source direction viewed from the
measurement point is estimated from the phase difference (arrival time difference Dij) of each
microphone pair (Mi, Mj). The horizontal angle θ and the elevation angle φ can be expressed by
the following equations (1) and (2). Here, the arrival time difference Dij is a time difference
between the sound pressure signal arriving at the microphone Mi and the sound pressure signal
arriving at the microphone Mj that is paired with the microphone Mi, and the two microphones
Mi The cross spectrum P ij (f) of the signal input to the microphone M j is determined, and
further, the phase angle information Ψ (rA / D) of the target frequency f is calculated by the
following equation (3). The sound source direction and the sound pressure level are measured for
each frequency. Further, the magnitude of the signal input to the microphone M5 is taken as the
magnitude of the sound pressure signal.
[0017]
Next, on the map where the horizontal axis is the horizontal angle θ and the vertical axis is the
elevation angle φ, a circle showing the sound source direction (θ, φ) and the frequency and
magnitude of the sound pressure signal as shown in FIG. The displayed sound source estimation
image G is created and stored in the data storage unit 31 (step S14). The diameter of the circle
represents the sound pressure level, and the pattern or color of the circle represents the
frequency of the sound pressure signal. Then, as shown in FIG. 4, a plurality of band images Gp,
which are images for each octave band, are created from the sound source estimation image G
(step S15). Finally, the band image Gp is displayed on the display screen 40M of the display
means 40 (step S16). One band image Gp may be displayed on the display screen 40M, or a
plurality of band images Gp may be displayed side by side on the same screen.
[0018]
As mentioned above, although this invention was demonstrated using embodiment, the technical
scope of this invention is not limited to the range as described in the said embodiment. It is
obvious to those skilled in the art that various changes or modifications can be added to the
above embodiment. It is also apparent from the scope of the claims that the embodiments added
with such alterations or improvements can be included in the technical scope of the present
invention.
[0019]
03-05-2019
8
For example, in the above embodiment, the sound source is reliably estimated using the band
image Gp, but a superimposed band image creating means is provided, and the frequency band of
the band images Gp created by the band image creating means 34 is adjacent. If a superimposed
band image Gp, q (p, q = 1 to 17, but p <q), which is an image obtained by superposing two or
more matched band images, is created and displayed, the sound source can be generated with a
small number of images. It can be estimated. As the overlapping band image Gp, q, for example, a
low band overlapping band image G2, 13 created by superposing band images G2, G3, ..., G13
with a center frequency of 160 Hz to 2 kHz, or a center frequency of 1 kHz to 5 kHz The high
band superposition band image G10, 17 or the like created by superposing the band images G10,
G11,. The overlapping band image Gp, q has a frequency domain wider than that of the band
image Gp, so that the sound source can be estimated with a small number of images. In addition,
when the frequency range of interest spans a plurality of octave bands, it is possible to effectively
specify the sound source by estimating the sound source using the overlapping band image Gp, q.
[0020]
DESCRIPTION OF SYMBOLS 1 sound source estimation image creation device, 10 sound / image
collecting unit, 11 sound collecting means, 12 CCD camera (camera), 13 microphone fixing
portion, 14 camera support base, 15 columns, 16 rotating base, 16r rotating members, 17 bases
20 data processing unit 21 sound data input / output unit 21a amplifier 21b A / D converter 22
image input / output unit 30 storage / calculation unit 31 data storage unit 32 sound source
direction estimation unit 33 sound source estimation Image creation means, 34 band image
creation means, 40 display means, 40M display screen, M1 to M5 microphones.
03-05-2019
9
Документ
Категория
Без категории
Просмотров
0
Размер файла
20 Кб
Теги
jp2015159458
1/--страниц
Пожаловаться на содержимое документа