close

Вход

Забыли?

вход по аккаунту

?

JP2012238964

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JP2012238964
[PROBLEMS] To provide a sound separation device capable of appropriately separating a sound
from a proximity sound source and a sound from a distant sound source. A sound separation
device (15) converts a first microphone (FFM) converting an input sound into a first sound signal,
converts an input sound into a second sound signal, and attenuates a distance compared with the
first microphone. The separation matrix is optimized by independent component analysis from
the second microphone NFM having characteristics with a large rate and the input first sound
signal and the second sound signal, and proximity is performed using the optimized separation
matrix And a sound signal processing unit 13 for separating the third sound signal as the sound
signal from the sound source and separating the fourth sound signal as the sound signal from the
distant sound source.
Sound separation device and camera unit provided with the same
[0001]
The present invention relates to a sound separation device that separates and extracts only a
near sound or a far sound from a mixed sound in which the near sound and the far sound are
mixed. The invention also relates to a camera unit comprising such a sound separation device.
[0002]
Conventionally, using the technology of Independent Component Analysis (ICA; Independent
03-05-2019
1
Component Analysis), the target sound is obtained from a mixed sound in which the sound from
the sound source to be detected (the target sound) and the sound from the noise source are
mixed. Separation and extraction has been conducted. As a sound source to be detected, for
example, a sound source of a speaker's voice may be mentioned.
[0003]
For example, Patent Document 1 is configured such that mixed sound is input to a nondirectional
microphone, and either sound from a sound source to be detected or sound from a noise source
is mainly input to a unidirectional microphone. A sound signal processing apparatus is disclosed
that enables blind source separation (BBS) to be performed in real time. Blind source separation
is a method of optimizing the separation matrix for separating the target sound from the mixed
sound using ICA technology, and separating and extracting the target sound from the mixed
sound using the optimized separation matrix. Point to that.
[0004]
JP, 2005-227512, A
[0005]
By the way, in recent years, electronic devices capable of shooting moving images (for example,
portable video camera devices, mobile phones, portable game machines, etc.) have been actively
used.
These electronic devices generally include a camera unit that performs audio recording
processing simultaneously with moving image shooting. The camera unit is generally provided
with an autofocus function for focusing on a subject, and a zoom function for varying the
magnification of the subject.
[0006]
In the auto focus function and the zoom function, the lens system is moved using a DC motor, a
stepping motor or the like. At this time, with the movement of the lens system, motor noise is
03-05-2019
2
generated, and other mechanical noise is generated. In addition, when moving image shooting is
performed by the camera unit, motor noise and operation noise may be recorded because focus
and zoom processing always operate. In addition to these sounds, unnecessary sounds such as
operation sounds of the camera operator may be recorded, and it is desirable that such
unnecessary sounds (noise sounds) be not recorded as much as possible.
[0007]
In this regard, it is conceivable to apply the technology of the sound signal processing device
shown in Patent Document 1, for example, to a camera unit so that only the target sound from
which the noise sound is removed is recorded. However, when the technique of Patent Document
1 is applied to a camera unit for the above purpose, the following problems occur.
[0008]
FIG. 11 is a figure for demonstrating the problem of a prior art, and is a figure which shows the
directional characteristic of each microphone in, when a non-directional microphone and a
unidirectional microphone are mounted in a camera unit. In FIG. 11, the camera unit is located at
the center O. In FIG. 11, an area (a circular area) RR1 surrounded by a solid line indicates the
directivity characteristic of the nondirectional microphone, and represents that sound in all
directions is evenly collected with high sensitivity. Further, a region (heart-shaped region) RR2
surrounded by a broken line shows the directivity characteristic of a unidirectional microphone,
and sounds with a specific direction (direction of C) with respect to the center O are sensitively
collected Represents that.
[0009]
At the time of moving image shooting, in general, the sound generated at a position away from
the camera unit, such as the voice of the subject, is the target sound (sound to be detected), and
the sound generated near the camera unit An operation sound, an operation sound, etc.)
accompanying the movement of are often unnecessary sounds (noise sounds).
[0010]
A unidirectional microphone has the property of capturing sound from a specific direction, and
03-05-2019
3
sounds from a sound source present in the direction of that directivity occur not only near the
camera unit but also away from the camera unit Sound is also collected.
If, according to the prior art, for example, the motor of the camera unit is present in the direction
in which the sensitivity of the directional characteristic of the unidirectional microphone is
obtained, the sound from the noise source is mainly collected. Sounds present far in the same
direction are also collected by the unidirectional microphone. For this reason, in this
configuration, when sound source separation is performed, there is a problem that part of the
distant sound remains as noise sound or the separation matrix does not converge and can not be
separated.
[0011]
In view of the above points, an object of the present invention is to provide a sound separation
device capable of appropriately separating the sound from the proximity sound source and the
sound from the distant sound source. Another object of the present invention is to provide a
camera unit provided with such a sound separation device, capable of removing a noise sound
generated near the camera unit and appropriately recording a target sound.
[0012]
In order to achieve the above object, a sound separation device according to the present
invention comprises a first microphone for converting an input sound into a first sound signal,
and a second sound signal for converting an input sound into a second sound signal. And a
second microphone having a large characteristic of distance attenuation rate, and a separation
matrix optimized by independent component analysis from the input first sound signal and the
second sound signal, using the optimized separation matrix And a sound signal processing unit
for separating the third sound signal as the sound signal from the proximity sound source and
separating the fourth sound signal as the sound signal from the distant sound source.
[0013]
According to this configuration, it is possible to properly separate the sound from the proximity
sound source and the sound from the distant sound source.
03-05-2019
4
For this reason, the present invention is, for example, a technique suitable for a camera unit or
the like that performs voice recording processing simultaneously with moving image shooting.
[0014]
In the sound separation device of the above configuration, the second microphone is preferably a
differential microphone, and for example, a differential microphone having a primary gradient
characteristic can be used. According to this configuration, it is possible to realize a sound
separation device that can separate and extract only the sound from the near sound source or the
far sound source with high accuracy.
[0015]
In the sound separation device with the above configuration, in the case where the first
microphone is a differential microphone, it is preferable that the differential microphone has a
single diaphragm that vibrates due to sound pressure. According to this configuration, the first
microphone can be miniaturized, and the sound separation device can be easily mounted on the
electronic device.
[0016]
In the sound separation device with the above configuration, the first microphone may be a
nondirectional microphone. This configuration is suitable when a wide range is assumed as a
region in which a distant sound source is present.
[0017]
In the sound separation device of the above configuration, preferably, the first microphone and
the second microphone are formed in one package. According to this configuration, since the
distance between the two microphones can be made very close, it becomes possible to more
appropriately separate and extract the target sound.
03-05-2019
5
[0018]
In addition, in order to achieve the above object, a camera unit of the present invention is
characterized by including the sound separating device of the above configuration. Specifically,
the camera unit configured as described above further includes an imaging unit configured to
image a subject and convert imaging information into a video signal, and a storage unit
configured to store the video signal and the fourth sound signal. Is preferred.
[0019]
In this configuration, when moving image shooting is performed by the camera unit, noise noise
generated from the main body of the camera unit and the vicinity thereof can be removed, and
ambient sound separated from the camera unit as the target sound can be appropriately
recorded. It is.
[0020]
In the camera unit having the above configuration, the imaging unit includes a lens unit that
forms an image of incident light from the subject direction, and a lens driving unit that drives a
movable lens included in the lens unit. The signal processing unit performs optimization
processing of the separation matrix while the lens drive unit is operating, and optimization of the
separation matrix is not performed while the lens drive unit is not operating. It is also good.
[0021]
According to this configuration, among the sounds generated in the vicinity of the camera unit,
the sound generated particularly by the lens driving unit can be effectively separated and
removed as the noise sound to obtain the target sound.
[0022]
According to the sound separation device of the present invention, it is possible to properly
separate the sound from the proximity sound source and the sound from the distant sound
source.
Further, in a camera unit provided with the sound separation device of the present invention,
noises such as mechanical noise generated in the vicinity of the camera unit can be removed to
03-05-2019
6
appropriately record a target sound (ambient sound separated from the camera unit). It is
possible.
[0023]
Block diagram showing the configuration of the camera unit of the present embodiment
Schematic perspective view showing the configuration of the camera unit of the present
embodiment Schematic diagram showing the configuration of the near-field microphone
provided in the camera unit of the present embodiment Schematic showing far-field microphone
configuration Graph showing relationship between sound pressure P and distance R from sound
source Graph showing near-field characteristics of far-field microphone and near-field
microphone Explaining distance attenuation characteristics of near-field microphone and far-field
microphone Graph for showing a directional characteristic of each microphone provided in the
camera unit of the present embodiment. A diagram for explaining a modification of the present
embodiment, wherein a near field microphone and a far field microphone are formed in one
package. Outline showing Cross-sectional view is a view for explaining a modification of the
present embodiment, and is a block diagram of a sound separation device having a configuration
in which whether to optimize the separation matrix can be switched depending on whether or
not the lens drive unit is driven. The figure for demonstrating a problem, when a non-directional
microphone and a unidirectional microphone are mounted in a camera unit, the figure which
shows the directional characteristic of each microphone
[0024]
Hereinafter, embodiments of a sound separation device of the present invention and a camera
unit provided with the same will be described in detail with reference to the drawings.
[0025]
FIG. 1 is a block diagram showing the configuration of the camera unit of the present
embodiment.
FIG. 2 is a schematic perspective view showing the configuration of the camera unit of the
present embodiment.
As shown in FIG. 1, the camera unit 1 according to the present embodiment includes an imaging
unit 11 capable of shooting moving images, a sound collection unit 12 capable of collecting
03-05-2019
7
ambient sound at the time of moving image shooting, and a sound collection unit 12. A sound
signal processing unit 13 for processing the collected sound, and an accumulation unit 14 for
recording the video signal output from the imaging unit 11 and recording the sound signal
output from the sound signal processing unit 13; Prepare.
[0026]
A portion 15 (a portion surrounded by a broken line in FIG. 1) including the sound collection unit
12 and the sound signal processing unit 13 is an embodiment of the sound separation device of
the present invention.
[0027]
The imaging unit 11 is attached to the main body 10 of the camera unit 1 as shown in FIG. 2 and
includes a lens unit 111 that forms an image of incident light from the subject direction.
The lens unit 111 may be configured by a single lens, or may be configured by a plurality of lens
groups.
In addition, the lens unit 111 includes a movable lens movable in the optical axis direction so as
to enable auto focus adjustment and zoom adjustment.
[0028]
The imaging unit 11 includes a lens driving unit 112 that drives a movable lens included in the
lens unit 111.
In FIG. 2, a part of the lens drive unit 112 is shown. The lens drive unit 112 has a drive source
such as a DC motor, a stepping motor, an ultrasonic motor, or a piezoelectric element, for
example. Then, when focus adjustment or zoom adjustment is performed, the lens drive unit 112
drives this drive source, and moves, for example, a holder that holds the movable lens along the
guide. The lens drive unit 112 is controlled in its operation by a control unit (not shown). In
addition, at the time of the drive of the lens drive part 112, the operation | movement sound etc.
which a motor noise and holder movement arose.
03-05-2019
8
[0029]
The imaging unit 11 includes an imaging processing unit 113 in which an imaging surface is
disposed at a position where incident light from a subject direction is imaged by the lens unit
111, and photoelectric conversion of incident light is performed to output a video signal. The
imaging processing unit 113 can be, for example, a charge coupled device (CCD) image sensor, a
complementary metal oxide semiconductor (CMOS) image sensor, or the like. The video signal
output from the imaging processing unit 113 is sent to the recording processing unit 141 of the
storage unit 14 and subjected to recording processing.
[0030]
The sound collection unit 12 includes a near-field microphone NFM that mainly collects the
sound from the proximity sound source (a sound source near the camera unit 1) and converts it
into an electric signal, the sound from the proximity sound source and the far sound source (this
embodiment) And a far-field microphone FFM for converting a mixed sound with a sound from a
sound source other than a close-proximity sound source into an electric signal.
[0031]
As far-field microphone FFM, a microphone capable of collecting the sound of a subject is used.
For example, a nondirectional microphone is selected. Also, as the near-field microphone NFM, a
microphone with a good distance attenuation characteristic is used. As the near-field microphone
NFM, for example, a differential microphone having a gradient characteristic equal to or higher
than the primary gradient can be used, and it is preferable to select a microphone that mainly
collects near sound while suppressing distant sound. The far-field microphone FFM is an example
of the first microphone of the present invention, and the near-field microphone NFM is an
example of the second microphone of the present invention.
[0032]
The near-field microphone NFM and the far-field microphone FFM are disposed adjacent to each
03-05-2019
9
other in the main body 10 of the camera unit 1 in a state of being mounted on a mounting
substrate (not shown). In FIG. 2, since these two microphones are inside the main body 10, they
are indicated by broken lines. The main body 10 of the camera unit 1 is provided with openings
for introducing sound into the microphones NFM and FFM. The position of these microphones
may be determined appropriately, but in the present embodiment, they are arranged on the front
surface of the main body 10. Here, the direction with the highest sensitivity (principal axis
direction) of the directivity characteristic is the direction of the lens drive unit so that the
differential microphone used as the near-field microphone NFM can efficiently collect the
operation sound of the lens drive unit. It is desirable to install it so as to face the
[0033]
FIG. 3 is a schematic view showing an example of the configuration of a near-field microphone
provided in the camera unit of the present embodiment. FIG. 3 (a) is a schematic perspective
view, and FIG. 3 (b) is an AA of FIG. 3 (a). It is sectional drawing in a position. The near-field
microphone NFM has a structure in which a lid body 211 is placed on a microphone substrate
201 on which a micro electro mechanical system (MEMS) chip 221 and an application specific
integrated circuit (ASIC) 222 are mounted.
[0034]
The MEMS chip 221 is a capacitor type microphone chip manufactured by processing silicon (Si)
by semiconductor process technology, and includes a diaphragm 221a displaced by input sound
pressure and a fixed electrode 221b disposed opposite to the diaphragm 221a. Have. The change
of the input sound pressure changes the distance between the diaphragm 221a and the fixed
electrode 221b, which in turn changes the capacitance of the capacitor. The MEMS chip 221 is
configured such that sound pressure is transmitted to both surfaces (upper surface and lower
surface) of the diaphragm 221a, and the fixed electrode 221b is penetrated from the front
surface to the rear surface so as not to be vibrated by the sound pressure. A vent is provided. The
ASIC 222 is an integrated circuit including a circuit for converting a change in capacitance of the
MEMS chip 221 into an electric signal (sound signal), a power supply circuit for applying a bias
voltage to the diaphragm 221 a or the fixed electrode 221 b, and the like.
[0035]
Although the ASIC 222 is provided separately from the MEMS chip 221 in this embodiment, the
integrated circuit mounted on the ASIC 222 may be monolithically formed on a silicon substrate
03-05-2019
10
on which the MEMS chip 221 is formed.
[0036]
A first opening 202 and a second opening 203 are provided on the upper surface 201 a of the
microphone substrate 201 on which the MEMS chip 221 and the ASIC 222 are mounted.
The first opening 202 and the second opening 203 communicate with each other through the
substrate internal space 204. Note that such a microphone substrate 201 may be obtained by
bonding a plurality of substrates.
[0037]
The MEMS chip 221 is disposed such that the diaphragm 221a is substantially parallel to the
microphone substrate 201, and is disposed so as to close the first opening 202 from the
substrate upper surface 201a side. Further, on the lower surface 201b of the microphone
substrate 201, a connection terminal 205 for external connection is formed.
[0038]
In the upper surface 211 a of the lid 211, a first sound hole 212 is formed on one end side in the
longitudinal direction, and a second sound hole 213 is formed on the other end side. In the
present embodiment, the two sound holes 212 and 213 are in the form of elongated holes, but
the shape is not limited to this shape, and the shape may be changed as appropriate.
[0039]
Further, in the lid body 211, a first space portion 214 connected to the first sound hole 212 and
a second space portion 215 separated from the first space portion 214 and connected to the
second sound hole 213 are formed. There is. The lid body 211 is mounted on the microphone
substrate 201 such that the first space portion 214 is separated from the substrate internal
space 204 by the MEMS chip 221. Further, the lid 211 is mounted on the microphone substrate
03-05-2019
11
201 such that the second space portion 215 communicates with the substrate internal space 204
via the second opening 203.
[0040]
The near-field microphone NFM configured as described above transmits the external sound
from the first sound hole 212 through the first space portion 214 to the upper surface of the
diaphragm 221a, the external sound, It has a second sound path P2 that passes from the two
sound holes 213 to the second space part 215, the second opening 203, the substrate internal
space 204, and the first opening 202 in this order and leads to the lower surface of the
diaphragm 221a. It is a structure.
[0041]
The near-field microphone NFM vibrates the diaphragm 221a by the difference between the
sound pressure pf applied to the upper surface of the diaphragm 221a and the sound pressure
pb applied to the lower surface of the diaphragm 221a, and the input sound is an electrical signal
(sound signal It is supposed to be converted to).
That is, the near-field microphone NFM is configured as a first-order gradient differential
microphone. In addition, although it is not the meaning limited to this, in this embodiment, the
length of the sound path P1 and the sound path P2 is made substantially the same, and the phase
difference of both sound paths is made not generate | occur | produce.
[0042]
FIG. 4 is a schematic view showing the configuration of a far-field microphone provided in the
camera unit of the present embodiment, FIG. 4 (a) is a schematic perspective view, and FIG. 4 (b)
is a BB position of FIG. 4 (a). FIG.
[0043]
The far-field microphone FFM has a structure in which a lid 311 is placed over the microphone
chip 301 on which the MEMS chip 321 and the ASIC 322 are mounted on the upper surface 301
a so as to cover the MEMS chip 321 and the ASIC 322.
03-05-2019
12
A connection terminal 302 for external connection is formed on the lower surface 301 b of the
microphone substrate 301.
[0044]
While the sound hole 312 is formed in the upper surface 311 a of the lid 311, the space portion
313 connected to the sound hole 312 is formed. The far-field microphone FFM configured in this
way has a sound path P for guiding external sound from the sound hole 312 through the space
portion 313 to the upper surface of the diaphragm 321a. Further, the lower surface side of the
diaphragm 321a is closed by the microphone substrate 301a to form a closed space.
[0045]
The MEMS chip 321 and the ASIC 322 have the same configuration as the near-field microphone
NFM, so the description is omitted.
[0046]
Here, the characteristics of the near-field microphone NFM and the far-field microphone FFM will
be described.
Prior to this explanation, the nature of the sound wave will be described. FIG. 5 is a graph
showing the relationship between the sound pressure P and the distance R from the sound
source. As shown in FIG. 5, the sound wave is attenuated as it travels through a medium such as
air, and the sound pressure (the strength and amplitude of the sound wave) is reduced. The
sound pressure attenuates in inverse proportion to the distance from the sound source, and the
relationship between the sound pressure P and the distance R can be expressed as the following
equation (1). In addition, k in Formula (1) is a proportionality constant. P=k/R (1)
[0047]
An output signal of the far-field microphone FFM is obtained in inverse proportion to the
distance from the sound source according to the equation (1). On the other hand, in the near-field
03-05-2019
13
microphone NFM, an output proportional to the differential pressure of the sound pressure input
from the first sound hole 212 and the second sound hole 213 is obtained. The output of the
near-field microphone NFM will be described in detail below with reference to FIGS. 5 and 3.
[0048]
The distance between the first sound hole 212 and the second sound hole 213 of the near-field
microphone NFM is Δd. When the microphone is disposed at a short distance from the sound
source, for example, when the distance from the sound source to the first sound hole 212 is R1
and the distance from the sound source to the second sound hole 213 is R2, the vibratory plate
321a The resulting differential pressure is (P1-P2). When the microphone is disposed at a far
distance from the sound source, for example, when the distance from the sound source to the
first sound hole 212 is R3 and the distance from the sound source to the second sound hole 213
is R4, the diaphragm The differential pressure generated at 321a is (P3-P4). From the above, the
output of the near-field microphone NFM is equivalent to finding the slope of the graph of FIG. 5,
and a characteristic equivalent to differentiation by the distance R is obtained.
[0049]
FIG. 7 is a graph for explaining the distance attenuation characteristics of a near-field
microphone and a far-field microphone, the horizontal axis representing the distance R from the
sound source as a logarithmic axis, and the vertical axis the sound pressure applied to the
diaphragm of the microphone Indicates the level (dB).
[0050]
In the far-field microphone FFM, the diaphragm 321a vibrates due to the sound pressure applied
to the upper surface, so the output level of the microphone is attenuated by 1 / R.
On the other hand, in the near-field microphone NFM, the output level of the microphone is a
characteristic 1 / R <2> obtained by differentiating the characteristic of the far-field microphone
FFM by the distance R because it vibrates due to the difference in sound pressure applied to the
upper and lower surfaces of the diaphragm 221a. Attenuate.
[0051]
03-05-2019
14
As shown in FIG. 7, the output of the near-field microphone NFM has a larger attenuation factor
with respect to the distance from the sound source than the output of the far-field microphone
FFM. That is, compared with the near-field microphone NFM, the far-field microphone FFM
efficiently collects the sound generated in the vicinity of the microphone, but the distant sound is
suppressed.
[0052]
The sound pressure of the sound generated in the vicinity of the near-field microphone NFM is
greatly attenuated between the first sound hole 212 and the second sound hole 213 and
transmitted to the upper surface of the diaphragm 221a, and the diaphragm 221a. There is a
large difference with the sound pressure transmitted to the lower surface of the. On the other
hand, the sound with the sound source at a distance is hardly attenuated between the first sound
hole 212 and the second sound hole 213, and is transmitted to the lower surface of the
diaphragm 221a and the sound pressure transmitted to the upper surface of the diaphragm
221a. The sound pressure difference between the sound pressure and the sound pressure is very
small. Here, it is assumed that the distance from the sound source to the first sound hole 212 and
the distance from the sound source to the second sound hole 213 are different.
[0053]
Since the sound pressure difference of the sound from the distant sound source received by the
diaphragm 221a is very small, the sound pressure of the sound from the distant sound source is
substantially canceled by the diaphragm 221a. On the other hand, since the sound pressure
difference of the sound of the proximity sound source received by the diaphragm 221a is large,
the sound pressure of the sound from the proximity sound source is not canceled by the
diaphragm 221a. For this reason, the signal obtained by the vibration of the diaphragm 221a can
be regarded as the signal of the sound from the proximity sound source.
[0054]
FIG. 6 shows directivity characteristics of the near-field microphone NFM and the far-field
microphone FFM. FIG. 6 (a) shows the directivity of the near-field microphone NFM, and FIG. 6
03-05-2019
15
(b) shows the directivity of the far-field microphone FFM. 6A shows the case where the first
sound hole 212 and the second sound hole 213 of the near-field microphone NFM are arranged
in the 0 ° and 180 ° directions, FIG. 6A shows the FFM of the far-field microphone. The case
where the sound hole 312 is disposed at the origin position is shown.
[0055]
First, the directivity characteristics of the near-field microphone NFM shown in FIG. 6A will be
described. If the distance from the sound source to the near-field microphone NFM is constant,
the sound pressure applied to the diaphragm 221a is maximum when the sound source is in the
direction of 0 ° or 180 °. This is because the difference between the distance from the sound
source to the first sound hole 212 and the distance from the sound source to the second sound
hole 213 is maximum.
[0056]
On the other hand, when the sound source is in the direction of 90 ° or 270 °, the sound
pressure applied to the diaphragm 221a is minimized (approximately zero). This is because the
distance from the sound source to the first sound hole 212 and the distance from the sound
source to the second sound hole 213 are equal.
[0057]
That is, when a differential microphone with a primary gradient is used as the near-field
microphone NFM, the sensitivity to the sound wave incident from the directions of 0 ° and 180
° is high, and the light is incident from the directions of 90 ° and 270 °. It exhibits so-called
bi-directionality, which reduces its sensitivity to sound waves.
[0058]
Next, the directivity characteristic of the far-field microphone FFM shown in FIG. 6B will be
described.
If the distance from the sound source to the diaphragm 321a is constant, the sound pressure
03-05-2019
16
applied to the diaphragm 321a is constant regardless of the direction of the sound source. That
is, the far-field microphone FFM exhibits nondirectionality to collect sound waves incident from
all directions with equal sensitivity.
[0059]
Returning to FIG. 1, the sound signal processing unit 13 included in the camera unit 1 will be
described. The sound signal processing unit 13 includes a first A / D conversion unit 131 and a
second A / D conversion unit 132 that convert an analog sound signal into a digital sound signal.
The first A / D conversion unit 131 samples a sound signal (corresponding to the second sound
signal of the present invention) output from the near-field microphone NFM at predetermined
time intervals and converts it into a digital signal Y1 (t). Do the processing. The second A / D
conversion unit 132 samples the sound signal (corresponding to the first sound signal of the
present invention) output from the far-field microphone FFM at predetermined time intervals and
converts it into a digital signal Y2 (t). Do the processing.
[0060]
The sound signal processing unit 13 includes an ICA (Independent Component Analysis)
processing unit 133 that sequentially processes digital signals output by time division from the
first A / D conversion unit 131 and the second A / D conversion unit 132. Prepare. For the basic
processing of ICA, conventionally used techniques are used. ICA processing unit 133 performs
FFT (Fast Fourier Transform) processing on digital audio signals input from the two A / D
conversion units 131 and 132, and then obtains a separation matrix using an independent
component analysis technique in the frequency domain Perform (process to optimize). Here, the
separation matrix is successively updated so as to maximize statistical independence between
separated signals, and processed so as to converge to an optimal solution.
[0061]
At a certain time t, let S1 (t) and S2 (t) be sounds output from two independent sound sources. In
addition, the sound (S1 (t), S2 (t)) output from these sound sources is collected by two
microphones, collected by each microphone, and the signals obtained by A / D conversion are
respectively Y1. (T) and Y2 (t). In this case, the following equation (2) holds. Here, A is a 2 × 2
mixing matrix.
03-05-2019
17
[0062]
Assuming that W is an inverse matrix of A, the following equation (3) holds. (3) where W is a
separation matrix, and using the technique of independent component analysis, the statistical
independence of the sounds S1 (t) and S2 (t) output from the two sources is maximized
Optimization of the separation matrix W is achieved. In the present embodiment, the two
independent sound sources correspond to the proximity sound source in the vicinity of the
camera unit 1 and the distant sound source (sound source other than the proximity sound
source) at a position distant from the camera unit 1. One of the two microphones corresponds to
the near-field microphone NFM, and the other corresponds to the far-field microphone FFM.
[0063]
The ICA processing unit 133 uses the optimized separation matrix W to separate separated
signals X1 (specifically, signals after processing such as A / D conversion) input from the two
microphones NFM and FFM. t) Separately extract X2 (t). Here, the separation signal X1 (t) is a
signal estimated as a signal of the sound (S1 (t)) from the proximity sound source, and
corresponds to the third sound signal of the present invention. Further, the separated signal X2
(t) is a signal estimated as a signal of sound (S2 (t)) from a distant sound source, and corresponds
to the fourth sound signal of the present invention.
[0064]
The ICA processing unit 133 outputs the separated signal X2 (t) estimated to be the target sound
to the recording processing unit 142 of the storage unit 14, and outputs the separated signal X1
(t) estimated to be the noise sound to the recording processing unit 142 do not do. The recording
processing unit 142 sequentially records the separation signal X2 (t) sent from the ICA
processing unit 133 in time division.
[0065]
Next, among the camera units 1 configured as described above, the operation of the sound
separation device 15 will be described.
03-05-2019
18
[0066]
FIG. 8 is a diagram showing the directivity characteristics of each microphone provided in the
camera unit of the present embodiment.
In FIG. 8, the camera unit 1 is located at the center O. In FIG. 8, the solid line R1 indicates the
directivity of the far-field microphone FFM, and the 8-shaped broken line R2 indicates the
directivity of the near-field microphone NFM.
[0067]
As described above, the near-field microphone NFM excels in the function of collecting the sound
from the proximity sound source near the camera unit 1 (near the center O of FIG. 8), and the
far-field microphone FFM is far from the camera unit 1 Excellent in the ability to collect sound
from a wide range including the sound from the distant sound source at the position.
[0068]
The near-field microphone NFM is generated, for example, mechanical sound generated from the
main body 10 of the camera unit 1 (sound generated when driving the lens by the lens driving
unit 112, etc.) or when the operator operates the camera unit 1 The operation sound and the
sound (S1) generated in the vicinity of the camera unit 1 such as the voice of the operator are
mainly collected.
In addition to the above three sounds, the far-field microphone FFM is installed to collect a sound
including an ambient sound (S2) far from the camera unit 1.
[0069]
At this time, the output of the near-field microphone NFM can be expressed as (a1 · S1 + a2 · S2),
and the output of the far-field microphone FFM can be expressed as (a3 · S1 + a4 · S2). Here, a1,
a2, a3 and a4 are coefficients, and a1 >> a2 holds.
03-05-2019
19
[0070]
The ICA processing unit 133, to which the signals from the near-field microphone NFM and the
far-field microphone FFM are input, uses the separation matrix W appropriately optimized to
generate the sound X1, which is estimated to be the sound S1 from the proximity sound source,
The sound S2 from the sound source and the estimated sound X2 are separated and extracted.
That is, according to the sound separation device 15 of the present embodiment, the proximity
which is conventionally considered to be unnecessary noise such as mechanical sound generated
from the main body 10 of the camera unit 1, operation sound of the operator, and voice of the
operator The sound from the sound source can be properly removed to obtain only the
surrounding sound away from the camera.
[0071]
Conventional sound source separation techniques are mainly used to separate two or more sound
sources present in different directions with respect to the microphone, and it is difficult to
separate sound sources present in different directions in the same direction. The This is because
the sound from the sound source enters the two microphones in the same phase. Therefore, in
order to separate two or more sound sources, it is necessary to arrange the distance between the
two microphones used for sound collection 10 cm or more, etc., and a large space is required for
the arrangement of the microphones.
[0072]
On the other hand, by using two microphones having different distance attenuation
characteristics as in the configuration of the present embodiment, a large amplitude difference
from the sound sources having different distances in the same direction can be secured, so that
separation of the sound sources is possible. Become. Conventionally, although the sound source
was separated using the difference in spatial orientation, the sound source can be separated
using the difference in distance from the microphone by using two microphones having different
distance attenuation characteristics. It will be. Further, in the configuration of the present
invention, even if two microphones are arranged at the same position, they can be separated, so
there is an advantage that they can be arranged if there is a space equal to the microphone size.
03-05-2019
20
[0073]
The embodiments described above are merely illustrative of the present invention. That is, the
present invention is not limited to the embodiments described above, and various modifications
can be made without departing from the object of the present invention.
[0074]
For example, in the embodiment described above, the near-field microphone NFM and the farfield microphone FFM are configured in separate packages. However, it is preferable to arrange
the near-field microphone and the far-field microphone as close as possible so that phase shift of
the input sound wave does not occur. For this reason, it is preferable to adopt a configuration in
which two microphones are formed in one package.
[0075]
FIG. 9 is a view for explaining a modification of this embodiment, and is a schematic crosssectional view showing a configuration in which the near-field microphone and the far-field
microphone are formed in one package. The configuration of the microphone of this modification
is merely an example, and it goes without saying that various modifications are possible. The
point is that any one package can be used as long as the function of the near-field microphone
and the function of the far-field microphone can be exhibited.
[0076]
The configuration of the microphone 400 of the modified example shown in FIG. 9 is
substantially the same as the configuration of the near-field microphone NFM shown in FIG. A
different point is that a MEMS chip 401 (having the same configuration as the MEMS chip 221)
is newly added to the configuration of the microphone shown in FIG. In FIG. 9, the same parts as
those in FIG. 3 are denoted by the same reference numerals.
[0077]
03-05-2019
21
When sound is generated outside the microphone 400, the sound wave input from the first
sound hole 212 reaches the upper surface of the diaphragm 401a of the second MEMS chip 401
by the first sound path P1, and the diaphragm 401a vibrates. . The diaphragm 401 a of the
second MEMS chip 401 vibrates only by the sound wave applied to the upper surface, and using
the signal output from the second MEMS chip 401, it is similar to the far-field microphone FFM
of this embodiment. Function is obtained.
[0078]
Also, when sound is generated outside the microphone 400, the sound wave input from the first
sound hole 212 reaches the upper surface of the diaphragm 221a of the first MEMS chip 221 by
the first sound path P1, and the second sound is generated. The sound wave input from the hole
213 reaches the lower surface of the diaphragm 221 a of the first MEMS chip 221 by the second
acoustic path P2. For this reason, the diaphragm 221a of the first MEMS chip 221 vibrates due
to the sound pressure difference between the sound pressure applied to the upper surface and
the sound pressure applied to the lower surface. Therefore, using the signal output from the first
MEMS chip 221, the same function as that of the near-field microphone NFM of this embodiment
can be obtained.
[0079]
In the embodiment described above, the sound signal processing unit (ICA processing unit) 13 of
the sound separation device 15 is configured to optimize the separation matrix W regardless of
whether the lens drive unit 112 is driven or not. . However, when the separation matrix W is
always optimized, the process of optimization of the separation matrix W is performed even in a
state in which the lens driving unit as a main noise source is not operating. May converge or
diverge to a certain value. In order to prevent this, the separation matrix W is optimized when
the lens drive unit 112 is driven (when mechanical noise is generated), and when the lens drive
unit 112 is not driven (mechanical noise is generated). It is preferable not to optimize the
separation matrix W if it does not occur)
[0080]
FIG. 10 is a diagram for explaining a modification of the present embodiment, and is a block
03-05-2019
22
diagram of a sound separation device having a configuration capable of switching whether or not
to optimize the separation matrix depending on whether or not the lens drive unit is driven. is
there. As shown in FIG. 10, the sound separation device 17 of the modification has a
configuration in which an optimization on / off unit 134 is added in the ICA processing unit 133
of the sound separation device 15 of the present embodiment.
[0081]
The optimization on / off unit 134 is electrically connected to the control unit 18 of the camera
unit 1. The control unit 18 also controls the lens drive unit 112, and grasps whether or not the
lens drive unit 112 is driven. When the information to drive the lens driving unit 112 is input
from the control unit 18 to the optimization on / off unit 134, the ICA processing unit 133
optimizes the separation matrix W as in the case of the present embodiment. , Separate and
extract the sound signal. On the other hand, when the information that the lens driving unit 112
is not driven is input from the control unit 18 to the optimization on / off unit 134, the ICA
processing unit 133 does not optimize the separation matrix W, and holds the separation matrix
W value. Do. This makes it possible to operate ICA processing stably.
[0082]
In such a sound separation device 17, the mechanical sound generated from the camera unit 1
among the sounds from the proximity sound source is effectively separated and extracted, and
the voice from the operator is not separated, and the sound from the distant sound source is
separated. Together with the target sound. In moving image shooting with the camera unit 1,
there is also a demand for not wanting to remove the sound of the operator, and this
modification is a configuration suitable for such a demand.
[0083]
In the embodiment described above, the microphones NFM and FFM included in the camera unit
1 are MEMS microphones formed using semiconductor manufacturing technology. However, the
present invention is not limited to this configuration. For example, the microphone may be a
condenser microphone (ECM) using an electrec film or the like. The microphones NFM and FFM
included in the camera unit 1 are not limited to so-called condenser microphones, and may be,
for example, electrodynamic (dynamic), electromagnetic (magnetic), piezoelectric microphones,
03-05-2019
23
and the like.
[0084]
Further, in the embodiment described above, the near-field microphone NMF is configured as a
differential microphone having only one diaphragm 221a. However, the present invention is not
limited to this configuration. That is, the near-field microphone may be, for example, a
differential microphone of a type having two diaphragms and outputting a difference between
signals output based on the respective diaphragms as a sound signal.
[0085]
In the embodiment described above, the near-field microphone NMF is configured as a first-order
gradient differential microphone. However, the present invention is not limited to this
configuration. That is, the near-field microphone may be, for example, a differential microphone
having secondary or tertiary gradient characteristics.
[0086]
In the embodiment described above, the far-field microphone FFM is a nondirectional
microphone. However, the present invention is not limited to this configuration. The far-field
microphone may be a directional microphone such as, for example, a unidirectional microphone.
For example, when the direction of the sound to be collected is limited to a specific direction at
the time of moving image shooting by the camera unit 1, such a configuration is also effective.
[0087]
Besides the above, the case where the sound separation device of the present invention is applied
to a camera unit has been described as an example. However, the sound separation apparatus
according to the present invention can be widely applied when it is desired to separate the sound
from the proximity sound source and the sound from the distant sound source, and the
application target is a background in electronic devices other than the camera unit, for example,
mobile phones It can also be applied as a noise separation application. When applied to a mobile
03-05-2019
24
phone, the near-field microphone NMF is installed to capture the voice of the speaker, and the
far-field microphone FFM is installed to capture the voice including the background noise,
thereby the speaker voice and the background noise It is possible to separate
[0088]
The present invention is suitable for a camera unit capable of moving image shooting.
[0089]
Reference Signs List 1 camera unit 11 imaging unit 14 storage unit 13 sound signal processing
unit 15 sound separation device 111 lens unit 112 lens drive unit 221 a diaphragm NFM nearfield microphone (second microphone) FFM far-field microphone (first microphone)
03-05-2019
25
Документ
Категория
Без категории
Просмотров
0
Размер файла
39 Кб
Теги
jp2012238964
1/--страниц
Пожаловаться на содержимое документа