close

Вход

Забыли?

вход по аккаунту

?

JP2008288910

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JP2008288910
An object of the present invention is to specify a noise generation period by using directivity at
the time of sound collection in a sound collection device and reduce noise. A directivity
generation unit generates a directivity signal having directivity with respect to a specific
direction of the surroundings based on an audio signal from an audio input unit supplied through
an amplifier. When the noise detection unit 360 detects noise from the directivity signal supplied
from the directivity generation unit 330, the noise detection unit 360 generates a signal
indicating a noise removal period according to the noise generation period. The noise reduction
processing unit 370 removes noise contained in the directivity signal supplied from the
directivity generation unit 330 in the noise removal period according to the noise removal period
supplied from the noise detection unit 360, and the noise reduction processing period must be
performed in the noise removal period. Noise is not removed. [Selected figure] Figure 1
Sound pickup device
[0001]
The present invention relates to a sound collection device, and more particularly to a sound
collection device that reduces noise of an audio signal.
[0002]
An imaging device such as a video camera often incorporates a microphone in order to pick up a
sound source around the imaging device in accordance with imaging of a subject.
10-05-2019
1
Although the purpose of this microphone is mainly to collect sound that matches the angle of
view of the lens, in recent years, three-dimensional surround sound collection has also been
performed on the entire surroundings.
[0003]
In such an imaging apparatus, in addition to a mechanism relating to sound collection, a disc
drive mechanism such as a DVD (Digital Versatile Disc) or an HDD (Hard Disc Drive) as a
recording medium, auto focus, power zoom, optical camera shake correction, etc. A lens drive
mechanism is also mounted. In addition, an opening / closing mechanism of a liquid crystal
monitor operated by a user, various operation switches, and the like are also included. The
operation sound of these mechanisms will be incident as noise to the above-mentioned sound
collection mechanism, and from the sound collection mechanism, it will be a factor to lower the S
/ N ratio (Signal to Noise ratio) of the target sound collection. Become. Although these noises
should normally be suppressed on the generation side, the degree of difficulty has been
increased more and more by the miniaturization and high functionality of the imaging device.
[0004]
In addition, these noises are generated simultaneously by the vibration transmitted through the
cabinet and the acoustic noise transmitted as sound in the air, which makes the noise
transmission path to the microphones complicated. Therefore, as in the prior art, by taking a
structure in which the microphone is floated from the cabinet by an insulator such as a rubber
damper or a structure in which the microphone floats in the hollow by a rubber wire etc., the
vibration transmitted from the cabinet is absorbed and noise is not transmitted. In this way, a
sufficient noise reduction effect has not been obtained with the passive method alone.
[0005]
Furthermore, such noise is generally generated in a short timing period generally, for example,
an instant of several milliseconds to several hundreds of milliseconds, as typified by a click
sound, and noise reduction methods such as an adaptive filter perform reduction processing
There were many cases in which I could not make it in time.
[0006]
10-05-2019
2
On the other hand, there has been proposed a technique for reducing noise by utilizing a
masking effect in human hearing.
For example, a device that reduces noise by switching an audio signal by detecting noise
generation timing from the outside has been proposed (for example, see Patent Document 1). JP
2005-303681 A (FIG. 1)
[0007]
The above-mentioned prior art removes the above-mentioned shock noise, touch noise, click
noise and the like so as not to be recognized by human hearing, and is effective when the noise
generation period can be specified.
[0008]
However, in the prior art, when noise other than noise is mixed in the input signal, or when the
noise timing can not be obtained from the drive device, the noise generation period can not be
identified, so the noise can not be removed. There was a problem that.
[0009]
The present invention has been made in view of such a situation, and it is an object of the present
invention to specify a noise generation period by using directivity at the time of sound collection
and reduce noise.
[0010]
The present invention has been made to solve the above problems, and a first aspect of the
present invention is an audio input means for inputting a plurality of surrounding audio signals,
and a first direction based on the plurality of audio signals. Directivity generating means for
generating a first directivity signal having directivity and a second directivity signal having
directivity in the second direction, and noise for removing a noise band from the first directivity
signal Removing means, noise recognizing means for recognizing noise contained in the second
directivity signal, and noise removing period generating means for generating a signal indicating
a noise removing period according to the recognized period of noise. If the noise removal period
is indicated, the output of the noise removing means is selected. If the noise removal period is not
10-05-2019
3
indicated, the first directivity signal is selected. With selection means A sound pickup device,
characterized by Bei.
This brings about the effect | action of selecting the presence or absence of the noise removal
from a 1st directivity signal according to the generation | occurrence | production period of the
noise contained in a 2nd directivity signal.
[0011]
In the first aspect, the voice input means may comprise a plurality of bi-directional microphones
and a non-directional microphone.
Further, the voice input means may comprise a plurality of nondirectional microphones.
Also, the voice input means may comprise a plurality of unidirectional microphones and one bidirectional microphone. Based on these, in the directivity generation means, first and second
directivity signals are generated.
[0012]
In the first aspect, the apparatus further comprises rotation coefficient generation means for
generating a rotation coefficient indicating a predetermined direction, and the directivity
generation means may generate the directivity coefficient if the direction indicated by the
rotation coefficient is the first direction. The first directivity signal may be generated, and the
second directivity signal may be generated if the direction indicated by the rotation coefficient is
the second direction. Thereby, the first and second directivity signals are generated on the basis
of the rotation coefficient.
[0013]
Further, in the first aspect, the noise recognition unit evaluates an output of a convolution
operation of the second directional signal and the wavelet signal whose average value is zero in a
10-05-2019
4
predetermined period waveform-approximated to the noise. The above noise recognition may be
performed as This brings about the effect | action of selecting the presence or absence of noise
removal according to the noise recognition result in a time domain.
[0014]
In the first aspect, the noise recognition unit performs the noise recognition using, as an
evaluation value, the correlation between the pattern signal approximated to the frequency
spectrum of the noise and the second directivity signal subjected to Fourier transform. You may
do so. This brings about the effect | action of selecting the presence or absence of noise removal
according to the noise recognition result in a frequency domain.
[0015]
In the first aspect, the noise removal means can be realized by a filter that removes a noise band.
In this case, the noise removal means may adaptively change the removal band and the pass
band of the filter based on the frequency of the noise recognized by the noise recognition means.
[0016]
In the first aspect, the selection means may be realized by a crossfade switch. This brings about
the effect of crossfading with a predetermined time constant when switching the presence or
absence of noise removal.
[0017]
Further, according to a second aspect of the present invention, there is provided an audio input
means for inputting a plurality of surrounding audio signals, a first directivity signal having
directivity in a first direction based on the plurality of audio signals, and Directivity generation
means for generating a second directivity signal having directivity in two directions, noise
removal means for removing a noise band from the first directivity signal, and for the signal from
which the noise band has been removed A signal interpolation unit that performs interpolation, a
10-05-2019
5
noise recognition unit that recognizes noise contained in the second directivity signal, and a
noise removal period that generates a signal indicating a noise removal period according to the
recognized noise generation period The generation means and the output of the signal
interpolation means are selected when it is indicated that it is the noise removal period, and the
first directivity when the noise elimination period is not shown. Pick a signal It may be a sound
pickup apparatus characterized by comprising selecting means for. Thereby, according to the
generation period of the noise included in the second directivity signal, the presence or absence
of noise removal from the first directivity signal is selected, and the first directivity signal from
which the noise is removed is interpolated. It brings about the effect of improving the masking
effect on hearing.
[0018]
Further, in the second aspect, the signal interpolation means includes: an interpolation source
signal generation means for generating an interpolation source signal for the interpolation; and
an interpolation outside removal means for removing the noise band other than the noise band
from the interpolation source signal. Level envelope generation means for generating a level
envelope of the first directional signal, level coefficient generation means for generating a level
coefficient for the interpolation based on the level envelope, and the level coefficient The signal
processing apparatus may further comprise level modulation means for modulating the output of
the non-interpolation removal means, and combining means for combining the output of the
noise removal means with the output of the level modulation means and outputting the result to
the selection means. In this case, the level modulation means may further modulate the output of
the extrapolation removal means based on the level masked in human auditory sense. Further,
the interpolation source signal generation means may generate a plurality of or a single periodic
signal having a predetermined waveform and a predetermined period, a white noise signal
having a uniform level in an audio band, or a predetermined of the periodic signal and the white
noise signal. Any of the mixed signals according to the mixing ratio of
[0019]
Further, in the second aspect, the signal interpolation means includes: an interpolation source
signal generation means for generating an interpolation source signal for the interpolation; and
an interpolation outside removal means for removing the noise band other than the noise band
from the interpolation source signal. Spectrum envelope generation means for generating a
frequency spectrum envelope of the output of the noise removal means; spectrum coefficient
generation means for generating spectrum coefficients for the interpolation based on the
10-05-2019
6
spectrum envelope; and the spectrum coefficients Spectral modulation means for modulating the
output of the extrapolation removal means, level envelope generation means for producing a
level envelope of the first directional signal, and a level for the interpolation based on the level
envelope Level coefficient generation means for generating a coefficient, and level modulation
means for modulating the output of the spectrum modulation means based on the level
coefficient May be synthesized with the outputs of the above-level modulation means for said
noise removing means comprises a synthesizing means for outputting to said selection means. In
this case, the noise removal means and the non-interpolation removal means may be realized by
a filter that adaptively changes the removal band and the pass band based on the frequency of
the noise recognized by the noise recognition means.
[0020]
According to the present invention, it is possible to obtain an excellent effect that the noise
generation period can be specified by utilizing directivity at the time of sound collection, and
noise can be reduced.
[0021]
Next, embodiments of the present invention will be described in detail with reference to the
drawings.
[0022]
FIG. 1 is a diagram showing a configuration example of the sound collection device 300
according to the embodiment of the present invention.
The sound collection device 300 includes an audio input unit 310, an amplifier 320, a directivity
generation unit 330, a timing generation unit 340, a rotation coefficient generation unit 350, a
noise detection unit 360, and a noise reduction processing unit 370. An encoding processing unit
380 and a recording and reproducing unit 390 are provided.
[0023]
The voice input unit 310 acquires and inputs surrounding voice signals, and is realized by, for
example, a plurality of microphones.
10-05-2019
7
The amplifier 320 amplifies the audio signal from the audio input unit 310 and supplies the
amplified signal to the directivity generation unit 330.
[0024]
The directivity generation unit 330 generates a directivity signal having directivity with respect
to a specific direction of the surroundings based on the audio signal from the audio input unit
310 supplied via the amplifier 320. The direction having directivity is in accordance with the
rotation coefficient supplied from the rotation coefficient generation unit 350. As a result, a
directivity signal having directivity in each of a plurality of directions can be obtained. Among
them, the directivity signal used for noise recognition is supplied to the noise detection unit 360
through the signal line 338, A directivity signal used as an original audio signal other than the
above is supplied to the noise reduction processing unit 370 via the signal line 339.
[0025]
The timing generation unit 340 generates operation timings in the directivity generation unit
330, the rotation coefficient generation unit 350, the noise detection unit 360, and the noise
reduction processing unit 370. The timing generation unit 340 divides one audio sampling
period “1 / Fs” into m and generates a timing signal for each sampling period “1 / (m · Fs)”
for up-sampling processing to be described later. That is, the sampling frequency is “m · Fs”.
The timing signal generated by the timing generation unit 340 is supplied to each unit via the
signal line 349.
[0026]
The rotation coefficient generation unit 350 generates a rotation coefficient indicating the
direction of the directivity of the directivity signal generated by the directivity generation unit
330. The rotation coefficient generated by the rotation coefficient generation unit 350 is
supplied to the directivity generation unit 330 via the signal line 359.
[0027]
10-05-2019
8
The noise detection unit 360 detects noise from the directivity signal supplied from the
directivity generation unit 330 via the signal line 338. When noise is detected, the noise
detection unit 360 generates a signal indicating a noise removal period according to the noise
generation period. The noise removal period generated by the noise detection unit 360 is
supplied to the noise reduction processing unit 370 via the signal line 369.
[0028]
The noise reduction processing unit 370 removes noise contained in the directivity signal
supplied from the directivity generation unit 330 in accordance with the noise removal period
supplied from the noise detection unit 360. The directivity signal subjected to the noise removal
processing by the noise reduction processing unit 370 is supplied to the encoding processing
unit 380 via the signal lines 371 to 376. The signal lines 371 to 376 correspond to respective
5.1 channel surround signals described later.
[0029]
The encoding processing unit 380 performs encoding processing on each signal supplied from
the noise reduction processing unit 370. The recording stream signal encoded by the encoding
processing unit 380 is supplied to the recording and reproducing unit 390 through the signal
line 389.
[0030]
The recording and reproducing unit 390 records or reproduces the recording stream signal
supplied from the encoding processing unit 380 on a recording medium. The recording /
reproducing unit 390 may record the video signal together with the audio signal input from the
audio input unit 310, but the description will be omitted in the embodiment of the present
invention.
[0031]
10-05-2019
9
FIG. 2 is a diagram showing the arrangement and directivity of a 5.1 channel surround sound
source. The 5.1 channel of this surround sound source has a directivity pattern 591 in the front
direction (FRT: Front), a directivity pattern 592 in the front left direction (FL: Front Left), and a
front right direction (FR: Front) with reference to the sound collection device 300. Low frequency
band of 5 directional patterns 593 of Right), directional pattern 594 of backward left direction
(RL: Rear Left), directional pattern 595 of backward right direction (RR: Rear Right), and
directional pattern 596 of omnidirectional direction It is a 0.1 channel of (LF: Low Frequency).
The 0.1 channel in the low frequency band is for obtaining a sense of weight of bass of about
100 Hz or less.
[0032]
A surround sound field can be obtained by collecting and recording such a surround sound
source and reproducing it with an existing surround compatible system. Although the abovementioned sound collection and sound source creation of the surround sound field is entrusted to
the production intention and know-how of the producer, it is conscious of the 5.1 channel
surround sound field reproduction standard ITU (International Telecommunication Union) -R
standard. Often done. In this standard, the front direction (FRT) is 0 degrees, the front left
direction (FL) is 30 degrees, the front right direction (FR) is 30 degrees, and the rear left
direction (RL) is 100 to 120 as the reproduction speaker arrangement. It is recommended that
the rear right direction (RR) be 100 to 120 degrees.
[0033]
FIG. 3 is a diagram showing an example of vector quantity extraction according to the
embodiment of the present invention. In the embodiment of the present invention, a vector as a
sound source and a vector for noise detection are set in each direction centered on the sound
collection device.
[0034]
The FRT vector 631 is a vector for the front direction, the FL vector 632 is a vector for the front
left direction, the FR vector 633 is a vector for the front right direction, and the RL vector 634 is
a vector for the rear left direction. Is the vector for the backward right direction. In addition,
10-05-2019
10
since 0.1 channel in the low frequency band has a long wavelength and is considered to have
only a size with little directivity, it is treated as a scalar quantity.
[0035]
The noise vectors A and F are vectors assuming noise generated from the lens drive mechanism,
the noise vector B is a vector assuming noise generated from the grip portion (gripping portion),
and the noise vector C assumes noise generated from the disk mechanism The noise vector D is a
vector assuming noise generated from an LCD (Liquid Crystal Display) monitor, and the noise
vector E is a vector assuming noise generated from various operation switches.
[0036]
By performing vector quantity extraction in which the direction and the size (sound collection
level) are matched to the noise incident from each direction as described above, it is possible to
easily recognize only the noise.
When the sound source direction matches the noise direction, the vector amount of the sound
source direction is also used for noise recognition. The collected sound image at this time is a
solid line 620 surrounding each vector in FIG.
[0037]
FIG. 4 is a view showing an example of polar patterns by the sound collection device according to
the embodiment of the present invention. The polar pattern is a polar coordinate representation
of the sensitivity level from the entire circumferential direction of each microphone in the sound
collection device. In this figure, the front direction is 0 degrees, and the sensitivity level in the
radial direction is relative, and the center is a sensitivity zero point.
[0038]
FIG. 4A shows a nondirectional (orientation) polar pattern, which has the same level of sensitivity
characteristics in all directions. FIG. 4 (b) shows a polar pattern of the first (single) directivity,
which is used when directivity is given in one direction. In this example, directivity is in the 0
10-05-2019
11
degree direction. FIG. 4C shows a polar pattern of the second directivity, which has stronger
direction selectivity than the first directivity. FIGS. 4 (d) and 4 (e) are referred to as bi-directional,
having a maximum sensitivity in the direction of the counter electrode with a certain direction
and the direction, and showing a sensitivity zero in the 90 degree direction with them. FIGS. 4D
and 4E have characteristics orthogonal to each other. Further, the positive electrode (+)
characteristic and the negative electrode (-) characteristic are opposite to each other, and the
signal phase of the both is 180 degrees shifted. Then, these directional characteristics can be
generated by a single microphone or a combination operation of a small number of microphones.
[0039]
In the embodiment of the present invention, these microphones are mounted as an audio input
unit 310 in the sound collection device by being built in or externally attached. And these
microphones pick up voice and noise from a plurality of directions simultaneously.
[0040]
FIG. 5 is a diagram showing a first arrangement example of the microphones according to the
embodiment of the present invention. In this first arrangement example, a nondirectional
microphone 411 and bidirectional microphones 412 and 413 are arranged.
[0041]
The nondirectional microphone 411 is a microphone having no directivity. The bi-directional
microphone 412 is a microphone having directivity in both the right and left directions as shown
in FIG. 4D, and is disposed in the front direction relatively to the non-directional microphone
411. The bi-directional microphone 413 is a microphone having directivity in both the front
direction and the rear direction as shown in FIG. 4E, and is disposed in the rear direction relative
to the nondirectional microphone 411. The positional relationship between the microphones is
an example and is not limited to this. For example, the microphones may be three-dimensionally
arranged.
[0042]
10-05-2019
12
FIG. 6 is a diagram showing an example of synthesizing sound sources of microphones according
to the first arrangement example according to the embodiment of the present invention. The
sound source combining mechanism is included in the directivity generating unit 330, and
includes level changing units 422 and 423, and an addition combining unit 426.
[0043]
The level variable unit 422 multiplies the lateral bi-directional signal supplied from the bidirectional microphone 412 by Ks. Further, the level variable unit 423 is for multiplying the
vertical bi-directional signal supplied from the bi-directional microphone 413 by Kc. Here, Ks and
Kc are rotation coefficients determined by the pointing direction. The rotation coefficient will be
described later.
[0044]
The addition synthesis unit 426 performs averaging processing on three signals of the
nondirectional signal supplied from the nondirectional microphone 411, the signal supplied from
the level variable unit 422, and the signal supplied from the level variable unit 423. It is to be
synthesized. The sound source synthesized by the addition synthesis unit 426 is a sound source
having any directivity.
[0045]
Here, the bi-directional signal in the horizontal direction (FIG. 4 (d)) is a sine function sin (t) at
time t, and the longitudinal bi-directional signal (FIG. 4 (e)) is the cosine function cos (t) at time t.
If it is expressed as t), the sound source X synthesized by the addition synthesis unit 426 can be
expressed by the following equation. In this equation, "1" corresponds to the nondirectional
signal (FIG. 4 (a)). X=(1+Ks・sin(t)+Kc・cos(t))/ 2
[0046]
FIG. 7 is a diagram showing a rotation coefficient in the embodiment of the present invention.
10-05-2019
13
[0047]
The rotation coefficient Ks 611 draws a sine curve according to the directivity rotation angle φ,
and the rotation coefficient Kc 612 draws a cosine curve according to the directivity rotation
angle φ.
That is, the rotation coefficients Ks 611 and Kc 612 are real numbers according to the directivity
rotation angle φ in the range from −1 to 1.
[0048]
When the rotation angle φ is 0 degree, only bi-directional signals from the bi-directional
microphone 413 are input to the adding and combining unit 426 at Ks = 0 and Kc = 1. Further,
when the rotation angle φ is 45 degrees, both Ks and Kc are reciprocals of the square root of 2
(≒ 0.7), and the bidirectional signals from the bidirectional microphones 412 and 413 have the
same level, and the addition and combination unit At 426, the averaging process is performed,
and the nondirectional signal is further subjected to the averaging process. FIG. 8 shows this
situation.
[0049]
That is, when the rotation angle φ is 45 degrees, the two bidirectional signals are added and
averaged at the same level, so that the reverse phase portion by the broken line is canceled and
the in phase portion by the solid line remains, as shown in FIG. A signal of directivity 511 in is
obtained. Then, the signal of directivity 511 and the signal of non-directivity 512 are subjected to
addition averaging processing, so that the reverse phase portion by the broken line is canceled,
the in phase portion by the solid line remains, and the rotation angle φ in FIG. A 45 degree
directivity 513 signal will be obtained.
[0050]
Similarly, when the rotation angle φ is 90 degrees, only the bi-directional signal from the bi-
10-05-2019
14
directional microphone 412 is input to the adding and combining unit 426. When the rotation
angle φ is 90 to 180 degrees, the positive and negative polarities of the bidirectional signal from
the bidirectional microphone 413 are inverted and synthesized by multiplication of negative
coefficients by Kc. When the rotation angle φ is 180 to 270 degrees, the positive and negative
polarities of the bidirectional signal from the bidirectional microphones 412 and 413 are
inverted and synthesized by multiplication of negative coefficients by Ks and Kc. When the
rotation angle φ is 270 to 0 degrees, positive and negative polarities of the bidirectional signal
from the bidirectional microphone 412 are inverted and synthesized by multiplication of
negative coefficients by Ks.
[0051]
Thus, by setting the rotation coefficients Ks and Kc, it is possible to generate a signal having
directivity at any rotation angle φ. Also, a surround sound source can be generated by using the
signal generated in this manner. Then, by dividing these sound sources into the vectors in FIG. 3,
it is possible to distinguish between the noise detection signal and the original audio signal.
[0052]
FIG. 9 is a diagram showing a second arrangement example of the microphones according to the
embodiment of the present invention. In this second arrangement example, four microphones of
nondirectional microphones 431 to 434 are arranged. The distance between the nondirectional
microphones 431 to 434 is, for example, about 10 to 15 millimeters. In addition, the line
connecting the nondirectional microphones 431 and 433 and the straight line connecting the
nondirectional microphones 434 and 432 may be orthogonal to each other, and the mutual
positional relationship is not limited to this.
[0053]
None of these nondirectional microphones 431 to 434 have directivity in a specific direction, but
combining and combining them can generate a signal having directivity in any direction.
[0054]
FIG. 10 is a diagram showing an example of generation of directivity characteristics of the
microphone in the second arrangement example according to the embodiment of the present
invention.
10-05-2019
15
[0055]
When the sound source of the nondirectional microphone 431 is subtracted from the sound
source of the nondirectional microphone 431 to adjust the frequency characteristics, a
bidirectional signal 506 as shown in FIG. 10A is generated.
Further, when the frequency characteristic is adjusted by subtracting the sound source of the
nondirectional microphone 432 from the sound source of the nondirectional microphone 434, a
bidirectional signal 507 as shown in FIG. 10B is generated.
Furthermore, a nondirectional signal is generated by arbitrarily combining and adding the sound
sources of the nondirectional microphones 431 to 434.
[0056]
FIG. 11 is a diagram showing an example of synthesizing the sound sources of the microphones
in the second arrangement example according to the embodiment of the present invention. The
sound source combining mechanism is included in the directivity generating unit 330, and
includes an adding unit 441, subtracting units 442 and 443, level changing units 444 and 445,
and an addition combining unit 446.
[0057]
The adding unit 441 generates a nondirectional signal by performing an averaging process on all
the sound sources of the nondirectional microphones 431 to 434. The subtractor unit 442
subtracts the sound source of the nondirectional microphone 432 from the sound source of the
nondirectional microphone 434 to generate the bidirectional signal 507 in the lateral direction of
FIG. The subtractor 443 subtracts the sound source of the nondirectional microphone 433 from
the sound source of the nondirectional microphone 431 to generate the bi-directional signal 506
in the vertical direction of FIG.
10-05-2019
16
[0058]
The level variable unit 444 multiplies the horizontal bi-directional signal 506 in FIG. 10B by Ks.
Further, the level variable unit 445 multiplies the vertical bi-directional signal 507 in FIG. 10A by
Kc. The direction coefficients Kc and Ks are the same as those described with reference to FIG.
[0059]
The addition synthesis unit 446 combines three signals of the nondirectional signal supplied
from the addition unit 441, the signal supplied from the level change unit 444, and the signal
supplied from the level change unit 445 by addition averaging processing. It is a thing. The
sound source synthesized by the addition synthesis unit 446 is a sound source having any
directivity.
[0060]
In the synthesis example of the sound sources of the microphones in the second arrangement
example, the output of the addition unit 441 is equivalent to the output of the nondirectional
microphone 411 in the first arrangement example, and the output of the subtraction unit 442 is
the first arrangement example The output of the subtractor 443 is equivalent to the output of the
bi-directional microphone 413 in the first arrangement example. Therefore, the output from the
addition synthesis unit 446 is equivalent to the output of the addition synthesis unit 426.
[0061]
FIG. 12 is a diagram showing a third arrangement example of the microphones according to the
embodiment of the present invention. In this third arrangement example, three microphones of
unidirectional microphones 452 and 453 and bi-directional microphone 451 are arranged. The
positional relationship between the microphones is an example and is not limited to this. For
example, the microphones may be three-dimensionally arranged.
[0062]
10-05-2019
17
The bi-directional microphone 451 is a microphone having directivity in both the right and left
directions as shown in FIG. 4 (d). The unidirectional microphone 452 is a microphone having
directivity in the front direction as shown in FIG. 4 (b). The unidirectional microphone 453 is a
microphone having directivity in the rear, as opposed to FIG. 4B. The distance between these
microphones is, for example, about 10 to 15 millimeters.
[0063]
FIG. 13 is a diagram showing an example of synthesizing the sound sources of the microphones
in the third arrangement example according to the embodiment of the present invention. The
sound source combining mechanism is included in the directivity generating unit 330, and
includes level variable units 461 to 463 and an addition combining unit 466.
[0064]
The level variable unit 461 multiplies the sound source of the bidirectional microphone 451 by
Ks. The level variable unit 462 multiplies the sound source of the unidirectional microphone 452
by (1 + Kc). The level variable unit 463 multiplies the sound source of the unidirectional
microphone 453 by (1-Kc). The direction coefficients Kc and Ks are the same as those described
with reference to FIG.
[0065]
The addition and combining unit 466 combines the three signals supplied from the level change
units 461 to 463 by the addition and averaging process. The sound source synthesized by the
addition synthesis unit 466 is a sound source having any directivity.
[0066]
Here, if the bi-directional signal in the longitudinal direction is represented as a cosine function
cos (t) at time t, the sound source of the unidirectional microphone 452 becomes (1 + cos (t)).
Also, the sound source of the unidirectional microphone 453 is (1−cos (t)). Then, if the bi-
10-05-2019
18
directional signal in the lateral direction is represented as a sine function sin (t) of time t, the
sound source Y synthesized by the addition synthesis unit 466 can be expressed by the following
equation. Y=((1+Kc)・(1+cos(t))/2 +(1−Kc)・
(1−cos(t))/2 +Ks・sin(t))/ 2
[0067]
FIG. 14 is a diagram showing an example of generation of directivity characteristics of
microphones in the third arrangement example according to the embodiment of the present
invention.
[0068]
FIG. 14A shows the directivity 521 of the sound source of the unidirectional microphone 452,
the directivity 522 of the sound source of the unidirectional microphone 453, and the directivity
523 of the sound source of the bidirectional microphone 451. ing.
[0069]
In the equation of the sound source Y synthesized by the addition synthesis unit 466, Ks = 0 and
Kc = 1 when the rotation angle φ is set to 0 degrees. A sound source is output.
When the directional rotation angle φ is set to 45 degrees, Ks and Kc both become reciprocals of
the square root of 2 (≒ 0.7), so that the addition and averaging process is performed in the
addition and combining unit 466, as shown in FIG. As in the directivity 524 of (b), a
unidirectional signal is generated in the 45-degree direction.
[0070]
Further, when the directional rotation angle φ is set to 90 degrees, Ks = 1 and Kc = 0, so that the
level variable sections 462 and 463 generate nondirectional signals, and the addition /
combination section 466 generates bi-directional signals. The averaging process with the
bidirectional signal of the sex microphone 451 generates a unidirectional signal in the 90-degree
direction.
[0071]
10-05-2019
19
Similarly, Kc is synthesized with a negative coefficient in the range of 90 to 180 degrees of
directional rotation angle φ, and Ks and kc are synthesized with the negative coefficients in the
range of directional rotation angle φ of 180 to 270 degrees, Ks is synthesized with a negative
coefficient in the range of the rotation angle φ of 270 ° to 0 °.
For example, when the directional rotation angle φ is set to 315 degrees, Kc is the reciprocal of
the square root of 2 (≒ 0.7), and Ks is the negative number of the reciprocal of the square root
of 2 (≒ −0.7) Become.
As a result, addition averaging processing is performed in the addition synthesis unit 466, and a
single directivity signal is generated in the direction of 315 degrees as the directivity 525 of FIG.
14 (b).
[0072]
In the first to third arrangement examples described above, the method of obtaining a
unidirectional signal as shown in FIG. 4B has been described, but a second directional signal as
shown in FIG. 4C is generated. It is also possible. In this case, the sound source Z to be
synthesized can be expressed by the following equation. In this equation, “1” corresponds to a
nondirectional signal (FIG. 4 (a), sin (t) corresponds to a bidirectional bidirectional signal (FIG. 4
(d)), and cos ( t) corresponds to the longitudinal bi-directional signal (FIG. 4 (e)).
Z=((1+Ks・sin(t)+Kc・cos(t)) ・(Ks・sin(t)+Kc・
cos(t)))/ 2
[0073]
According to this second directivity signal, since the directivity can be further narrowed,
selectivity of each directivity signal for noise detection described later can be improved.
[0074]
Note that the first to third arrangement examples described above are examples for explanation,
and the microphones can be changed within the scope of the present invention as long as they
are relatively close.
10-05-2019
20
For example, it is not necessary to arrange them on a straight line or at equal intervals, and in the
arrangement example of FIG. 9, three microphones can be similarly configured.
[0075]
FIG. 15 is a diagram showing an example of the directional rotation angle φ in the embodiment
of the present invention. The directional rotation angle φ (651) represents an angle formed by
the directivity 650 rotating clockwise with the front direction as 0 degree.
[0076]
According to the above-described sound source combining mechanism, it is possible to combine a
plurality of directivity signals with an arbitrary rotation angle φ of the entire circumference.
However, if these directional signals are handled individually, the number of channels to be
handled may increase, and the processing may become large-scaled or complicated. Therefore, in
the embodiment of the present invention, each directional signal is treated as a directional
stream signal of a single channel or a small number of channels.
[0077]
FIG. 16 is a diagram showing an example of contents of a directional stream signal in the
embodiment of the present invention. In this figure, the horizontal axis is, for example, a direction
channel in which the entire circumference is divided by 30 degrees. In this example, the D_1
channel with a rotation angle φ of 0 degrees, the D_2 channel with a rotation angle φ of 30
degrees, the D_3 channel with a rotation angle φ of 90 degrees, and the 12 channels from D_c
channel with a rotation angle φ of 330 degrees It is shown.
[0078]
The vertical axis represents the audio sampling period. Assuming that the sampling frequency is
Fs, the audio sampling period is "1 / Fs". In the audio sampling period Ts_0, directivity signals
10-05-2019
21
sampled in order from the D_1 channel are sequentially arranged in the manner of S (01), S (02),
and S (03). Also in the audio sampling period Ts_1, directivity signals sampled in order from the
D_1 channel are sequentially arranged in the manner of S (11), S (12), and S (13).
[0079]
The signals sampled in this way are sequentially scanned to generate one directional stream
signal as shown in FIG. The directional stream signal includes the levels of vector components in
both the time axis and the direction. That is, the directivity pattern described in the above
arrangement example can be regarded as a collection of vector quantities having the strongest
magnitude in the directivity direction, and by changing the principal axis direction along the
rotation angle φ, A vector quantity corresponding to the sound collection level is obtained for
each audio sampling cycle for each main axis direction.
[0080]
As shown in FIG. 17, when sampling m (m is an integer) number of directional channels in one
audio sampling cycle "1 / Fs", the required sampling cycle of the directional stream signal is "1 /
(m · · · Fs) For example, in the example of FIG. 16, since m = 12, the sampling period is “1 / (12
× Fs)”.
[0081]
FIG. 18 is a diagram showing a configuration example of the directivity generation unit 330
according to the embodiment of the present invention. The directivity generation unit 330
includes an up-sampling unit 331, an interpolation filter 332, and a directivity generation unit
333.
[0082]
The up-sampling unit 331 is configured to up-sample the audio signal acquired from the audio
input unit 310 via the amplifier 320. That is, the audio signal sampled at the sampling frequency
Fs is resampled at the sampling frequency “m · Fs” in the upsampling unit 331.
10-05-2019
22
[0083]
The interpolation filter 332 is for removing unnecessary wide band components (false signals)
generated by resampling in the upsampling unit 331. The interpolation filter 332 is realized by,
for example, a low pass filter (LPF).
[0084]
The directivity generation unit 333 generates a directivity signal based on the audio signal of the
sampling cycle “1 / (m · Fs)” supplied from the interpolation filter 332. The directivity
generation unit 333 generates a directivity signal having directivity corresponding to the rotation
coefficient in accordance with the rotation coefficient supplied from the rotation coefficient
generation unit 350 via the signal line 359. Here, a directivity signal for noise detection is
supplied from the signal line 338 to the noise detection unit 360, and a directivity signal used as
an original audio signal other than that is supplied from the signal line 339 to the noise
reduction processing unit 370. Shall be
[0085]
FIG. 19 is a diagram showing one configuration example of the downsampling mechanism in the
embodiment of the present invention. The downsampling mechanism is provided inside the noise
reduction processing unit 370 and the noise detection unit 360, and includes a directivity
direction extraction unit 371, a decimation filter 372, and a downsampling unit 373.
[0086]
The directivity direction extraction unit 371 extracts each directivity signal at timing
synchronized with the sampling frequency “m · Fs” supplied by the signal line 349.
[0087]
The decimation filter 372 removes unnecessary aliasing components in the directivity signal
extracted by the directivity direction extraction unit 371, and is realized by, for example, a low
pass filter (LPF).
10-05-2019
23
[0088]
The downsampling unit 373 restores the sampling frequency Fs to the original sampling
frequency by multiplying the sampling rate of the directional signal supplied from the decimation
filter 372 by “1 / m”.
[0089]
Accordingly, the noise reduction processing unit 370 corresponds to, for example, the 5.1
channel surround sound sources of the front direction, the front left direction, the front right
direction, the rear left direction, the rear right direction, and the low frequency band described
with FIG. A directional signal can be generated.
Further, the noise detection unit 360 can generate, for example, a directivity signal
corresponding to the noise vector described with reference to FIG.
[0090]
FIG. 20 is a diagram showing a first configuration example of the noise reduction mechanism in
the embodiment of the present invention.
In this noise reduction mechanism, a directivity signal for noise detection is input through the
signal line 118, and a directivity signal used as the other original voice signal is input through
the signal line 119. Noise reduction processing is performed on this directivity signal.
[0091]
The noise reduction mechanism includes an interpolation source signal generation unit 130, a
noise removal filter 141, an inverse filter 142, a level envelope generation unit 171, a level
coefficient generation unit 172, a level modulation unit 173, and a synthesis unit 180, A
selection switch 190, a noise recognition unit 210, and a noise removal period generation unit
220 are provided.
10-05-2019
24
Although it is assumed that the noise recognition unit 210 and the noise removal period
generation unit 220 are included in the noise detection unit 360 and the other units are included
in the noise reduction processing unit 370, the present invention is not limited thereto.
[0092]
The noise removal filter 141 is a filter that removes a noise band from the directivity signal from
the voice input unit 310. The noise removal filter 141 is realized by, for example, a BEF (Band
Elimination Filter) or the like which removes one or more frequency bands. The output of the
noise removal filter 141 is supplied to one input of the synthesis unit 180 through a signal line
149.
[0093]
The interpolation source signal generation unit 130 generates an interpolation source signal for
interpolation. In the embodiment of the present invention, by combining the interpolation signal
with the directional signal whose noise band has been removed by the noise removal filter 141,
the masking effect on human hearing is improved. The interpolation source signal generation
unit 130 outputs an appropriate mixture of a tone signal and a random signal as an interpolation
source signal that is a source of the interpolation signal. The configuration of this interpolation
source signal generation unit 130 will be described later.
[0094]
The inverse filter 142 is a filter that removes other than the noise band from the interpolation
source signal generated by the interpolation source signal generation unit 130. The inverse filter
142 has the inverse filter characteristic of the noise removal filter 141, the stop band of the
noise removal filter 141 is the pass band of the reverse filter 142, and the pass band of the noise
removal filter 141 is the stop band of the reverse filter 142. It becomes. The output of the
inverse filter 142 is supplied to the level modulation unit 173 via a signal line 148.
[0095]
10-05-2019
25
The level envelope generation unit 171 continuously detects the level envelope (level envelope)
of the directional signal from the audio input unit 310. The output of the level envelope
generation unit 171 is supplied to the level coefficient generation unit 172 via the signal line
177.
[0096]
The level coefficient generation unit 172 generates a level coefficient based on the level envelope
supplied from the level envelope generation unit 171. The output of the level coefficient
generation unit 172 is supplied to the level modulation unit 173 via the signal line 178.
[0097]
The level modulation unit 173 performs level modulation on the interpolation source signal
supplied from the inverse filter 142 in accordance with the level coefficient supplied from the
level coefficient generation unit 172, and outputs the result as an interpolation signal. The output
of the level modulation unit 173 is supplied to one input of the synthesis unit 180 through a
signal line 179.
[0098]
The combining unit 180 combines the directivity signal supplied from the noise removal filter
141 via the signal line 149 with the interpolation signal supplied from the level modulation unit
173 via the signal line 179. The combining unit 180 is realized by, for example, an adder. The
output of the combining unit 180 is supplied to the on input terminal of the selection switch 190
via a signal line 189.
[0099]
The noise recognition unit 210 recognizes noise contained in the directivity signal from the voice
input unit 310. The output of the noise recognition unit 210 is supplied to the noise removal
10-05-2019
26
period generation unit 220 via the signal line 219. When noise is recognized in the noise
recognition unit 210, the noise removal period generation unit 220 generates a signal indicating
a noise removal period according to the noise generation period. The output of the noise removal
period generation unit 220 is supplied to the control terminal of the selection switch 190 via the
signal line 369.
[0100]
The selection switch 190 selects the directivity signal supplied from the combining unit 180
through the signal line 189 in the noise removal period according to the signal supplied from the
noise removal period generation unit 220 through the signal line 369. If it is not a noise removal
period, it is a switch for selecting the directivity signal supplied from the voice input unit 310 via
the signal line 119. The output of the selection switch 190 is supplied via the signal line 199 for
the subsequent processing.
[0101]
Here, although an example of the noise reduction mechanism for one channel of the directivity
signal is shown, in practice, a number of noise reduction mechanisms corresponding to the
required number of channels are provided.
[0102]
FIG. 21 is a view for explaining the masking phenomenon used in the embodiment of the present
invention.
Human hearing is designed so as not to notice the existence of a small sound that is behind a
relatively loud sound so that the human voice is difficult to hear in loud noise. Such a
phenomenon is called a masking phenomenon and is known to depend on conditions such as
frequency components, sound pressure levels, and durations. This auditory masking
phenomenon is roughly divided into frequency masking and temporal masking, and temporal
masking is further divided into simultaneous masking and non-simultaneous masking
(continuous masking). This masking phenomenon is applied, for example, as CD (compact disc) as
high-efficiency coding to compress an audio signal to about 1/5 to 1/10.
10-05-2019
27
[0103]
In FIG. 21, time lapse is shown in the horizontal direction, and the absolute value of the signal
level over time is shown in the vertical direction. As shown in FIG. 21A, when the signal A is input
at a predetermined level and the signal B is input at a predetermined level with no signal gap
period interposed, the human auditory sense level is schematically shown in FIG. 21B. Shown.
That is, in the human sense of hearing, even after the signal A is extinguished, the pattern of the
signal A remains for a while as in the region 91, although the sensitivity is lowered. Such a
phenomenon is called forward (forward) masking, and even if another sound is present during
this period, the human hearing can not be heard. Also, just before the signal B is input, the same
sensitivity decrease occurs as in the region 92. This is called backward (reverse) masking, and
even if there is another sound in this period, the human hearing is inaudible.
[0104]
Usually, the amount of forward masking is larger than the amount of backward masking, and
although depending on the conditions in time, up to several hundreds of ms occur. Under certain
conditions, the gap period in FIG. 21 (a) is not perceived audibly for several milliseconds to
several tens of milliseconds, and a phenomenon occurs in which the signal A and the signal B can
be heard as continuous sound. Such a phenomenon is described in R.S. Plomp's research paper
on gap detection (1963), Miura's research paper (JAS. Journal 94. November), and an overview of
auditory psychology (B.C.J. As shown in Moore, Kengo Ohori, Seishin Shobo, Chapter 4 / Time
resolution of the auditory system), it is known to have the following characteristics:
[0105]
(First characteristic): If the frequency bands of the signal A and the signal B are correlated, the
gap length becomes large. Also, if the continuity between the signal A and the signal B is
maintained in frequency, the gap length becomes large. (Second characteristic): The gap length of
the band signal is larger than that of the single sine wave signal. (Third characteristic): When the
levels of the signal A and the signal B are the same, the gap length increases as the signal level
decreases, and the gap length does not change when the signal level increases beyond a certain
level. (Fourth Characteristic): The gap length increases as the level of the signal B is smaller than
that of the signal A. (Fifth characteristic): The gap length increases as the center frequency
included in the signal decreases, and the gap length decreases as the center frequency increases.
10-05-2019
28
[0106]
In the embodiment of the present invention, the level coefficient generation unit 172 generates
level coefficients for interpolation in consideration of these five characteristics. For example, the
level coefficient generation unit 172 lengthens the gap period when the audio level is low (third
characteristic), and further increases the gap period when the audio level tends to rise above
temporally. Make it longer (4th characteristic).
[0107]
FIG. 22 is a diagram showing a configuration example of the interpolation source signal
generation unit 130 in the embodiment of the present invention. The interpolation source signal
generation unit 130 includes a tone signal generation unit 131, a white noise signal generation
unit 132, and a mixing unit 133.
[0108]
The tone signal generator 131 generates a tone signal composed of a single or a plurality of sine
waves or pulse waves of a predetermined cycle. This tone signal has single or multiple peaks at a
predetermined frequency in frequency characteristics.
[0109]
The white noise signal generation unit 132 generates a white noise signal (random signal) whose
level is uniform over the entire voice band. The white noise signal generation unit 132 is realized,
for example, by an M-sequence random number generator.
[0110]
The mixing unit 133 outputs, as an interpolation source signal, a mixed signal obtained by
mixing the tone signal generated by the tone signal generating unit 131 and the white noise
10-05-2019
29
signal generated by the white noise signal generating unit 132 according to a predetermined
mixing ratio. The output of the mixing unit 133 is supplied to the inverse filter 142 through the
signal line 139.
[0111]
The predetermined mixing ratio is appropriately set according to the noise removal band
characteristic of the noise removal filter 141. However, only one of the tone signals or only the
white noise signal may be output as the interpolation source signal, with either one set to zero.
[0112]
FIG. 23 is a diagram showing an example of frequency characteristics of the noise removal filter
141 and the inverse filter 142 in the embodiment of the present invention. Here, the frequency is
shown in the horizontal direction, and the passing signal level of the filter is shown in the vertical
direction.
[0113]
FIG. 23A shows an example of the frequency characteristic of the noise removal filter 141. Here,
it is shown that the filter has three as fa, fb and fc as center frequencies of the removal band. On
the other hand, FIG. 23 (b) shows an example of the frequency characteristic of the inverse filter
142. It is shown that the filter has center frequencies fa, fb and fc as passbands, contrary to the
noise removal filter 141.
[0114]
That is, in this example, it is understood that the center frequencies fa, fb and fc are treated as
noise bands, the noise removing filter 141 treats the noise band as the removing band, and the
inverse filter 142 handles the noise band as the pass band.
[0115]
10-05-2019
30
FIG. 24 is a diagram showing a configuration example of the level envelope generation unit 171
in the embodiment of the present invention.
The level envelope generation unit 171 includes an absolute value generation unit 174 and a
smoothing unit 175.
[0116]
The absolute value generation unit 174 generates an absolute value of the directional signal
supplied via the signal line 119. The smoothing unit 175 extracts and smoothes the low
frequency component from the directivity signal that is absolute value converted by the absolute
value generation unit 174, and is realized by, for example, a low pass filter (LPF). By this
smoothing, it is possible to remove the influence of a sudden level change such as instantaneous
noise.
[0117]
FIG. 25 is a diagram showing an example of a process performed by the level envelope
generation unit 171 in the embodiment of the present invention. FIG. 25A is a waveform example
of the directivity signal (audio signal) supplied to the level envelope generation unit 171 via the
signal line 119. The directivity signal is converted into an absolute value by the absolute value
generation unit 174, so that it has a waveform as shown in FIG. 25 (b).
[0118]
Then, the absolute valued directivity signal having the waveform of FIG. 25 (b) is smoothed by
the smoothing unit 175 to form an envelope like a thick line shown in FIG. 25 (c).
[0119]
The level coefficient generation unit 172 generates a level coefficient based on the level envelope
generated as described above, and the level modulation unit 173 is controlled by the level
coefficient to generate an interpolation signal.
[0120]
10-05-2019
31
FIG. 26 is a diagram showing an example of the interpolation signal in the embodiment of the
present invention.
In this example, the correction signal 21 is generated so as to maintain the continuity of the
frequency between the signal A and the signal B based on the level envelope generated by the
level envelope generation unit 171.
Thereby, the gap length can be increased by the first characteristic described above.
[0121]
FIG. 27 is a diagram showing another example of the interpolation signal in the embodiment of
the present invention. In this example, a correction signal 22 for compensating for the shortfall
ΔS between the front masking and the rear masking shown in FIG. 21B and the signal B is
generated. This prevents a gap from being felt in hearing. That is, in the example of FIG. 27, the
continuity between the signal A and the signal B is not ensured as in the example of FIG. 26, but
level interpolation is performed so that the gap period is masked in the sense of hearing. ing.
[0122]
FIG. 28 is a diagram showing a configuration example of the noise recognition unit 210
according to the embodiment of the present invention. FIG. 28 (a) recognizes noise in the time
domain, and FIG. 28 (b) recognizes noise in the frequency domain.
[0123]
In the configuration example of FIG. 28A, the noise recognition unit 210 includes a frame
generation unit 211, a noise pattern matching unit 212, and a noise pattern holding unit 213.
[0124]
10-05-2019
32
The frame generation unit 211 frames the directivity signal supplied via the signal line 119 at
predetermined time intervals.
Here, a frame is a data string composed of a plurality of audio sampling signals. The framed N (N
is an integer) audio sampling signals S (n) are supplied to the noise pattern matching unit 212.
However, n represents an integer of 1 to N.
[0125]
The noise pattern holding unit 213 is a memory that holds the noise pattern W (n). This noise
pattern (also referred to as a wavelet) is further read out from the noise pattern holding unit 213
as a function W ((n−b) / a) of a and b. Here, a is a scale parameter (where a> 0), and when this
value is small, this corresponds to noise recognition of low frequency components. On the other
hand, when the scale parameter is large, it corresponds to noise recognition of high frequency
components. Further, b is a shift parameter, and represents a shift position (time) in pattern
matching with the noise pattern. The wavelet is a function whose signal mean value is 0 and
localized around time 0. However, in the embodiment of the present invention, a function
approximating an actual noise waveform is selected in advance, and the noise pattern holder It
shall hold in 213.
[0126]
The noise pattern matching unit 212 performs convolution while changing the directivity signal
S (n) framed by the frame generation unit 211 and the noise patterns W (n) and a and b held in
the noise pattern holding unit 213. By performing the operation, the noise present in the
directional signal is evaluated. The evaluation value Et in this case is calculated by the following
equation.
[0127]
That is, the evaluation value Et is an index indicating how much the noise pattern W (n) is
included in the audio signal S (n), and there is noise in the directivity signal S (n) for each frame.
The evaluation value Et becomes large, and the evaluation value Et comes to approach zero when
there is little correlation with noise.
10-05-2019
33
[0128]
In the configuration example of FIG. 28B, the noise recognition unit 210 includes a frame
generation unit 214, a Fourier transform unit 215, a noise pattern matching unit 216, and a
noise pattern holding unit 217.
[0129]
The frame generation unit 214, like the frame generation unit 211, frames the directivity signal
supplied via the signal line 119 at predetermined time intervals.
The Fourier transform unit 215 transforms the time signal into a frequency signal F (n) by
performing Fourier transform on the directional signal framed by the frame generation unit 214
using FFT (Fast Fourier Transform).
[0130]
The noise pattern holding unit 217 is a memory that holds the noise pattern P (n).
The noise pattern P (n) held by the noise pattern holding unit 217 is a model of the frequency
distribution at the time of noise generation.
[0131]
The noise pattern matching unit 216 determines the degree of correlation between the directivity
signal F (n) converted by the Fourier transform unit 215 and the noise pattern P (n) held in the
noise pattern holding unit 213, The noise present in the sex signal is evaluated. The evaluation
value Ef in this case is calculated by the following equation.
[0132]
Here, N is the number of FFT points in one frame. That is, when n is 1 to N and the similarity
10-05-2019
34
between the noise pattern and the directivity signal is high, the evaluation value Ef approaches 1,
so if both are equal to or more than a predetermined threshold value, it is recognized that both
patterns match be able to.
[0133]
When noise is recognized in this manner, the noise removal period generation unit 220
generates a period determined by the start point and the end point of the noise generation as the
noise removal period. Here, although the method of recognizing noise in each of the time domain
and the frequency domain has been described, the recognition rate can be further improved by
combining these.
[0134]
When a plurality of types of noise are assumed, the above-described noise pattern holding unit
213 and noise pattern holding unit 217 hold noise patterns corresponding to the plurality of
types of noise and recognize each noise. It will be.
[0135]
In the example of FIG. 20, the selection switch 190 has been described assuming a simple
changeover switch, but this may be realized by, for example, the following cross fade switch.
[0136]
FIG. 29 is a diagram showing a configuration example of the cross fade switch 191 as an
example of the selection switch 190 according to the embodiment of the present invention.
The crossfade switch 191 includes attenuators 192 and 193, a control coefficient generation unit
194, a coefficient inversion unit 195, and a combining unit 196.
[0137]
The attenuators 192 and 193 are attenuators that attenuate the input signal according to the
10-05-2019
35
control coefficient.
The control coefficient of the attenuator 192 is supplied from the control coefficient generator
194, and the control coefficient of the attenuator 193 is supplied from the coefficient inverter
195.
[0138]
The control coefficient generation unit 194 generates a control coefficient of the attenuator 192
based on the noise removal period supplied via the signal line 229. The coefficient inverting unit
195 inverts the output of the control coefficient generating unit 194. That is, the control
coefficients of the attenuators 192 and 193 are mutually inverted.
[0139]
The combining unit 196 combines the outputs of the attenuators 192 and 193, and is realized by
an adder, for example.
[0140]
FIG. 30 is a diagram showing an example of waveform signals of the cross fade switch 191
according to the embodiment of the present invention.
When a signal 31 as shown in FIG. 30A is input to the signal line 229, the output signal of the
control coefficient generation unit 194 cross-fades with a predetermined time constant as the
signal 32. On the other hand, the output signal of the coefficient inverting unit 195 is the
inverted signal 33 of the signal 32, and similarly crossfades with a predetermined time constant.
Therefore, overshoot and ringing can be prevented. In addition, it is possible to absorb the
discontinuities of the waveform when switching the outputs of the attenuators 192 and 193 on
the auditory sense, and there is an advantage that it works in favor of the masking effect.
[0141]
10-05-2019
36
FIG. 31 is a diagram showing an example of an interpolation signal when the cross fade switch
191 according to the embodiment of the present invention is used. Assuming that an
interpolation signal as shown in FIG. 26 is output in level modulation section 173, crossfading is
performed in the transition between signals A and B and the interpolation signal as shown in FIG.
31 when crossfade switch 191 is used. , Smooth switching can be realized.
[0142]
FIG. 32 is a diagram showing a second configuration example of the noise reduction mechanism
in the embodiment of the present invention. Similar to the first configuration example, a
directivity signal for noise detection is input to this noise reduction mechanism via the signal line
118, and the directivity signal used as the other original voice signal is a signal. The noise
reduction process is performed on the directivity signal, which is input through the line 119.
[0143]
In the second configuration example, in addition to the first configuration example, the noise
removal filter 143, a spectrum envelope generation unit 161, a spectrum coefficient generation
unit 162, and a variable filter 163 are further provided. Although it is assumed that these are
included in the noise reduction processing unit 370, the present invention is not limited to this.
[0144]
Similar to the noise removal filter 141, the noise removal filter 143 is a filter that removes a
noise band from the directivity signal from the voice input unit 310. The output of the noise
removal filter 143 is supplied to a spectrum envelope generation unit 161. Note that the noise
removal filter 143 can be shared with the noise removal filter 141, and in this case, the output of
the noise removal filter 141 is supplied to the spectrum envelope generation unit 161.
[0145]
The spectral envelope generation unit 161 continuously detects the envelope (spectral envelope)
of the frequency spectrum of the directional signal from the voice input unit 310. The spectrum
10-05-2019
37
envelope generation unit 161 detects the frequency spectrum by detecting the level for each
frequency of the directional signal (audio signal) by FFT or a plurality of band divisions. The
output of the spectrum envelope generation unit 161 is supplied to a spectrum coefficient
generation unit 162.
[0146]
The spectral coefficient generation unit 162 generates spectral coefficients based on the spectral
envelope supplied from the spectral envelope generation unit 161. The spectrum coefficient
generation unit 162 generates spectrum coefficients so as to reproduce the frequency spectrum
detected by the spectrum envelope generation unit 161. The output of the spectrum coefficient
generator 162 is supplied to the variable filter 163 via the signal line 168.
[0147]
The variable filter 163 performs frequency modulation on the interpolation source signal
supplied from the inverse filter 142 in accordance with the spectral coefficient supplied from the
spectral coefficient generation unit 162. Thus, not only the level modulation in the level
modulation section 173 but also the frequency components are continuously interpolated, so the
gap length can be further increased by the first characteristic.
[0148]
The second configuration example is similar to the first configuration example in that the
selection switch 190 can be replaced with the cross fade switch 191.
[0149]
FIG. 33 is a diagram showing a third configuration example of the noise reduction mechanism in
the embodiment of the present invention.
Similar to the first and second configuration examples, a directivity signal for noise detection is
input to this noise reduction mechanism via the signal line 118, and the directivity used as the
other original voice signal A signal is input via signal line 119, and noise reduction processing is
10-05-2019
38
performed on this directivity signal.
[0150]
In this third configuration example, in addition to the second configuration example, a delay unit
120 is provided, and the outputs subjected to a delay for a predetermined time by this delay unit
120 are noise removal filters 141 and 143 and a level envelope generation unit 171. It is
supplied to Also, the signal line 157 from the noise recognition unit 210 is supplied to the
variable filter block 140. The variable filter block 140 is a block including a noise removal filter
141, an inverse filter 142, and a noise removal filter 143.
[0151]
The noise recognition unit 210 in the third configuration example detects the frequency of the
recognized noise and feeds it back to the variable filter block 140. As a method of detecting the
noise frequency, at the time of noise recognition in the time domain of FIG. Further, at the time of
noise recognition in the frequency domain of FIG. 28B, the noise frequency can be calculated by
detecting the noise peak frequency from the Fourier transform unit 215.
[0152]
The noise frequency fed back from the noise recognition unit 210 is used to adjust the pass band
or stop band in each filter of the variable filter block 140. Thereby, for example, by changing
center frequencies fa, fb and fc in FIG. 23 adaptively in accordance with the noise frequency,
noise frequency fluctuation and continuous noise from a plurality of noise sources can be
obtained. It is possible to cope effectively.
[0153]
In this third configuration example, the directivity signal is supplied to other than the noise
recognition unit 210 through the delay unit 120, so the pass band or the stop band can be
adjusted in real time according to the result of the noise recognition. .
[0154]
10-05-2019
39
The third configuration example is similar to the first and second configuration examples in that
the selection switch 190 can be replaced with the cross fade switch 191.
[0155]
Next, the operation of the imaging apparatus according to the embodiment of the present
invention will be described with reference to the drawings.
[0156]
FIG. 34 is a diagram showing an example of a basic processing procedure of the noise reduction
method in the sound collection device 300 according to the embodiment of the present
invention.
This processing procedure example is common to the first to third configuration examples
described above.
[0157]
First, noise recognition processing is performed in the noise recognition unit 210 (step S 910).
Thereby, the noise removal period generation unit 220 generates a noise removal period.
Then, if it corresponds to the noise removal period (step S920), the directivity signal supplied
from the noise removal filter 141 via the signal line 149 is selected by the selection switch 190
(step S930). On the other hand, when it does not correspond to the noise removal period (step
S920), the directivity signal supplied from the voice input unit 310 via the signal line 119 is
selected (step S940). The above processing is repeated.
[0158]
As described above, according to the embodiment of the present invention, the noise detection
10-05-2019
40
unit 360 performs noise detection based on the directivity signal generated by the directivity
generation unit 330, and the noise reduction processing unit for the directivity signal according
to the result. Noise removal can be performed at 370. The directivity signal used for noise in the
noise detection unit 360 corresponds to a noise vector, and noise can be detected efficiently.
Further, the noise reduction processing unit 370 specifies a noise removal period from the noise
recognized by the noise recognition unit 210, selects a signal from which noise is removed by the
noise removal filter 141 in the noise removal period, and other periods. By controlling the
selection switch 190 so as to select a directivity signal which is not subjected to noise removal, it
is possible to realize noise reduction processing in consideration of human hearing. Further,
according to the embodiment of the present invention, it is possible to reduce long-lasting noise
by synthesizing the interpolation signal in the noise removal period.
[0159]
In the embodiment of the present invention, the directivity signal is subjected to scanning
processing to generate a recording stream signal, and an example in which the sampling
frequency is changed by upsampling processing and downsampling processing has been
described. Each directivity signal may be handled individually without performing the
[0160]
In the embodiment of the present invention, an example in which a 5.1 channel surround signal
is assumed has been described. However, the present invention is not limited to this, and the
present invention is not limited to this. The same implementation is possible without departing
from the object of the invention.
[0161]
In addition, the embodiment of the present invention shows an example for embodying the
present invention, and as shown below, it has correspondences with the invention specific
matters in the claims, respectively, but is limited thereto Various modifications can be made
without departing from the scope of the present invention.
[0162]
That is, in claim 1, the voice input means corresponds to the voice input unit 310, for example.
Further, the directivity generation unit corresponds to, for example, the directivity generation
10-05-2019
41
unit 330.
Also, the noise removal means corresponds to the noise removal filter 141, for example.
Also, noise recognition means corresponds to the noise recognition unit 210, for example.
Further, the noise removal period generation means corresponds to, for example, the noise
removal period generation unit 220. The selection means corresponds to, for example, the
selection switch 190.
[0163]
In claim 2, the bi-directional microphones correspond to, for example, bi-directional microphones
412 and 413. Also, the nondirectional microphone corresponds to the nondirectional
microphone 411, for example.
[0164]
Further, in claim 3, the nondirectional microphones correspond to, for example, the
nondirectional microphones 431 to 434.
[0165]
Also, in claim 4, the unidirectional microphones correspond to, for example, the unidirectional
microphones 452 and 453.
Also, the bi-directional microphone corresponds to the bi-directional microphone 451, for
example.
[0166]
Further, in claim 5, the rotation coefficient generation means corresponds to, for example, the
rotation coefficient generation unit 350.
10-05-2019
42
[0167]
Further, in claim 11, the voice input means corresponds to the voice input unit 310, for example.
Further, the directivity generation unit corresponds to, for example, the directivity generation
unit 330. Also, the noise removal means corresponds to the noise removal filter 141, for
example. The signal interpolation means may be, for example, the interpolation source signal
generation unit 130, the inverse filter 142, the noise removal filter 143, the spectrum envelope
generation unit 161, the spectrum coefficient generation unit 162, the variable filter 163, the
level envelope generation unit 171, and the level coefficient generation unit 172 , And the
combination of at least a part of the level modulation unit 173 and the combining unit 180. Also,
noise recognition means corresponds to the noise recognition unit 210, for example. Further, the
noise removal period generation means corresponds to, for example, the noise removal period
generation unit 220. The selection means corresponds to, for example, the selection switch 190.
[0168]
Further, in claim 12, the interpolation source signal generation means corresponds to, for
example, the interpolation source signal generation unit 130. Further, the non-interpolation
removal means corresponds to the inverse filter 142, for example. Further, the level envelope
generation unit corresponds to, for example, the level envelope generation unit 171. The level
coefficient generation unit corresponds to, for example, the level coefficient generation unit 172.
The level modulation means corresponds to, for example, the level modulation unit 173. Also, the
combining means corresponds to the combining unit 180, for example.
[0169]
Further, in claim 15, the interpolation source signal generation means corresponds to the
interpolation source signal generation unit 130, for example. Further, the non-interpolation
removal means corresponds to the inverse filter 142, for example. The spectral envelope
generation means corresponds to, for example, the spectral envelope generation unit 161. In
addition, a spectral coefficient generation unit corresponds to, for example, the spectral
coefficient generation unit 162. Further, the spectrum modulation means corresponds to, for
example, the variable filter 163. Further, the level envelope generation unit corresponds to, for
example, the level envelope generation unit 171. The level coefficient generation unit
10-05-2019
43
corresponds to, for example, the level coefficient generation unit 172. The level modulation
means corresponds to, for example, the level modulation unit 173. Also, the combining means
corresponds to the combining unit 180, for example.
[0170]
The processing procedure described in the embodiment of the present invention may be
regarded as a method having a series of these procedures, and a program for causing a computer
to execute the series of procedures or a recording medium storing the program. You may think of
it as
[0171]
It is a figure which shows one structural example of the sound collection apparatus 300 in
embodiment of this invention.
It is a figure which shows arrangement | positioning and directivity characteristic of a 5.1
channel surround sound source. It is a figure which shows an example of the vector quantity
extraction in embodiment of this invention. It is a figure which shows the example of the polar
pattern by the sound collection apparatus in embodiment of this invention. It is a figure which
shows the 1st example of arrangement | positioning of the microphone in embodiment of this
invention. It is a figure which shows the synthesis example of the sound source of the
microphone by the 1st example of arrangement | positioning by embodiment of this invention. It
is a figure which shows the rotation coefficient in embodiment of this invention. It is a figure
which shows an example of the directional characteristic of the microphone in the 1st example of
arrangement | positioning by embodiment of this invention. It is a figure which shows the
example of 2nd arrangement | positioning of the microphone in embodiment of this invention. It
is a figure which shows the example of production | generation of the directional characteristic
of the microphone in the 2nd example of arrangement | positioning by embodiment of this
invention. It is a figure which shows the synthesis example of the sound source of the
microphone in the 2nd example of arrangement | positioning by embodiment of this invention. It
is a figure which shows the 3rd example of arrangement | positioning of the microphone in
embodiment of this invention. It is a figure which shows the synthesis example of the sound
source of the microphone in the 3rd example of arrangement | positioning by embodiment of
this invention. It is a figure which shows the example of production | generation of the
directional characteristic of the microphone in the 3rd example of arrangement | positioning by
embodiment of this invention. It is a figure which shows the example of rotation angle (phi) of
the directivity in embodiment of this invention. It is a figure which shows the example of a
10-05-2019
44
content of the directional stream signal in embodiment of this invention. It is a figure which
shows the relationship between the directional stream signal in embodiment of this invention,
and a sampling period. It is a figure which shows one structural example of the directivity
generation part 330 in embodiment of this invention. It is a figure which shows one structural
example of the downsampling mechanism in embodiment of this invention. It is a figure which
shows the 1st structural example of the noise reduction mechanism in embodiment of this
invention. It is a figure for demonstrating the masking phenomenon utilized in embodiment of
this invention. It is a figure which shows one structural example of the interpolation source
signal generation part 130 in embodiment of this invention. It is a figure which shows the
example of a frequency characteristic of the noise removal filter 141 and the reverse filter 142 in
embodiment of this invention. It is a figure which shows one structural example of the level
envelope production | generation part 171 in embodiment of this invention. It is a figure which
shows an example of the process in the level envelope production | generation part 171 in
embodiment of this invention. It is a figure which shows an example of the interpolation signal in
embodiment of this invention. It is a figure which shows the other example of the interpolation
signal in embodiment of this invention.
It is a figure which shows one structural example of the noise recognition part 210 in
embodiment of this invention. It is a figure which shows the structural example of the cross fade
switch 191 as an example of the selection switch 190 in embodiment of this invention. It is a
figure which shows the example of a waveform signal of the cross fade switch 191 in
embodiment of this invention. It is a figure which shows the example of the interpolation signal
at the time of using the cross fade switch 191 in embodiment of this invention. It is a figure
which shows the 2nd structural example of the noise reduction mechanism in embodiment of
this invention. It is a figure which shows the example of 3rd structure of the noise reduction
mechanism in embodiment of this invention. It is a figure which shows the example of a
fundamental process sequence of the noise reduction method in the sound collection apparatus
300 by embodiment of this invention.
Explanation of sign
[0172]
300 sound collecting device 310 voice input unit 320 amplifier 330 directivity generation unit
331 up sampling unit 332 interpolation filter 333 directivity generation unit 340 timing
generation unit 350 rotation coefficient generation unit 360 noise detection unit 370 noise
reduction processing unit 371 directivity Direction extraction unit 372 Decimation filter 373
Downsampling unit 380 Coding processing unit 390 Recording / playback unit 411
Nondirectional microphone 412, 413 Bidirectional microphone 422, 423 Level variable unit 426
10-05-2019
45
Addition synthesis unit 431-434 Nondirectional microphone 441 Addition Parts 442, 443
Subtractors 444, 445 Level variable part 446 Addition synthesis part 451 Bidirectional
microphone 452, 453 Unidirectional microphones 461-463 Level variable part 66 adding and
combining unit 506, 507 bi-directional signal
10-05-2019
46
Документ
Категория
Без категории
Просмотров
0
Размер файла
63 Кб
Теги
jp2008288910
1/--страниц
Пожаловаться на содержимое документа