close

Вход

Забыли?

вход по аккаунту

?

JP2007174011

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JP2007174011
PROBLEM TO BE SOLVED: To pick up sound with high S / N ratio without deteriorating
frequency characteristics even in a situation where a sound source moves. SOLUTION: An output
signal synthesis unit 30 synthesizes a final digital audio signal SS from a digital audio signal S-k
(k = 1 to m) obtained by a microphone 11-k (k = 1 to m). Output. The extraction unit 20-k (k = 1
to m) outputs an audio strength signal Es-k (k = 1 to m) from the digital audio signal S-k (k = 1 to
m). The switching control unit 40 controls the output signal combining unit 30 based on the
audio intensity signal Es-k (k = 1 to m) so that a signal with high intensity of the audio
component is output. [Selected figure] Figure 1
Sound pickup device
[0001]
The present invention relates to a sound pickup device that picks up external sounds and outputs
an electrical signal.
[0002]
In order to pick up a speaker's voice in a noisy environment such as a voice conference, a voice
pickup device is needed to pick up the voice emitted from the speaker's mouth as a sound source
at a high S / N ratio. .
As a technique for meeting such requirements, Patent Documents 1 and 2 propose a technique of
04-05-2019
1
combining microphones having unidirectionality, superposing output signals of the respective
microphones, and outputting them. Patent Document 1: Japanese Patent Application Laid-Open
No. 10-126876 Patent Document 2: Japanese Patent Application Laid-Open No. 2000-188795
[0003]
According to the techniques disclosed in Patent Documents 1 and 2 described above, the output
signals of the microphones that picked up the voice are superimposed in phase and output only
when the voice is generated from a specific position or direction. . Therefore, it is possible to
narrow the target to the voice emitted from the speaker's mouth, which is a sound source, and to
pick up this voice at a high S / N ratio. However, it is difficult to overlap output signals of a
plurality of microphones in the same phase over a wide frequency band. For this reason, when
the sound collection is performed by the techniques disclosed in Patent Documents 1 and 2,
there is a problem that the frequency characteristic of the collected sound is impaired. Further, in
the techniques disclosed in Patent Documents 1 and 2, when the position of the sound source
moves, for example, when the position of the speaker's mouth moves, the sound generated from
the sound source is collected at a high S / N ratio. There is a problem that it is difficult to do.
[0004]
The present invention has been made in view of the above-described circumstances, and provides
a sound collection device capable of collecting sound at a high S / N ratio without damaging the
frequency characteristics even when the sound source moves. The purpose is to
[0005]
The present invention comprises a plurality of microphones for picking up sound from the
outside and outputting electric signals, an output signal combining means for combining and
outputting audio signals to be output from output signals of the plurality of microphones, and
Extracting means for extracting at least an audio component from each output signal of a
plurality of microphones and outputting a signal indicating an intensity or an S / N ratio of the
audio component of each signal; Switching control means for executing switching control
processing for controlling the output signal combining means such that a signal having a high
strength or S / N ratio of an audio component among output signals of the microphone is output
by the output signal combining means Provided is a sound collecting device characterized by
comprising.
04-05-2019
2
According to this aspect of the invention, among the output signals of the plurality of
microphones, a signal having a large strength or S / N ratio of the voice component is output by
the output signal combining means. Therefore, even in the situation where the sound source
moves, the sound generated from the sound source can be collected at a high S / N ratio without
deteriorating the frequency characteristic.
[0006]
Hereinafter, embodiments of the present invention will be described with reference to the
drawings. <Configuration of Embodiment> FIG. 1 is a block diagram showing a configuration of a
sound collection device according to an embodiment of the present invention. As shown in FIG. 1,
the sound collection device in the present embodiment has m microphones 11-k (k = 1 to m).
FIGS. 2 and 3 each show an implementation example of the microphone 11-k (k = 1 to m) in the
sound collecting device. In these figures, an example in which the number m of microphones is
three is shown.
[0007]
The sound collection device in the present embodiment may be configured as an independent
device or may be incorporated into another device. FIG. 2 shows an implementation example of a
microphone in the sound collection device as an example of the former. In this sound collection
device, three microphones 11-1 to 11-3 are fixed to a horizontal bar 502 fixed to the upper part
of the stand 501. As an example of the latter, FIG. 3 shows a mounting example of a microphone
in a notebook personal computer in which the sound collection device according to the present
embodiment is incorporated. In this example, three microphones 11-1 to 11-3 are fixed to the
upper part of the display 503 of the notebook computer.
[0008]
The microphones 11-k (k = 1 to m) used in the present embodiment are unidirectional
microphones whose sound reception sensitivity depends on the direction of arrival of sound. In
the example shown in FIG. 2 and FIG. 3, assuming that the axis directed in the direction in which
the maximum sound receiving sensitivity is obtained in the microphone is referred to as the
maximum sensitivity axis, the microphones 11-1 to 11-3 each have the maximum sensitivity axis
04-05-2019
3
It is directed to the diagonal right, right in front, diagonal left of the sound pickup device or
laptop computer. As described above, the m microphones 11-k (k = 1 to m) in the present
embodiment are fixed to the sound collection device such that the respective maximum
sensitivity axes draw radiation.
[0009]
The speaker speaks with these microphones 11-k (k = 1 to m) in front, but when the speaker
moves, the microphone suitable for picking up the voice of the speaker is the speaker's It
changes according to the position. For example, in the example shown in FIG. 2 and FIG. 3, when
the speaker's mouth is on the left side of the sound collecting device or the notebook computer,
the microphone 11-1 whose maximum sensitivity axis is directed to the direction of the speaker's
mouth. The output signal level of the speaker is at a maximum, and this output signal is suitable
to indicate the speech of the speaker. However, when the speaker changes his / her posture and
the speaker's mouth moves directly in front of the microphone 11-2, the level of the output
signal of the microphone 11-2 becomes maximum, and this output signal is adopted as the voice
of the speaker. You should do it.
[0010]
Therefore, in the sound collection device in the present embodiment, the level of the audio
component of each output signal of the microphone 11-k (k = 1 to m) is monitored, and in
principle, the signal of the maximum level is selected to make a final digital It is output as the
audio signal SS, and the directivity of the entire sound collection device is made to follow the
direction of the sound source (in this example, the mouth of the speaker). Then, in the sound
collection device according to the present embodiment, a signal (S / N ratio) of the digital audio
signal SS (for convenience of a device (not shown) in the subsequent stage that receives and
processes the final digital audio signal SS) Hereinafter, the signal-to-noise ratio signal is
generated and output. Hereinafter, the circuit configuration of the sound pickup apparatus for
obtaining the digital audio signal SS and the S / N ratio signal will be described.
[0011]
In FIG. 1, an A / D converter 12-k (k = 1 to m) samples an analog audio signal output from a
microphone 11-k (k = 1 to m) at a constant sampling period, and the sample value To a digital
04-05-2019
4
audio signal S-k (k = 1 to m) indicating. The digital audio signal S-k (k = 1 to m) is input to the
extraction unit 20-k (k = 1 to m) and is also input to the output signal combining unit 30.
[0012]
An extraction unit 20-k (k = 1 to m) is an audio intensity signal Es-k (k = 1 to m) indicating the
intensity of the audio component from each of the digital audio signals S-k (k = 1 to m) and It is a
circuit for extracting a noise intensity signal En-k (k = 1 to m) indicating the intensity of the noise
component. In this embodiment, which of the digital audio signals S-k (k = 1 to m) is output as
the final digital audio signal SS by level comparison of the audio strength signals Es-k (k = 1 to m)
Make a judgment. Further, in the present embodiment, the S / N ratio signal is calculated from
the speech intensity signal Es-k (k = 1 to m) and the noise intensity signal En-k (k = 1 to m).
[0013]
FIG. 4 is a block diagram showing the configuration of each of the extraction units 20-k (k = 1 to
m). In FIG. 4, a BPF (band pass filter; band pass filter) 21 has a pass band of, for example, 300 to
3000 Hz and passes audio frequency components included in the digital audio signal S-k. The
output signal of the BPF 21 indicates the strength of the audio component in the digital audio
signal S-k, but the value changes rapidly and frequently. Therefore, if the output signal of the BPF
21 is output as the voice intensity signal Es-k as it is, the digital audio signal S-k selected as the
digital audio signal SS is frequently switched, and the operation becomes unstable. Therefore, the
envelope generation unit 22 is provided at the subsequent stage of the BPF 21. The envelope
generation unit 22 outputs an audio strength signal Es-k indicating an envelope (envelope)
obtained by reducing a rapid change of the output signal of the BPF 21. Specifically, the envelope
generation unit 22 includes an effective value calculation circuit and an LPF (low pass filter).
Here, the effective value calculation circuit divides the output signal of the BPF 21 into a frame
consisting of a predetermined number of samples, and calculates an effective value which is the
root mean square of each sample for each frame. The LPF removes an abrupt change of the
effective value obtained for each frame, and outputs an audio strength signal Es-k indicating an
envelope of the effective value.
[0014]
The BEF (band elimination filter; band elimination filter) 23 has a stop band of, for example, 300
04-05-2019
5
to 3000 Hz, and passes components of bands other than the stop band included in the digital
audio signal S-k. The output signal of this BEF 23 indicates the strength of the noise component
in the digital audio signal S-k, but the value changes rapidly and frequently. Therefore, if the
output signal of BEF 23 is output as noise intensity signal En-k as it is, it is calculated from
speech intensity signal Es-k (k = 1 to m) and noise intensity signal En-k (k = 1 to m). The S / N
ratio signal becomes unstable. Therefore, an envelope generation unit 24 similar to the envelope
generation unit 22 is provided downstream of the BEF 23. The envelope generation unit 24
outputs a noise intensity signal En-k indicating an envelope in which the abrupt change of the
output signal of the BEF 23 is mitigated.
[0015]
FIG. 5 is a block diagram showing another configuration example of the extraction unit 20-k (k =
1 to m). In this example, the BEF 23 in FIG. 4 is replaced by a subtractor 25. The subtractor 25
subtracts the output signal of the BPF 21 from the digital audio signal S-k and supplies the result
to the envelope generation unit 24. Also in this configuration, the speech intensity signal Es-k
and the noise intensity signal En-k similar to those shown in FIG. 4 are output from the envelope
generation units 22 and 24, respectively.
[0016]
In FIG. 1, the output signal synthesis unit 30 selects one of the digital audio signals S-k (k = 1 to
m) and outputs it as the digital audio signal SS, or the digital audio signal S-k (k It is a circuit
which cross-fades two signals of (1 to m) and outputs a digital audio signal SS. The output signal
synthesis unit 30 multiplies the digital audio signal S-k (k = 1 to m) by the coefficient ak (k = 1 to
m) and outputs the product by multiplying the multipliers 31-k (k = 1 to m). And an adder 32
which adds the output signals of the multipliers 31-k (k = 1 to m) and outputs as a digital audio
signal SS, and synthesis control for controlling the coefficients a−k (k = 1 to m) And a unit 33.
[0017]
The switching control unit 40 is a circuit that monitors the audio strength signal Es-k (k = 1 to m)
and outputs the selection signals Mnew and Mold and the crossfade signal CF based on the
monitoring result. Here, the selection signal Mnew is a signal indicating the index k of the digital
audio signal S-k (k = 1 to m) that is most suitable for being used as the final digital audio signal
04-05-2019
6
SS. The selection signal Mold is a signal indicating a value immediately before the selection signal
Mnew is changed to the current value. In principle, the switching control unit 40 performs
switching control processing for performing verification and necessary updating of the selection
signals Mnew and Mold whenever a periodic verification pulse Pc is applied. In this switching
control process, the level comparison of the voice strength signal Es-k (k = 1 to m) is performed
except for the period when the cross fade signal CF is "1", roughly speaking, the voice strength
signal of the maximum level The selection signal Mnew is updated to indicate the index k of Es-k.
Further, in the switching control process, when the content of the selection signal Mnew is
changed, the selection signal Mold is updated with the content of the selection signal Mnew
before the change. Although various modes can be considered for the switching control process,
the details will be clarified in the operation explanation of the present embodiment in order to
avoid the repetition of the explanation.
[0018]
The synthesis control unit 33 in the output signal synthesis unit 30 monitors the selection signal
Mnew thus updated, and the digital audio signal S-k having the index k designated by the
selection signal Mnew is the final digital audio. The value of the coefficient a−k (k = 1 to m) is
controlled so as to be output as the signal SS. Specifically, the combination control unit 33 sets
the coefficient a−k having the index k designated by the selection signal Mnew to “1”, and
sets the other coefficients to “0”.
[0019]
Here, since the m microphones 11-k (k = 1 to m) in the present embodiment have maximum
sensitivity axes different in direction from one another, the digital audio signal S-k (k = 1) is
generally used. There is a level difference between ~ m). For this reason, when the content of the
selection signal Mnew changes, if the digital signal S-k to be the digital audio signal SS is
immediately switched in accordance therewith, an unnatural discontinuity occurs in the digital
audio signal SS. Therefore, in the present embodiment, when changing the contents of the
selection signals Mnew and Mold, the switching control unit 40 causes the output signal
combining unit 30 to perform cross fading in a predetermined period.
[0020]
04-05-2019
7
Specifically, when changing the contents of selection signals Mnew and Mold, switching control
unit 40 raises crossfade signal CF from “0” to “1” at that time, and changes crossfade signal
CF. After being set to "1" over a predetermined period, it is returned to "0" again. The combining
control unit 33 in the output signal combining unit 30 sets the coefficient (for example, a-newk)
for which the index is specified by the selection signal Mnew to “1” to “1” during the period
when the cross fade signal CF is “1”. The coefficient (for example, a-oldk) whose index is
specified by the selection signal Mold is continuously changed from “1” to “0”. In this way,
since the two old and new digital audio signals S-k are cross-faded, unnatural discontinuities do
not occur in the digital audio signal SS.
[0021]
The S / N ratio signal generation unit 50 selects one of the speech strength signals Es-k (k = 1 to
m) having the index k designated by the selection signal Mnew as an S component, and selects
the noise strength signal En-. It is a circuit which selects the thing with the highest intensity |
strength among k (k = 1-m) as N component, and outputs the result of dividing the signal level of
S component by the signal level of N component as a S / N ratio signal. The output unit 60 is a
circuit that outputs the final digital audio signal SS obtained from the output signal combining
unit 30 and the S / N ratio signal obtained from the S / N ratio signal generation unit 50. The
above is the configuration of the present embodiment.
[0022]
<Operation of Embodiment> (1) Overall Operation Next, the operation of the present embodiment
will be described. FIG. 6 is a time chart showing an operation example of the present
embodiment. This operation example is an operation example of a sound collection device having
three microphones 11-k (k = 1 to 3) as illustrated in FIG. 2 or FIG. As in this operation example,
in the present embodiment, the switching control process is executed by the switching control
unit 40 each time the periodic verification pulse Pc is generated, and the voice strength signal Esk (k = 1 to 3) is generated. Level comparison is performed.
[0023]
In this operation example, the speaker's mouth, which is a sound source, moves from the front of
the sound collection device to the right. When the sound source is in front of the sound collection
04-05-2019
8
device, the level of the sound strength signal Es-2 is the largest among the sound strength signals
Es-k (k = 1 to 3). For this reason, in the switching control process that is repeatedly executed, the
selection signal Mnew is “2” that is an index specifying the digital audio signal S-2 obtained
from the central microphone 11-2.
[0024]
However, as the sound source moves from the center of the sound collection device to the right,
the level of the sound intensity signal Es-2 gradually decreases and the level of the sound
intensity signal Es-3 gradually increases. Then, in the operation example, when the switching
control process is executed at time t1, since the magnitude relation between the levels of the
voice strength signals Es-2 and Es-3 is reversed, the selection signal Mnew is set to "3", The
selection signal Mold is set to "2". Then, after this point in time, the crossfade signal CF is made
"1" over a predetermined period. While the cross fade signal CF is "1", the switching control
process is not executed even if the verification pulse Pc is generated.
[0025]
In the output signal combining unit 30, an operation is performed to reduce the coefficient a-2 to
be multiplied by the digital audio signal S-2 from "1" to "0", taking a period in which the
crossfade signal CF is "1". An operation of raising the coefficient a-3 to be multiplied by the
digital audio signal S-3 from "0" to "1" is performed. As a result, the digital audio signal SS finally
output is naturally shifted from the digital audio signal S-2 to the digital audio signal S-3.
[0026]
In the S / N ratio signal generation unit 50, as described above, the S / N ratio signal is calculated
from the speech strength signal Es-k (k = 1 to 3) and the noise strength signal En-k (k = 1 to 3)
Be done. In this operation example, the speech intensity signal Es-2 corresponding to the index
"2" and the noise intensity signal En-k (k = 1 to 3) are the largest during the period when the
selection signal Mnew is "2". An S / N ratio signal is calculated from that of the level. Further,
during the period in which the selection signal Mnew is "3", the speech intensity signal Es-3
corresponding to the index "3" and the noise intensity signal En-k (k = 1 to 3) have the highest
level. The S / N ratio signal is calculated from The output unit 60 outputs the digital audio signal
SS obtained in this manner and the S / N ratio signal to the device at the subsequent stage.
04-05-2019
9
[0027]
(2) Various Aspects of Switching Control Processing In the switching control processing executed
by the switching control unit 40 in the present embodiment, it is sufficient that the switching
control processing can follow the movement of the position of the speaker's mouth. If the
switching control process responds too sensitively to the change of the voice strength signal Es-k
(k = 1 to m), the digital audio signal S-k to be the final digital audio signal SS is frequently
switched, and the final The digital audio signal SS is unnatural in hearing. Hereinafter, various
aspects of the switching control process for preventing such an inconvenience will be described
by taking the case of m = 3 as an example.
[0028]
a. First Aspect In this aspect, the threshold th which is the boundary between the level of
speech and the level of background noise is used, and only the speech intensity signal Es-k (k = 1
to 3) having a level above the threshold th is used. Is used as a material of judgment in selection
of the digital audio signal S-k. FIGS. 7A and 7B show an example of execution of the switching
control process in this aspect. In each example shown in FIGS. 7A and 7B, the verification pulse
Pc is generated at time t11 and time t12, and the switching control process is performed. In
these drawings, in order to prevent the illustration from being complicated, the audio strength
signals Es-k (k = 1 to 3) generated at times t11 and t12 are illustrated side by side in the lateral
direction.
[0029]
In the example shown in FIG. 7A, in the switching control process at time t11, the level of the
audio strength signal Es-2 is maximum and is equal to or higher than the threshold th, so the
selection signal Mnew is set to "2", The digital audio signal S-2 is selected as the digital audio
signal SS. In the switching control process at time t12, the level of the audio strength signal Es-1
is maximum and is equal to or higher than the threshold th, so the selection signal Mnew is set to
"1" and the digital audio signal S-1 is a digital audio signal. Selected as SS.
[0030]
04-05-2019
10
However, in the example shown in FIG. 7B, in the switching control process at time t12, the level
of any audio strength signal Es-k (k = 1 to 3) does not reach the threshold th, and the digital
audio signal S There is no speech strength signal Es-k to be a source of judgment for selecting -k.
Therefore, in the switching control process at time t12, the selection signal Mnew = "2" obtained
in the switching control process at time t11 is maintained.
[0031]
According to this aspect, even if the magnitude relation of the levels of the speech intensity
signal Es-k (k = 1 to 3) changes in the background noise level range, such a change is ignored,
and the current selection signal Mnew is maintained. Therefore, it is possible to prevent the
digital audio signal S-k, which is the digital audio signal SS when the level of the collected sound
is low, from being switched frequently.
[0032]
b. Second Aspect Also in this aspect, as in the first aspect, only the audio strength signal Es-k
(k = 1 to 3) at a level equal to or higher than the threshold th is used as a reference for
determination in the switching control process. Further, in this aspect, in order to select a digital
audio signal S-k as the digital audio signal SS in the switching control process, the level of the
audio intensity signal Es-k corresponding to the digital audio signal S-k is selected. It is not
sufficient to be the largest among the speech strength signals Es-k (k = 1 to 3). In order to select
the digital audio signal S-k as the digital audio signal SS, the level of the corresponding audio
strength signal Es-k is the level of the audio strength signal at which the level is the maximum in
the previous switching control process. It must be exceeded.
[0033]
FIG. 8 shows an example of execution of the switching control process in this aspect. In this
example, in the switching control process at time t22, the level of the audio strength signal Es-2
is maximum and is equal to or higher than the threshold th. Further, the level of the voice
strength signal Es-2 is larger by the positive value iVGC than the level of the voice strength signal
Es-1 at which the level was maximum in the previous switching control process (the switching
04-05-2019
11
control process at time t21). Therefore, in the switching control process at time t22, the selection
signal Mnew is set to "2", and the digital audio signal S-2 is selected as the digital audio signal SS.
[0034]
Although illustration is omitted, if the level of the voice strength signal Es-2 which is maximum in
the switching control process at time t22 is not more than the level of the voice strength signal
Es-1 at the time of switching control process at time t21, Digital audio signal S-2 is not selected
as digital audio signal SS.
[0035]
According to this aspect, switching of the selection signal Mnew is performed only when a clear
change occurs in the magnitude relationship of the audio strength signals Es-k (k = 1 to 3), so
that the digital audio signal SS becomes digital. Frequent switching of the audio signal S-k can be
prevented.
[0036]
c.
Third Aspect This aspect further enhances the stability of the selection signal Mnew in the
second aspect.
Also in this aspect, as in the first and second aspects, only the audio strength signal Es-k (k = 1 to
3) having a level equal to or higher than the threshold th is used as a material for determination
in the switching control process. Further, in this aspect, in the switching control process, in order
to select the digital audio signal S-k corresponding to a certain audio strength signal Es-k as the
digital audio signal SS, it is necessary to satisfy the following conditions . Condition 1: The level
of the voice strength signal Es-k is the largest among the voice strength signals Es-k (k = 1 to 3).
Condition 2: Increment iVGC for the voice strength signal of the highest level in the previous
switching control process of the voice strength signal Es-k and increment for the voice strength
signal of the highest level in the last two switching control processes of the voice strength signal
Es-k iVGCR> iVGC when compared with iVGCR.
04-05-2019
12
[0037]
FIG. 9 shows an example of execution of the switching control process in this aspect. In this
example, in the switching control process at time t33, the level of the audio strength signal Es-2
is maximum and is equal to or higher than the threshold th. Further, the level of the voice
strength signal Es-2 at the time of the switching control process at time t33 is higher than the
level of the voice strength signal Es-2 at which the level was the maximum in the previous
switching control process (the switching control process at time t32) Even the positive value
iVGC is larger. Further, the level of the voice strength signal Es-2 at the time of the switching
control process at time t33 is higher than the level of the voice strength signal Es-1 at which the
level was the largest in the switching control process of the last time (the switching control
process at time t31). The positive value iVGCR is larger. そして、iVGCR>iVGCである。
Therefore, in the switching control process at time t33, the selection signal Mnew is set to "2",
and the digital audio signal S-2 is selected as the digital audio signal SS.
[0038]
Although illustration is omitted, digital audio signal S-2 is digital if the condition iVGCR> iVGC is
not satisfied even if the level of audio strength signal Es-2 is maximum in the switching control
process at time t33. It is not selected as the audio signal SS.
[0039]
According to this aspect, even if there is a temporary change in the magnitude relationship of the
voice strength signals Es-k (k = 1 to 3), it is ignored, and a certain voice strength signal Es-k is at
the maximum level, and The corresponding digital audio signal S-k is selected as the final digital
audio signal SS only if it is clearly recognized that it tends to increase.
Therefore, frequent switching of the digital audio signal S-k, which is the digital audio signal SS,
can be prevented.
[0040]
(3) Aspects of Output of Digital Audio Signal SS and S / N Ratio Signal There are various aspects
regarding the output of the digital audio signal SS and the S / N ratio signal at the output unit 60.
04-05-2019
13
[0041]
In an aspect, the output unit 60 outputs a set of an S / N ratio signal and a digital audio signal SS
for each sample as illustrated in FIG.
In this case, each sample of the S / N ratio signal and the digital audio signal may be separate
words, but for example, the S / N ratio signal is an upper bit string and the word having the
digital audio signal SS as a lower bit string is sequentially The output unit 60 may be configured
to output. According to this aspect, the downstream device that receives the output signal of the
sound collection device has an advantage that it can obtain the digital audio signal and the
corresponding S / N ratio signal at any timing.
[0042]
In another aspect, as illustrated in FIG. 11, the output unit 60 divides the digital audio signal SS
into frames consisting of a predetermined number of samples, and for each frame, a
representative S / N ratio signal (for example, An average value) and a predetermined number of
samples of the digital audio signal SS belonging to the frame are output. According to this aspect,
there is an advantage that the amount of data as a whole can be reduced.
[0043]
<Effects of the Embodiment> As described above, in the present embodiment, the digital audio
signal S-k having the maximum strength of the audio component is selected even in the situation
where the position of the sound source changes, and the final digital audio signal SS is selected.
Is output as Therefore, the digital audio signal can always be acquired with the maximum sound
reception sensitivity regardless of the change in the position of the sound source. Further, in the
present embodiment, when switching the digital audio signal to be output as the final digital
audio signal SS, it takes a certain time to cross fade between the two old and new digital audio
signals, so the digital output is performed. There is an advantage of not causing an unnatural
discontinuity in the audio signal SS.
[0044]
04-05-2019
14
<Other Embodiments> While one embodiment of the present invention has been described
above, other embodiments can be considered in the present invention. For example, in the above
embodiment, the selection of the digital audio signal S-k to be the final digital audio signal SS is
performed based on the audio strength signal Es-k (k = 1 to m). Each of k (k = 1 to m) is divided
by each of noise intensity signal En-k (k = 1 to m) to generate S / N ratio signal S / N-k (k = 1 to
m) The digital audio signal S-k corresponding to the highest S / N ratio signal S / N-k may be
selected as the final digital audio signal SS. According to this aspect, for example, when noise
occurs in a specific direction, the level of the noise intensity signal generated based on the output
signal of the microphone that picked up the noise increases, and digital audio obtained from the
same microphone It is possible to avoid the signal being selected as the final digital audio signal
SS. Therefore, even in a situation where local noise suddenly occurs, sound can be collected at a
high S / N ratio.
[0045]
It is a block diagram which shows the structure of the sound collection apparatus which is one
Embodiment of this invention. It is a figure which shows the example of implementation of the
microphone in the embodiment. It is a figure which shows the other implementation example of
the microphone in the embodiment. It is a block diagram which shows the structural example of
the extraction part in the embodiment. It is a block diagram which shows the other structural
example of the extraction part in the embodiment. It is a time chart which shows operation of the
embodiment. It is a time chart which shows the 1st mode of the switch control processing in the
embodiment. It is a time chart which shows the 2nd mode of the switch control processing in the
embodiment. It is a time chart which shows the 3rd mode of the switch control processing in the
embodiment. It is a figure which shows the aspect of an output of the S / N ratio signal of the
output part in the embodiment, and a digital audio signal. It is a figure which shows the other
aspect of the output of the S / N ratio signal of the output part in the embodiment, and a digital
audio signal.
Explanation of sign
[0046]
11-k (k = 1 to m) ... Microphone, 12-k (k = 1 to m) ... A / D converter, 20-k (k = 1 to m) ...
Extraction unit, 30 ... Output signal combining unit, 31-k (k = 1 to m) ... multiplier, 32 ... adder, 33
... combining control unit, 40 ... switching control unit, 50 ... S / N ratio signal generating unit , 60
04-05-2019
15
... output unit.
04-05-2019
16
Документ
Категория
Без категории
Просмотров
0
Размер файла
28 Кб
Теги
jp2007174011
1/--страниц
Пожаловаться на содержимое документа