close

Вход

Забыли?

вход по аккаунту

?

JPH1118191

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JPH1118191
[0001]
The present invention relates to a sound collection method and apparatus for collecting a sound
by processing an output signal of a microphone array composed of a plurality of microphones,
and in particular to a video conference etc. Method and apparatus for detecting received speech
when applied to a teleconference, and the effect of received speech radiated from a received
speaker is eliminated to direct the directivity of the microphone array in the direction of the
target speech correctly The present invention relates to a sound collecting method and apparatus
capable of collecting sound.
[0002]
2. Description of the Related Art In recent years, with the advancement of multimedia technology,
communication conferences such as video conferences in the form of loud-speaking speech using
microphones and speakers are becoming possible. In such a case, there is a need for a sound
collection device capable of natural conversation without being aware of the microphones,
without collecting microphones for the number of speakers on the desk of the communication
conference, and collecting only the purpose such as voice. .
[0003]
As an example of such a sound collection device, there is a sound collection device that installs a
03-05-2019
1
plurality of microphones (microphone arrays) and processes the outputs of the microphones to
extract a target sound. There are many known signal processing methods for suppressing noise
and extracting a target sound using such a microphone array, such as the delay-sum method,
AMNOR, etc. (For example, Oga, Yamazaki, and Kanada "Acoustic Systems and Digital
Processing") The Institute of Electronics, Information and Communication Engineers, 1995. pp.
173-197) For example, in the delay-sum method, the target sound is extracted as follows.
[0004]
FIG. 6 is a diagram for explaining the principle of target sound extraction by the delay-and-sum
method. In FIG. 6, 1 is a sound pickup unit (microphone array), 21, 22, ..., 2M is a microphone (M
is the number of microphones), 31, 32, ..., 3M is a delay device, 4 is an adder , 5 is an output
signal, 6 is a noise suppression unit, d is a microphone interval, s (t) is a sound wave (t represents
time) arriving at the sound collection unit 1, and θ is a sound collection unit 1 of the sound wave
s (t) Is the time difference (delay time) for the sound wave to reach each microphone.
[0005]
It is assumed that the microphones 21, 22,..., 2M in FIG. 6 are linearly arranged at equal intervals
d, and the sound waves s (t) arrive from the distance to the microphones linearly arranged at an
angle θ. At this time, the distance that the sound wave that has reached the microphone 21
propagates until it reaches the microphone 22 is represented by dsin θ from the microphone
interval d and the arrival angle θ. Similarly, the distance propagated to the i-th microphone 2i (i
= 2,..., M) is represented by (i-1) d sin θ. Therefore, the delay time τi until reaching the
microphone 2i (i = 2,..., M) can be expressed by the following equation (1) by dividing this
propagation distance by the speed of sound c based on the microphone 21 Be done.
[0006]
Here, the output signal from each microphone 2i (i = 1,..., M) is represented by Xi (t), which is the
sound wave s (t) delayed by τi. It becomes like 2).
[0007]
Here, it is shown below that if the delay amount Di of the delay device 3i (i = 1, 2,..., M) is
appropriately set, only the incoming sound wave from the θ direction can be emphasized and
output to the output signal 5 .
03-05-2019
2
[0008]
The delay amount Di of the delay device 3i (i = 1, 2,..., M) is set as in the following equation (3).
[0009]
Do is a fixed delay amount to be added in order to prevent the accuracy in realizing the delay
characteristics with a digital filter when the value of τi is too small.
[0010]
At this time, the output of the delay device 3i (i = 1, 2,..., M) is the signal of the equation (2) in
which the delay Di of the equation (3) is generated. become.
[0011]
That is, regardless of the microphone number i, s (t) becomes the same signal delayed by D0.
[0012]
When the signals are thus added by the adder 4 after the phases are aligned, the sound waves
coming from the θ direction are emphasized by the amount of the addition.
On the other hand, since the sound waves coming from the θN direction different from the θ
direction are received with a delay time τN different from τi, the delay amount of equation (3)
does not match the phase, and the adder 4 adds the signals. It will not be emphasized even in
combination.
[0013]
Thus, in the delay-sum method, the sound wave coming from the target direction θ is
emphasized, and the noise coming from the other direction θN is relatively suppressed.
[0014]
03-05-2019
3
At this time, if the target direction θ is scanned and the output signal of the microphone array 1
is monitored, the output signal becomes larger when θ is directed to the target speaker, so that
the direction of the target speaker can be searched. .
Then, by emphasizing and adding the phases according to equation (4) so as to emphasize the
sound wave from the direction θ of the target speaker, that is, by aiming the directivity of the
microphone array 1 in the direction of θ, Sound can be picked up with a high SN ratio.
[0015]
Here, for convenience of explanation, although it has been described that a plurality of
microphones are arranged on a straight line at equal intervals d, the microphones may be
arranged at irregular intervals, and the arrangement shape is also two-dimensional · 3 You may
arrange in dimension.
[0016]
Also, as shown in FIG. 7, when the point sound source S is located at a relatively short distance to
the microphone array 1, the delay elements 31, 32,. .., 7M are provided in the latter stage of 3M,
and it is important for improving the sound collection SN ratio to apply an appropriate load to
the gains.
There is a way of giving the load as expressed by the following formulas (5), (6) and (7) (Nomura,
Kanada, Kojima "Near Field Microphone Array", Journal of the Acoustical Society of Japan, 53
Volume 2 (1997), pp. 110-116).
[0017]
Here, r1, r2, ..., rM are distances from the sound source S to the respective microphones 21, 22,
..., 2M, rC is the critical distance in the room, that is, the direct sound power and the
reverberation power of the sound source become equal (H. Kuttruff, “Room Acoustics (Third
Edition)”, Elsevier Applied), which is a distance and is represented by rC = √ (0.0032 V / T)
with respect to a room volume V [m 3] and a room reverberation time T [seconds] Science, pp.
100-132 (1991)).
03-05-2019
4
At this time, the microphone array 1 is most sensitive to the “point” of the position of the
sound source S, and so to speak, the “focus” of the sensitivity is formed.
At this time, with respect to the distances ri (i = 1, 2,..., M) to the respective microphones, the
delays D.sub.0 -ri / c (c: sound velocity) of the delay devices 31, 32,. If the sensitivity focus is
scanned by changing g0, that is, a, and the array output is monitored, the array output becomes
larger when the sensitivity focus is directed to the point where the target speaker exists. You can
find the position.
[0018]
Thus, the target sound can be picked up with a high sound collection SN ratio by finding the
existence area of the target speaker as the direction or position and directing the directivity of
the array to the existence area.
[0019]
It is attempted to apply this microphone array 1 to a communication conference such as a video
conference.
The advantage of using the microphone array 1 for the sound collection unit of a communication
conference is that the microphone array 1 can be installed at a position separated from the
speaker with a high sound collection SN ratio, so multiple microphones are installed on the desk
There are advantages such as not being necessary, not being aware of the microphone, and
enabling natural communication.
[0020]
An example of a communication conference apparatus in which the microphone array 1 is used
as a sound pickup unit is shown in FIG.
In this figure, 10A and 10B represent communication conference rooms, 11A and 11B represent
microphone arrays 12A and 12B, and the microphone array main device, 13 represents a
03-05-2019
5
communication line, and 14A and 14B represent receiving speakers.
The target voice uttered in the conference room 10A is picked up by the microphone array 11A,
and after the processing for emphasizing the target voice in the microphone array main unit 12A,
the communication conference room which is the communication destination through the
communication line 13 It is transmitted to 10B and emitted from the speaker 14B as a received
voice.
The flow of signals for the target voice uttered in the communication conference room 10B is
also the same flow as described above. As described above, the microphone array main units 12A
and 12B scan the directivity of the microphone arrays 11A and 11B to find the existence area of
the target speaker, and the directivity of the microphone arrays 11A and 11B is directed to the
existence area of the target speaker It operates in such a way as to pick up the target voice at a
high SN ratio.
[0021]
As described above, the microphone arrays 11A and 11B detect the presence area of the target
speaker, direct the directivity of the microphone arrays 11A and 11B to the presence area, and
collect the target sound at a high SN ratio. there were. However, when the reception voice from
the communication destination is radiated from the reception speaker 14A or 14B, the reception
speaker 14A or 14B is often erroneously detected as the target speaker, and the directivity of the
microphone array 11A, 11B is received by the reception speaker 14A. Or it turned out that it
turned to the direction of 14B. Further, at this time, the sound radiated from the reception
speaker 14A or 14B is picked up by the microphone array 14A or 14B and returned to the
communication conference room 10A or 10B where the speaker is present again to be perceived
as an echo, a cause such as howling It also turns out that the problem of poor call quality may
arise.
[0022]
SUMMARY OF THE INVENTION In order to solve the above-mentioned problems, the present
invention has the following constitution. First, a directivity control unit is provided to prevent the
directivity of the microphone array from being directed to the position of the reception speaker.
However, this is insufficient for practical use. That is, the first reflected sound to the sound
03-05-2019
6
emitted from the reception speaker is generated from the floor and the wall in the vicinity area of
the reception speaker. Since the first reflected sound generally has high energy, the microphone
array may erroneously detect the influence of the reflected sound. In order to prevent this, a
directivity control unit is provided which prevents the directivity of the microphone array from
being directed to the area near the receiving speaker including the receiving speaker, not only at
the position of the receiving speaker. Note that the area near the reception speaker is an area of
about 0.5 to 2 m in radius centered on the reception speaker, and the actual size of the radius is
the application of sound collection, the degree of reflection of the room used, noise, etc. It is
decided depending on the condition. However, it is desirable that the size be as large as possible
without overlapping with the target speaker's presence area.
[0023]
The most basic method or means of directivity control is a method of excluding the region Fsp in
which the reception speaker is present or the region near the reception speaker Fn from the
directional scanning range for the target speaker detection. In the case where the sound pressure
of a specific area in the room other than the area near the reception speaker is raised by the
room reflection of the reception sound radiated from the reception speaker, air conditioning,
noise coming from the window or wall of the room, etc. The specific area is set so as to be
excluded from the scanning range of directivity together with the receiving speaker or the area
near the receiving speaker.
[0024]
Also, in order to realize the same purpose, the following means can be applied. When directivity
is detected to a region where the average power of the microphone array output is high, the
power for each region for calculating the power for each region when the directivity is scanned is
detected when the presence region of the target speaker is detected. A calculation unit is
provided to calculate the power for each area excluding the receiving speaker or the area near
the receiving speaker or the designated specific area, and a high power area is detected as a
sound source area from the power calculated by the area-specific power calculating section. can
do. In this way, it is possible to avoid the problem that the receiving speaker, the area near the
receiving speaker, or the designated specific area is erroneously detected as the target speaker's
presence area.
[0025]
03-05-2019
7
Furthermore, in order to realize the same purpose, the following means can be applied. That is,
from the output of each area power calculator which calculates the power for each area when the
directivity is main-scanned, the power of the receiving speaker or the receiving speaker is high
among the power for each area excluding the specified area. The area may be detected as a
sound source area.
[0026]
As described above, the method for preventing the directivity of the microphone array from
being directed to the receiving speaker, the area near the receiving speaker, or the set specific
area is to collect the speech of the speaker and use the speaker in the same room where the
speaker is present. It can also be applied to in-situ loudspeakers. For example, when a listener
asks a question to a speaker at a lecture in a relatively wide area, the question of the listener is
expanded into the venue by a speaker or the like in order to make it easy to hear the contents of
the question. It is possible to use the above-mentioned microphone array to direct directivity to
the audience when collecting questions of the audience, but if the energy of the sound wave
emitted from the loudspeaker for loudening is large, the microphone array will ask the question
In addition to the listeners who are listening, it is conceivable to control the directivity by
excluding the loudspeakers for loudspeakers or the region near the loudspeakers for
loudspeakers or the set specific regions.
[0027]
BEST MODE FOR CARRYING OUT THE INVENTION The sound pickup method and apparatus
according to the present invention radiates from a reception speaker by preventing the
directivity of the microphone array from being directed to the reception speaker or a region near
the reception speaker or a set specific region. It is configured to prevent the influence of the
received voice and to prevent the operation of erroneously directing the directivity of the
microphone array to the received speaker, the near-receiving area or the designated specific area
where the target voice is not present.
[0028]
Embodiments of the present invention will be described hereinbelow with reference to the
drawings.
03-05-2019
8
[0029]
FIG. 1 is a block diagram showing the configuration of the first embodiment of the present
invention.
In this figure, reference numeral 20 denotes a communication conference apparatus, which
includes a microphone array 21 and a microphone array main device 22, a transmission line 232 serving as transmission means, and a reception line 23-1 serving as reception means, a
reception speaker 24, and a directivity control unit It consists of thirty.
[0030]
To explain the operation, the signal picked up by the microphone array 21 is subjected to signal
processing by the microphone array main unit 22, and the directivity of the microphone array 21
is directed to the presence area of the target speaker to make the target voice SN high. The voice
is picked up by the ratio and the target voice is transmitted to the communication destination
through the transmission line 23-2.
A reception signal received from the communication destination through the reception line 23-1
is emitted as a reception voice by the reception speaker 24. At this time, the directivity control
unit 30 is configured to prevent the directivity of the microphone array 21 from being
erroneously directed to the reception speaker 24 where the target speaker does not speak or the
vicinity region of the reception speaker 24 or the designated specific region. Control the signal
processing of the microphone array main unit 22 and control the directivity of the microphone
array 21.
[0031]
FIG. 2 shows a second embodiment of the present invention. In the figure, 31 is a directional
scanning unit, 32 is a sound source existing area detection unit, 33 is a sound source existing
area detection limiting unit, and the directivity control unit 30 is formed above. The same
reference numerals as in FIG. 1 denote the same components.
03-05-2019
9
[0032]
Next, the operation will be described. A sound source presence area detection unit 32 detects a
target speaker area from an output signal of the directivity scanning unit 31 that scans the
directivity of the microphone array 21. The reception speaker 24 or the reception speaker near
area or the designated specific area is excluded from the detection of the sound source existing
area by the sound source existing area detecting and limiting unit 33.
[0033]
FIG. 3 shows a third embodiment of the present invention. In this embodiment, the sound source
existing area detection unit 32 of the second embodiment of FIG. 2 is configured by an area
power calculation section 321 and an area power maximum area detection section 322, and the
other is the same as FIG.
[0034]
Next, the operation will be described. The power of the output signal of the directional scanning
unit 31 is calculated by the region-specific power calculator 321, and the region where the
power calculated by the region-specific power calculator 321 is maximum is detected by the
region-specific power maximum region detector 322 The detection unit of the sound source area
is performed.
[0035]
FIG. 4 shows a fourth embodiment of the present invention. In this embodiment, a directional
scanning limited output power calculation unit 323 for calculating the power of the output signal
of the directional scanning unit 31 for each area except the receiving speaker 24 or the area near
the receiving speaker or the set specific area; In this embodiment, the sound source existing area
detection unit 32 is configured of a directional scan limited output power maximum area
detection unit 324 that detects an area where the calculated power of the characteristic scan
limited output power calculation unit 323 is maximum. Others are the same as FIG.
03-05-2019
10
[0036]
FIG. 5 shows a fifth embodiment of the present invention. The directivity scanning unit 31 scans
the directivity of the microphone array 21, the scan limiting unit 34 prohibits the scanning of
directivity to the reception speaker 24 or the region near the reception speaker or a designated
region, the directivity scanning unit 31. A directional scanning limited output power maximum
region detecting unit for detecting a region where the calculated power of the directional
scanning limited output power calculating unit 35 and the directional scanning limited output
power calculating unit 35 for calculating the power of the output signal of 31 is maximum It is
the Example which comprised the directivity control part 30 from 36 and.
[0037]
As described above, according to the sound collection method of the present invention, sound
collection is performed using a microphone array consisting of a plurality of microphones and a
microphone array main device for processing the output signal of the microphone array. In the
sound collection method, the reception signal from the communication destination is received,
and the reception signal is emitted as a reception sound wave from the reception speaker, and
the directivity of the microphone array is transmitted to the reception speaker or the reception
speaker near region Since the directivity of the microphone array is controlled to prohibit
turning, the directivity of the microphone array is not directed to the receiving speaker or its
vicinity or a designated specific area.
[0038]
Further, according to the present invention, there is provided a sound collection apparatus
comprising a reception voice radiated from a reception speaker by providing a directivity control
unit for preventing the directivity of the array from being directed to the reception speaker or a
region near the reception speaker or a designated specific region. It is possible to prevent the
operation of directing the directivity of the microphone array erroneously to the receiving
speaker without the target voice or the area near the receiving speaker or the specified area of
the receiving speaker.
[0039]
Further, a directional scanning unit for scanning the directivity of the microphone array, a sound
source existing area detecting / limiting unit for detecting the existing area of the target sound
source from the output signal of the directional scanning unit, the receiving speaker or the
vicinity of the receiving speaker Since the directivity control unit is configured of the directivity
control unit that prohibits directing the directivity of the microphone array to the specified area
03-05-2019
11
or region, the microphone array is erroneously detected in the reception speaker or the specified
area specified in the vicinity thereof. Can effectively prevent the directionality of the
[0040]
In addition, a directional scan limited output power calculation unit that calculates the power of
the output signal of the directional scan unit, and a directional scan that detects the directivity
main scan area where the calculated power of the directional scan limited output power
calculation unit is maximum. Since the sound source area detection unit is composed of the
characteristic scan limited output power maximum area detection unit, the sound source area can
be detected from the power, and a reliable operation is guaranteed.
[0041]
In addition, a directivity scanning unit for scanning the directivity of the microphone array, and a
directivity speaker for calculating the power of the output signal of the directivity scan with
respect to the reception speaker, a region near the reception speaker or a region near the
reception speaker or a designated specific region. The directivity control unit is configured by the
directivity scan limited output power calculation unit and the directivity scan limited output
power maximum region detection unit that detects the region where the calculated power of the
directivity scan output power calculation unit is maximum. Since the number of areas designated
in the power calculation stage is excluded, the amount of calculation can be reduced.
[0042]
In addition, a directivity scanning unit for scanning the directivity of the microphone array, a
scan limiting unit for prohibiting the scanning of the reception speaker, a region near the
reception speaker or the reception speaker, or a designated specific region, and the directivity
scanning unit A directional scan limited output power calculating unit for calculating the power
of the output signal of the target, a directional scan limited output power maximum region
detecting unit for detecting a region where the calculated power of the directional scan limited
output power calculating unit is maximum Since the directivity control unit is configured from
the above, some areas designated at the stage of the directivity scan are eliminated, so that the
amount of calculation can be further reduced.
[0043]
As described above, according to the present invention, it is possible to prevent the sound
radiated from the receiving speaker from being collected by the microphone array, and the voice
uttered by the utterer passes through the line again in the room where the utterer is present. It is
possible to prevent echoing back to, and to prevent howling, and has an excellent effect such as
03-05-2019
12
preventing deterioration in speech quality due to these echoes and howling.
[0044]
Brief description of the drawings
[0045]
1 is a block diagram showing the configuration of a first embodiment of the sound collection
device of the present invention.
[0046]
2 is a block diagram showing the configuration of a second embodiment of the sound collection
device of the present invention.
[0047]
<Figure 3> It is the block diagram which shows the constitution of the 3rd example of the sound
collection device of this invention.
[0048]
<Figure 4> It is the figure which shows the constitution of the 4th example of the sound
collection device of this invention.
[0049]
<Figure 5> It is the figure which shows the constitution of the 5th example of the sound
collection device of this invention.
[0050]
6 is a diagram for explaining the principle of noise suppression and sound collection by the
conventional delay and sum method.
[0051]
FIG. 7 is a view for explaining that the load of the gain at the rear stage of the delay unit is
appropriately set to improve the sound collection SN ratio when the sound source is located at a
position close to the microphone array.
03-05-2019
13
[0052]
8 is a block diagram for explaining a communication conference using a conventional
microphone array.
[0053]
Explanation of sign
[0054]
Reference Signs List 20 communication conference apparatus 21 microphone array 22
microphone array main apparatus 23-1 reception line 23-2 transmission line 24 reception
speaker 30 directivity control unit 31 directivity scanning unit 32 sound source area detection
unit 321 area-specific power calculation unit 322 by area Maximum power area detection unit
323 Directivity scan limited output Power calculation unit 324 Directivity scan limited output
Power maximum area detector 33 Source existing area detection limit unit 34 Scan limit unit 35
Directivity scan limit output power calculation unit 36 Directivity scan limit Output power
maximum area detector
03-05-2019
14
Документ
Категория
Без категории
Просмотров
0
Размер файла
24 Кб
Теги
jph1118191
1/--страниц
Пожаловаться на содержимое документа