close

Вход

Забыли?

вход по аккаунту

?

JP2009302984

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JP2009302984
[PROBLEMS] In voice communication having echo cancellation, the voice of a speaker speaking
at any position can be sent with the same sound quality as the speaker speaking to a microphone.
According to the present invention, there is provided a signal processing unit (13) for performing
processing for inputting voice captured by a microphone and sending it to the other party, and
processing for inputting voice sent from the other party and outputting it from a speaker. In the
signal processing unit 13, the adaptive processing unit 131 performs processing for removing
the component of the voice sent from the other party from the component of the voice captured
by the microphone, and the frequency characteristic of the voice processed by the adaptive
processing unit 131. It is a voice communication apparatus which has the echo cancellation
output signal frequency characteristic change part 132 to change. [Selected figure] Figure 2
Voice communication apparatus and voice communication method
[0001]
The present invention relates to a voice communication apparatus and a voice communication
method for transmitting voice captured by a microphone to the other party and outputting voice
sent from the other party via a speaker.
[0002]
In recent years, a video conference system via a network has been widely used (see, for example,
Patent Documents 1 and 2).
08-05-2019
1
In a loud-speaking communication system used for such a video conference system etc., the
sound collected by the microphone of the far end device is sent to the near end device and
emitted from the speaker of the near end device. On the other hand, the near-end device is also
equipped with a microphone and configured to send the voice of the near-end speaker to the farend device. For this reason, the sound emitted from the speaker at each of the far end and the
near end is applied to the microphone.
[0003]
In such a system of loud speaker communication system, if no processing is performed on the
sound emitted from the speaker and the sound acquired from the microphone, the sound emitted
from the speaker is transmitted to the other party via the microphone again. Will be sent to. This
causes a phenomenon called "echo" that can be heard from the speaker with a slight delay as if
your voice is echoing. When the echo becomes large, it is applied to the microphone again,
looping the system and causing howling.
[0004]
An echo canceller is known as a device for preventing such echo and howling. In general, an
adaptive filter is used to learn an impulse response between a speaker and a microphone, and the
above-mentioned impulse response is simulated by convoluting the sound from the speaker
applied to the microphone with the reference signal emitted from the speaker. It is generating an
echo. Then, this pseudo echo is subtracted from the voice captured by the microphone to remove
it.
[0005]
JP, 2004-208051, A JP, 2007-281668, A
[0006]
In a system having a loud voice communication system such as a video conference system,
usually, after echo canceling the voice collected by the microphone, it is sent out to the other
party as it is or fixed equalizing is performed for sending.
08-05-2019
2
For this reason, the sound often falls in a high-frequency range, such as the voice of a speaker
who is not speaking into the microphone or the voice uttered at a position away from the
microphone.
[0007]
According to the present invention, in voice communication having echo cancellation, the voice
of a speaker speaking at any position can be sent with the same sound quality as the speaker
speaking at the microphone. With the goal.
[0008]
According to the present invention, there is provided a signal processing unit for performing
processing for inputting voice captured by a microphone and sending the voice to the other party
and processing for inputting voice output from the other party and outputting the voice from a
speaker. An adaptive processing unit for removing the component of the voice sent from the
other party from the component of the voice captured by the microphone, a transmission voice
characteristic changing unit for changing the frequency characteristic of the voice processed by
the adaptive processing unit A voice communication device.
[0009]
In the present invention as described above, since the frequency characteristic of the voice after
applying the adaptive processing of the component of the voice taken in by the microphone and
the component of the voice sent from the other party is changed, the influence on the adaptive
processing is affected. It becomes possible to send the voice of the optimum frequency
characteristic to the other party without giving it.
[0010]
Further, according to the present invention, there is provided a signal processing unit for
performing processing for inputting a voice taken in with a microphone and sending it to the
other party, and processing for inputting voice sent from the other party and outputting from the
speaker. The adaptive processing unit performs processing to remove the component of the voice
sent from the other party from the component of the voice taken in by the microphone, and the
frequency characteristic of the voice sent from the other party is input to the adaptive processing
unit. It is an audio | voice communication apparatus which has a receiving audio | voice
characteristic change part changed before.
08-05-2019
3
[0011]
In the present invention as described above, when performing adaptive processing between the
component of the voice captured by the microphone and the component of the voice sent from
the other party, the frequency characteristic of the voice sent from the other party is changed
before performing the adaptive processing. As a result, it becomes possible to send the voice
after the same characteristic change to the adaptive processing unit and the speaker.
[0012]
In the present invention, the step of performing an adaptation process of subtracting the
component of the voice sent from the other party from the component of the voice to be
transmitted when transmitting the voice taken in by the microphone to the other party, and after
performing the adaptation process And V. changing the frequency characteristics of the voice.
[0013]
In the present invention as described above, since the frequency characteristic of the voice after
applying the adaptive processing of the component of the voice taken in by the microphone and
the component of the voice sent from the other party is changed, the influence on the adaptive
processing is affected. It becomes possible to send the voice of the optimum frequency
characteristic to the other party without giving it.
[0014]
The present invention also includes the steps of changing the frequency characteristic of the
voice sent from the other party, and performing the adaptive processing of subtracting the voice
component after the change of the frequency characteristic from the component of the voice
captured by the microphone. And a voice communication method.
[0015]
In the present invention as described above, when performing adaptive processing between the
component of the voice captured by the microphone and the component of the voice sent from
the other party, the frequency characteristic of the voice sent from the other party is changed
before performing the adaptive processing. As a result, it becomes possible to output from the
speaker the sound after the same characteristic change as that used for the adaptive processing.
[0016]
According to the present invention, the reproduction sound quality is improved even when the
08-05-2019
4
characteristic of the microphone input is bad or the characteristic of the received signal from the
other party is bad, for example, when away from the microphone or when not facing the
direction of the microphone. It is possible to
[0017]
Hereinafter, an embodiment of the present invention will be described based on the drawings.
[0018]
<System Configuration> FIG. 1 is a block diagram for explaining the configuration of a video
conference system that implements the voice communication device according to the present
embodiment.
In the video conference system shown in FIG. 1, a block configuration is shown focusing on the
main parts related to the description of the present embodiment.
Further, the voice communication device of the present embodiment is realized as the near end
device 10 which is a terminal on one side in the teleconference system or the far end device 20
which is a terminal on the other side.
The near-end device 10 and the far-end device 20 are the same device, and the internal block
diagram of the far-end device 20 is omitted.
[0019]
The near end apparatus 10 includes an A / D converter 11, a D / A converter 12, a signal
processing unit 13, an audio codec unit 14, and a communication unit 15.
[0020]
The A / D converter 11 performs a process of converting the sound captured by the microphone
110 into a digital signal.
08-05-2019
5
The D / A converter 12 performs processing of converting the voice sent from the other party
into an analog signal for output by the speaker 120.
[0021]
The signal processing unit 13 performs processing for inputting the voice captured by the
microphone 110 and sending it to the other party, and processing for inputting voice sent from
the other party and outputting it from the speaker 120.
[0022]
The voice codec unit 14 performs processing of encoding a digital signal of voice to be sent to
the other party and processing of decoding a digital signal of voice sent from the other party.
The communication unit 15 is a unit that performs signal input / output with the far-end device
20 via the communication line N, and transmits / receives a digital signal of encoded voice.
[0023]
Here, the details of each part are as follows.
The speaker 120 connected to the near-end device 10 is picked up, signal-processed and
encoded by the microphone 210 connected to the far-end device 20, and the voice data
transmitted through the communication line N is transmitted to the near-end device 10. Process
with and emit noise.
[0024]
The microphone 110 connected to the near-end device 10 picks up the speech sound of the nearend teleconference attendee, and the sound emitted from the speaker 120 is also superimposed
and collected through the space.
[0025]
08-05-2019
6
The A / D converter 11 amplifies the sound collected by the microphone 110 with an amplifier
(not shown) and converts analog sound data into a digital signal.
[0026]
The D / A converter 12 converts digital audio data sent from the signal processing unit 13 into
an analog.
The audio data that has become analog is amplified by an amplifier (not shown) and emitted from
the speaker 120.
[0027]
The signal processing unit 13 mainly includes a digital signal processor (DSP), and performs
processing of converting input and output audio data into desired data.
Details of this signal processing unit will be described later.
[0028]
The audio codec 14 performs processing for converting (encoding) audio data based on the
microphone input sent from the signal processing unit 13 into a code defined in standard in the
communication of the teleconference system.
Further, the audio codec unit 14 decodes the encoded audio data sent from the communication
unit 15 from the far-end device 20 and sends the decoded audio data to the signal processing
unit 13.
[0029]
08-05-2019
7
The communication unit 15 performs signal input / output with the far-end device 20 via the
communication line N.
The signal to be handled is mainly digital data of encoded speech.
The communication line N is a general digital communication line such as the Internet or a LAN
(Local Area Network).
[0030]
Next, details of the signal processing unit 13 will be described.
FIG. 2 is a block diagram for explaining the internal structure of the signal processing unit.
The signal processing unit 13 includes an adaptive processing unit 131 including a stereo
adaptive filter 131 F, an echo cancellation output signal frequency characteristic changing unit
132, and a reference signal frequency characteristic changing unit 133.
[0031]
The stereo adaptive filter 131F used in the adaptive processing unit 131 is a filter for removing
the component of the sound sent from the other party from the component of the sound
captured by the microphone.
The echo cancellation output signal frequency characteristic changing unit 132 is a part that
changes the frequency characteristic of the sound processed by the stereo adaptive filter 131F.
The reference signal frequency characteristic changing unit 133 performs processing to change
the frequency characteristic of the voice sent from the other party before being input to the
stereo adaptive filter 131F of the adaptation processing unit 131.
08-05-2019
8
[0032]
In the present embodiment, although a configuration including both the echo cancellation output
signal frequency characteristic changing unit 132 and the reference signal frequency
characteristic changing unit 133 is taken as an example, a configuration including either one
may be used.
[0033]
In the signal processing unit 13 having such a block configuration, the reference signal
frequency characteristic changing unit 133 changes the characteristics of the audio signal from
the far-end device received from the audio codec unit, and outputs the signal from the D / A
converter to the speaker.
[0034]
Also, the stereo adaptive filter 131 F is learned based on the voice whose characteristic has been
changed by the reference signal frequency characteristic changing unit 133.
That is, an audio signal whose characteristic has been changed by the reference signal frequency
characteristic changing unit 133 is sent to the stereo adaptive filter 131F.
[0035]
On the other hand, the audio signal captured by the microphone and converted into a digital
signal by the A / D converter is subjected to a subtraction process using an estimated echo
generated by the adaptive filter 131F.
That is, processing of subtracting the estimated echo component from the sound component
captured by the microphone is performed. As a result, an echo cancellation output is generated,
the characteristic is changed by the echo cancellation output signal frequency characteristic
changing unit 132, and the signal is sent to the voice codec.
[0036]
08-05-2019
9
Here, the update amount of the coefficient of the stereo adaptive filter 131F is the voice signal
after the characteristic change by the reference signal frequency characteristic change unit 133,
the voice signal from the A / D converter, and the echo cancellation output sent to the voice
codec unit. Determined based on the audio signal.
[0037]
The reference signal frequency characteristic changer 133 analyzes the audio signal from the
audio codec unit, and outputs the reference signal audio whose frequency characteristic is
changed to the D / A converter and the stereo adaptive filter 131F.
On the other hand, the echo cancellation output signal frequency characteristic changing unit
132 analyzes the output signal after the echo cancellation by the stereo adaptive filter 131F, and
sends the voice whose frequency characteristic has been changed to the voice codec unit.
[0038]
The change of the frequency characteristic performed by the reference signal frequency
characteristic changing unit 133 or the echo cancellation output signal frequency characteristic
changing unit 132 is, for example, high frequency level emphasis which raises the level of a
predetermined frequency or more. This makes it possible to obtain the same sound quality as the
speaker speaking into the microphone regardless of the position of the speaker speaking at any
position.
[0039]
In order to change the frequency characteristics as described above, in the voice to be
transmitted, the frequency characteristics of the voice signal subjected to echo cancellation in the
adaptation processing unit 131 are changed, so that the other party is not affected by the
adaptation processing. It will be possible to send voices with optimal frequency characteristics.
[0040]
Also, for the voice sent from the other party, the frequency characteristics of the voice before
input to the stereo adaptive filter 131F of the adaptation processing unit 131 are changed, so the
same characteristics of the stereo adaptive filter 131F and the speaker are changed. Will be able
08-05-2019
10
to send voices.
[0041]
In the present embodiment, the frequency characteristic is changed to increase the level higher
than a predetermined frequency, but various characteristic changes can be made, and
quantitative characteristic change and dynamic characteristic change are performed. It can also
be done.
For example, in the echo cancellation output signal frequency characteristic changing unit 132,
the pattern of characteristic change may be switched according to the average value in the
predetermined period in the frequency characteristic of the sound subjected to the echo
cancellation.
Further, in the reference signal frequency characteristic changing unit 133, the pattern of
characteristic change may be switched in accordance with the average value in the
predetermined period in the frequency characteristic of the voice sent from the other party. Note
that this statistical process is not limited to a simple average value, and a weighted average value
or other statistical calculation may be used.
[0042]
<Flow of Processing of Voice Communication Method> FIG. 3 is a flowchart illustrating the flow
of the voice communication method according to the present embodiment. In addition, the code |
symbol which is not shown by FIG. 3 by the following description shall refer to FIG. 1, FIG.
[0043]
First, in step (S-1), processing of voice communication is started. Next, in step (S-2), the
communication unit 15 of the near-end device 10 receives the voice data sent from the far-end
device 20 via the communication line N. Voice data sent from the far-end device 20 is encoded.
08-05-2019
11
[0044]
In step (S-3), the received audio data is decoded by the audio codec unit 14 to obtain digital
audio data of, for example, 32 kHz sampling 16-bit straight PCM. Then, the decoded audio data is
sent as a reference signal to the signal processing unit 13 configured by the DSP.
[0045]
Inside the signal processing unit 13, the reference signal sent from the voice codec unit 14 is
analyzed, and in step (S-4), the frequency characteristic of the reference signal is changed. The
reference signal after the frequency characteristic change is sent to the D / A converter 12 and
the stereo adaptive filter 131 F of the signal processing unit 13.
[0046]
At step (S-5), learning of the stereo adaptive filter 131F is performed using the reference signal
to which the frequency characteristic has been changed. Then, by subtracting the estimated echo
generated by the stereo adaptive filter 131F from the signal (audio component of the
microphone input) output from the A / D converter 11, an echo cancellation output is obtained.
Also, the echo cancellation output signal is analyzed, and the voice signal to which the frequency
characteristic has been changed in step (S-6) is output to the voice codec unit 14.
[0047]
At step (S-7), the PCM digital audio data from the signal processing unit 13 is encoded and sent
to the communication unit 15. In step (S-8), the encoded voice data output from the voice codec
unit 14 is sent to the far-end device 20 via the communication line N. Then, the process ends in
step (S-9).
[0048]
On the other hand, in step (S-20), the voice is received by the microphone 110, and it is
08-05-2019
12
converted into digital voice data by the A / D converter 11 in step (S-21). Then, the converted
digital audio data is sent to the signal processing unit 13 as a microphone input signal. It is
assumed that the reference signal and the microphone input signal are always supplied to the
signal processing unit 13 at the same timing one sample at a time.
[0049]
Furthermore, in step (S-10), the digital audio data sent to D / A converter 12 in step (S-4) is
converted into an analog signal, and in step (S-11), this analog signal is amplified. It emits sound
from the speaker 120 through.
[0050]
Next, the concept of reference signal frequency characteristic change processing and echo
cancellation output signal frequency characteristic change processing will be described.
FIG. 4 is a schematic view for explaining the concept of frequency characteristic change used in
the present embodiment.
[0051]
First, the frequency characteristic (4-2) of the reference signal which is an audio signal from the
other party received from the audio codec unit is analyzed, and the characteristic is corrected as
needed (4-1). In the analysis of the frequency characteristic (4-2) of the reference signal, for
example, the feeling of stagnation such as the characteristic of the high frequency band is
analyzed, and the characteristic is changed using an exciter or the like. Then, the reference signal
after the characteristic correction is sent from the D / A converter 12 to the speaker 120 and
also sent to the adaptive processing unit 131.
[0052]
On the other hand, the input signal collected by the microphone 110 is echo-cancelled by the
adaptation processing unit 131 and becomes an echo-canceled output signal. The frequency
characteristic (4-4) of the echo cancellation output signal output from the adaptive processing
08-05-2019
13
unit 131 is analyzed, and the characteristic correction is performed as needed (4-5).
[0053]
In the analysis of the frequency characteristic (4-4) of the echo cancellation output signal, the
feeling of stagnation such as the characteristic of the high frequency band is analyzed, and the
characteristic is changed using an exciter or the like. Then, the eco-can output signal after the
characteristic correction is sent to the voice codec unit.
[0054]
<Effects of the Embodiment> The following effects can be obtained by applying the voice
communication apparatus or the voice communication method according to the present
embodiment. That is, in an echo canceler that prevents echo and howling in a speech
communication system having a speaker and a microphone, the microphone input signal and the
reception input signal from the other end are analyzed, and the sense of stagnation of each input
signal is corrected in real time. It is possible to In the voice communication apparatus or the
voice communication method according to the present embodiment, the sound quality of the
microphone input signal or the reception signal from the other party is poor in a speech
communication system including a microphone and a speaker such as a handsfree telephone It
will be possible to improve it in the case.
[0055]
It is a block diagram explaining the composition of the teleconference system which realizes the
voice communication device concerning this embodiment. It is a block diagram explaining the
internal structure of a signal processing part. It is a flowchart explaining the flow of the voice
communication method concerning this embodiment. It is a schematic diagram explaining the
concept of the frequency characteristic change used by this embodiment.
Explanation of sign
[0056]
08-05-2019
14
DESCRIPTION OF SYMBOLS 10 ... Near-end apparatus, 11 ... A / D converter, 12 ... D / A
converter, 13 ... Signal processing part, 14 ... Audio codec part, 15 ... Communications part, 20 ...
Far end apparatus, 110 ... Microphone, 120 ... speaker, 131 ... adaptive processing unit, 131 F ...
stereo adaptive filter, 132 ... echo cancellation output frequency characteristic changing unit,
133 ... reference signal frequency characteristic changing unit, 210 ... microphone, 220 ...
speaker, N ... communication line
08-05-2019
15
Документ
Категория
Без категории
Просмотров
0
Размер файла
23 Кб
Теги
jp2009302984
1/--страниц
Пожаловаться на содержимое документа