close

Вход

Забыли?

вход по аккаунту

?

JP2008131474

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JP2008131474
The present invention provides a close-talking type voice input device which can be miniaturized
and can realize a highly accurate noise removal function, a method of manufacturing the same,
and an information processing system. A voice input device is a close-talking type voice input
device, and includes a first microphone having a first vibrating membrane, and a second
microphone having a second vibrating membrane. The differential signal generation unit 30
generates a differential signal indicating the difference between the first voltage signal acquired
by the first microphone 10 and the second voltage signal acquired by the second microphone 20.
The first and second diaphragms 12 and 22 have a noise intensity ratio indicating the ratio of the
intensity of the noise component included in the difference signal to the intensity of the noise
component included in the first or second voltage signal. Are arranged to be smaller than an
audio intensity ratio indicating the ratio of the intensity of the input audio component included in
the signal to the intensity of the input audio component included in the first or second voltage
signal. [Selected figure] Figure 1
Voice input device, method of manufacturing the same, and information processing system
[0001]
The present invention relates to a voice input device, a method of manufacturing the same, and
an information processing system.
[0002]
It is preferable to pick up only the target voice (user's voice) at the time of a telephone call or the
04-05-2019
1
like, voice recognition, voice recording and the like.
However, in the use environment of the voice input device, sounds other than the intended voice
such as background noise may be present. Therefore, development of a voice input device having
a function of removing noise is in progress.
[0003]
As a technique for removing noise in a use environment where noise is present, the microphone
is made to have sharp directivity, or noise is eliminated by signal processing by identifying the
direction of arrival of the sound wave using difference in arrival time of the sound wave. The
method is known.
[0004]
Further, in recent years, the miniaturization of electronic devices has progressed, and techniques
for miniaturizing voice input devices have become important.
JP-A-7-312638 JP-A-9-331377 JP-A-2001-186241
[0005]
In order to make the microphones have a sharp directivity, it is necessary to arrange a large
number of vibrating membranes, and miniaturization is difficult.
[0006]
In addition, in order to detect the arrival direction of the sound wave with high accuracy by
utilizing the difference in arrival time of sound wave, it is necessary to install a plurality of
diaphragms at intervals of about a few wavelengths of the audible sound wave. Is difficult.
[0007]
An object of the present invention is to provide a close talk type voice input device having a
function of removing noise components, a method of manufacturing the same, and an
information processing system.
04-05-2019
2
[0008]
(1) A voice input device according to the present invention is a close-talking voice input device,
comprising: a first microphone having a first diaphragm; a second microphone having a second
diaphragm; A differential signal generation unit configured to generate a differential signal
indicating a difference between a first voltage signal acquired by a first microphone and a second
voltage signal acquired by the second microphone; In the first and second diaphragms, a noise
intensity ratio indicating the ratio of the intensity of the noise component included in the
difference signal to the intensity of the noise component included in the first or second voltage
signal is the difference signal Are arranged to be smaller than an input speech intensity ratio
indicating a ratio of the intensity of the input speech component included in the input speech
component to the intensity of the input speech component included in the first or second voltage
signal.
[0009]
According to this voice input device, the first and second microphones (first and second
diaphragms) are arranged to satisfy the predetermined condition.
According to this, the differential signal indicating the difference between the first and second
voltage signals acquired by the first and second microphones can be regarded as a signal
indicating the input voice from which the noise component is removed.
Therefore, according to the present invention, it is possible to provide a voice input device
capable of realizing the noise removal function with a simple configuration that only generates a
differential signal.
[0010]
In this voice input device, the difference signal generation unit generates a difference signal
without performing analysis processing (such as Fourier analysis processing) on the first and
second voltage signals.
Therefore, it becomes possible to reduce the signal processing load of the differential signal
04-05-2019
3
generation unit or to realize the differential signal generation unit by a very simple circuit.
[0011]
From this, according to the present invention, it is possible to provide a voice input device which
can be miniaturized and can realize a high precision noise removal function.
[0012]
In this voice input device, the first and second diaphragms may be arranged so that the intensity
ratio based on the phase difference component of the noise component is smaller than the
intensity ratio based on the amplitude of the input voice component. Good.
[0013]
(2) In this voice input device, the sound input device further includes a base having a recess
formed on the main surface, the first vibrating membrane is installed on the bottom of the recess,
and the second vibrating membrane is installed on the main surface It may be
[0014]
(3) In this voice input device, the base is arranged such that the opening communicating with the
recess is closer to the model sound source of the input voice than the formation region of the
second diaphragm on the main surface. It may be installed in
[0015]
According to this voice input device, it is possible to reduce the phase shift of the input voice
incident on the first and second diaphragms.
Therefore, it becomes possible to generate a differential signal with less noise, and it is possible
to provide a voice input device having a highly accurate noise removal function.
[0016]
(4) In this voice input device, the recess may be shallower than a distance between the opening
and a formation region of the second diaphragm.
04-05-2019
4
[0017]
(5) In this voice input device, the voice input device further includes a base on which a first
recess and a second recess shallower than the first recess are formed on the main surface, and
the first diaphragm is the first diaphragm. The second vibrating membrane may be disposed on
the bottom surface of the second recess.
[0018]
(6) In this voice input device, the base is closer to the model sound source of the input voice than
a second opening in which a first opening in communication with the first recess is in
communication with the second recess. It may be installed to be placed at
[0019]
According to this voice input device, it is possible to reduce the phase shift of the input voice
incident on the first and second diaphragms.
Therefore, it becomes possible to generate a differential signal with less noise, and it is possible
to provide a voice input device having a highly accurate noise removal function.
[0020]
(7) In this voice input device, the difference in depth between the first and second recesses may
be smaller than the distance between the first and second openings.
[0021]
(8) In this voice input device, the base may be installed such that the input voice arrives at the
first and second diaphragms simultaneously.
[0022]
According to this, it is possible to generate a differential signal that does not include the phase
shift of the input voice, so it is possible to provide a voice input device having a highly accurate
noise removal function.
04-05-2019
5
[0023]
(9) In this voice input device, the first and second diaphragms may be arranged such that their
normals are parallel.
[0024]
(10) In this voice input device, the first and second vibrating membranes may be arranged such
that the normals do not become the same straight line.
[0025]
(11) In this voice input device, the first and second microphones may be configured as
semiconductor devices.
[0026]
For example, the first and second microphones may be silicon microphones (Si microphones).
The first and second microphones may be configured as one semiconductor substrate.
At this time, the first and second microphones and the difference signal generation unit may be
configured as one semiconductor substrate.
The first and second microphones and the difference signal generation unit may be configured as
so-called MEMS (Micro Electro Mechanical Systems).
[0027]
(12) In this voice input device, the center-to-center distance between the first and second
diaphragms may be 5.2 mm or less.
[0028]
The first and second vibrating membranes may be arranged such that the normals are parallel
and the interval between the normals is 5.2 mm or less.
04-05-2019
6
[0029]
(13) An information processing system according to the present invention includes a first
microphone having a first diaphragm, a second microphone having a second diaphragm, and a
first microphone acquired by the first microphone. A close-talking voice input device including a
difference signal generation unit that generates a difference signal indicating a difference
between a voltage signal and a second voltage signal acquired by the second microphone; and
based on the difference signal And an analysis processing unit that analyzes the voice
information input to the voice input device, and the voice input device includes a noise
component that the first and second diaphragms include in the difference signal. A noise
intensity ratio indicating a ratio of the intensity of the noise component to the intensity of the
noise component included in the first or second voltage signal is the first or second of the
intensity of the input voice component included in the difference signal. Included in the voltage
signal It is arranged so as to be smaller than the input voice intensity ratio indicating the ratio of
the intensity of the fill power speech components.
[0030]
According to this information processing system, analysis processing of voice information is
performed based on the difference signal acquired by the voice input device arranged such that
the first and second diaphragms satisfy the predetermined condition.
According to this voice input device, the difference signal is a signal indicating the voice
component from which the noise component has been removed. Therefore, various information
processing based on the input voice can be performed by analyzing the difference signal.
[0031]
The information processing system according to the present invention may be a system that
performs voice recognition processing, voice authentication processing, command generation
processing based on voice, and the like.
[0032]
(14) An information processing system according to the present invention includes a first
microphone having a first diaphragm, a second microphone having a second diaphragm, and a
first microphone acquired by the first microphone. A close-talking-type voice input device
including a difference signal generation unit that generates a difference signal indicating a
04-05-2019
7
difference between a voltage signal and a second voltage signal acquired by the second
microphone, and a communication processing unit. A host computer that analyzes voice
information input to the voice input device based on the difference signal; and the voice input
device includes the first and second diaphragms as the difference signal. A noise intensity ratio
indicating a ratio of the intensity of the noise component included to the intensity of the noise
component included in the first or second voltage signal is the first of the intensities of the input
speech component included in the difference signal. Or Are arranged to be smaller than an input
speech intensity ratio that indicates a ratio to the intensity of the input speech component
included in the voltage signal of the voltage signal, and the communication processing unit
performs communication processing with the host computer via the network. .
[0033]
According to this information processing system, analysis processing of voice information is
performed based on the difference signal acquired by the voice input device arranged such that
the first and second diaphragms satisfy the predetermined condition.
According to this voice input device, since the difference signal is a signal indicating the voice
component from which the noise component has been removed, various information processing
based on the input voice can be performed by analyzing the difference signal.
[0034]
The information processing system according to the present invention may be a system that
performs voice recognition processing, voice authentication processing, command generation
processing based on voice, and the like.
[0035]
(15) A method of manufacturing a voice input device according to the present invention,
comprising: acquiring a first microphone having a first diaphragm, a second microphone having a
second diaphragm, and the first microphone A close talk having a function of removing noise
components, including a difference signal generation unit that generates a difference signal
indicating a difference between a first voltage signal and a second voltage signal acquired by the
second microphone. A method of manufacturing a voice input device of the second type,
comprising a value of .DELTA.r / .lambda. Indicating a ratio of a center distance .DELTA.r
between the first and second diaphragms to a wavelength .lambda. Of noise, and the difference
signal Preparing a data indicating a correspondence relationship between a strength of the noise
04-05-2019
8
component and a noise strength ratio indicating a ratio to the strength of the noise component
included in the first or second voltage signal; and based on the data, Set the value of Δr / λ
Forward and set the value of the [Delta] r / lambda, and, based on the wavelength of the noise,
including the steps of setting the center-to-center distance.
[0036]
According to the present invention, it is possible to provide a method of manufacturing a voice
input device that can be miniaturized and has high precision noise removal function.
[0037]
(16) In the method of manufacturing the voice input device, in the procedure of setting the value
of Δr / λ, the noise intensity ratio is the intensity of the input voice component included in the
difference signal based on the data. The value of Δr / λ may be set to be smaller than an input
speech intensity ratio indicating a ratio to the intensity of the input speech component included
in the first or second voltage signal.
[0038]
(17) In this method of manufacturing a voice input device, the input voice strength ratio may be
a strength ratio based on an amplitude component of the input voice.
[0039]
(18) In this method of manufacturing a voice input device, the noise intensity ratio may be an
intensity ratio based on a phase difference of the noise component.
[0040]
Hereinafter, embodiments to which the present invention is applied will be described with
reference to the drawings.
However, the present invention is not limited to the following embodiments.
Further, the present invention includes any combination of the following contents.
04-05-2019
9
[0041]
1.
First, the configuration of the voice input device 1 according to the embodiment to which the
present invention is applied will be described with reference to FIGS. 1 to 3.
The voice input device 1 described below is a close-talking-type voice input device, and is an
information processing system using, for example, a voice communication device such as a
mobile phone or a transceiver, or a technology for analyzing input voice. It can be applied to
voice recognition systems, voice recognition systems, command generation systems, electronic
dictionaries, translators, voice input remote controllers, etc., or recording devices, amplifier
systems (loudspeakers), microphone systems, etc. .
[0042]
The voice input device according to the present embodiment includes a first microphone 10
having a first diaphragm 12 and a second microphone 20 having a second diaphragm 22.
Here, the microphone is an electroacoustic transducer that converts an acoustic signal into an
electrical signal.
The first and second microphones 10 and 20 may be converters that output the vibrations of the
first and second diaphragms 12 and 22 (diaphragm), respectively, as voltage signals.
[0043]
In the voice input device according to the present embodiment, the first microphone 10
generates a first voltage signal.
04-05-2019
10
Also, the second microphone 20 generates a second voltage signal.
That is, the voltage signals generated by the first and second microphones 10 and 20 may be
referred to as first and second voltage signals, respectively.
[0044]
The mechanism of the first and second microphones 10 and 20 is not particularly limited.
FIG. 2 shows a structure of a condenser microphone 100 as an example of a microphone
applicable to the first and second microphones 10 and 20.
The condenser microphone 100 has a vibrating membrane 102.
The vibrating film 102 is a film (thin film) that vibrates upon receiving an acoustic wave, has
conductivity, and forms one end of an electrode.
The condenser microphone 100 also has an electrode 104.
The electrode 104 is disposed to face the vibrating membrane 102.
Thus, the diaphragm 102 and the electrode 104 form a capacitance.
When a sound wave is incident on the condenser microphone 100, the vibrating membrane 102
vibrates, the distance between the vibrating membrane 102 and the electrode 104 changes, and
the electrostatic capacitance between the vibrating membrane 102 and the electrode 104
changes.
By outputting this change in capacitance as, for example, a change in voltage, the sound wave
incident on the condenser microphone 100 can be converted into an electrical signal.
04-05-2019
11
In the condenser microphone 100, the electrode 104 may have a structure which is not affected
by the sound wave.
For example, the electrode 104 may have a mesh structure.
[0045]
However, the microphone applicable to the present invention is not limited to the condenser type
microphone, and any microphone already known can be applied. For example, as the first and
second microphones 10 and 20, microphones of an electrodynamic (dynamic) type, an
electromagnetic (magnetic) type, a piezoelectric (crystal) type, or the like may be applied.
[0046]
The first and second microphones 10 and 20 may be silicon microphones (Si microphones) in
which the first and second diaphragms 12 and 22 are made of silicon. By using a silicon
microphone, downsizing and high performance of the first and second microphones 10 and 20
can be realized. At this time, the first and second microphones 10 and 20 may be configured as
one integrated circuit device. That is, the first and second microphones 10 and 20 may be
configured on one semiconductor substrate. At this time, the difference signal generation unit 30
described later may also be formed on the same semiconductor substrate. That is, the first and
second microphones 10 and 20 may be configured as so-called MEMS (Micro Electro Mechanical
Systems). However, the first microphone 10 and the second microphone 20 may be configured
as separate silicon microphones.
[0047]
The voice input device according to the present embodiment realizes a function of removing a
noise component by using a differential signal indicating the difference between the first and
second voltage signals, as described later. In order to realize this function, the first and second
microphones (first and second vibrating membranes 12 and 22) are arranged to satisfy certain
constraints. The details of the constraints that the first and second vibrating membranes 12 and
04-05-2019
12
22 should satisfy will be described later, but in the present embodiment, the first and second
vibrating membranes 12 and 22 (the first and second microphones 10, 20) are arranged such
that the noise intensity ratio is smaller than the input speech intensity ratio. This makes it
possible to regard the differential signal as a signal representing an audio component from which
noise components have been removed. The first and second vibrating membranes 12 and 22 may
be arranged, for example, to have a center-to-center distance of 5.2 mm or less.
[0048]
In the voice input device according to the present embodiment, the directions of the first and
second diaphragms 12 and 22 are not particularly limited. The first and second vibrating
membranes 12 and 22 may be arranged such that the normals are parallel. At this time, the first
and second vibrating membranes 12 and 22 may be arranged such that the normals do not
become the same straight line. For example, the first and second vibrating membranes 12 and 22
may be spaced apart on the surface of a base (for example, a circuit board) not shown.
Alternatively, the first and second vibrating membranes 12 and 22 may be arranged offset in the
normal direction. However, the first and second vibrating membranes 12 and 22 may be
arranged such that the normals do not become parallel. The first and second vibrating
membranes 12 and 22 may be arranged such that the normals are orthogonal to each other.
[0049]
The voice input device according to the present embodiment has a difference signal generation
unit 30. The difference signal generation unit 30 generates a difference signal indicating a
difference (voltage difference) between the first voltage signal acquired by the first microphone
10 and the second voltage signal acquired by the second microphone 20. Do. The difference
signal generation unit 30 performs processing of generating a difference signal indicating the
difference between the first and second voltage signals without performing analysis processing
such as Fourier analysis, for example. The function of the difference signal generation unit 30
may be realized by a dedicated hardware circuit (difference signal generation circuit) or may be
realized by signal processing by a CPU or the like.
[0050]
The voice input device according to the present embodiment may further include a signal
04-05-2019
13
amplification unit that amplifies the differential signal. The differential signal generation unit 30
and the signal amplification unit may be realized by one control circuit. However, the voice input
device according to the present embodiment may have a configuration without the signal
amplification unit inside.
[0051]
FIG. 3 shows an example of a circuit that can realize the differential signal generation unit 30 and
the signal amplification unit. According to the circuit shown in FIG. 3, the first and second voltage
signals are received, and a signal obtained by amplifying the difference signal indicating the
difference by 10 times is output. However, the circuit configuration for realizing the differential
signal generation unit 30 and the signal amplification unit is not limited to this.
[0052]
The voice input device according to the present embodiment may include a housing 40. At this
time, the outer shape of the voice input device may be configured by the housing 40. A basic
posture may be set in the case 40, which makes it possible to regulate the traveling path of the
input voice. The first and second vibrating membranes 12 and 22 may be formed on the surface
of the housing 40. Alternatively, the first and second diaphragms 12 and 22 may be disposed
inside the housing 40 so as to face the opening (sound inlet) formed in the housing 40. Then, the
first and second diaphragms 12 and 22 may be arranged to be different in distance from the
sound source (model sound source of the incident sound). For example, as shown in FIG. 1, the
housing 40 may have its basic posture set so that the traveling path of the input voice is along
the surface of the housing 40. The first and second diaphragms 12 and 22 may be disposed
along the traveling path of the input voice. Then, it is possible to use the first diaphragm 12 as
the diaphragm disposed on the upstream side of the traveling path of the input voice and the
second diaphragm 22 as the diaphragm disposed on the downstream side.
[0053]
The voice input device according to the present embodiment may further include an arithmetic
processing unit 50. The arithmetic processing unit 50 performs various arithmetic processing
based on the difference signal generated by the difference signal generation unit 30. The
arithmetic processing unit 50 may perform analysis processing on the difference signal. The
04-05-2019
14
arithmetic processing unit 50 may perform processing (so-called voice authentication
processing) of identifying a person who has issued an input voice by analyzing the difference
signal. Alternatively, the arithmetic processing unit 50 may perform processing (so-called voice
recognition processing) for specifying the content of the input voice by analyzing the difference
signal. The arithmetic processing unit 50 may perform processing of creating various commands
based on the input voice. The arithmetic processing unit 50 may perform a process of amplifying
the difference signal. Further, the arithmetic processing unit 50 may control the operation of the
communication processing unit 60 described later. The arithmetic processing unit 50 may realize
each of the above functions by signal processing by a CPU or a memory.
[0054]
The arithmetic processing unit 50 may be disposed inside the housing 40, but may be disposed
outside the housing 40. When the arithmetic processing unit 50 is disposed outside the housing
40, the arithmetic processing unit 50 may acquire the difference signal via the communication
processing unit 60 described later.
[0055]
The voice input device according to the present embodiment may further include the
communication processing unit 60. The communication processing unit 60 controls
communication between the voice input device and another terminal (such as a mobile phone
terminal or a host computer). The communication processing unit 60 may have a function of
transmitting a signal (difference signal) to another terminal via the network. The communication
processing unit 60 may also have a function of receiving signals from other terminals via the
network. Then, for example, even if the host computer analyzes the differential signal acquired
via the communication processing unit 60 and performs various information processing such as
voice recognition processing, voice authentication processing, command generation processing,
data storage processing, etc. Good. That is, the voice input device may configure the information
processing system in cooperation with another terminal. In other words, the voice input device
may be regarded as an information input terminal that constructs an information processing
system. However, the voice input device may be configured without the communication
processing unit 60.
[0056]
04-05-2019
15
The voice input device according to the present embodiment may further include a display device
such as a display panel and a voice output device such as a speaker. Further, the voice input
device according to the present embodiment may further include an operation key for inputting
operation information.
[0057]
The voice input device according to the present embodiment may have the above configuration.
According to this voice input device, a signal (voltage signal) indicating the voice component
from which the noise component has been removed is generated by a simple process of only
outputting the difference between the first and second voltage signals. Therefore, according to
the present invention, it is possible to provide a voice input device that can be miniaturized and
has an excellent noise removal function. The principle will be described in detail later.
[0058]
2. Noise Removal Function Hereinafter, the voice removal principle employed by the voice
input device according to the present embodiment, and conditions for realizing the same will be
described.
[0059]
(1) Noise Removal Principle First, the noise removal principle of the voice input device according
to the present embodiment will be described.
[0060]
The sound wave attenuates as it travels through the medium, and the sound pressure (intensity /
amplitude of the sound wave) decreases.
Since the sound pressure is inversely proportional to the distance from the sound source, the
sound pressure P is related to the distance r from the sound source
04-05-2019
16
[0061]
It can be expressed as. In equation (1), k is a proportional constant. FIG. 4 shows a graph
representing the equation (1), but as can be understood from this figure, the sound pressure
(amplitude of the sound wave) is rapidly attenuated at a position close to the sound source (left
side of the graph) It decays gently as you leave. The voice input device according to the present
embodiment removes noise components using this attenuation characteristic.
[0062]
That is, in the close talk type voice input device, the user emits voice from a position closer to the
first and second microphones 10 and 20 (the first and second diaphragms 12 and 22) than the
noise source. . Therefore, the voice of the user is greatly attenuated between the first and second
diaphragms 12 and 22, and a difference appears in the strength of the user voice included in the
first and second voltage signals. On the other hand, the noise component hardly attenuates
between the first and second diaphragms 12 and 22 because the sound source is farther than the
user's voice. Therefore, it can be considered that no difference appears in the intensity of the
noise contained in the first and second voltage signals. From this, when the difference between
the first and second voltage signals is detected, the noise is eliminated, so that it is possible to
obtain a voltage signal (difference signal) indicating only the voice component of the user which
does not include the noise component it can. That is, the differential signal can be regarded as a
signal indicating the voice of the user from which the noise component has been removed.
[0063]
However, the sound wave has a phase component. Therefore, in order to realize a highly reliable
noise removal function, it is necessary to consider the phase difference between the voice
component and the noise component included in the first and second voltage signals.
[0064]
Hereinafter, specific conditions to be satisfied by the voice input device in order to realize the
noise removal function by generating the difference signal will be described.
04-05-2019
17
[0065]
(2) Specific Conditions to be Satisfied by the Voice Input Device The voice input device according
to the present embodiment does not include noise as the difference signal indicating the
difference between the first and second voltage signals as described above. It is considered to be
an input speech signal.
According to this voice input device, if the noise component included in the difference signal is
smaller than the noise component included in the first or second voltage signal, it can be
evaluated that the noise removal function has been realized. Specifically, the noise intensity ratio
indicating the ratio of the intensity of the noise component contained in the difference signal to
the intensity of the noise component contained in the first or second voltage signal is the noise
intensity ratio of the sound component contained in the difference signal. It can be evaluated that
the noise removal function is realized if the ratio is smaller than the audio intensity ratio
indicating the ratio to the intensity of the audio component included in the first or second voltage
signal.
[0066]
Hereinafter, specific conditions to be satisfied by the voice input device (the first and second
diaphragms 12 and 22) in order to realize the noise removal function will be described.
[0067]
First, the sound pressure of the sound incident on the first and second microphones 10 and 20
(the first and second diaphragms 12 and 22) will be considered.
Assuming that the distance from the sound source of the input voice (user's voice) to the first
diaphragm 12 is R and the phase difference is ignored, the sound pressure of the input voice
acquired by the first and second microphones 10 and 20 (Strength) P (S1) and P (S2) are
[0068]
It can be expressed as.
04-05-2019
18
[0069]
Therefore, an audio intensity ratio ρ (P) indicating the ratio of the intensity of the input audio
component included in the difference signal to the intensity of the input audio component
acquired by the first microphone 10 when the phase difference of the input audio is ignored Is
represented as.
[0070]
Here, the voice input device according to the present embodiment is a close-talking voice input
device, and Δr can be regarded as sufficiently smaller than R.
[0071]
Therefore, the above equation (4) can be transformed into
[0072]
That is, it can be seen that the speech intensity ratio when the phase difference of the input
speech is ignored is expressed as Expression (A).
[0073]
By the way, in consideration of the phase difference of the input voice, the sound pressures Q
(S1) and Q (S2) of the user voice can be expressed as follows.
In the equation, α is a phase difference.
At this time, the speech strength ratio ((S) is expressed as
Considering equation (7), the magnitude of the speech intensity ratio ((S) can be expressed as
[0074]
By the way, in the equation (8), the term sin ωt-sin (ωt-α) indicates the intensity ratio of the
04-05-2019
19
phase component, and the Δr / R sin ωt term indicates the intensity ratio of the amplitude
component.
Even in the case of the input voice component, the phase difference component is noise with
respect to the amplitude component, and therefore, in order to extract the input voice (user's
voice) with high accuracy, the intensity ratio of the phase component is greater than the intensity
ratio of the amplitude component. It is also necessary that it be small enough.
That is, sin ωt-sin (ωt-α) and Δr / R sinωt are
[0075]
It is necessary to meet the relationship of
[0076]
Here, since it can be represented as, the above-mentioned formula (B) can be represented as
[0077]
In consideration of the amplitude component of equation (10), it is understood that the voice
input device according to the present embodiment needs to satisfy
[0078]
As described above, since Δr can be regarded as sufficiently smaller than R, sin (α / 2) can be
regarded as sufficiently small, and can be approximated as follows.
[0079]
Therefore, Formula (C) can be transformed into
[0080]
Further, if the relationship between the phase difference α and Δr is expressed as, the equation
(D) can be transformed into
[0081]
04-05-2019
20
That is, in the present embodiment, in order to extract the input voice (user's voice) with high
accuracy, it is necessary to manufacture the voice input device so as to satisfy the relationship
shown in equation (E).
[0082]
Next, the sound pressure of noise incident on the first and second microphones 10 and 20 (the
first and second diaphragms 12 and 22) will be examined.
[0083]
Assuming that the amplitudes of noise components acquired by the first and second microphones
are A and A ′, sound pressures Q (N1) and Q (N2) of noise considering phase difference
components can be expressed as A noise intensity ratio ρ (N) indicating the ratio of the intensity
of the noise component included in the difference signal to the intensity of the noise component
acquired by the first microphone 10 can be expressed as
[0084]
As described above, the amplitudes (intensity) of the noise components acquired by the first and
second microphones are almost the same, and can be treated as A = A ′.
Therefore, the above equation (15) can be transformed into
[0085]
And, the magnitude of the noise intensity ratio can be expressed as
[0086]
Here, considering the above equation (9), equation (17) can be transformed into
[0087]
Then, considering equation (11), equation (18) can be transformed into
[0088]
04-05-2019
21
Here, referring to equation (D), the noise intensity ratio can be expressed as
Here, Δr / R is an intensity ratio of amplitude components of input speech (user speech), as
shown in equation (A).
From this equation (F), it can be seen that in this voice input device, the noise intensity ratio is
smaller than the intensity ratio Δr / R of the input voice.
[0089]
From the above, according to the voice input device designed such that the intensity ratio of the
phase component of the input voice is smaller than the intensity ratio of the amplitude
component (see equation (B)), the noise intensity ratio is equal to the input voice intensity ratio.
It becomes smaller than (refer Formula (F)).
Conversely, according to the voice input device designed so that the noise intensity ratio is
smaller than the input speech intensity ratio, a highly accurate noise removal function can be
realized.
[0090]
That is, according to the present embodiment, the first and second diaphragms 12 and 22 (the
first and second microphones 10 and 20) are arranged such that the noise intensity ratio is
smaller than the input sound intensity ratio. According to the voice input device, a highly
accurate noise removal function can be realized.
[0091]
3.
Method of Manufacturing Voice Input Device Hereinafter, a method of manufacturing the voice
04-05-2019
22
input device according to the present embodiment will be described.
In the present embodiment, the value of Δr / λ indicating the ratio of the distance Δr between
the centers of the first and second diaphragms 12 and 22 to the wavelength λ of the noise, and
the noise intensity ratio (the intensity based on the phase component of the noise The voice input
device is manufactured using data indicating the correspondence relationship with the ratio).
[0092]
The intensity ratio based on the phase component of noise is expressed by the above-mentioned
equation (18).
Therefore, the decibel value of the intensity ratio based on the phase component of noise can be
expressed as
[0093]
Then, if each value is substituted into α in equation (20), the correspondence between the phase
difference α and the intensity ratio based on the phase component of noise can be clarified.
FIG. 5 shows an example of data representing the correspondence between the phase difference
and the intensity ratio when the horizontal axis is α / 2π and the vertical axis is an intensity
ratio (decibel value) based on the phase component of noise. .
[0094]
The phase difference α can be expressed as a function of Δr / λ, which is the ratio of the
distance Δr to the wavelength λ, as shown in equation (12), and the horizontal axis in FIG. 5
should be regarded as Δr / λ. Can.
That is, it can be said that FIG. 5 is data indicating the correspondence between the intensity
04-05-2019
23
ratio based on the phase component of noise and Δr / λ.
[0095]
In this embodiment, this data is used to manufacture a voice input device.
FIG. 6 is a flow chart for explaining the procedure of manufacturing a voice input device using
this data.
[0096]
First, data (see FIG. 5) indicating the correspondence between the noise intensity ratio (intensity
ratio based on the noise phase component) and Δr / λ is prepared (step S10).
[0097]
Next, the noise intensity ratio is set according to the application (step S12).
In the present embodiment, it is necessary to set the noise intensity ratio so that the noise
intensity decreases.
Therefore, in this step, the noise intensity ratio is set to 0 dB or less.
[0098]
Next, based on the data, a value of Δr / λ corresponding to the noise intensity ratio is derived
(step S14).
[0099]
Then, the condition of Δr to be satisfied is derived by substituting the wavelength of the main
noise for λ (step S16).
04-05-2019
24
[0100]
As a specific example, consider the case of manufacturing a voice input device in which the noise
intensity decreases by 20 dB in an environment where the main noise is 1 KHz and the
wavelength is 0.347 m.
[0101]
First, as a necessary condition, a condition for the noise intensity ratio to be 0 dB or less is
examined.
Referring to FIG. 5, it can be understood that the value of Δr / λ may be set to 0.16 or less in
order to set the noise intensity ratio to 0 dB or less.
That is, it turns out that the value of Δr should be 55.46 mm or less, which is a necessary
condition of this voice input device.
[0102]
Next, conditions for reducing the intensity of 1 KHz noise by 20 dB will be considered.
Referring to FIG. 5, it can be seen that the value of Δr / λ may be set to 0.015 in order to
reduce the noise intensity by 20 dB.
Then, assuming that λ = 0.347 m, it is understood that this condition is satisfied when the value
of Δr is 5.199 mm or less.
That is, if Δr is set to about 5.2 mm or less, it becomes possible to manufacture a close talk type
voice input device having a noise removal function.
[0103]
04-05-2019
25
The voice input device according to the present embodiment is a close-talking voice input device,
and the distance between the sound source of the user's voice and the first or second diaphragm
12 or 22 is usually 5 cm or less. Further, the distance between the sound source of the user's
voice and the first and second diaphragms 12 and 22 can be controlled by the design of the
housing 40. Therefore, it can be seen that the value of Δr / R, which is the intensity ratio of the
input voice (user's voice), is larger than 0.1 (noise intensity ratio), and the noise removal function
is realized.
[0104]
Generally, noise is not limited to a single frequency. However, since the noise whose frequency is
lower than the noise assumed as the main noise has a longer wavelength than the main noise, the
value of Δr / λ becomes smaller and is removed by this voice input device. Also, the higher the
frequency of sound waves, the faster the energy decays. Therefore, noise higher in frequency
than noise assumed as main noise attenuates earlier than the main noise, so the influence on the
voice input device can be ignored. From this, the voice input device according to the present
embodiment can exhibit an excellent noise removal function even in the presence of noise of a
frequency different from the noise assumed as the main noise.
[0105]
Further, in the present embodiment, as can be understood from the equation (12), noise incident
from a straight line connecting the first and second vibrating films 12 and 22 is assumed. This
noise is noise in which the apparent distance between the first and second diaphragms 12 and
22 is the largest, and in the actual use environment, the phase difference is the largest. That is,
the voice input device according to the present embodiment is configured to be capable of
removing the noise with the largest phase difference. Therefore, according to the voice input
device according to the present embodiment, noise incident from all directions is removed.
[0106]
4. Effects Hereinafter, effects of the voice input device according to the present embodiment
will be described.
04-05-2019
26
[0107]
As described above, according to the voice input device according to the present embodiment,
the noise component is generated simply by generating the difference signal indicating the
difference between the voltage signals acquired by the first and second microphones 10 and 20.
The removed speech component can be obtained. That is, in this voice input device, the noise
removal function can be realized without performing complicated analysis processing. Therefore,
according to the present embodiment, it is possible to provide a voice input device capable of
realizing a highly accurate noise removal function with a simple configuration.
[0108]
In addition, this voice input device realizes the noise removal function by making the noise
intensity ratio based on the phase difference smaller than the intensity ratio of the input voice.
The noise intensity ratio based on the phase difference changes depending on the arrangement
direction of the first and second diaphragms 12 and 22 and the incident direction of the noise.
That is, as the distance (apparent distance) between the first and second diaphragms 12 and 22
with respect to noise increases, the phase difference of the noise increases, and the noise
intensity ratio based on the phase difference increases. By the way, in the present embodiment,
the voice input device can remove noise in which the apparent distance between the first and
second diaphragms 12 and 22 is the widest as understood from the equation (12). Is configured
as. In other words, in the present embodiment, the first and second vibrating films 12 and 22 are
arranged such that the incident noise can be removed such that the noise intensity ratio based on
the phase difference is maximized. . Therefore, according to this voice input device, noise incident
from all directions is removed. That is, according to the present invention, it is possible to
provide an audio input device capable of removing noise incident from all directions.
[0109]
According to this voice input device, it is also possible to remove the user voice component
incident on the voice input device after being reflected by a wall or the like. Specifically, the
sound source of the user's voice reflected by the wall or the like can be regarded as being farther
than the sound source of the normal user's voice, and the energy is largely dissipated by the
reflection. The sound pressure is not significantly attenuated between the second vibrating
membranes 12 and 22. Therefore, according to this voice input device, the user voice component
04-05-2019
27
incident on the voice input device after being reflected by a wall or the like is also removed (as a
kind of noise) as noise.
[0110]
Then, by using this voice input device, it is possible to obtain a signal indicating the input voice,
which does not include noise. Therefore, by using this voice input device, highly accurate voice
recognition, voice authentication, and command generation processing can be realized.
[0111]
Moreover, if this voice input device is applied to a microphone system, the user's voice output
from the speaker is also removed as noise. Therefore, it is possible to provide a microphone
system in which howling does not easily occur.
[0112]
5. Voice Input Device According to Second Embodiment Next, a voice input device according to
a second embodiment to which the present invention is applied will be described with reference
to FIG.
[0113]
The voice input device according to the present embodiment includes a base 70. A recess 74 is
formed in the main surface 72 of the base 70. Then, in the voice input device according to the
present embodiment, the first vibrating membrane 12 (the first microphone 10) is disposed on
the bottom surface 75 of the recess 74, and the second vibrating membrane 22 (the A second
microphone 20) is arranged. Recess 74 may extend perpendicularly to main surface 72, and
bottom surface 75 of recess 74 may be a surface parallel to main surface 72. The bottom surface
75 may be a surface orthogonal to the recess 74. In addition, the recess 74 may have the same
outer shape as the first vibrating membrane 12.
04-05-2019
28
[0114]
In the present embodiment, the recess 74 may be shallower than the distance between the region
76 and the opening 78. That is, assuming that the depth of the recess 74 is d and the distance
between the region 76 and the opening 78 is ΔG, the base 70 may satisfy d ≦ ΔG. The base 70
may satisfy 2d = ΔG. Note that ΔG may be 5.2 mm or less. Alternatively, the base 70 may be
configured such that the linear distance connecting the centers of the first and second vibrating
membranes 12 and 22 is 5.2 mm or less.
[0115]
The base 70 is disposed such that the opening 78 communicating with the recess 74 is disposed
at a position closer to the sound source of the input sound than the region 76 in the main surface
72 where the second diaphragm 22 is disposed. The base 70 may be installed so that the input
voice arrives at the first and second vibrating membranes 12 and 22 simultaneously. For
example, the base 70 may be installed so that the distance between the sound source of the input
voice (model sound source) and the first diaphragm 12 is the same as the distance between the
model sound source and the second diaphragm 22. . The base 70 may be installed in a housing in
which the basic posture is set so as to satisfy the above conditions.
[0116]
According to the voice input device according to the present embodiment, it is possible to reduce
the deviation of the incident time of the input voice (user's voice) incident on the first and second
diaphragms 12 and 22. That is, since the difference signal can be generated so that the phase
difference component of the input speech is not included, the amplitude component of the input
speech can be extracted with high accuracy.
[0117]
Note that since the sound wave does not diffuse in the recess 74, the amplitude of the sound
wave is hardly attenuated. Therefore, in this voice input device, the strength (amplitude) of the
input voice for vibrating the first diaphragm 12 can be regarded as the same as the strength of
the input voice at the opening 78. From this, even when the voice input device is configured such
04-05-2019
29
that the input voice reaches the first and second vibrating membranes 12 and 22 simultaneously,
the first and second vibrating membranes 12 and 22 are vibrated. A difference appears in the
strength of the input speech. Therefore, input speech can be extracted by acquiring a difference
signal indicating the difference between the first and second voltage signals.
[0118]
In summary, according to this voice input device, the amplitude component (difference signal) of
the input voice can be acquired so as not to include noise based on the phase difference
component of the input voice. Therefore, it is possible to realize a highly accurate noise removal
function.
[0119]
In addition, since the resonant frequency of the recess 74 can be set high by setting the depth of
the recess 74 to ΔG or less (5.2 mm or less), generation of resonance noise in the recess 74 can
be prevented. .
[0120]
FIG. 8 shows a modification of the voice input device according to the present embodiment.
[0121]
The voice input device according to the present embodiment includes a base 80.
In the main surface 82 of the base 80, a first recess 84 and a second recess 86 shallower than
the first recess 84 are formed.
The difference Δd between the depths of the first and second recesses 84 and 86 corresponds to
the first opening 85 communicating with the first recess 84 and the second opening 87
communicating with the second recess 86. It may be smaller than the interval ΔG. The first
vibrating membrane 12 is disposed on the bottom of the first recess 84, and the second vibrating
membrane 22 is disposed on the bottom of the second recess 86.
04-05-2019
30
[0122]
Even with this voice input device, it is possible to realize a high precision noise removal function
since the same effect as described above can be obtained.
[0123]
Finally, FIG. 9 to FIG. 11 respectively show a mobile phone 300, a microphone (microphone
system) 400, and a remote controller 500 as an example of the voice input device according to
the embodiment of the present invention.
Further, FIG. 12 shows a schematic view of an information processing system 600 including a
voice input device 602 as an information input terminal and a host computer 604.
[0124]
The present invention is not limited to the above-described embodiment, and various
modifications can be made. The present invention includes configurations substantially the same
as the configurations described in the embodiments (for example, configurations having the same
function, method and result, or configurations having the same purpose and effect). Further, the
present invention includes a configuration in which a nonessential part of the configuration
described in the embodiment is replaced. The present invention also includes configurations that
can achieve the same effects as the configurations described in the embodiments or that can
achieve the same purpose. Further, the present invention includes a configuration in which a
known technology is added to the configuration described in the embodiment.
[0125]
The figure for demonstrating a speech input device. The figure for demonstrating a speech input
device. The figure for demonstrating a speech input device. The figure for demonstrating a
speech input device. The figure for demonstrating the method to manufacture a voice input
device. The figure for demonstrating the method to manufacture a voice input device. The figure
for demonstrating a speech input device. The figure for demonstrating a speech input device. The
figure which shows the mobile telephone as an example of a voice input device. The figure which
04-05-2019
31
shows the microphone as an example of a speech input device. The figure which shows the
remote controller as an example of a voice input device. Schematic of an information processing
system.
Explanation of sign
[0126]
DESCRIPTION OF SYMBOLS 1 ... Speech input device, 10 ... 1st microphone, 12 ... 1st diaphragm,
20 ... 2nd microphone, 22 ... 2nd diaphragm, 30 ... difference signal generation part, 40 ...
housing | casing, 50 ... 60 processing unit, communication processing unit, 70, base, 72, main
surface, 74, recess, 75, bottom surface, 76, area, 78, opening, 80, base, 82, main surface, 84, first
recess , 85: first opening, 86: second recess, 87: second opening, 100: capacitor type
microphone, 102: vibrating membrane, 104: electrode, 300: mobile phone, 400: microphone,
500: remote controller , 600 ... information processing system, 602 ... information input terminal,
604 ... host computer
04-05-2019
32
Документ
Категория
Без категории
Просмотров
0
Размер файла
45 Кб
Теги
jp2008131474
1/--страниц
Пожаловаться на содержимое документа