close

Вход

Забыли?

вход по аккаунту

?

JP2006050303

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JP2006050303
An object of the present invention is to provide a sound input device provided with highly
adaptive noise removal means applicable to special environments such as a vehicle interior.
SOLUTION: One or more microphone means (110-1 to 110-n) for detecting a sound, a sound
input means (120) for inputting a sound, and a filter means for removing noise components from
the sound inputted by the sound input means (120) 130, first storage means 150 for storing
information on noise, and filter calculation means for updating the contents of filter means 130
and storing information on noise of the sound signal used for the update in the first storage
means 150 160 and the sound signal acquired from the sound input means 120 include the
target signal inputted by the user and the type of noise component in the sound signal not
containing the target signal is the first storage means A sound input device is constituted
comprising judgment means 140 for judging that the filter calculation means 160 is to operate
when the type of noise component contained in the noise component stored in 150 is different.
[Selected figure] Figure 1
Sound input device
[0001]
The present invention relates to a sound input device.
[0002]
In recent years, a voice input system in a vehicle compartment has been widely used for invehicle device operation by voice recognition, hands-free telephone, and the like.
08-05-2019
1
One of the factors that hinders the realization of these technologies is the presence of
environmental noise in the vehicle compartment. In general, in a voice recognition device used in
a vehicle interior or a hands-free telephone, a noise removal algorithm is introduced which
suppresses traveling noise in which energy is concentrated in a low region using a fixed
coefficient filter such as a high pass filter.
[0003]
In addition, as an example incorporating a simple adaptation technique, it has been proposed to
change the cutoff frequency of the high pass filter according to the input signal (see Patent
Document 1 below). In this method, although adaptation is performed for each input, the amount
of calculation is suppressed because the adaptation range of the filter is limited. These
techniques aim at the removal of the steady noise in a vehicle interior.
[0004]
On the other hand, a directional noise removal technique has been proposed which is effective
for noise coming from a specific direction regardless of stationary or non-stationary. However,
directional noise removal techniques are difficult to use in a vehicle interior environment because
of the large amount of calculation.
[0005]
Patent Document 1: Japanese Patent Application Publication No. 2001-296887 Japanese Patent
Application Publication No. 2003-271166 "Acoustic system and digital processing", pp. 136-142,
edited by the Institute of Electronics, Information and Communication Engineers, published on
March 25, 1995.
[0006]
Among noises considered to be steady in the passenger compartment, there are noises that are
not always steady.
08-05-2019
2
For example, road noise and engine noise during driving are generally considered to be highly
stable, but in practice they are considered to change over time due to deterioration of vehicle
speed, road surface conditions, tires, engine and exhaust system, etc. Be It is difficult to remove
all these noises by a high pass filter given a fixed coefficient. In this regard, a long-term
observation of interior noise removal requires a highly adaptive noise removal method.
[0007]
Also, there are many non-stationary noises in the passenger compartment. For example, it may be
a passenger's speech, a change in surrounding environment, a rain noise, a wind noise, and the
like. Such noise can not be removed by the method assuming stationary noise. In this regard,
noise energy can be significantly reduced by removing noise whose direction of arrival is clear. In
either case, since the amount of calculation is large, it becomes an issue to reduce the amount of
calculation.
[0008]
The object of the present invention is to solve the above-mentioned problems and to provide a
sound input device provided with highly adaptive noise removal means which is applicable to
special environments such as passenger compartments.
[0009]
In the present invention, in order to solve the above problems, when maintaining the filter for
removing the noise component from the input sound in an effective state, the characteristics of
the sound input system are not always performed sequentially. The sound input device is
provided with noise removal means using a new adaptive learning method that reduces the
amount of calculation by updating the filter only when the value of L changes significantly, and
maintains a certain level of S / N ratio. Do.
[0010]
By constructing the sound input device using a new adaptive learning method that implements
the present invention and performs filter update only when the characteristics of the sound input
system change significantly, and reduces the amount of calculation, it is possible to It is possible
to provide a sound input device with highly adaptive noise removal means applicable to various
environments.
08-05-2019
3
[0011]
In the sound input device according to the present invention, it is possible to use an adaptive
method that requires a large amount of calculation, such as a directional noise removal
algorithm, as a vehicle interior noise removal algorithm.
[0012]
The reason why the sound input device according to the present invention is applied particularly
effectively for the vehicle interior is as follows.
[0013]
In general adaptive filtering, there are many methods aimed at updating filters sequentially and
using an input sound signal and always calculating an optimal filter.
In these methods, the performance is expected to improve as the algorithm becomes more
complex, but there is also a problem that the amount of calculation increases.
For example, when the method we proposed in the past (see Patent Document 2 above) is
processed by a general personal computer, it takes 30 seconds to several tens of minutes to
apply the filter, and the subsequent filtering process takes about 4 seconds of data. It takes about
0.5 to 6 seconds to process.
[0014]
As described above, it is difficult to use the present method for a sound input system in a vehicle
cabin because the adaptation of the filter requires a very large computational cost.
However, in the voice input system in the passenger compartment, the speaker position and the
noise position do not greatly fluctuate. For a private car, the number of speakers can be limited
to several people. Noise generated in the passenger compartment can be predicted. The noise
generated in the passenger compartment is characterized by age-related changes.
08-05-2019
4
This indicates that there are many cases where the filter coefficients are hardly changed in
practice even when filter adaptation processing is always performed.
That is, the constant S / N can be maintained only by performing the filter adaptation processing
only when the characteristics of the sound input system in the vehicle compartment change
significantly, instead of constantly performing the filter update constantly.
[0015]
Hereinafter, the configuration of the present invention will be described by taking the case where
the sound input device according to the present invention is placed in the environment inside a
vehicle as an example, but the present invention is not limited to this.
[0016]
1 and 2 are block diagrams showing a configuration example of a sound input device according
to the present invention.
[0017]
Microphone means 110-1 to 110-n (here, n is an integer of 1 or more and is the number of
microphone means shown in FIGS. 1 and 2).
Microphone means 110-1 to 110-n collect voices and environmental noises uttered by the user
and convert them into electrical signals.
This can be realized by using the microphones shown by 210-1 to 210-n in FIG. In FIGS. 1 and 2,
only the microphone means 110-1 and 110-n are shown.
[0018]
The sound input means 120 for inputting sound shown in FIGS. 1 and 2 AD-converts the electric
signals inputted from the microphone means 110-1 to 110-n and converts them into easy-to-use
08-05-2019
5
sound signals. This can be realized by the filter 220 shown in FIG. 3 or an AD converter 230
which is a real time signal discretization device. Here, the electrical signal is converted into a
discrete sound signal through an AD conversion process.
[0019]
Filter means 130 shown in FIG. 1 and FIG. 2 for removing noise components from the sound
input by the sound input means 120 removes noise components from the input sound signal, and
a signal that the user desires to input to the peripheral device by voice And sends a target signal
S. This can be realized by the arithmetic unit 240 and the storage unit 250 shown in FIG. The
arithmetic device 240 may be, for example, a general personal computer, a microcomputer, a
signal processing device such as a CPU, an MPU, or a DSP constituting a system having an
arithmetic function, or a combination of one or more, real time processing It is desirable to have
a computing capability that can The storage device 250 may also be a cache memory, a main
memory, a disk memory, a flash memory, a ROM, or any other device having an information
storage capability used in a general information processing device.
[0020]
The filter calculation means 160 shown in FIG. 1 and FIG. 2 calculates a new filter from the
information on the input noise component and the target signal, updates the contents of the filter
means 130, and uses it for the update as necessary. Information on noise or information on the
target signal used for the update is stored in the storage means 150, 151 or 152.
[0021]
The storage means 150 of FIG. 1 and the storage means 151, 152 of FIG. 2 store information on
the input noise component and the target signal.
This can be achieved by the storage unit 250 of FIG. The memory means 150, 151, 152 are a
first memory means for storing information on noise, a second memory means for storing
information on a target signal, and a third memory for storing information on a target signal and
a noise component. It serves as at least one storage means of the means.
[0022]
08-05-2019
6
The judging means 140 shown in FIG. 1 and FIG. 2 judges the presence / absence of operation
(whether to operate or not) of the filter calculating means 160 of FIG. 1 and FIG. . This can be
realized by the arithmetic unit 240 and the storage unit 250 of FIG. The determination means
140 includes, for example, a target signal inputted by the user in the sound signal acquired from
the sound input means 120, and the type of the target signal is the second storage means 150,
151 or The type of noise component in the sound signal different from the type of target signal
stored in 152 or in the sound signal acquired from the sound input means 120 is the type of
noise component stored in the first storage means 150, 151 or 152 and When it is different, it is
judged that the filter calculation means 160 is operated. In addition to the above judgment, the
judging means 140 judges, for example, whether or not to operate the filter calculating means
160 if necessary, and whether the sound signal input from the sound input means 120 is a target
signal or not The sound signal input from the sound input means 120 is analyzed to determine
whether the noise signal is negligible with respect to the target signal, and the second sound
signal is used as the target signal when the analysis result is obtained. When the sound signal is
stored as the noise component in the first storage means 150, 151 or 152 when the analysis
result that the target signal is not present is obtained and stored in the storage means 150, 151
or 152 of The decision of shall be made.
[0023]
Here, the information on the noise signal includes noise signal, direction of noise signal, signal
obtained by analyzing noise signal by orthogonal transformation, power of noise signal, signal on
cepstral of noise signal, temporal difference signal of sound signal, etc. Contains information that
can be obtained from the signal.
[0024]
Also, information on the target signal may be a voice signal, a signal obtained by analyzing the
voice signal by orthogonal transformation, power of the voice signal, direction of the voice signal,
cepstrum of the voice signal, mel cepstrum of the voice signal, temporal difference signal of the
sound signal Etc, including information that can be obtained from the audio signal.
[0025]
According to the above configuration, the filter can be updated only when a noise component
different from the noise component stored in advance is detected, so comparison with the
conventional method (the filter is updated each time it is input) Calculation cost can be reduced.
08-05-2019
7
[0026]
Also, when it is determined that the type of target signal has changed, the filter can be updated.
With this configuration, if the filter is updated only when the target signal having a feature
different from the target signal stored in advance is input, the calculation cost can be reduced
compared to the conventional method. it can.
[0027]
Additionally, the filter can be updated when it is determined that the target signal or noise has
changed.
With this configuration, if the filter is updated only when a target signal or noise having a feature
different from that of the target signal or noise stored in advance is input, the computational cost
can be reduced compared to the conventional method. It can be made smaller.
[0028]
If two or more microphone means 110-1 to 110-n (where n is 2 or more) are used in the
configuration shown in FIG. 1 and FIG. 2, the arrival direction of the target signal and the arrival
direction of the noise component are detected. Since it becomes possible, when a target signal
coming from a direction different from the direction of arrival of the target signal stored in
advance is input, or from a direction different from the direction of arrival of the noise
component stored in advance If the filter is updated only when the noise component is input, the
calculation cost can be reduced as compared with the conventional method.
[0029]
Next, information other than the sound signal input to the determination means 140 will be
described using FIG. 4, FIG. 5 and FIG.
[0030]
08-05-2019
8
The switch means 170 shown in FIG. 4 sends out ON / OFF information of the switch used in the
environment where the sound input device according to the present invention is installed.
This can be realized by using the switch device of the arithmetic unit 240 and the information
collecting unit 260 shown in FIG.
Specifically, a toggle switch having an ON / OFF function, a jog dial, a joystick, a mouse, a track
ball, a force feedback switch, or the like is used singly or in combination.
[0031]
If the switch means 170 is used to notify the system of the presence or absence of an input,
when the switch means 170 is turned ON, the judgment means 140 can judge that the target
signal is input. .
[0032]
An information means 180 for collecting information other than sound shown in FIG. 5 sends out
information on the state of equipment operating in the environment where the sound input
device according to the present invention is installed.
From the information sent from the information means 180, the judging means 140 judges that
the filter calculating means 160 is to be operated when it is considered that the noise has
changed.
[0033]
The information about the equipment status refers to information that can predict the noise
generated by the equipment directly or indirectly according to the operation status of the
equipment, that is, the information on the vehicle speed, the information on the operation of the
air conditioner, the information on the opening and closing of the window, the position of the
seat Information on the passenger, information on the vehicle body, sensors installed inside and
outside the vehicle, information obtained from the camera, information on tires, information on
the operation target device installed in the vehicle interior, and the like.
08-05-2019
9
Specific examples include information on the air flow level of the air conditioner and information
on the speed of the vehicle.
[0034]
The collection of the information and the judgment based on the information can be realized by
using the arithmetic unit 240 and the information collecting unit 260 shown in FIG. As a target
to acquire information, there are a control device of a wind speed of an air conditioner, a vehicle
speed pulse generator, a camera, a sensor and the like.
[0035]
The flow of processing in an embodiment of the present invention will be described using FIGS. 7
and 8.
[0036]
FIG. 7 is a flow diagram illustrating a processing system that adapts when the characteristics of
the target signal change.
[0037]
When the system starts operation, initialization processing is first performed in step S100.
At this time, in the initial state, the N = 1st filter is read and expanded on the memory.
[0038]
In step S110, the presence or absence of sound input is determined.
In the sound input step, there are cases where the user intentionally inputs a target signal and
08-05-2019
10
cases where the system is always in the input state. In the former, it is determined that a sound
has been input when the user has made an input, and in the latter, it is determined that an input
has always been made. In any case, if a sound input is detected, the process proceeds to step
S120. If no sound input is detected, step S110 is repeated.
[0039]
In step S120, when the target signal is contained about the input sound signal, it progresses to
step S131. When there is no target signal and there is a signal section in which only a noise
component is present, the process proceeds to step S140.
[0040]
In step S131, the input sound signal is filtered, and the processed sound signal is sent to another
system.
[0041]
In step S140, the information on the noise component 1 input in step S120 is stored.
Here, the information on the noise signal includes information that can be acquired from the
noise signal, such as the noise signal, the direction of the noise signal, a signal obtained by
analyzing the noise signal by orthogonal transformation, the power of the noise signal, and the
signal related to the cepstrum of the noise signal. For example, in the case of a vehicle, the
operation level of the air conditioner, the size of opening and closing the window, the number of
rotations of the engine, the vehicle speed, the presence or absence of the passenger's speech or
movement, the presence or absence of the blinker operation, the wiper operation Level, operating
condition of audio, driving information from car navigation system, information on noise
obtained from sensors installed in the vehicle compartment, information from camera installed
outside the vehicle compartment, and others, indirectly And information obtained from means
capable of obtaining information on noise. If information on another noise component is stored
on the storage device 250, it may be replaced, or information on the acquired noise component
may be added to the information on the existing noise component 1.
[0042]
08-05-2019
11
Steps S110, S120, S131, and S140 are implemented only once after factory shipment. From the
second time onward, the configuration that proceeds from step S100 to step S150 is natural.
[0043]
In step S150, the presence or absence of sound input is determined. In the sound input step,
there are cases where the user intentionally inputs a target signal and cases where the system is
always in the input state. In the former, it is determined that a sound has been input when the
user has made an input, and in the latter, it is determined that an input has always been made. In
any case, if a sound input is detected, the process proceeds to step S160. If the sound input can
not be detected, step S150 is repeated.
[0044]
In step S160, the information on the noise component 2 contained in the input sound signal is
compared with the information on the noise component 1 stored, and if a "difference" exceeding
the set threshold is detected, the process proceeds to step S170. If "difference" is not detected,
the process proceeds to step S190. For example, when “air conditioner switch is OFF” as
information regarding noise component 1 and “air conditioner switch is ON” as information
regarding noise component 2, “difference” is selected in step S 160. To detect. As an example
using sound signal information, the spectrum of the noise component 2 contained immediately
before the input sound signal is analyzed, and compared with the spectrum envelope of the noise
component 1 stored, and a threshold with a constant spectral distortion is obtained. There is a
procedure to detect "difference" when exceeding.
[0045]
In step S170, a filter adapted to the current noise component 2 is calculated and updated to the
Nth filter. The input signal used for the filter calculation may be the case of using the sound
signal input in step S150 or the case of using the information on noise component 1 stored in
advance.
08-05-2019
12
[0046]
In step S180, information on noise component 2 is stored as information on noise component 1.
At this time, the old noise component 1 may be deleted, or the noise component 2 may be stored
as additional information of the noise component 1.
[0047]
In step S190, it is checked whether the target sound signal is included in the input sound signal.
If the target signal is included, the process proceeds to step S132. When there is no target signal
and there is a signal section in which only a noise component is present, the process returns to
step S150.
[0048]
In step S132, the input sound signal is filtered. The content of the filter at this time is the Nth
updated filter. The processed sound signal is sent to another system, and the processing system
returns to step S150.
[0049]
In the process from step S150 to step S180, the operation can be performed even when the user
does not intend to input the target signal. That is, by operating the process from step S150 to
step S180 when the voice input operation is not generated, the process from step S150 to step
S180 causes the process when the user inputs the target signal. Processing delays can be
eliminated.
[0050]
FIG. 8 is a flow diagram illustrating a processing system that adapts when the characteristics of
the target signal change.
[0051]
08-05-2019
13
When the system starts operation, first, in step S200, initialization processing is performed.
At this time, in the initial state, the N = 1st filter is read and expanded on the memory.
[0052]
In step S210, it is determined whether there is a sound input. In the sound input step, there are
cases where the user intentionally inputs a target signal and cases where the system is always in
the input state. In the former, it is determined that a sound has been input when the user has
made an input, and in the latter, it is determined that an input has always been made. In any case,
if a sound input is detected, the process proceeds to step S220. If no sound input is detected, step
S210 is repeated.
[0053]
In step S220, if the level of the noise component is small and the target signal is included in the
input sound signal, the process proceeds to step S231. If there is no target signal and there is a
signal section in which only a noise component is present, the process proceeds to step S240.
[0054]
In step S231, the input sound signal is filtered, and the processed sound signal is sent to another
system.
[0055]
In step S240, information on the target signal 1 input in step S220 is stored.
Here, the target signal refers to a signal that the user desires to input to another system. For
example, it is an audio signal. In addition, information on the target signal includes the target
signal, the direction of the target signal, a signal obtained by analyzing the target signal by
08-05-2019
14
orthogonal transformation, the power of the target signal, and a signal related to the cepstrum of
the target signal Contains information that can be Further, for example, in the case of a vehicle,
the information on the seat position of the user, the information on the target signal obtained
from the sensor installed in the vehicle compartment, and the information on the user's utterance
position obtained from the camera installed in the vehicle cabin And other information obtained
from means capable of indirectly acquiring information on the target signal. When information
on another target signal is stored on the storage device 250, it may be replaced, or information
on the acquired target signal may be added to the information on the existing target signal 1.
[0056]
Steps S210, S220, S231, and S240 are implemented only once after shipping from the factory.
From the second time onward, the configuration that proceeds from step S200 to step S250 is
natural.
[0057]
In step S250, it is determined whether there is a sound input. In the sound input step, there are
cases where the user intentionally inputs a target signal and cases where the system is always in
the input state. In the former, it is determined that a sound has been input when the user has
made an input, and in the latter, it is determined that an input has always been made. In any case,
if a sound input is detected, the process proceeds to step S260. If the sound input can not be
detected, step S250 is repeated.
[0058]
In step S260, if the level of the noise component is small in the input sound signal and the target
signal is included, the process proceeds to step S270. If the target signal does not exist or the
target signal exists but the level of the noise component is large, the process returns to step
S250.
[0059]
08-05-2019
15
In step S270, the information on the target signal 2 contained in the input sound signal is
compared with the information on the target signal 1 stored, and if a "difference" exceeding the
set threshold is detected, the process proceeds to step S280. If the "difference" is not detected,
the process proceeds to step S232. For example, the direction of arrival of the target signal 2
contained immediately before the input sound signal is compared with the direction of arrival of
the target signal 1 stored, and if the azimuth difference exceeds a certain threshold, the
“difference” is The procedure to detect is mentioned.
[0060]
In step S280, a filter adapted to the current target signal 2 is calculated and updated to the Nth
filter. The input signal used for the filter calculation may be the case where the sound signal
input in step S250 is used or the case where the information on the target signal 1 stored in
advance is used.
[0061]
In step S290, information on the target signal 2 is stored as information on the target signal 1. At
this time, the old target signal 1 may be deleted, or the target signal 2 may be stored as
additional information of the target signal 1.
[0062]
In step S232, the input sound signal is filtered. The content of the filter at this time is the Nth
updated filter. The processed sound signal is sent to another system, and the processing system
returns to step S250.
[0063]
The following describes an example in which an adaptive algorithm for removing near-stationary
noise when observed in a short time is applied to the filter means and the filter calculation means
in the sound input device according to the present invention. For example, an adaptive algorithm
such as LMS can be used as the adaptive algorithm.
08-05-2019
16
[0064]
The flow of processing in an embodiment of the present invention will be described below with
reference to FIG.
[0065]
In the sound input device according to the present invention, the operation related to filter
processing and the operation related to filter learning are independent.
In the storage device 250 (corresponding to the third storage means or the combination of the
first storage means and the second storage means), the purpose inputted at the time when the
noise component level is small (time t0) A signal S1 and information I1 related to the noise
component input at time t1 are stored. S1 is a target signal extracted from the input sound
signal. I1 and I2 described below have any one or both of information on noise components
extracted from the input sound signal and information on the operating state of the device
extracted from the input environment information There is.
[0066]
When the sound signal S2 is input at time t2, judgment under each condition is made as follows.
[0067]
Condition 1.
When the sound signal is a noise component and there is no "difference" in the information I2
related to the input noise component as I1, the process returns to the sound signal input waiting
state. Condition 2. If the sound signal is a noise component and there is a "difference" in the
information I2 related to the input noise component as I1, "filter calculation" processing is
performed to update the content of the "filter". Condition 3. When the sound signal is a
composite signal of the target signal and the noise component, S2 is subjected to "filter"
processing and output as an output signal S3.
08-05-2019
17
[0068]
An example using the LMS algorithm for the processing procedure of “filter calculation” in FIG.
9 is shown in FIG.
[0069]
In FIG. 10, the target signal S1 input at time t0 is used as an input of port 1, and the noise
component N1 input at time t2 is used as an input of port 2.
S1 and N1 are added, and a pseudo observation signal SN1 to which a target signal and a noise
component are simultaneously input at time t2 is created. The signal SN1 is applied with a filter
H100 and converted to Sn1. Furthermore, the error signal E1 between Sn1 and S1 is calculated,
and based on E1, the filter H100 is updated so that the noise removal rate becomes high, that is,
the error signal E1 becomes small (for example, the above Non-Patent Document 1).
[0070]
When all S1 and N1 are input, the filter H100 is replaced with the "filter" of FIG.
[0071]
In the storage device 250 of FIG. 9, the target signal (for example, an utterance during idling) S1
stored at time t0 is used when creating a pseudo observation signal as described above.
At this time, the following case can be considered for the target signal S1 used for the Nth filter
calculation process.
[0072]
Case 1. A signal that memorizes the user's utterance stored at the N-1st update process (always
learning using the previous utterance) In the case of the user's utterance input from the 2.1st to
the N-1th (Summing all the input utterances to adapt it to all potential users). For family use) 3. A
08-05-2019
18
signal (adapted to a specific speaker) obtained by adding the utterances of the user A input from
the (N−x) th to the (N−1) th times. Individuals) 4. The utterance of the user A input from the
(N−x) th to the (N−1) th times (adapted to the specific speaker. Personal).
[0073]
Here, an example in which an adaptive algorithm for removing diffusive and directional noise is
applied to the filter means of the present invention and the filter calculation means will be
described. For example, the method disclosed in Patent Document 2 can be used as the adaptive
algorithm.
[0074]
Hereinafter, an outline of the operation according to an embodiment will be described with
reference to FIG.
[0075]
In the storage device 250 of FIG. 11 (corresponding to the third storage means or the
combination of the first storage means and the second storage means), information D0 related to
the target signal inputted at time t0, Information I1 related to the noise component input at time
t1 is stored.
D0 has one or both of the target signal extracted from the input sound signal and the sound
quality of the target signal estimated from the input sound signal and the information on the
arrival direction of the target signal. I1 and I2 described below are information on the sound
quality of the noise component estimated from the input sound signal and the information on the
direction of arrival of the noise component, and the information on the operating state of the
device extracted from the input environment information Any one or both are included.
[0076]
When the sound signal S2 is input at time t2, the determination under each condition is made
and processing is performed as follows.
08-05-2019
19
[0077]
Condition 4.
When the sound signal is a noise component and there is no "difference" between I1 and the
information I2 related to the input noise component, the process returns to the sound signal
input waiting state. Condition 5. When the sound signal is a noise component and there is a
“difference” with respect to the direction of arrival of the noise component in I1 and the
information I2 regarding the input noise component, “direction estimation” processing for the
noise component, and Perform "filter calculation" processing on the signal and noise
components, and update the content of the "separation filter" (perform only update). Condition 6.
When the sound signal is a noise component and there is only a "difference" in the information I2
and the information I2 on the input noise component regarding information other than the
arrival direction of the noise component, "azimuth estimation" processing on the noise
component Skips, performs "filter calculation" processing on the target signal and the noise
component, and updates the contents of the "separation filter" (performs update only). Condition
7. The sound signal is a composite signal of the target signal and the noise component, and there
is no “difference” in I1 and the information I2 on the input noise component, and D0 on the
input target signal In the information D2, when the "difference" does not exist, the S2 performs
"separation filter" processing and outputs (path 2). Condition 8. If the sound signal is a composite
signal of the target signal and the noise component, and there is a “difference” regarding the
direction of arrival of the noise component between I1 and the information I2 regarding the
input noise component, “ The direction estimation process and the “filter calculation” process
on the target signal and the noise component are performed to update the contents of the
“separation filter”. After the "separation filter" is updated, S2 processes and outputs the
"separation filter" (path 4). Condition 9. If the sound signal is a composite signal of the target
signal and the noise component, and there is only a “difference” regarding information other
than the arrival direction of the noise component in I1 and the information I2 regarding the
input noise component, The "direction estimation" process on noise components is skipped, the
"filter calculation" process on target signals and noise components is performed, and the contents
of "separation filter" are updated. After the "separation filter" is updated, S2 processes the
"separation filter" and outputs it (path 3 or path 4). Condition 10. When the sound signal is a
composite signal of the target signal and the noise component and there is a “difference”
regarding the arrival direction of the target signal in D0 and the information D2 related to the
input target signal, “the target signal“ The direction estimation process and the “filter
calculation” process on the target signal and the noise component are performed to update the
contents of the “separation filter”.
08-05-2019
20
After the "separation filter" is updated, S2 processes and outputs the "separation filter" (path 4).
Condition 11. When the sound signal is a composite signal of the target signal and the noise
component and there is only a “difference” regarding information other than the arrival
direction of the target signal in D0 and the information D2 regarding the input target signal,
"Direction estimation" processing for the target signal and "filter calculation" processing for the
target signal and the noise component are performed to update the contents of the "separation
filter". After the "separation filter" is updated, S2 processes the "separation filter" and outputs it
(path 3 or path 4). Condition 12. When the sound signal is the target signal, S2 is processed
without passing through the "separation filter" (path 1).
[0078]
In the method disclosed in Patent Document 2 described above, two (or more) microphone
means are used as sound input means, and 2 ch (ch represents a channel) signals input by the
user are separated to transmit a target signal. Do. When the method disclosed in Patent
Document 2 is applied to the present invention, there are two methods for providing an input
signal.
[0079]
A method of providing an input signal when the method disclosed in Patent Document 2 is
applied to the present invention will be described with reference to FIGS. 12 and 13.
[0080]
FIG. 12 shows how to provide an input signal when any one of the above conditions 7 to 11 is
satisfied.
[0081]
The target signal D1 (t1) input from the user at time t1 and the noise component N1 (t1) arriving
from the noise source at time t1 are input to the microphone means M1.
The target signal D2 (t1) input from the user at time t1 and the noise component N2 (t1) arriving
from the noise source at time t1 are input to the microphone means M2.
08-05-2019
21
At this time, the signal S1 (t1) input from the microphone means M1 to the "filter calculation"
processing unit at time t1 can be expressed as S2 (t1) = D1 (t1) + N1 (t1).
[0082]
Similarly, the signal S2 (t1) input from the microphone means M2 to the "filter calculation"
processing unit at time t1 can be represented by S2 (t1) = D2 (t1) + N2 (t1). Thus, it can be seen
that S1 (t1) and S2 (t1) are observation signals obtained by respective microphone means for
sound signals that can be observed in real time.
[0083]
Using the input observation signal, the “filter calculation” processing unit calculates a
“separation filter” for separating the observation signal into the target signal and the noise
component.
[0084]
FIG. 13 shows how to give an input signal when either of the above conditions 5 and 6 is
satisfied.
[0085]
The noise component N1 (t2) arriving from the noise source at time t2 is input to the
microphone unit M1, and the noise component N2 (t2) arriving from the noise source at time t2
is input to the microphone unit M2.
The noise component N1 (t1) is added with the target signal D1 (t0) stored in the storage device
250 from the microphone means M1 at time t0 before time t2, and noise component N2 (t2) is
added before time t2. The target signal D1 (t0) stored in the storage unit 250 from the
microphone means M2 is added to the time t0 of t.
[0086]
08-05-2019
22
The signal Sp1 (t2) input to the “filter calculation” processing unit is originally S1 (t2) = D1
(t2) + N1 (t2), but in the configuration of FIG. 13, the substitute characteristic of the target signal
D1 (t2) As D1 (t0) is used, Sp1 (t2) = D1 (t0) + N1 (t2).
Similarly, Sp2 (t2) = D2 (t0) + N2 (t2), and the “filter calculation” processing unit uses the
pseudo observation signals Sp1 (t2) and Sp2 (t2) generated in a pseudo manner to generate a
pseudo observation signal. Calculate a "separation filter" that separates the signal into the target
signal and the noise component. At this time, the target signal D1 (t0) and the target signal D2
(t0) are used as target signals input from a virtual user at time t2.
[0087]
According to the above configuration, a pseudo observation signal is created using the target
signal stored in the past, and the content of the filter means 130 is improved in the noise
removal rate even when the target signal is not input by using it. Can be updated. This is a major
feature of the present invention.
[0088]
When the determination means 140 analyzes the sound signal acquired from the sound input
means 120 and obtains that the noise component can be ignored with respect to the target
signal, the storage means 150, 151 uses the sound signal as the target signal. Alternatively, if it is
determined that the sound signal is stored as the noise component in the storage means 150,
151 or 152 when it is determined that the target signal is not stored, the target signal at time t0
and the time t1 are determined. From the noise component stored previously, the observation
signal at time t1, target signal and time t1 is created in a pseudo manner, and filter update
becomes possible using an adaptive algorithm such as LMS. Even if there are, the present
invention can be practiced.
[0089]
Also, if filter calculation means 160 performs filter update when the user does not input the
target signal, it becomes possible to obtain the noise component accurately, and as a result,
highly adaptable filter update is possible. It becomes.
[0090]
08-05-2019
23
The above-described embodiment is merely an example of the embodiment of the present
invention, and does not limit the scope of the present invention.
[0091]
It is a block diagram showing composition of a sound input device concerning the present
invention.
It is a block diagram showing composition of a sound input device concerning the present
invention.
It is a figure showing composition of a sound input device concerning the present invention.
It is a block diagram showing composition of a sound input device concerning the present
invention. It is a block diagram showing composition of a sound input device concerning the
present invention. It is a figure showing composition of a sound input device concerning the
present invention. It is a flowchart of signal processing in the sound input device concerning the
present invention. It is a figure which shows the process process of the signal in the sound input
device which is this invention which is a flowchart of the signal processing in the sound input
device which concerns on this invention. It is a figure which shows the filter calculation process
in the sound input device based on this invention. It is a figure which shows the process process
of the signal in the sound input device based on this invention. It is a figure showing filter
calculation in a sound input device concerning the present invention. It is a figure showing filter
calculation in a sound input device concerning the present invention.
Explanation of sign
[0092]
110-1, 110-n: microphone means, 120: input sound input means, 130: filter means, 140:
judgment means, 150, 151, 152: storage means, 160: filter calculation means, 170: switch
means, 180: information Means, 210-1, 210-n ... Microphone, 220 ... Filter, 230 ... AD converter,
240 ... Arithmetic unit, 250 ... Storage unit, 260 ... Information collecting unit.
08-05-2019
24
Документ
Категория
Без категории
Просмотров
0
Размер файла
37 Кб
Теги
jp2006050303
1/--страниц
Пожаловаться на содержимое документа