close

Вход

Забыли?

вход по аккаунту

?

JPWO2017086030

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JPWO2017086030
Abstract: To reduce the number of sound collection units and improve the resolution in
estimating the direction of a sound source. An acquisition unit acquires sound collection results
of sound from each of one or more sound sources by a sound collection unit in which position
information indicating at least one of position and orientation changes, and the position of the
sound collection unit An estimation unit configured to estimate the direction of each of the one
or more sound sources based on a change in frequency of sound collected by the sound
collection unit in accordance with a change in information. [Selected figure] Figure 1
INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND
PROGRAM
[0001]
The present disclosure relates to an information processing device, an information processing
method, and a program.
[0002]
In recent years, with the development of so-called speech recognition technology and acoustic
analysis technology, the state and condition are recognized by using speech uttered by the user
and sound from a sound source present in the surroundings as input information, and
recognition results Various information processing apparatuses capable of executing processing
according to have been proposed.
03-05-2019
1
In such an information processing apparatus, the arrival direction (that is, the direction of the
sound source) of the sound is estimated based on the sound collection result of voice and sound,
and the estimation result is used to suppress noise or to recognize the target sound. What can be
fed back to various processes, such as improvement etc., is also proposed. For example, Patent
Document 1 discloses an example of a technique for estimating the direction of arrival of sound
based on the sound collection result of each of a plurality of microphones (hereinafter also
referred to as “sound collection unit”).
[0003]
JP, 2011-61422, A
[0004]
As an example of a mechanism for estimating the direction of arrival of sound, there is a
technology using sound collection results by each of a plurality of sound collection units as in the
technology disclosed in Patent Document 1.
In such techniques, the resolution for estimating the direction of arrival of sound and the width
of the main lobe of beamforming depend on the spacing and number of sound collectors, and to
obtain higher resolution in a wider frequency band, It may be necessary to have a large number
of sound collectors installed at high density.
[0005]
On the other hand, an increase in the number of sound collecting units may increase various
costs such as the cost of the sound collecting unit itself, the wiring cost, the maintenance cost,
and the countermeasure against the characteristic variation among the sound collecting units. In
addition, the weight of the device itself may increase as the number of sound collection units
increases.
[0006]
Thus, the present disclosure proposes an information processing device, an information
processing method, and a program that can achieve both the reduction in the number of sound
03-05-2019
2
collection units and the improvement in resolution in estimating the direction of a sound source.
[0007]
According to the present disclosure, an acquisition unit for acquiring a sound collection result of
sound from each of one or more sound sources by a sound collection unit in which position
information indicating at least one of position and orientation changes; An information
processing apparatus is provided, comprising: an estimation unit configured to estimate the
direction of each of the one or more sound sources based on a change in frequency of sound
collected by the sound collection unit according to a change in the position information.
[0008]
Further, according to the present disclosure, acquiring a sound collection result of sound from
each of one or more sound sources by a sound collection unit in which position information
indicating at least one of position and orientation changes; An information processing method
including: estimating the direction of each of the one or more sound sources based on a change
in frequency of the sound collected by the sound collection unit in accordance with a change in
the position information of the sound collection unit; Be done.
[0009]
Further, according to the present disclosure, acquiring in the computer a sound collection result
of sound from each of one or more sound sources by a sound collection unit in which position
information indicating at least one of position and orientation changes. A program is provided,
which comprises: estimating the direction of each of the one or more sound sources based on a
change in frequency of sound collected by the sound collection unit in accordance with a change
in the position information of the sound collection unit. Ru.
[0010]
As described above, according to the present disclosure, there is provided an information
processing device, an information processing method, and a program capable of achieving both
the reduction in the number of sound collecting units and the improvement in resolution in
estimating the direction of a sound source. Be done.
[0011]
Note that the above-mentioned effects are not necessarily limited, and, along with or in place of
the above-mentioned effects, any of the effects shown in the present specification, or other
effects that can be grasped from the present specification May be played.
03-05-2019
3
[0012]
An example of a schematic system configuration of an information processing system according
to an embodiment of the present disclosure is shown.
It is a block diagram showing an example of functional composition of an information processing
system concerning the embodiment.
It is a figure showing typically an example of spatial physical relationship between a sound
collection part and sound in case a sound collection part circularly moves.
An example of the observation result of the sound which arrives from each of a plurality of sound
sources which exist in mutually different positions is shown.
The example of the spectrum of the sound which arrives from each sound source in case two
sound sources are located in a mutually different direction is shown.
It is an example of the graph which represented the presumed result of the arrival direction of
sound based on the spectrum shown in FIG. 5 as a histogram.
It is a figure showing typically an example of spatial physical relationship between a sound
collection part and the sound source concerned in a case where a position of a sound source is
near a sound collection part. An example of the observation result of the sound which comes
from a proximity sound source is shown. It is explanatory drawing for demonstrating an example
of the method of calculating the phase difference at the time of the modulation | alteration by the
Doppler effect. It is explanatory drawing for demonstrating an example of the method of
calculating the phase difference at the time of the modulation | alteration by the Doppler effect.
FIG. 13 is an explanatory diagram for describing an overview of an information processing
system according to a first modification; An example of a result of observation of sound by a
plurality of sound collecting units is shown. An example of an amplitude spectrum calculated
based on a sound collection result of each of a plurality of sound collection units is shown. The
other example of the amplitude spectrum calculated based on the sound collection result of each
03-05-2019
4
of several sound collection parts is shown. FIG. 18 is an explanatory diagram for describing an
overview of an information processing system according to a modification 3; The example of the
detection result of the speed and acceleration of the mobile body in which the sound collection
unit was installed is shown. FIG. 18 is an explanatory diagram for describing an outline of an
information processing system according to a modification 4; It is a figure showing an example of
the hardware constitutions of the information processor concerning the embodiment.
[0013]
Hereinafter, preferred embodiments of the present disclosure will be described in detail with
reference to the accompanying drawings. In the present specification and the drawings,
components having substantially the same functional configuration will be assigned the same
reference numerals and redundant description will be omitted.
[0014]
The description will be made in the following order. 1. Configuration 1.1. System configuration
1.2. Functional configuration 2. Technical features 2.1. Basic principle 2.2. When the sound
collector performs circular motion, and the sound coming from the sound source can be
regarded as a plane wave 2.3. Generalizing the sound source to the sound and the trajectory of
the sound collection unit 2.4. When the source is close to the observation point 2.5. Source
separation, application to beam forming 3. Modifications 3.1. Variation 1: Example of using
multiple sound collectors 3.2. Variation 2: Combination with Other Direction Estimation
Techniques 3.3. Modification 3: An example of moving the observation point 3.4. Modification 4:
Application example indoors 4. Hardware configuration 5. むすび
[0015]
<<1. Configuration >> <1.1. System Configuration> First, an overview of an information
processing system according to an embodiment of the present disclosure will be described. For
example, FIG. 1 shows an example of a schematic system configuration of the information
processing system according to the present embodiment. In the example shown in FIG. 1, the
directions orthogonal to each other on the horizontal plane will be described as the x direction
and the y direction, and the vertical direction will be described as the z direction.
03-05-2019
5
[0016]
As shown in FIG. 1, the information processing system 1 according to the present embodiment
includes an information processing device 10 and a sound collection unit 30. Further, the sound
collection unit 30 includes a sound collection unit 301, a support unit 303, and a drive unit 305.
A part of the support portion 303 is connected to the drive portion 305, and the drive portion
305 is driven to rotate along the trajectory L1 (circular trajectory). Further, the sound collection
unit 301 is supported by the support unit 303. Based on such a configuration, when the support
unit 303 is rotated by driving of the drive unit 305, the sound collection unit 301 moves along
the trajectory L1 (that is, the position of the sound collection unit 301, The orientation changes
along the trajectory L1).
[0017]
The sound collection unit 301 is configured by a sound collection device such as a so-called
microphone. Also, the sound collection unit 301 may include a plurality of sound collection
devices, for example, a microphone array. The sound collection unit 301 collects sound coming
from the surroundings, and outputs an acoustic signal based on the sound collection result to the
information processing apparatus 10. For example, in the case of the example illustrated in FIG.
1, the voices uttered by the users U11 and U12 are collected by the sound collection unit 301,
and an acoustic signal based on the collection result of the voices is output to the information
processing apparatus 10.
[0018]
The information processing apparatus 10 acquires an acoustic signal based on a sound collection
result of voice and sound (hereinafter, may be generally referred to as “sound”) from the
sound collection unit 301, and collects the sound signal based on a change in the acquired sound
signal. The direction of the sound source of the sound relative to the sound unit 30 (ie, the
direction of arrival of the sound) is estimated. More specifically, when the information processing
apparatus 10 moves the sound collection unit 301 along a predetermined trajectory (for
example, a two-dimensional or three-dimensional trajectory), an acoustic signal based on a sound
collection result of sound is obtained. The direction of the sound source of the sound with
respect to the sound collection unit 30 is estimated by using the characteristic that the frequency
of 変 化 changes due to the Doppler effect.
03-05-2019
6
[0019]
As a specific example, in the example illustrated in FIG. 1, the sound collection unit 301 moves
along a two-dimensional trajectory L1 (that is, a circular trajectory) on a horizontal surface (that
is, the xy plane). At this time, focusing on the positional relationship between the sound
collection unit 301 and the user U11, the relative position between the sound collection unit 301
and the user U11 is achieved by the sound collection unit 301 moving along the trajectory L1.
The relationship changes, and the distance between the sound collection unit 301 and the user
U11 changes. Thereby, for example, the acoustic signal based on the sound collection result by
the sound collection unit 301 of the voice uttered by the user U11 has a frequency changed due
to the Doppler effect. At this time, the information processing apparatus 10 recognizes, for
example, a change in the position of the sound collection unit 3301 directly or indirectly, and
uses the change in the position of the sound collection unit 301 and the sound collection result
by the sound collection unit 301. The direction of the sound source (i.e., the user U11) with
respect to the sound collection unit 30 is estimated based on the change of the sound signal
based on. The same applies to the case of the user U12.
[0020]
Also, the information processing apparatus 10 may control the operation of the sound collection
unit 30. Specifically, the information processing apparatus 10 may move the sound collection
unit 301 at a desired speed along a predetermined trajectory (for example, the trajectory L1) by
controlling the operation of the drive unit 305. As a result, the information processing apparatus
10 can recognize changes in the position and orientation of the sound collection unit 301 as the
drive unit 305 is driven.
[0021]
The control entity of the sound collection unit 30 does not necessarily have to be the information
processing device 10. In this case, the information processing apparatus 10 changes, for
example, the position and the direction of the sound collecting unit 301 accompanying the
driving of the driving unit 305 by acquiring information indicating the driving state of the
driving unit 305 from the sound collecting unit 30. You should recognize In the following
description, it is assumed that the information processing apparatus 10 controls the operation of
03-05-2019
7
the sound collection unit 30 (in particular, the drive unit 305).
[0022]
The outline of the information processing system according to the present embodiment has been
described above with reference to FIG.
[0023]
<1.2.
Functional Configuration> Next, an example of a functional configuration of the information
processing system 1 according to the present embodiment will be described with particular
emphasis on a functional configuration of the information processing apparatus 10 with
reference to FIG. FIG. 2 is a block diagram showing an example of a functional configuration of
the information processing system 1 according to the present embodiment.
[0024]
As shown in FIG. 2, the information processing apparatus 10 includes an analysis unit 101, a
drive control unit 103, a process execution unit 105, and a storage unit 107.
[0025]
The drive control unit 103 controls the operation of the drive unit 305.
The drive control unit 103 also outputs information indicating the control result of the drive unit
305 to the analysis unit 101 described later. As a result, the analysis unit 101 can recognize the
control result of the drive unit 305, and can further recognize the movement of the sound
collection unit 301 (that is, the change of the position and the direction) accompanying the drive
of the drive unit 305. Become.
[0026]
03-05-2019
8
The analysis unit 101 acquires, from the sound collection unit 301, an acoustic signal based on
the sound collection result of the sound. Further, the analysis unit 101 acquires, from the drive
control unit 103, information indicating the control result of the drive unit 305. Based on the
information indicating the control result of the drive unit 305, the analysis unit 101 moves the
sound collection unit 301 such as the movement direction of the sound collection unit 301, the
change in direction, the movement speed, etc. Recognize change). Then, based on the movement
of the sound collection unit 301 recognized and the change in the acoustic signal acquired from
the sound collection unit 301 (that is, the change in frequency due to the Doppler effect), the
analysis unit 101 Estimate the direction (ie the direction of arrival of the sound). Then, the
analysis unit 101 outputs information indicating the estimation result of the direction of the
sound source to the process execution unit 105 described later. In addition, about the detail of
the process which concerns on estimation of the direction of a sound source by the analysis part
101, "3. It will be described later separately as “technical features”. Further, the analysis unit
101 corresponds to an example of the “estimation unit”.
[0027]
The storage unit 107 is a storage area for temporarily or constantly storing various data for the
information processing apparatus 10 to execute various functions. For example, in the storage
unit 107, data (for example, a library) for executing various functions (for example, applications)
to be described later, and control information (for example, setting information for performing
the functions) Etc. may be stored.
[0028]
The process execution unit 105 is configured to execute various functions (for example,
applications) provided by the information processing apparatus 10. The processing execution
unit 105 may acquire information indicating the estimation result of the direction of the sound
source from the analysis unit 101, and may execute various functions based on the estimation
result of the direction of the sound source.
[0029]
As a specific example, the processing execution unit 105 may emphasize sound coming from the
03-05-2019
9
direction of the sound source (that is, sound from the sound source) based on the estimation
result of the direction of the sound source. Further, as another example, the processing execution
unit 105 may suppress sound (that is, noise) coming from another direction based on the
estimation result of the direction of the sound source. In this case, for example, the process
execution unit 105 may enhance or suppress the sound coming from the desired direction by
multiplying the acquired sound signal by the gain value according to the direction. The process
execution unit 105 corresponds to an example of the “sound control unit”.
[0030]
In addition, the processing execution unit 105 may control the operation of another
configuration based on the estimation result of the direction of the sound source. As a specific
example, the processing execution unit 105 determines the directivity of a device (so-called
directivity device) such as a speaker or a microphone configured to be able to control directivity
based on the estimation result of the direction of the sound source. You may control. As a more
specific example, when the user's voice is collected, the process execution unit 105 determines
that the directivity of the directional speaker is the direction of the user based on the estimation
result of the direction of the sound source (that is, the user). It may be controlled to turn to
[0031]
The functional configuration of the information processing system 1 described with reference to
FIG. 2 is merely an example, and is not necessarily limited to the same configuration. As a
specific example, the sound collection unit 30 and the information processing apparatus 10 may
be integrally configured. Further, among the configurations of the information processing
apparatus 10, a part of the configuration may be provided in an external device (for example, a
server or the like) different from the information processing apparatus 10. In addition, other
configurations different from the various configurations illustrated in FIG. 2 may be provided
according to the function provided by the information processing apparatus 10. As a specific
example, a directional device (for example, a speaker, a microphone, or the like) to be a noncontrol target of the process execution unit 105 may be separately provided.
[0032]
The example of the functional configuration of the information processing system 1 according to
the present embodiment has been described above with particular emphasis on the functional
configuration of the information processing apparatus 10 with reference to FIG.
03-05-2019
10
[0033]
<<2.
Technical Feature >> Next, as a technical feature of the present embodiment, in particular, details
of processing relating to estimation of the direction of the sound source by the information
processing device 10 (particularly, the analysis unit 101) will be described.
[0034]
<2.1. Basic Principle> As described above, in the information processing system 1
according to the present embodiment, the position and the direction of the sound collection unit
are changed along a predetermined trajectory (for example, a two-dimensional or threedimensional trajectory). Sometimes, the direction of the sound source (direction of arrival of
sound) is estimated by utilizing the characteristic that the sound signal based on the sound
collection result of sound changes due to the Doppler effect. In the estimation of the direction of
the sound source, the information processing system 1 according to the present embodiment
assumes the following points. (1) The moving speed of the sound collecting unit is known or
observable. (2) The sound coming from the sound source for which the direction is to be
estimated includes a section in which stationarity and tonality can be assumed. (3) The moving
speed of the sound source viewed from the sound collecting unit is at least sufficiently small
compared to the speed at which the sound collecting unit moves along the predetermined
trajectory. (4) The speed change of the sound source viewed from the sound collecting unit is at
least sufficiently gentle as compared with the speed at which the sound collecting unit moves
along the predetermined trajectory.
[0035]
The assumption (1) can be realized by, for example, controlling the information processing device
10 so that the sound collection unit 301 moves along a predetermined trajectory. Further, as
another example, the information processing apparatus 10 can also be realized by calculating the
moving speed of the sound collection unit 301 based on detection results of various sensors and
the like.
03-05-2019
11
[0036]
Assumption (2) means that it is intended for (tone) sound having a so-called long-wave structure
without the sound property changing suddenly with time (stationary property) at least in the
observation section of the spectrogram. Do. In addition, the assumption (2) is widely applicable
to, for example, sound having tone characteristics such as voice and music, animal calling, siren
and the like.
[0037]
As to assumptions (3) and (4), the degree changes according to the moving speed of the sound
collection unit 301, but when the sound source is sufficiently separated from the sound
collection unit 301 (in other words, the sound coming from the sound source This can be applied
to the case where it can be regarded as a plane wave. In addition, even when the position of the
sound source is close to the sound collection unit 301, the present invention is applicable to the
case where the latitude speed of the sound source is sufficiently slow with respect to the
movement speed of the sound collection unit 301 It is.
[0038]
Further, as described above, the information processing apparatus 10 according to the present
embodiment uses the Doppler effect generated by the movement of the sound collection unit 301
to estimate the direction of the sound source. Specifically, when the sound collection unit 301
approaches the sound source, the sound collection result of the sound from the sound source is
observed to have a higher pitch than the sound (that is, the wavelength becomes shorter). On the
other hand, when the sound collection unit 301 moves away from the sound source, the sound
collection result of the sound from the sound source is observed to have a lower pitch than the
sound (that is, the wavelength becomes longer).
[0039]
According to the assumption (2) described above, the sound coming from the sound source has a
03-05-2019
12
section that can be regarded as steady, and the change of the pitch (pitch) in that section is
determined by the sound collection unit 301 according to assumptions (3) and (4). It depends on
the change in moving speed and the arrival direction of the sound. Since the change in the
moving speed of the sound collection unit 301 is known by assumption (1), the information
processing apparatus 10 receives the sound based on the change in the sound pitch of the sound
signal based on the sound collection result of the sound. It is possible to estimate the direction (ie
the direction of the sound source). In the following, further detailed explanation will be made by
giving a proper example.
[0040]
<2.2. When the sound collection unit performs a circular motion, and the sound coming
from the sound source can be regarded as a plane wave> First, referring to FIGS. 3 and 4, the
sound source is sufficiently separated from the sound collection unit 301 and comes from the
sound source An example of a method of estimating the direction of the sound source will be
described, focusing on the case where the sound can be regarded as a plane wave and the sound
collection unit 301 moves at a constant velocity at a circular trajectory.
[0041]
For example, FIG. 3 is a diagram schematically showing an example of the spatial positional
relationship between the sound collecting unit 301 and the sound when the sound collecting unit
301 performs a circular motion. In the present description, as shown in FIG. 3, the sound
collection unit 301 moves on the circumference of the radius r at a predetermined angular
velocity φ, and an example of a method of estimating the direction of the sound source will be
described. The plane wave is a sine wave whose traveling direction is θ, and its frequency is f0.
Here, assuming that the velocity of the sound collection unit 301 at time t is v = (vx, vy), the
velocity v is expressed by the following equation (Equation 1).
[0042]
... (Equation 1)
[0043]
03-05-2019
13
In the above (Equation 1), φ0 represents an angle at t = 0.
Here, assuming that the unit vector facing the traveling direction of the plane wave is ek = (cos
θ, sin θ), of the signal modulated by the Doppler effect and observed in the sound collection
unit 301 (hereinafter also referred to as “observed signal”) The frequency f is expressed by the
following equation (Equation 2). Note that, as shown in (Expression 2), v⊥ is represented by the
inner product of ek and v.
[0044]
... (Equation 2)
[0045]
In the above equation (2), although the frequency f0 of the plane wave is an unknown value, the
other values are known, so the direction of the sound source from the modulation phase of the
frequency f of the observation signal Direction) can be derived.
[0046]
In addition, when multiple sound sources exist, the sound which comes from each sound source
is modulated by the phase which may be according to the position of a sound source.
For example, FIG. 4 shows an example of observation results of sounds coming from each of a
plurality of sound sources present at different positions.
In FIG. 4, the horizontal axis indicates time t, and the vertical axis indicates frequency f of the
observation signal. In addition, the some graph shown by FIG. 4 has shown an example of the
observation signal based on the sound which each arrived from a different sound source. As
understood from FIG. 4, according to the information processing system 1 according to the
present embodiment, it is understood that the single sound collecting unit 301 can estimate the
direction of each of the plurality of sound sources. Further, according to the information
processing system 1 according to the present embodiment, by extracting only a signal of a
specific phase, for example, even in a situation where the frequencies of the signals overlap, the
signal is positioned in a desired direction. It is possible to extract and separate the sound coming
from the sound source.
03-05-2019
14
[0047]
As described above, referring to FIGS. 3 and 4, the sound source is sufficiently separated from
the sound collecting unit 301, the sound coming from the sound source can be regarded as a
plane wave, and the sound collecting unit 301 has a circular trajectory at constant velocity. An
example of the method of estimating the direction of the sound source has been described
focusing on the case of moving at.
[0048]
<2.3.
When Generalizing from the Sound Source to the Sound and the Trajectory of the Sound
Collection Unit> Next, referring to FIGS. 5 and 6, the sound from the sound source and the
trajectory of the sound collection part are generalized, that is, they come from the sound source
An example of a method of estimating the direction of the sound source in the case where the
sound is not limited to a sine wave and the trajectory of the sound collection unit 301 is not
limited to a circular trajectory will be described. Also in the present description, it is assumed
that the sound source is sufficiently separated from the sound collection unit 301, and the sound
coming from the sound source can be regarded as a plane wave.
[0049]
First, the spectrum at time t of the sound coming from the sound source is A (ω, t), the velocity
at time t of the sound collection unit 301 is v = (vx, vy), and the unit vector facing the traveling
direction of the plane wave is ek = (Cos θ, sin θ), let the component in the traveling direction of
the plane wave of velocity v be v⊥. Note that v⊥ is represented by an inner product of ek and v,
as in the above-described example. Further, the angular frequency is represented by ω. At this
time, a spectrum A <~> (ω, t) of a signal (that is, an observation signal) observed by the sound
collection unit 301 is expressed by a calculation formula shown as (Expression 3) below. In
addition, "A <->" shall show the character where the tilde was attached on "A".
[0050]
... (Equation 3)
03-05-2019
15
[0051]
In the above (Equation 3), d represents the distance from the sound source to the sound
collection unit 301.
Also, ω 0 indicates the angular frequency of the sound coming from the sound source. Also, ω
indicates an (instantaneous) angular frequency modulated by the Doppler effect. Here, according
to the assumption (2) described above, since the spectrum can be regarded as steady in a certain
short section, the relational expression shown as (Expression 4) below holds.
[0052]
... (Equation 4)
[0053]
If the spectrum A <~> (ω, t) of the observation signal described above as (Eq. 3) is partially
differentiated at time t based on the relational expression shown above as (Eq. 4), the equation
shown as (Eq. 5) Is derived.
[0054]
... (Equation 5)
[0055]
On the other hand, the partial derivative in the frequency direction of the spectrum A <>> (ω, t)
of the observation signal is expressed by the equation shown below (Eq. 6).
[0056]
... (Equation 6)
[0057]
Here, let γ be the ratio of the partial derivative in the time direction of the spectrum A <~> (ω, t)
of the observed signal and the partial derivative in the frequency direction of the spectrum A <~>
03-05-2019
16
(ω, t). Is expressed by the following equation (Eq. 7).
[0058]
... (Equation 7)
[0059]
Here, since γ and v are observable, it is possible to estimate the direction of arrival of sound ek
(that is, the direction of the sound source) based on (Equation 7) shown above.
In addition, since the influence of observation errors and noise is also assumed in practice, γ
may be obtained at a plurality of (ω, t) to improve the estimation accuracy of the arrival
direction ek of sound.
[0060]
In the case where there are a plurality of sound sources, if the sound coming from a certain
sound source at a certain frequency is dominant, the value of ek to be estimated indicates the
direction of the sound source.
Therefore, when there is a band having no overlap in frequency between the sound sources, it is
possible to estimate the direction of each sound source by using the information of the band.
For example, FIG. 5 shows an example of the spectrum of the sound coming from each sound
source when the two sound sources are located in different directions.
In FIG. 5, the horizontal axis indicates time t, and the vertical axis indicates frequency f of the
observation signal.
In the example shown in FIG. 5, an example of the spectrum of the sound coming from each of
the sound sources located in the directions ek1 and ek2 different from each other is shown.
03-05-2019
17
[0061]
Here, based on the spectrum shown in FIG. 5, the arrival direction ek of sound is calculated for
each time and each frequency (that is, a plurality of (ω, t)), and the calculation result of ek is
counted for each arrival direction. The histogram is generated as shown in FIG.
FIG. 6 is an example of a graph representing the estimation result of the direction of arrival of
sound based on the spectrum shown in FIG. 5 as a histogram.
In FIG. 6, the horizontal axis represents the sound traveling direction θ (in other words, the
direction of arrival of sound), and the vertical axis represents the count value N of the direction
of arrival of sound ek calculated for a plurality of (ω, t)). . That is, the example shown in FIG. 6
indicates that there is a high possibility that the sound source is present in the directions of θ1
and θ2.
[0062]
Note that distortion may occur in the estimation result of the direction of arrival of sound ek due
to the overlap of spectra and the influence of non-stationary parts included in the sound coming
from a sound source. However, in the case where the conditions shown in the above-mentioned
assumptions (1) to (4) are satisfied, the direction of arrival ek can be estimated correctly in many
cases. Therefore, for example, it is possible to generate a histogram as shown in FIG. 6 and
estimate the direction of arrival of sound from each sound source (that is, the direction of each
sound source) from the peak value of the histogram.
[0063]
In the above, an example of the method of estimating the direction of the sound source has been
described with reference to FIGS. 5 and 6 when the sound and the trajectory of the sound
collection unit are generalized from the sound source. As understood from the contents described
above, in the information processing system 1 according to the present embodiment, if the
frequency of the acoustic signal based on the sound collection result by the sound collection unit
03-05-2019
18
301 changes due to the Doppler effect, the sound collection unit The aspect (for example, the
track | orbit which moves the sound collection part 301) which changes at least any one of the
position and direction of 301 is not specifically limited.
[0064]
<2.4. When the sound source is close to the observation point> Next, referring to FIGS. 7 to
10, when the position of the sound source is close to the sound collecting portion, that is, in the
case where the sound coming from the sound source is a plane wave does not hold, An example
of a method of estimating the direction of a sound source will be described.
[0065]
For example, FIG. 7 is a diagram schematically showing an example of the spatial positional
relationship between the sound collecting unit 301 and the sound source when the position of
the sound source is close to the sound collecting unit 301. In the present description, in order to
make the method of estimating the direction of the sound source easier to understand, the sound
coming from the sound source is a single sine wave of the frequency f0, and the sound collection
unit 301 is It moves at a predetermined angular velocity φ on a circular orbit L1 of radius r. In
FIG. 7, reference symbol S indicates the position of the sound source. Further, reference symbol l
indicates the distance between the sound source S and the rotation center of the trajectory L1 on
which the sound collection unit 301 moves. At this time, the instantaneous frequency f of the
signal (that is, the observation signal) observed in the sound collection unit 301 by the Doppler
effect is expressed by the following equation (Equation 8).
[0066]
... (Equation 8)
[0067]
In the above (Equation 8), φ0 represents an angle at t = 0.
For example, FIG. 8 shows an example of the observation result of the sound coming from the
03-05-2019
19
proximity sound source. In FIG. 8, the horizontal axis indicates time t, and the vertical axis
indicates frequency f of the observation signal. In the example shown in FIG. 8, in addition to the
observation result of the sound from the proximity sound source, an example of the observation
result in the case where the sound can be regarded as a plane wave is presented as a reference.
As can be seen by referring to (Equation 8) shown above and FIG. 6, in the case of a proximity
sound source, distortion occurs in the signal modulated by the Doppler effect, but the period and
phase are still preserved. I understand. Therefore, it is possible to estimate the direction θ of the
sound source from the phase of the signal modulated by the Doppler effect.
[0068]
Specifically, assuming that the direction of the sound source is θ, the steady frequency is
modulated in the form of sin (φt + φ0 + θ) (see, for example, the above-mentioned (equation
2)), f = sin (φt + φ0) It is possible to estimate the direction of the sound source as a phase
difference θ with As a more specific example, a cross correlation function may be calculated,
and it may be determined as phase difference θ = φΔT from time ΔT at which the correlation
value becomes maximum. In this case, the distance l between the observation point and the
proximity sound source may be unknown.
[0069]
Specifically, the cross correlation between Aref (f, t) and R and the observed signal is set to (f0,
θ, l) = (f0 ′, θ ′, l ′) in (Equation 8) shown above By calculating (f0 ′, θ ′, l ′) where the
correlation value becomes maximum, it is possible to estimate the phase difference θ = θ ′.
Aref (f, t) and R in this case are as shown as (Expression 8a) below. In this method, not only the
direction but also the distance l = l 'to the sound source can be estimated, but in order to obtain it
accurately, it is necessary to solve the maximization problem for three variables, and the amount
of calculation is larger May be
[0070]
... (Equation 8a)
[0071]
03-05-2019
20
In addition, another example of the method of deriving the phase difference θ will be described
below with reference to FIGS. 9 and 10.
FIG. 9 and FIG. 10 are explanatory diagrams for explaining an example of a method of calculating
the phase difference at the time of modulation by the Doppler effect. 9 and 10, the horizontal
axis represents time t, and the vertical axis represents frequency f.
[0072]
First, as shown in FIG. 9, the frequency f0 is derived so that the period of the intersection of the
straight line f = f0 and the observation signal becomes constant. Next, as shown in FIG. 10, a zero
point time t1 of f = sin (φt + φ0) t1 = (nπ−φ0) / φ and a time t2 at which the straight line f =
f0 intersects the observation signal are derived. Then, the phase difference θ = φ (t2−t1) may
be calculated based on the derived times t1 and t2. Also in this case, the distance l between the
observation point and the nearby sound source may be unknown.
[0073]
Also, if it is possible to specify the distance l between the observation point and the nearby sound
source by some method, the arrival direction of the sound (that is, the direction of the sound
source) can be estimated based on (Equation 8) described above. Needless to say.
[0074]
As described above, referring to FIGS. 7 to 10, in the case where the position of the sound source
is close to the sound collecting unit, that is, in the case where the assumption that the sound
coming from the sound source is a plane wave does not hold, An example has been described.
[0075]
<2.5.
Application to Sound Source Separation and Beam Forming> As described above, according to the
information processing system 1 according to the present embodiment, it is possible to estimate
the arrival direction of sound (that is, the direction of the sound source) for each frequency bin is
03-05-2019
21
there.
Therefore, as described above, it is possible to enhance or suppress the sound coming from the
desired direction by, for example, multiplying the acquired observation signal by the gain value
according to the desired direction.
[0076]
Although the sound obtained in the information processing system 1 according to the present
embodiment is modulated by the Doppler effect to become distorted sound, for example, the
inverse correction of the modulation due to the Doppler effect is performed based on (Equation
3) described above. Then, it is possible to acquire a sound with less distortion as in the case
where the sound collection unit 301 is at rest.
[0077]
In addition, it may be assumed that the filter gain changes rapidly and musical noise occurs
because the estimation result of the direction of the sound source changes rapidly due to
movement of the sound source, calculation error, and the like.
In such a case, for example, in order to avoid the occurrence of musical noise, processing such as
smoothing in the time direction may be added to the estimation result of the direction of the
sound source or the filter gain value.
[0078]
In the above, as the technical feature of the present embodiment, in particular, the details of the
process relating to the estimation of the direction of the sound source by the information
processing device 10 have been described.
[0079]
<<3.
03-05-2019
22
Modified Example >> Next, a modified example of the information processing system 1 according
to the present embodiment will be described.
[0080]
<3.1. Modification Example 1: Example of Using a plurality of Sound Collection Units> First,
as Modification Example 1, performance (for example, resolution etc.) related to estimation of the
direction of a sound source can be obtained by using a plurality of sound collection units 301. An
example of a mechanism that can be further improved will be described.
[0081]
As described above, in the information processing system 1 according to the present
embodiment, the single sound collecting unit 301 can estimate the direction of each of a plurality
of sound sources. On the other hand, the resolution for estimating the direction of the sound
source may depend on the moving speed of the sound collection unit 301, the degree of
steadiness of sound from the sound source, and the like. For example, when the moving speed of
the sound collection unit 301 is excessively slow, it may be difficult to observe the influence of
the Doppler effect, and in particular, direction estimation may be difficult when there are a
plurality of sound sources. On the other hand, when the moving speed of the sound collection
unit 301 is excessively fast, the change of the instantaneous frequency may be severe, and the
peak of the spectrum may be blurred, making it difficult to estimate the direction of the sound
source with high accuracy. In addition, the moving speed of the sound collection unit 301 is
restricted by hardware such as the drive unit 305 for moving the sound collection unit 301, and
in particular, it is difficult to move the sound collection unit 301 at a higher speed. is there.
Therefore, in the information processing system according to the first modification, it is possible
to further improve the performance (for example, resolution etc.) related to the estimation of the
direction of the sound source by using the plurality of sound collectors 301 even under the
above constraints. Is possible.
[0082]
For example, FIG. 11 is an explanatory diagram for describing an overview of the information
processing system 1 according to the first modification, and between the sound collectors 301
and the sound when using a plurality of sound collectors 301. It is the figure which showed
03-05-2019
23
typically an example of spatial positional relationship. In the present description, as shown in FIG.
11, it is assumed that each of the plurality of sound collecting units 301 moves on the same
circular trajectory L1, and that the sound coming from the sound source can be regarded as a
plane wave. Do. Further, FIG. 12 shows an example of the observation result of the sound by the
plurality of sound collecting units 301. In FIG. 12, the horizontal axis indicates time t, and the
vertical axis indicates frequency f of the observation signal.
[0083]
As a specific example, the information processing apparatus 10 estimates the arrival direction of
the sound source based on the acoustic signals collected for each of the plurality of sound
collection units 301. Note that, as can be seen with reference to FIG. 12, the observation signals
acquired by each of the plurality of sound collectors 301 are out of phase due to the difference in
relative positional relationship between the respective sound collectors 301. Therefore, in the
information processing apparatus 10, the histogram corresponding to the estimation result of the
direction of arrival of the sound source corresponding to each sound collecting unit 301 is
obtained for the phase due to the difference in relative positional relationship among the
plurality of sound collecting units 301. Shift and add. With such processing, the information
processing apparatus 10 can obtain a sharper histogram as the estimation result of the direction
of the sound source, and can estimate the direction of the sound source more accurately based
on the peak value of the histogram. It becomes.
[0084]
Moreover, the amplitude spectrum is calculated from the frequency of the acoustic signal
(namely, observation signal) observed by each sound collection part 301 as another example, and
the method of estimating the direction of a sound source based on the said amplitude spectrum
is mentioned. In this description, in order to make the method of estimating the direction of the
sound source easier to understand, the description is focused on the situation where a single sine
wave plane wave of frequency f0 arrives from the θ direction.
[0085]
Specifically, when it is assumed that the angle at time t = 0 for each of N sound collectors 301
located on the same circular trajectory L1 is φ0... ΦN, it is observed by the i-th sound collector
301 The frequency fi of the acoustic signal (that is, the observation signal) is expressed by the
following equation (Equation 9).
03-05-2019
24
[0086]
... (equation 9)
[0087]
Here, assuming that the amplitude spectrum of the acoustic signal observed by the i-th sound
collection unit 301 is A <i> (fi) and the unknown direction of arrival is θ ′, an amplitude
spectrum in which the influence of the Doppler effect is corrected The sum of A <−> (f) is
expressed by the following equation (Equation 10).
In addition, "A <->" shall show the character in which the bar was attached above "A."
[0088]
... (Equation 10)
[0089]
For example, FIG. 13 shows an example of the amplitude spectrum calculated based on the sound
collection result of each of the plurality of sound collection units 301, and in the above (formula
10), an example of the amplitude spectrum in the case of θ = θ ′. Is shown.
In FIG. 13, the horizontal axis indicates the frequency f, and the vertical axis indicates the
amplitude | A |.
As can be seen with reference to FIG. 13, at θ = θ ′, the frequency when the influence of the
Doppler effect on the observed signal is corrected substantially matches among the plurality of
sound collectors 301, and the peak of the spectrum becomes sharper Take the maximum value.
[0090]
03-05-2019
25
Based on such characteristics, the direction of arrival of the sound (that is, the direction of the
sound source) can be estimated by finding θ ′ such that the sum A <−> (f) of the amplitude
spectrum takes the sharpest maximum value. It becomes possible. In this case, since A <-> (f)
emphasizes the sound in the θ direction, it can be used for beam forming, sound source
separation, and the like.
[0091]
Moreover, since the relational expression shown above as (Formula 10) is materialized about
arbitrary frequency f, the sound which comes from a sound source may not necessarily be a
single sine wave, but may be arbitrary spectra. For example, FIG. 14 shows another example of
the amplitude spectrum calculated based on the sound collection result of each of the plurality of
sound collection units 301, assuming an arrival direction different from the original sound arrival
direction θ ( That is, it is an example of a spectrum when θ ≠ θ ') and the sum A <-> (f) of the
amplitude spectrum are obtained. In this case, the spectrum after correction of the amplitude
spectrum A <i> (fi) corresponding to each sound collecting unit 301 does not overlap, and
therefore, as shown in FIG. 14, compared to the example shown in FIG. Is smaller and has a
broadened-shaped spectrum.
[0092]
As described above, it is possible to further improve the performance (for example, resolution
etc.) related to the estimation of the direction of the sound source by using the plurality of sound
collecting units 301 as modification 1 with reference to FIGS. 11 to 14. An example of the
mechanism has been described.
[0093]
<3.2.
Modified Example 2: Combination with Other Direction Estimation Technology> Next, as a
modified example 2, the process related to the estimation of the direction of the sound source by
the information processing system 1 according to the present embodiment is combined with the
process related to the other direction estimation technology Thus, an example of a technique for
further improving the accuracy of estimating the direction of the sound source will be described.
03-05-2019
26
[0094]
Specifically, in a situation where many noises are mixed in from various directions, observation
after modulation for estimating the direction of arrival of sound from a target sound source
according to the direction of arrival of noise It may be difficult to calculate the signal. In such a
case, for example, the information processing apparatus 10 estimates the candidate of the
position of the sound source by analyzing the image around the observation point, and uses the
estimation result and the observation signal in the sound collection unit 301 described above.
The direction of the sound source may be estimated by combining it with the estimation result of
the direction of arrival of the sound based on it.
[0095]
Specifically, the information processing apparatus 10 acquires a video around the observation
point captured by an imaging device or the like, and performs various analysis processing such
as image analysis on the acquired video, thereby generating sound source candidates (e.g. ,
People, etc.), and the candidate of the direction of the sound source is estimated based on the
extraction result. Then, the information processing apparatus 10 performs a filter such that the
sound coming from the direction of the sound source estimated based on the video is further
emphasized when analyzing the sound signal (observation signal) based on the sound collection
result of the sound collection unit 301. It may apply. As a more specific example, it is also
possible to emphasize the sound coming from a desired direction by applying a filter that
smoothes the spectrum along the frequency modulation according to the direction of arrival of
the sound.
[0096]
Note that the example described above is merely an example, and another direction estimation
technique combined with the process related to the estimation of the direction of the sound
source by the information processing system 1 according to the present embodiment may
estimate the candidate of the direction of the sound source If possible, it is not necessarily limited
to techniques based on video analysis.
[0097]
As described above, as the second modification, by combining the processing relating to
03-05-2019
27
estimation of the direction of the sound source by the information processing system 1 according
to the present embodiment with the processing relating to other direction estimation techniques,
the accuracy relating to estimation of the direction of the sound source An example of a
technique for improving has been described.
[0098]
<3.3.
Modified Example 3 An Example of Moving the Observation Point> Next, as a modified example
3, an example of applying the information processing system 1 according to the present
embodiment to a mobile body such as a car (vehicle) will be described.
For example, FIG. 15 is an explanatory diagram for describing an overview of the information
processing system 1 according to the third modification. In the example shown in FIG. 15, the
sound collection unit 30 is placed on a mobile unit 50 (for example, a car, a train, a bicycle, etc.),
and the sound collection unit 30 itself moves as the mobile unit 50 moves. In the present
description, the sound collection unit 301 is described as moving along a circular trajectory.
Moreover, in the example shown in FIG. 15, the moving body 50 moves on the xy plane.
[0099]
In this case, for example, the information processing apparatus 10 recognizes the moving speed
of the moving body 50. As a specific example, the information processing apparatus 10 acquires,
from the moving body 50, information indicating the moving speed (for example, the value of the
speedometer, the information indicating the content of the handle operation, etc.), and the
moving body 50 moving speeds may be recognized. In addition, the information processing
apparatus 10 may recognize the moving speed of the moving body 50 based on the detection
results of various sensors such as an acceleration sensor. Further, as another example, the
information processing apparatus 10 may calculate the moving speed of the moving body 50
based on the positioning result of the position of the moving body 50 by GPS (Global Positioning
System) or the like. Of course, the information processing apparatus 10 may recognize the
moving speed of the moving object 50 by combining two or more of the various methods
described above.
03-05-2019
28
[0100]
For example, FIG. 16 shows an example of the detection result of the velocity and acceleration of
the mobile unit 50 in which the sound collection unit 30 is installed. In FIG. 16, the horizontal
axis of each graph indicates time. Further, in FIG. 16, | v | indicates the absolute value of the
velocity of the mobile unit 50, and can be acquired as, for example, a value of a speedometer.
Further, ax represents an acceleration applied to the moving body 50 in the x direction, and ay
represents an acceleration applied to the moving body 50 in the y direction. The accelerations ax
and ay can be acquired, for example, as detection results of the acceleration sensor. Also, vx
indicates the x-direction component of the moving speed of the moving body 50, and vy indicates
the y-direction component of the moving speed of the moving body 50. The velocity vx in the x
direction of the moving body 50 can be calculated as an integral value in the time direction of the
acceleration ax applied in the x direction. Similarly, the velocity vy in the y direction of the
moving body 50 can be calculated as an integral value in the time direction of the acceleration ay
applied in the y direction.
[0101]
Then, the information processing apparatus 10 adds the speed of the moving body 50 as a bias
to the speed of the sound collection unit 301 moving along the circular track relative to the
moving body 50, thereby the moving body 50. The moving speed of the sound collection unit
301 accompanying the movement of is calculated. Here, assuming that the speed of the moving
body 50 is vcar, and the speed of the sound collecting unit 301 moving along a circular track
relative to the moving body 50 is vmic, the sound collecting unit accompanying the movement of
the moving body 50 The moving speed vtotal 301 is expressed by the following equation
(Equation 11).
[0102]
... (Equation 11)
[0103]
The subsequent process is the same as that of the above-described embodiment.
03-05-2019
29
As a specific example, if the information processing apparatus 10 estimates the direction of the
sound source with respect to the sound collection unit 30 (as a result, the moving object 50)
based on the processing described above with reference to (Expression 3) to (Expression 7) Good.
With such a configuration, for example, by applying the information processing system 1 to a
car, it is possible to estimate the arrival direction of the siren and the traveling sound of other
cars located in the blind spot, thereby obtaining the surrounding situation. It becomes possible to
apply to grasp and danger detection.
[0104]
The type of the moving body 50 is not necessarily limited, and the movement of the moving body
50 is not limited to the planar movement as shown in FIG. As a specific example, the mobile unit
50 may be configured as a small unmanned airplane such as a so-called drone. In such a case, the
information processing apparatus 10 installed in the mobile unit 50 configured as a small
unmanned aerial vehicle analyzes the speed of the sound collection unit 301 and the traveling
direction of the sound three-dimensionally to obtain a sound source. It is preferable to estimate
the direction of (i.e., the direction of arrival of sound) three-dimensionally.
[0105]
In addition, in the case of applying the moving body 50 which moves relatively threedimensionally at a relatively high speed as in a small unmanned airplane, even in the case where
the rotation mechanism for rotating the sound collection unit 301 is not provided, There are
cases where the direction of the sound source can be detected by monitoring the movement.
Specifically, the velocity of the moving body 50 may be estimated by an acceleration sensor, an
ultrasonic sensor, an atmospheric pressure sensor, GPS, or the like, and the velocity may be
regarded as the moving velocity of the sound collector 301 to estimate the direction of the sound
source. In the case of such a configuration, for example, it is possible to estimate the position of
the sound source based on the counting result by estimating the arrival direction of the sound
while moving around the moving body 50 and totaling the estimation result.
[0106]
Further, in the case where the moving object 50 itself makes a sound like a so-called drone, for
03-05-2019
30
example, it is possible to mutually grasp the positions of the plurality of moving objects 50. In
this case, for example, the one mobile unit 50 may consider the other mobile unit 50 as a sound
source and estimate the position or direction of the other mobile unit 50.
[0107]
Heretofore, as the third modification, an example in which the information processing system 1
according to the present embodiment is applied to a mobile body such as a car has been
described with reference to FIGS. 15 and 16.
[0108]
<3.4.
Modified Example 4: Indoor Application Example> Next, as a modified example 4, an example in
which the information processing system 1 according to the present embodiment is applied to an
apparatus installed indoors will be described. For example, FIG. 17 is an explanatory diagram for
describing an overview of the information processing system 1 according to the fourth
modification, and shows an example in the case where the information processing system 1 is
applied to a ceiling fan set indoors. There is.
[0109]
Specifically, in the example shown in FIG. 17, the ceiling fan 30 'installed to the ceiling is used as
the sound collection unit 30 (see, for example, FIG. 1) described above, and the rotary fan 303' of
the ceiling fan is It is used as a support portion 303 for supporting the sound collection portion
301. With such a configuration, when the rotary wings 303 'of the sealing fan are opened, the
sound collection unit 301 moves along a circular path. At this time, for example, when the voices
emitted from the user U21 and the user U22 are collected by the sound collection unit 301, they
are modulated by the influence of the Doppler effect. That is, in the example shown in FIG. 17,
the directions of the user U21 and the user U22 with respect to the ceiling fan 30 ′ (that is, the
sound collection unit 30) are estimated based on the sound collection results of the voices from
the user U21 and the user U22. Is possible.
[0110]
03-05-2019
31
With such a configuration, for example, when voice input is performed, it is possible to estimate
the direction of the user (that is, the speaker) who utters the voice, and to provide a service to the
user. At this time, for example, the system may be configured such that a user individual can be
specified by combining with an image recognition technology or the like, and a service according
to the specification result of the individual can be provided.
[0111]
Further, as in the example shown in FIG. 17, by using the ceiling fan 30 ′ as the sound
collection unit 30, the space of the sound collection unit 301 can be reduced compared to the
case where the sound collection unit is installed on the floor or table. It becomes possible to fix
the position. Also, as shown in FIG. 17, when using the ceiling fan 30 ′ as the sound collection
unit 30, there is a high possibility that the sound collection unit 30 (ie, the ceiling fan 30 ′) can
be installed near the center of the room. An obstacle is unlikely to be present between the sound
source (e.g., the user). Therefore, as shown in FIG. 17, by using the ceiling fan 30 'as the sound
collection unit 30, it is possible to estimate the direction of the sound source (for example, the
user) more accurately.
[0112]
Heretofore, with reference to FIG. 17, an example in which the information processing system 1
according to the present embodiment is applied to an apparatus installed indoors has been
described as the fourth modification. In addition, although the modification 4 demonstrated the
case where a ceiling fan was utilized as the sound collection unit 30, it can not be
overemphasized that the apparatus which can be utilized as the sound collection unit 30 is not
necessarily limited only to a ceiling fan. More specifically, any device having at least a part a
mechanism that moves at a sufficiently high speed with respect to the movement speed of the
sound source assumed can be used as the sound collection unit 30.
[0113]
<<4. Hardware Configuration >> Next, an example of a hardware configuration of the
information processing device 10 (that is, the above-described signal processing devices 11 to
03-05-2019
32
14) according to each embodiment of the present disclosure will be described with reference to
FIG. FIG. 18 is a diagram showing an example of a hardware configuration of the information
processing apparatus 10 according to each embodiment of the present disclosure.
[0114]
As shown in FIG. 18, the information processing apparatus 10 according to the present
embodiment includes a processor 901, a memory 903, a storage 905, an operation device 907, a
notification device 909, a sound collection device 913, and a bus 917. Including. In addition, the
information processing apparatus 10 may include at least one of an acoustic device 911 and a
communication device 915.
[0115]
The processor 901 may be, for example, a central processing unit (CPU), a graphics processing
unit (GPU), a digital signal processor (DSP), or a system on chip (SoC), and executes various
processes of the information processing apparatus 10. The processor 901 can be configured by,
for example, an electronic circuit for executing various arithmetic processing. The analysis unit
101, the drive control unit 103, and the process execution unit 105 described above can be
realized by the processor 901.
[0116]
The memory 903 includes a random access memory (RAM) and a read only memory (ROM), and
stores programs and data to be executed by the processor 901. The storage 905 may include a
storage medium such as a semiconductor memory or a hard disk. For example, the storage unit
107 described above can be realized by at least one of the memory 903 and the storage 905, or
a combination of both.
[0117]
The operation device 907 has a function of generating an input signal for the user to perform a
desired operation. The operation device 907 can be configured, for example, as a touch panel.
03-05-2019
33
Further, as another example, the operation device 907 generates an input signal based on an
input by the user such as a button, a switch, a keyboard, and the like, and an input supplied to
the processor 901 based on the input by the user. It may be configured of a control circuit or the
like.
[0118]
The notification device 909 is an example of an output device and may be, for example, a device
such as a liquid crystal display (LCD) device or an organic light emitting diode (OLED) display. In
this case, the notification device 909 can notify the user of predetermined information by
displaying the screen.
[0119]
In addition, the example of the alerting | reporting device 909 shown above is an example to the
last, and if the predetermined information can be alert | reported with respect to a user, the
aspect of the alerting | reporting device 909 is not specifically limited. As a specific example, the
notification device 909 may be a device that notifies a user of predetermined information by a
lighting or blinking pattern, such as a light emitting diode (LED). Further, the notification device
909 may be a device that notifies the user of predetermined information by vibrating like a socalled vibrator.
[0120]
The acoustic device 911 is a device such as a speaker or the like that notifies a user of
predetermined information by outputting a predetermined acoustic signal.
[0121]
The sound collection device 913 is a device such as a microphone or the like for collecting the
sound emitted from the user and the sound of the surrounding environment and acquiring it as
sound information (sound signal).
Further, the sound collection device 913 may acquire data indicating an analog sound signal
03-05-2019
34
indicating collected sound or sound as sound information, or convert the analog sound signal
into a digital sound signal and convert the sound signal. Data representing a later digital acoustic
signal may be acquired as acoustic information. The sound collection unit 301 described above
can be realized by the sound collection device 913.
[0122]
The communication device 915 is a communication unit included in the information processing
device 10, and communicates with an external device via a network. The communication device
915 is a wired or wireless communication interface. When the communication device 915 is
configured as a wireless communication interface, the communication device 915 may include a
communication antenna, an RF (Radio Frequency) circuit, a baseband processor, and the like.
[0123]
The communication device 915 has a function of performing various types of signal processing
on a signal received from an external device, and can supply the processor 901 with a digital
signal generated from the received analog signal.
[0124]
The bus 917 mutually connects the processor 901, the memory 903, the storage 905, the
operation device 907, the notification device 909, the acoustic device 911, the sound collection
device 913, and the communication device 915.
The bus 917 may include multiple types of buses.
[0125]
In addition, a program for causing hardware such as a processor, a memory, and a storage built
in a computer to exhibit the same function as the configuration of the information processing
apparatus 10 described above can be created. In addition, a computer readable storage medium
in which the program is recorded may be provided.
03-05-2019
35
[0126]
<<5. Conclusion >> As described above, in the information processing system 1 according to
the present embodiment, at least one of the position and the direction of the sound collection
unit 301 is changed in a predetermined pattern (for example, along a predetermined trajectory)
By moving (moving), the sound collection unit 301 collects the sound from at least one or more
sound sources. Then, the information processing apparatus 10 uses the characteristic that the
frequency of the acoustic signal based on the sound collected by the sound collection unit 301
changes due to the influence of the Doppler effect accompanying the change of the position and
the direction of the sound collection unit 301. Thus, the direction of the sound source of the
collected sound is estimated.
[0127]
With such a configuration, in the information processing system 1 according to the present
embodiment, the direction of each of the plurality of sound sources is provided by providing at
least one sound collecting unit 301 even in a situation where a plurality of sound sources exist in
the periphery. Can be estimated with higher resolution. That is, according to the information
processing system 1 according to the present embodiment, it is possible to achieve both the
reduction of the number of sound collection units 301 and the improvement of the resolution in
the estimation of the direction of the sound source. Further, in the information processing system
1 according to the present embodiment, since the number of the sound collecting units 301 can
be further reduced, various costs can be reduced, and the weight can be further reduced.
[0128]
The preferred embodiments of the present disclosure have been described in detail with
reference to the accompanying drawings, but the technical scope of the present disclosure is not
limited to such examples. It is obvious that those skilled in the art of the present disclosure can
conceive of various modifications or alterations within the scope of the technical idea described
in the claims. It is understood that also of course belongs to the technical scope of this indication.
[0129]
03-05-2019
36
In addition, the effects described in the present specification are merely illustrative or exemplary,
and not limiting. That is, the technology according to the present disclosure can exhibit other
effects apparent to those skilled in the art from the description of the present specification, in
addition to or instead of the effects described above.
[0130]
The following configurations are also within the technical scope of the present disclosure. (1) An
acquisition unit for acquiring a sound collection result of sound from each of one or more sound
sources by a sound collection unit in which position information indicating at least one of
position and orientation changes in a predetermined pattern, and the sound collection unit An
estimation unit configured to estimate the direction of each of the one or more sound sources
based on a change in frequency of sound collected by the sound collection unit in accordance
with the change in the position information. (2) The information processing apparatus according
to (1), further including: a drive control unit configured to control an operation of a drive unit
that changes the position information of the sound collection unit according to the
predetermined pattern. (3) The information according to (2), wherein the drive control unit
controls the operation of the drive unit such that the position information of the sound collection
unit changes along a substantially circular predetermined track. Processing unit. (4) The
estimation unit is a ratio of a change in a time direction of a spectrum of sound collected by the
sound collection unit to a change in a frequency direction of the spectrum, and a change in the
position information of the sound collection unit. The information processing apparatus
according to any one of (1) to (3), wherein the direction of each of the one or more sound
sources is estimated based on. (5) An acoustic control unit for controlling the amplitude of the
sound coming from at least one direction and collected based on the estimation result of the
direction of each of the one or more sound sources, the (1) to (4) The information processing
apparatus according to any one of the above. (6) The acquisition unit acquires sound collection
results of sound by each of the plurality of sound collection units, and the estimation unit is
configured to use each of the plurality of sound collection units along with movement of each of
the plurality of sound collection units. The information processing apparatus according to any
one of (1) to (5), wherein a direction of each of the one or more sound sources is estimated based
on the frequency change of the collected sound. (7) The sound collection unit is supported by the
moving body so that the position information changes relative to the predetermined moving body
in the predetermined pattern, and the estimation unit is configured to adjust the predetermined
moving body Based on the change in the frequency of the sound accompanying the change in at
least one of the position and the direction of the sound and the change in the position
information of the sound collecting unit relative to the predetermined moving body; The
03-05-2019
37
information processing apparatus according to any one of (1) to (6), which estimates a direction
of each sound source. (8) The information processing apparatus according to (7), wherein the
predetermined moving body is a vehicle. (9) Any of the above (1) to (6), wherein the sound
collecting unit is held with respect to a predetermined driving body, and the positional
information changes in the predetermined pattern by driving the driving body. The information
processing apparatus according to any one of the items.
(10) The driving body is a rotating body, and the sound collecting unit rotates the rotating body
to change the position information along a substantially circular predetermined track.
Information processor as described. (11) The information processing apparatus according to
(10), wherein the rotating body is a rotor of a fan installed on a predetermined ceiling surface or
wall surface. (12) The sound source is another moving body that emits sound, and the estimation
unit estimates the direction of the other moving body based on the frequency change of the
sound collected by the sound collection unit. The information processing apparatus according to
any one of (1) to (6). (13) The estimation unit uses the estimation result of the acquired position
of the other moving object as the estimation result of the direction of the other moving object
based on the frequency change of the sound collected by the sound collection unit. The
information processing apparatus according to (12), wherein the correction is based on. (14) The
acquisition unit acquires an image captured by an imaging unit, and the estimation unit analyzes
the acquired image and the frequency change of the sound collected by the sound collection unit.
The information processing apparatus according to any one of (1) to (13), wherein the direction
of each of the one or more sound sources is estimated based on. (15) Acquiring a sound
collection result of sound from each of one or more sound sources by a sound collection unit in
which position information indicating at least one of position and orientation changes in a
predetermined pattern; And estimating the direction of each of the one or more sound sources
based on a change in frequency of the sound collected by the sound collection unit in accordance
with a change in the position information of the sound unit. (16) obtaining a sound collection
result of sound from each of one or more sound sources by a sound collection unit in which
position information indicating at least one of position and orientation changes in a
predetermined pattern; Estimating the direction of each of the one or more sound sources based
on a change in frequency of sound collected by the sound collection unit in accordance with a
change in the position information of a sound unit.
[0131]
DESCRIPTION OF SYMBOLS 1 information processing system 10 information processing
apparatus 101 analysis part 103 drive control part 105 process execution part 107 storage part
30 sound collection unit 301 sound collection part 303 support part 305 drive part
03-05-2019
38
Документ
Категория
Без категории
Просмотров
0
Размер файла
58 Кб
Теги
jpwo2017086030
1/--страниц
Пожаловаться на содержимое документа