close

Вход

Забыли?

вход по аккаунту

?

JPWO2017056781

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JPWO2017056781
Abstract: A signal processing device, a signal processing method, and a program are provided. A
first arithmetic processing unit performing a first suppression process for suppressing a first
audio signal based on a first microphone based on a second audio signal based on a second
microphone; A second arithmetic processing unit that performs second suppression processing
for suppressing a second audio signal based on the first audio signal. [Selected figure] Figure 2
Signal processing apparatus, signal processing method, and program
[0001]
The present disclosure relates to a signal processing device, a signal processing method, and a
program.
[0002]
Stereo recording is performed using a stereo microphone in which two microphones (hereinafter
sometimes referred to simply as microphones) are arranged on the left and right.
By recording with a stereo microphone, for example, there is an effect that a sense of localization
can be obtained. However, for example, in a small device such as an IC recorder, the distance
between the microphones is small, so there may be a case where a sufficient sense of localization
can not be obtained.
10-05-2019
1
[0003]
Therefore, the localization feeling is enhanced by using a microphone having directivity. For
example, Patent Document 1 below discloses a technique capable of adjusting the sense of
localization by adjusting the angles of two directional microphones.
[0004]
JP, 2008-311802, A
[0005]
However, since the use of directional microphones may increase the cost, it is desirable to obtain
a more localized output even when using non-directional microphones that are relatively cheaper
than directional microphones.
[0006]
Therefore, in the present disclosure, a new and improved signal processing device capable of
obtaining an output signal with higher localization even if the input signal is an audio signal
obtained based on a nondirectional microphone, We propose a signal processing method and
program.
[0007]
According to the present disclosure, a first arithmetic processing unit performing a first
suppression process for suppressing a first audio signal based on a first microphone based on a
second audio signal based on a second microphone; There is provided a signal processing device
comprising: a second arithmetic processing unit which performs a second suppression process of
suppressing the second audio signal based on the first audio signal.
[0008]
Further, according to the present disclosure, performing a first suppression process for
suppressing a first audio signal based on a first microphone based on a second audio signal
based on a second microphone; And C. performing a second suppression process of suppressing
the first audio signal based on the first audio signal, and the signal processing method executed
by the signal processing device is provided.
10-05-2019
2
[0009]
Further, according to the present disclosure, a computer is provided with a first suppression
process for suppressing a first audio signal based on a first microphone based on a second audio
signal based on a second microphone. A program is provided for realizing an arithmetic
processing function and a second arithmetic processing function for performing a second
suppression process of suppressing the second audio signal based on the first audio signal.
[0010]
As described above, according to the present disclosure, it is possible to obtain an output signal
with a higher sense of localization even if the input signal is an audio signal obtained based on a
nondirectional microphone.
[0011]
Note that the above-mentioned effects are not necessarily limited, and, along with or in place of
the above-mentioned effects, any of the effects shown in the present specification, or other
effects that can be grasped from the present specification May be played.
[0012]
It is an explanatory view showing the appearance of the recording and reproducing device
concerning a first embodiment of this indication.
It is a block diagram which shows the structural example of the recording / reproducing
apparatus 1 which concerns on the embodiment.
It is a block diagram showing an example of composition of delay filter 142 concerning the
embodiment.
It is a flowchart figure for demonstrating the operation example by the recording / reproducing
apparatus 1 which concerns on the embodiment.
It is an explanatory view showing an example of composition of a recording and reproducing
10-05-2019
3
system concerning a second embodiment of this indication.
It is explanatory drawing which shows the example of the file format of the data file memorize |
stored in the memory | storage part 233 which concerns on the embodiment.
It is an explanatory view showing the example of realization of UI section 245 concerning the
embodiment.
It is an explanatory view showing the outline of the broadcast system concerning a third
embodiment of this indication.
It is an explanatory view showing an example of composition of transmitting system 32
concerning the embodiment. It is an explanatory view showing an example of composition of
acquisition part 329 concerning the embodiment. It is an explanatory view showing an example
of composition of corresponding receiving device 34 concerning the embodiment. FIG. 6 is an
explanatory view showing a configuration example of a non-corresponding receiving apparatus
36. It is an explanatory view for explaining an outline of a 4th embodiment of this indication. It is
an explanatory view showing an example of composition of smart phone 44 concerning the
embodiment. It is an explanatory view for explaining a modification concerning the present
disclosure It is an explanatory view for explaining a modification concerning the present
disclosure It is a block diagram showing an example of hardware constitutions of a signal
processing device concerning this indication.
[0013]
Hereinafter, preferred embodiments of the present disclosure will be described in detail with
reference to the accompanying drawings. In the present specification and the drawings,
components having substantially the same functional configuration will be assigned the same
reference numerals and redundant description will be omitted.
[0014]
In the present specification and the drawings, elements having substantially the same functional
10-05-2019
4
configuration may be distinguished by appending different alphabets to the same reference
numerals. However, when it is not necessary to distinguish each of a plurality of elements having
substantially the same functional configuration, only the same reference numerals will be given.
[0015]
The description will be made in the following order. <<1. First embodiment >> <1-1. Outline
of First Embodiment> <1-2. Configuration of First Embodiment> <1-3. Operation of First
Embodiment> <1-4. Effects of First Embodiment> << 2. Second embodiment >> <2-1. Outline of
Second Embodiment> <2-2. Configuration of Second Embodiment> <2-3. Effects of Second
Embodiment> <2-4. Supplement of Second Embodiment> << 3. Third Embodiment >> <3-1.
Outline of Third Embodiment> <3-2. Configuration of Third Embodiment> <3-3. Effects of Third
Embodiment> << 4. Fourth Embodiment >> <4-1. Outline of Fourth Embodiment> <4-2.
Configuration of Fourth Embodiment> <4-3. Effects of Fourth Embodiment> << 5. Modifications
>> << 6. Hardware configuration example >> << 7. むすび>>
[0016]
<<1. First embodiment >> <1-1. Overview of First Embodiment> First, an overview of a signal
processing apparatus according to a first embodiment of the present disclosure and a
background leading to the creation of a recording and reproducing apparatus according to the
present embodiment will be described with reference to FIG. . FIG. 1 is an explanatory view
showing an appearance of a recording and reproducing apparatus according to a first
embodiment of the present disclosure.
[0017]
The recording and reproducing apparatus 1 according to the first embodiment shown in FIG. 1 is
a signal processing apparatus such as an IC recorder that performs recording and reproduction
with the same apparatus. As shown in FIG. 1, the recording / reproducing apparatus 1 includes
two microphones, a left microphone 110L and a right microphone 110R, and can perform stereo
recording.
[0018]
10-05-2019
5
In a small-sized device such as an IC recorder, it is difficult to increase the distance between two
microphones (for example, the distance d between the left microphone 110L and the right
microphone 110R shown in FIG. 1). For example, when the distance between the microphones is
separated by only a few centimeters, there is a possibility that a sufficient sense of localization
can not be obtained at the time of reproduction because a sufficient sound pressure difference
between left and right can not be obtained.
[0019]
When the left and right microphones respectively have directivity in the left direction and the
right direction, it is possible to enhance the sense of localization. Therefore, in order to obtain a
sufficient sense of localization even when the distance between the microphones is small, for
example, a configuration provided with two directional microphones can be considered, but
directional microphones are often more expensive than non-directional microphones. Further, in
the case of the configuration using the directional microphone, in order to adjust the localization
feeling, an angle adjustment mechanism for physically adjusting the angle of the directional
microphone is required, and the structure may be complicated.
[0020]
Therefore, the present embodiment has been made with the above circumstances taken into
consideration. According to the present embodiment, the directivity of the audio signal is
enhanced by suppressing the left and right audio signals based on the other audio signal, and the
input signal is an audio signal obtained by the nondirectional microphone, It is also possible to
obtain an output signal with a higher sense of localization. Moreover, according to the present
embodiment, it is possible to adjust the sense of localization by changing the parameters without
requiring a physical angle adjustment mechanism of the microphone. Hereinafter, the
configuration and operation of the recording and reproducing apparatus according to the present
embodiment having such an effect will be sequentially described in detail.
[0021]
<1−2. Configuration of First Embodiment> The background that led to the creation of the
10-05-2019
6
recording / reproduction device according to the present embodiment has been described above.
Subsequently, the configuration of the recording and reproducing apparatus according to the
present embodiment will be described with reference to FIGS. FIG. 2 is a block diagram showing a
configuration example of the recording and reproducing apparatus 1 according to the first
embodiment. As shown in FIG. 2, the recording / reproducing apparatus according to this
embodiment includes a left microphone 110L, a right microphone 110R, A / D converters 120L
and 120R, gain correction units 130L and 130R, a first arithmetic processing unit 140L, The
signal processing apparatus includes the second arithmetic processing unit 140R, the encoding
unit 150, the storage unit 160, the decoding unit 170, the D / A converting units 180L and
180R, and the speakers 190L and 190R.
[0022]
The left microphone 110L (first microphone) and the right microphone 110R (second
microphone) are, for example, nondirectional microphones. The left microphone 110L and the
right microphone 110R convert ambient sound into an analog audio signal (electric signal), and
supply the analog audio signal to the A / D converter 120L and the A / D converter 120R,
respectively.
[0023]
The A / D conversion unit 120L and the A / D conversion unit 120R convert analog audio signals
supplied from the left microphone 110L and right microphone 110R into digital audio signals
(hereinafter may be simply referred to as audio signals) Do.
[0024]
The gain correction unit 130L and the gain correction unit 130R perform a gain correction
process for correcting the gain difference (sensitivity difference) of the left microphone 110L and
the right microphone 110R, respectively.
The gain correction unit 130L and the gain correction unit 130R according to the present
embodiment correct the difference between the audio signals output from the A / D conversion
unit 120L and the A / D conversion unit 120R, respectively.
10-05-2019
7
[0025]
For example, the gain correction unit 130L and the gain correction unit 130R measure in
advance the gain difference between the left microphone 110L and the right microphone 110R,
and multiply the audio signal by a predetermined value that suppresses the gain difference. You
may process. With such a configuration, the influence of the gain difference between the left
microphone 110L and the right microphone 110R is suppressed, and directivity can be
emphasized with higher accuracy by the processing described later.
[0026]
Although the example in which the gain correction process is performed on the digital audio
signal after A / D conversion has been described above, the gain correction process is performed
on the analog audio signal before the A / D conversion is performed. May be
[0027]
Also, hereinafter, the audio signal output from the gain correction unit 130L is referred to as a
left input signal or a first audio signal, and the audio signal output from the gain correction unit
130R is a right input signal or a second audio signal. May call.
[0028]
The first arithmetic processing unit 140L and the second arithmetic processing unit 140R
perform arithmetic processing based on the left input signal and the right input signal.
For example, the first arithmetic processing unit 140L performs a first suppression process to
suppress the left input signal based on the right input signal.
The second arithmetic processing unit 140R performs a second suppression process of
suppressing the right input signal based on the left input signal.
[0029]
10-05-2019
8
The functions of the first arithmetic processing unit 140L and the second arithmetic processing
unit 140R may be realized by, for example, different processors. Further, one processor may
have the functions of both the first operation processing unit 140L and the second operation
processing unit 140R. In the following, an example in which the functions of the first operation
processing unit 140L and the second operation processing unit 140R are realized by a DSP
(Digital Signal Processor) will be described.
[0030]
As shown in FIG. 2, the first arithmetic processing unit 140L includes a delay filter 142L, a
directivity correction unit 144L, a suppression unit 146L, and an equivalent filter 148L.
Similarly, as shown in FIG. 2, the second arithmetic processing unit 140R includes a delay filter
142R, a directivity correction unit 144R, a suppression unit 146R, and an equivalent filter 148R.
[0031]
The delay filters 142L and 142R are filters that perform delay processing on input signals. As
shown in FIG. 2, the delay filter 142L performs a first delay process for delaying the right input
signal. Further, as shown in FIG. 2, the delay filter 142R performs a second delay process for
delaying the left input signal.
[0032]
The first delay processing and the second delay processing are performed based on the distance
between the left microphone 110L and the right microphone 110R (inter-microphone distance).
Since the timing at which the sound is transmitted to each microphone depends on the distance
between the microphones, according to such a configuration, for example, in combination with
the suppression processing described later, a directivity emphasizing effect based on the distance
between the microphones can be obtained.
[0033]
For example, the first delay processing by the delay filters 142L and 142R and the second delay
10-05-2019
9
processing may be processing for delaying the distance between the microphones by the number
of samples corresponding to the time it takes for the sound to propagate. Assuming that the
distance between the microphones is d [cm], the sampling frequency is f [Hz], and the sound
velocity is c [m / s], the number D of delay samples delayed by the delay filters 142L and 142R is
calculated by the following equation, for example.
[0034]
[0035]
Here, the number of delayed samples D calculated by the equation (1) is not generally an integer.
When the number of delay samples D is noninteger, the delay filters 142L and 142R become
noninteger delay filters. In order to realize a non-integer delay filter, a filter with an infinite tap
length is strictly required. However, in practice, a filter cut off with a finite tap length or a filter
approximated by linear interpolation is a delay filter 142 L, It may be used as 142R. Hereinafter,
a configuration example of the delay filter 142 in the case where the delay filter 142 (delay
filters 142L and 142R) is realized as a filter approximated by linear interpolation or the like will
be described with reference to FIG.
[0036]
Assuming that the integer part and the decimal part of the delay sample number D are M and η,
respectively, the approximate value of the signal obtained by delaying the signal y (n) input to
the delay filter 142 by the delay sample number D is Obtained.
[0037]
[0038]
If the above equation (2) is expressed as a block diagram, the block diagram of FIG. 3 is obtained.
10-05-2019
10
FIG. 3 is a block diagram showing a configuration example of the delay filter 142. As shown in
FIG.
As shown in FIG. 3, the delay filter 142 includes a delay filter 1421, a delay filter 1423, a linear
filter 1425, a linear filter 1427, and an adder 1429.
[0039]
The delay filter 1421 is an integer delay filter that delays by the number of delay samples M. The
delay filter 1423 is an integer delay filter that delays by one delay sample number. The linear
filter 1425 and the linear filter 1427 multiply the input signal by 1− に and 、, respectively, and
output the result. In addition, the adder 1429 adds the input signals and outputs the result.
[0040]
The first delay process and the second delay process by the delay filter 142L and the delay filter
142R described above are performed based on predetermined filter coefficients. The filter
coefficients may be identified based on the distance between the microphones to be the delay
filter as described above. In the present embodiment, since the left microphone 110L and the
right microphone 110R are fixedly provided in the recording / reproducing apparatus 1, for
example, the filter coefficient is determined in advance based on the method of realizing the
delay filter 142 described above. It may be
[0041]
Returning to FIG. 2, the directivity correction unit 144 L and the directivity correction unit 144 R
respectively add a predetermined value α to the signal obtained by the first delay processing
and the signal obtained by the second delay processing. It is a linear filter that multiplies and
outputs. α is a parameter for adjusting the directivity. The directivity is intensified as α is closer
to 1 and the directivity is weakened as α is closer to 0. Since it is possible to adjust the sense of
localization by adjusting the directivity, with such a configuration, the directivity and the sense of
localization are adjusted by changing the parameter α without requiring a physical angle
adjustment mechanism of the microphone. It is possible.
10-05-2019
11
[0042]
The suppression unit 146L performs the first suppression processing by subtracting the signal
based on the first delay processing from the left input signal. In addition, the suppression unit
146R performs a second suppression process by subtracting a signal based on the second delay
process from the right input signal. With such a configuration, it is possible for the output signal
of the suppression unit 146L to obtain the directivity in the left direction by suppressing the
signal in the right direction. In addition, it is possible for the suppression unit 146R output signal
to obtain the directivity in the right direction by suppressing the signal in the left direction.
[0043]
For example, as shown in FIG. 2, the suppressing unit 146L performs the first suppressing
process by subtracting the output signal of the directivity correcting unit 144L based on the first
delay process from the left input signal. Further, the suppression unit 146R performs a second
suppression process by subtracting the output signal of the directivity correction unit 144R
based on the second delay process from the right input signal.
[0044]
The equivalent filter 148L is a filter that corrects the frequency characteristic of the signal
obtained by the first suppression processing by the suppression unit 146L. In addition, the
equivalent filter 148R is a filter that corrects the frequency characteristic of the signal obtained
by the second suppression processing by the suppression unit 146R. The equivalent filter 148L
and the equivalent filter 148R may perform correction so as to compensate for suppression in a
frequency band that is suppressed regardless of directivity by the above-described suppression
processing. For example, since the low band signal with a long wavelength is suppressed due to a
small phase difference between the delayed signal and the non-delayed signal by the abovedescribed suppression processing, the equivalent filter 148L and the equivalent filter 148R are
low band The frequency characteristic may be corrected to emphasize the signal. According to
the configuration, it is possible to reduce the change of the frequency characteristic due to the
suppression processing. The filter coefficient for performing the above-described correction may
be identified based on the microphone distance.
10-05-2019
12
[0045]
Here, assuming that the left input signal is xl (n) and the right input signal is xr (n), the output
signal yl (n) of the first arithmetic processing unit 140L and the output signal yr of the second
arithmetic processing unit 140R (N) is expressed by the following equation. In the following, it is
assumed that the parameter α related to the directivity correction units 144L and 144R is one.
[0046]
[0047]
In equations (3) and (4), * represents a convolution operation, p (n) represents delay filters 142L
and 142R, and q (n) represents equivalent filters 148L and 148R.
[0048]
In the case where the calculation of Equations (3) and (4) is realized by fixed-point calculation,
for example, when the calculation result in {} is rounded to a short word length, the convolution
of the equivalent filter q (n) according to the calculation result Since the low band is amplified by
the calculation, the low band S / N ratio (signal-to-noise ratio) may be reduced.
[0049]
In addition, although it is conceivable to save the operation result in {} as long word length in
Equations (3) and (4) and perform the convolution operation of equivalent filter q (n) with double
precision, The memory of the buffer area to be stored becomes large, and the operation cost of
performing double precision operation is also large.
[0050]
Here, when the synthesis filter u (n) = p (n) * q (n) of the delay filter p (n) and the equivalent filter
q (n) is used, the output signal yl (n of the first arithmetic processing unit 140L And the output
signal yr (n) of the second arithmetic processing unit 140R is expressed by the following
equation.
[0051]
10-05-2019
13
[0052]
For example, when the above equations (5) and (6) are computed by a DSP capable of fixed-point
arithmetic processing, the total number of product-sum operations increases compared to
equations (3) and (4), but combining of convolution operations It becomes unnecessary.
By subtracting two convolutional operation results stored in a long word length in the DSP's
accumulator, the operation results of equations (5) and (6) are obtained.
Therefore, by performing calculation using Equations (5) and (6), the S / N ratio does not
decrease, and storage of the calculation result with double precision and convolution calculation
processing with double precision become unnecessary.
[0053]
Although the parameter α related to the directivity correction units 144L and 144R has been
described above as 1, it is possible to perform similar arithmetic processing even when the
parameter α is not 1.
[0054]
The output signal of the first arithmetic processing unit 140L obtained as described above is the
audio signal of the left channel in the stereo audio signal, and the output signal of the second
arithmetic processing unit 140R is the audio signal of the stereo audio signal. It is an audio signal
of the right channel.
That is, a stereo audio signal obtained by combining the left channel audio signal having
directivity in the left direction and the audio signal of the right channel having directivity in the
right direction is obtained by the above processing.
With such a configuration, the stereo audio signal has a higher sense of localization compared to,
for example, a stereo audio signal based on a combination of the left input signal and the right
input signal.
10-05-2019
14
[0055]
The encoding unit 150 performs encoding by combining the audio signal of the left channel and
the audio signal of the right channel.
The encoding scheme performed by the encoding unit 150 is not limited, and may be, for
example, a non-compression scheme, a lossless compression scheme, or a non-reversible
compression scheme.
[0056]
The storage unit 160 stores data obtained by the encoding of the encoding unit 150.
The storage unit 160 may be realized by, for example, a flash memory, a magnetic disk, an
optical disk, a magneto-optical disk, or the like.
[0057]
The decoding unit 170 decodes the data stored in the storage unit 160.
The decoding by the decoding unit 170 may be performed according to the encoding scheme of
the encoding unit 150.
[0058]
The D / A conversion unit 180L and the D / A conversion unit 180R respectively output the left
channel audio signal and the right channel audio signal output from the decoding unit 170 as the
left channel analog audio signal and the right channel. Convert to analog voice signal.
[0059]
10-05-2019
15
The speaker 190L and the speaker 190R perform reproduction (sound output) of the analog
audio signal of the left channel and the analog audio signal of the right channel output from the
D / A conversion unit 180L and the D / A conversion unit 180R, respectively. .
The analog audio signal of the left channel and the analog audio signal of the right channel
output from the D / A conversion unit 180L and the D / A conversion unit 180R may be output
to an external speaker, an earphone, or a headphone, etc. .
[0060]
<1−3. Operation of First Embodiment> The configuration example of the recording and
reproduction device 1 according to the first embodiment of the present disclosure has been
described above. Subsequently, with reference to FIG. 4, an operation example of the recording
and reproducing apparatus 1 according to the present embodiment will be described focusing on
operations of the first arithmetic processing unit 140L and the second arithmetic processing unit
140R. FIG. 4 is a flowchart for explaining an operation example of the recording and reproducing
apparatus 1 according to the present embodiment.
[0061]
As shown in FIG. 4, first, pre-processing for generating the left input signal and the right input
signal input to the first arithmetic processing unit 140L and the second arithmetic processing
unit 140R is performed (S102). The pre-processing includes, for example, conversion processing
from analog audio signal to digital audio signal by A / D conversion unit 120L and A / D
conversion unit 120R, gain correction processing by gain correction unit 130L, and gain
correction unit 130R. Be
[0062]
Subsequently, delay processing (first delay processing) of the right input signal is performed by
the delay filter 142L, and delay processing (second delay processing) of the left input signal is
performed by the delay filter 142R (S104). The signal obtained by the above delay processing is
10-05-2019
16
corrected by the directivity correction unit 144L and the directivity correction unit 144R in order
to adjust the directivity (S106).
[0063]
Subsequently, the left input signal is suppressed by the suppression unit 146L (first suppression
processing), and the right input signal is suppressed by the suppression unit 146R (second
suppression processing). The signal obtained by the suppression has its frequency characteristic
corrected by the equivalent filter 148L and the equivalent filter 148R (S110).
[0064]
<1−4. Effects of the First Embodiment> The first embodiment has been described above.
According to the present embodiment, the directivity of the audio signal is enhanced by
suppressing the left and right audio signals based on the other audio signal, and the input signal
is an audio signal obtained by the nondirectional microphone, It is also possible to obtain an
output signal with a higher sense of localization. Further, according to the present embodiment, it
is possible to adjust the sense of localization by changing the directivity adjustment parameter α
without requiring a physical angle adjustment mechanism of the microphone.
[0065]
<<2. Second embodiment >> <2-1. Outline of Second Embodiment> In the first embodiment
described above, an example in which recording and reproduction are performed by the same
device has been described. However, the device for performing recording and the device for
performing reproduction are not necessarily the same. For example, the recording device that
performs recording and the reproduction device that performs reproduction may each be an IC
recorder, for example.
[0066]
For example, when content recorded by a certain IC recorder (recording device) is reproduced by
another IC recorder (reproduction device) via a network, a file of the content is transmitted to
another IC recorder (reproduction device) It may be copied and played back.
10-05-2019
17
[0067]
In such a case, for example, when the reproduction device performs the suppression process
based on the distance between the microphones of the recording device, it is possible to
emphasize the directivity of the audio signal and obtain an output signal with a higher sense of
localization.
Therefore, in the following, as a second embodiment, an example in which a recording device
that performs recording and a reproduction device that performs reproduction are different will
be described.
[0068]
<2−2. Configuration of Second Embodiment> A recording and reproduction system
according to a second embodiment of the present disclosure will be described with reference to
FIG. FIG. 5 is an explanatory view showing a configuration example of a recording and
reproduction system according to a second embodiment of the present disclosure. As shown in
FIG. 5, the recording and reproduction system 2 according to the present embodiment has a
recording device 22 and a reproduction device 24. The recording device 22 and the reproducing
device 24 according to the present embodiment have the same configuration as the recording
and reproducing device 1 described with reference to FIG.
[0069]
(Recording Device) The recording device 22 has at least a recording function. As shown in FIG. 5,
the recording device 22 includes a left microphone 221L, a right microphone 221R, A / D
conversion units 223L and 223R, gain correction units 225L and 225R, an encoding unit 227, a
metadata storage unit 229, a multiplexer 231, and A storage unit 233 is provided. The
configurations of the left microphone 221L, the right microphone 221R, the A / D conversion
units 223L and 223R, the gain correction units 225L and 225R, the encoding unit 227, and the
storage unit 233 are the left microphone 110L and the right described with reference to FIG. The
microphone 110R, the A / D conversion units 120L and 120R, the gain correction units 130L
and 130R, the encoding unit 150, and the storage unit 160 are the same as the microphone
110R, and thus the description thereof is omitted.
10-05-2019
18
[0070]
Note that the recording device 22 according to the present embodiment performs a process
corresponding to step S102 described with reference to FIG. 4 as a process related to directivity
emphasis.
[0071]
The metadata storage unit 229 stores metadata used when the reproduction device 24 described
later performs suppression processing (directivity enhancement processing).
The metadata stored in the metadata storage unit 229 may include, for example, distance
information related to the distance between the left microphone 221L and the right microphone
221R, or information on a filter coefficient calculated based on the distance between the
microphones. May be. The metadata stored in the metadata storage unit 229 may include a
model code or the like for identifying the model of the recording device 22. Furthermore, the
metadata stored in the metadata storage unit 229 may include information on gain differences
between the left microphone 221L and the right microphone 221R.
[0072]
The format of the metadata stored in the metadata storage unit 229 may be a chunk format used
in Waveform Audio Format or the like, or a format using a structure such as XML (extensible
Markup Language) or the like. It may be.
[0073]
In the following, an example in which the metadata stored in the metadata storage unit 229
includes at least information of filter coefficients used when performing suppression processing
will be described, and other examples will be described later as a supplement.
[0074]
The multiplexer 231 outputs a plurality of input signals as one output signal.
10-05-2019
19
The multiplexer 231 according to the present embodiment outputs the audio signal encoded by
the encoding unit 227 and the metadata stored in the metadata storage unit 229 as one output
signal.
[0075]
The output signal output from the multiplexer 231 is stored in the storage unit 233 as a data file
including audio data and metadata.
FIG. 6 is an explanatory view showing an example of the file format of the data file stored in the
storage unit 233. As shown in FIG. As shown in FIG. 6, the data file stored in the storage unit 233
includes a header portion F12 including information such as file type, a recording content
portion F14 including recorded audio data, and metadata including metadata. And a unit F16.
[0076]
(Reproducing Device) As shown in FIG. 5, the reproducing device 24 includes a demultiplexer
241, a decoding unit 243, a UI unit 245, switch units 247A to D, a first arithmetic processing
unit 249L, a second arithmetic processing unit 249R, It is a signal processing device provided
with D / A conversion parts 251L and 251R, and speakers 253L and 253R. The configurations of
the decoding unit 243, the D / A conversion units 251L and 251R, and the speakers 253L and
253R are the same as the decoding unit 170, the D / A conversion units 180L and 180R, and the
speakers 190L and 190R described with reference to FIG. The description is omitted because it is
similar.
[0077]
Note that the reproduction device 24 according to the present embodiment performs the
processing corresponding to steps S104 to S110 described with reference to FIG. 4 as the
processing relating to directivity emphasis.
[0078]
10-05-2019
20
The demultiplexer 241 receives from the recording device 22 a signal in which the audio signal
and the metadata stored in the storage unit 233 of the recording device 22 are combined into
one, distributes the signal into the audio signal and the metadata, and outputs the signal.
The demultiplexer 241 provides the audio signal to the decoding unit 243, and the metadata to
the first operation processing unit 249L and the second operation processing unit 249R. As
described above, in the example illustrated in FIG. 5, the metadata at least includes information of
the filter coefficient used when performing the suppression process, and the demultiplexer 241
serves as a filter coefficient acquisition unit that acquires information of the filter coefficient.
Function.
[0079]
In the example shown in FIG. 5, the recording device 22 and the reproducing device 24 are
directly connected, and a signal is provided from the storage unit 233 of the recording device 22
to the demultiplexer 241 of the reproducing device 24. Embodiments are not limited to such an
example. For example, the reproduction device 24 may have a storage unit, and the
demultiplexer 241 may receive a signal from the storage unit after data is once copied to the
storage unit. In addition, the information stored in the storage unit 233 of the recording device
22 may be provided to the reproduction device 24 via a recording device 22 and a storage device
other than the reproduction device 24 or a network.
[0080]
The UI unit 245 receives an input by the user for selecting whether or not to perform directivity
enhancement processing by the first operation processing unit 249L and the second operation
processing unit 249R. Although the directional enhancement process has the effect that spatial
separation of sound occurs in the output sound to make it easy to hear the sound, some users
may think that the as-received content is preferable Therefore, the playback device 24 may
include the UI unit 245.
[0081]
The UI unit 245 may be realized by various input means. FIG. 7 is an explanatory view showing
10-05-2019
21
an implementation example of the UI unit 245. As shown in FIG. As shown on the left of FIG. 7,
the playback device 24A may include a UI unit 245A which is a physical switch. In such an
example, when the UI unit 245A detects that the reproduction device 24A has acquired metadata
necessary for performing directivity enhancement processing such as a filter coefficient, the UI
unit 245A may be lit to prompt the user to make a selection input.
[0082]
Further, as shown on the right of FIG. 7, the reproduction device 24B may include a UI unit 245B
capable of displaying and inputting a touch panel or the like. In such an example, when the UI
unit 245B detects that the reproduction device 24B has acquired metadata necessary for
performing directivity enhancement processing such as a filter coefficient, as shown in FIG. It
may be notified that it is possible, and may be displayed to prompt selection input.
[0083]
It is needless to say that the user may operate the physical switch or touch panel to perform the
selection input without the above-described explicit notification that automatically prompts the
user to make a selection input.
[0084]
Referring back to FIG. 5, the switch units 247A to D switch on / off of the directivity
enhancement processing by the first arithmetic processing unit 249L and the second arithmetic
processing unit 249R according to the user's input to the UI unit 245.
In the state shown in FIG. 5, the directivity enhancement processing by the first arithmetic
processing unit 249L and the second arithmetic processing unit 249R is on.
[0085]
As shown in FIG. 5, the first arithmetic processing unit 249L includes a delay filter 2491L, a
directivity correction unit 2493L, a suppression unit 2495L, and an equivalent filter 2497L.
Similarly, as shown in FIG. 5, the second arithmetic processing unit 249R includes a delay filter
10-05-2019
22
2491R, a directivity correction unit 2493R, a suppression unit 2495R, and an equivalent filter
2497R. The configurations of the directivity correction units 2493L and 2493R and the
suppression units 2495L and 2495R are similar to those of the directivity correction units 144L
and 144R and the suppression units 146L and 146R described with reference to FIG. Do.
[0086]
The delay filters 2491L and 2491R are filters that perform delay processing on input signals,
similarly to the delay filters 142L and 142R described with reference to FIG. In the present
embodiment, since the device for performing recording and the device for performing
reproduction are not the same, the distance between the microphones at the time of recording of
the data reproduced by the reproducing device 24 is not necessarily constant. Similar to the
delay filters 142L and 142R described with reference to FIG. 2, the appropriate filter coefficients
(or the number of delay samples) of the delay filters 2491L and 2491R differ depending on the
distance between the microphones. Therefore, the delay filters 2491L and 2491R according to
the present embodiment receive filter coefficients corresponding to the recording device 22 from
the demultiplexer 241, and perform delay processing based on the filter coefficients.
[0087]
Similar to the equivalent filters 148L and 142R described with reference to FIG. 2, the equivalent
filters 2497L and 2497R are filters that correct the frequency characteristics of the signal
obtained by the suppression process. Similar to the equivalent filters 148L and 142R described
with reference to FIG. 2, the appropriate filter coefficients of the equivalent filters 2497L and
2497R differ depending on the distance between the microphones. Therefore, the equivalent
filters 2497L and 2497R according to the present embodiment receive filter coefficients
corresponding to the recording device 22 from the demultiplexer 241, and perform correction
processing based on the filter coefficients.
[0088]
<2−3. Effects of Second Embodiment> The second embodiment has been described above.
According to the present embodiment, the metadata based on the distance between the
microphones at the time of recording is provided to the device that performs reproduction, so
that higher localization is achieved even when the device that performs recording and the device
10-05-2019
23
that performs reproduction are different devices. It is possible to obtain an output signal with a
sense.
[0089]
<2−4. Supplement of Second Embodiment> In the above, an example was described in
which the metadata stored in the metadata storage unit 229 of the recording device 22 at least
includes information of filter coefficients used when performing suppression processing, The
present embodiment is not limited to such an example.
[0090]
For example, the metadata may be a model code for identifying the model of the recording device
22. In such a case, for example, the playback device 24 determines whether the recording device
22 and the playback device 24 are the same model using the model code, and only when the
model is the same model, directivity enhancement processing is performed. You may go.
[0091]
Also, the metadata may be distance information related to the distance between the microphones.
In such a case, the demultiplexer 241 of the playback device 24 functions as a distance
information acquisition unit that acquires distance information. In such a case, for example, the
reproduction device 24 selects a storage unit that stores a plurality of filter coefficients, and a
filter coefficient corresponding to the distance information acquired by the demultiplexer 241
from among the plurality of filter coefficients stored in the storage unit. And a filter coefficient
selection unit. In addition, in such a case, the reproduction device 24 may further include a filter
coefficient identification unit that identifies the filter coefficient based on the distance
information acquired by the demultiplexer 241, and the filter may be generated dynamically at
the time of reproduction.
[0092]
The metadata may also include information on gain differences between the left microphone
10-05-2019
24
221L and the right microphone 221R. In such a case, for example, instead of the recording
device 22 including the gain correction units 225L and 225R, the reproduction device 24
includes a gain correction unit, and the gain correction unit of the reproduction device 24
performs gain correction based on the information on the gain difference. You may go.
[0093]
<<3. Third Embodiment >> In the first embodiment and the second embodiment described
above, an example in which the sound acquired by the microphone is reproduced after being
stored in the storage unit has been described. On the other hand, in the following, as a third
embodiment, an example in which sound acquired by a microphone is reproduced in real time
will be described.
[0094]
<3−1. Overview of Third Embodiment> An overview of a third embodiment of the present
disclosure will be described with reference to FIG. FIG. 8 is an explanatory view showing an
overview of a broadcast system according to the third embodiment of the present disclosure. As
shown in FIG. 8, the broadcast system 3 according to the present embodiment includes a
transmission system 32 (broadcast station), corresponding receiving devices 34A and 34B, and
non-corresponding receiving devices 36A and 36B.
[0095]
The transmission system 32 is a system that simultaneously transmits voice and other data, such
as teletext broadcasting. For example, the transmission system 32 acquires the first audio signal
and the second audio signal by the stereo microphone, and receives the information including the
first audio signal, the second audio signal, and the metadata corresponding receiver 34A, 34B. ,
And transmit (broadcast) to the non-compliant receivers 36A and 36B. The metadata according
to the present embodiment may include the same information as the metadata described with
reference to some examples in the second embodiment, and may further include metadata (text
information etc.) regarding broadcasting. .
[0096]
10-05-2019
25
The corresponding receiving devices 34A and 34B are signal processing devices compatible with
suppression processing (directivity enhancement processing) using metadata, and may perform
suppression processing when metadata for directivity enhancement processing is received. It is
possible. In addition, the non-corresponding receiving devices 36A and 36B are devices that do
not correspond to the suppression process using metadata, ignore the metadata for directivity
emphasis processing, and process only the audio signal.
[0097]
With such a configuration, even if the sound acquired by the microphone is reproduced in real
time, an output signal with a higher sense of localization can be obtained as long as the device is
compatible with the directivity enhancement processing. .
[0098]
<3−2.
Configuration of Third Embodiment> The outline of the broadcast system 3 according to the
present embodiment has been described above. Subsequently, configuration examples of the
transmission system 32, the compatible receiving apparatus 34, and the non-compatible
receiving apparatus 36 included in the broadcast system 3 according to the present embodiment
will be sequentially described in detail with reference to FIGS.
[0099]
(Transmission System) FIG. 9 is an explanatory view showing a configuration example of a
transmission system 32 according to the present embodiment. As shown in FIG. 9, the
transmission system 32 includes a left microphone 321L, a right microphone 321R, A / D
conversion units 323L and 323R, gain correction units 325L and 325R, an encoding unit 327, an
acquisition unit 329, and a transmission unit 331. The configurations of the left microphone
321L, the right microphone 321R, the A / D conversion units 323L and 323R, the gain
correction units 325L and 325R, and the encoding unit 327 are the left microphone 110L and
the right microphone 110R and A described with reference to FIG. The same as the / D
conversion units 120L and 120R, the gain correction units 130L and 130R, and the encoding
10-05-2019
26
unit 150, the description will be omitted.
[0100]
Note that the transmission system 32 according to the present embodiment performs the process
corresponding to step S102 described with reference to FIG. 4 as the process related to
directivity enhancement.
[0101]
The acquisition unit 329 acquires metadata such as a distance between microphones between
the left microphone 321L and the right microphone 321R or a filter coefficient based on the
distance between the microphones.
The acquisition unit 329 can acquire metadata in various ways.
[0102]
FIG. 10 is an explanatory view showing a configuration example of the acquisition unit 329. As
shown in FIG. As shown in FIG. 10, the acquisition unit 329 is a jig that connects the left
microphone 321L and the right microphone 321R and fixes the distance between the
microphones. Further, as shown in FIG. 10, the acquiring unit 329 may specify the intermicrophone distance, and output the inter-microphone distance as metadata. Note that the
acquiring unit 329 shown in FIG. 10 may keep the inter-microphone distance constant, and may
output the constant inter-microphone distance stored in the acquiring unit 329, or may be
expandable (it is possible to change the inter-microphone distance) May output the current intermicrophone distance.
[0103]
The acquisition unit 329 may be a sensor attached to the left microphone 321L and the right
microphone 321R to measure and output the distance between the microphones.
[0104]
10-05-2019
27
For example, in voice recording such as live broadcasting on television, assuming that stereo
microphones are set in each camera, the distance between the microphones is not unified due to
the influence of the camera size etc. The distance between the microphones is changed each time
the camera is switched May be changed.
Moreover, even if it is the same microphone, when changing the distance between microphones
in real time is also considered. With the configuration of the acquisition unit 329 described
above, for example, even when switching to a stereo microphone with a different distance
between microphones or when the distance between microphones is changed in real time, the
distance between microphones acquired in real time Etc. metadata can be transmitted.
[0105]
Note that the process of the acquisition unit 329 described above may be included in the process
of step S102 described with reference to FIG. 4. Also, as a matter of course, the distance between
the microphones may be specified by the user who performs the recording checking the distance
between the microphones every time the distance between the microphones is changed and
manually inputting and setting the information on the distance between the microphones.
[0106]
The transmission unit 331 illustrated in FIG. 9 collectively (for example, multiplex) the audio
signal provided from the encoding unit 327 and the metadata provided from the acquisition unit
329.
[0107]
(Corresponding Reception Device) FIG. 11 is an explanatory view showing a configuration
example of the corresponding reception device 34. As shown in FIG.
As shown in FIG. 11, the corresponding receiving apparatus 34 includes a receiving unit 341, a
decoding unit 343, a metadata parser 345, switch units 347A to D, a first arithmetic processing
unit 349L, and a second arithmetic processing unit 349R, D /. It is a signal processing device
provided with A conversion part 351L and 351R. The configuration of the D / A conversion units
10-05-2019
28
351L and 351R is similar to that of the D / A conversion units 180L and 180R described with
reference to FIG. The configurations of the switch units 347A to 347D are the same as the switch
units 247A to 2D described with reference to FIG.
[0108]
In addition, the corresponding | compatible receiving apparatus 34 which concerns on this
embodiment performs the process corresponded to step S104-S110 demonstrated with reference
to FIG. 4 as a process which concerns on directivity emphasis.
[0109]
The reception unit 341 receives information including the first audio signal based on the left
microphone 321L of the transmission system 32, the second audio signal based on the right
microphone 321R of the transmission system 32, and metadata from the transmission system
32. .
[0110]
The decoding unit 343 decodes the first audio signal and the second audio signal from the
information received by the receiving unit 341.
Also, the decoding unit 343 extracts metadata from the information received by the receiving
unit 341, and provides the metadata to the metadata parser 345.
[0111]
The metadata parser 345 analyzes the metadata received from the decoding unit 343 and
switches the switch units 347A to 347D according to the metadata.
For example, when the metadata parser 345 determines that the metadata includes distance
information related to the inter-microphone distance or information of a filter coefficient, the
directivity emphasis including the first suppression processing and the second suppression
processing is performed. The switch units 347A to 3D may be switched to perform the process.
10-05-2019
29
[0112]
According to such a configuration, when the directivity enhancement processing is possible, the
directivity enhancement processing is automatically performed, and it is possible to obtain higher
localization feeling.
[0113]
Further, when the metadata parser 345 includes distance information related to the distance
between microphones or information of a filter coefficient, the metadata parser 345 performs the
first arithmetic processing unit 349L and the second arithmetic processing, Provide to section
349R.
[0114]
As shown in FIG. 11, the first arithmetic processing unit 349L includes a delay filter 3491L, a
directivity correction unit 3493L, a suppression unit 3495L, and an equivalent filter 3497L.
Similarly, as shown in FIG. 11, the second arithmetic processing unit 349R includes a delay filter
3491R, a directivity correction unit 349R, a suppression unit 3495R, and an equivalent filter
3497R.
The configuration of each of the first arithmetic processing unit 349L and the second arithmetic
processing unit 349R is the same as that of the first arithmetic processing unit 249L and the
second arithmetic processing unit 249R described with reference to FIG. 5. Therefore, the
explanation is omitted.
[0115]
The stereo audio signals (left output and right output) output from the D / A conversion units
351L and 351R may be reproduced by an external speaker, headphones, or the like.
[0116]
(Non-compliant Reception Device) FIG. 12 is an explanatory view showing a configuration
10-05-2019
30
example of the non-compliant reception device 36. As shown in FIG.
As shown in FIG. 12, the non-compliant receiver 36 is a signal processing device including a
receiver 361, a decoder 363, and D / A converters 365L and 365R. The configurations of the
receiving unit 361 and the D / A converting units 365L and 365R are similar to those of the
receiving unit 341 and the D / A converting units 351L and 351R described with reference to
FIG.
10-05-2019
31
Документ
Категория
Без категории
Просмотров
0
Размер файла
45 Кб
Теги
jpwo2017056781
1/--страниц
Пожаловаться на содержимое документа