close

Вход

Забыли?

вход по аккаунту

?

JP2016082444

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JP2016082444
Abstract: PROBLEM TO BE SOLVED: To avoid degradation of the quality of outputted voice due to
the way of holding a microphone. SOLUTION: A voice processing apparatus 10 acquires, from a
microphone 20, voice information which is information indicating a voice of a user and holding
information which is information indicating how to hold the microphone 20 by the user. The
holding method determination unit 14 of the voice processing device 10 determines from the
holding method information whether the user holds the microphone in an appropriate holding
manner or the user holds the microphone in an incorrect holding manner. The correction unit 16
of the voice processing device 10 corrects the voice information according to the holding manner
determination result indicating that the holding is performed improperly. Then, the speech
processing device 10 mixes and outputs the speech information after correction and the
information indicating the accompaniment for karaoke. [Selected figure] Figure 1
Speech processing device and microphone
[0001]
The present invention relates to a voice processing device and a microphone connected to the
voice processing device.
[0002]
There is a karaoke apparatus as an example of a speech processing apparatus that processes and
outputs information indicating speech acquired from a microphone (hereinafter referred to as
speech information).
07-05-2019
1
In general, a karaoke apparatus acquires voice information of a user (song) 's singing from a
microphone, mixes the voice information with information indicating accompaniment of a
karaoke song (hereinafter referred to as accompaniment information), and outputs the mixed
information.
[0003]
Unexamined-Japanese-Patent No. 5-181407
[0004]
The user of this karaoke apparatus usually sings toward the microphone with the microphone
connected to the karaoke apparatus.
If the user holds the microphone in an appropriate manner and sings, the microphone can
properly collect the user's voice.
[0005]
However, if the user holds the microphone in an inappropriate manner and sings, the
microphone may not be able to properly pick up the user's voice. As a result, there is a possibility
that the quality of the user's voice outputted from the karaoke apparatus may be lowered due to
the improper holding of the microphone.
[0006]
The present invention has been made in view of the above circumstances, and an object of the
present invention is to provide a technical means for preventing the quality of the output voice
from being lowered due to the way the microphone is held.
[0007]
The present invention comprises: acquiring means for acquiring holding method information
indicating how to hold the microphone; and correcting means for correcting audio information
07-05-2019
2
indicating sound acquired from the microphone based on the holding method information
acquired by the acquiring means. Provided is a voice processing apparatus characterized by
comprising.
[0008]
According to the present invention, the voice corrected based on the holding method information
indicating the holding method of the microphone is output.
For this reason, it is possible to avoid that the quality of the output voice is reduced due to the
way the microphone is held.
[0009]
Patent Document 1 discloses an automatic performance apparatus with a singing ability scoring
function that compares the distance between a microphone and a singer's mouth and the
loudness value of melody data by comparing the distance between the microphone and the
singer's mouth and scores the singing ability of the singer. There is.
However, the technology of Patent Document 1 does not have the concept of correcting voice
information acquired from a microphone. Therefore, the present speech processing device is
completely different from the device disclosed in Patent Document 1.
[0010]
FIG. 1 is a block diagram showing a configuration of an audio processing system 1 including an
audio processing device 10 according to a first embodiment of the present invention. It is a
figure which illustrates how a user holds the microphone 20. FIG. 7 is a diagram showing
directivity dependency of frequency characteristics of the microphone 20. It is a block diagram
showing the composition of speech processing system 1A containing speech processing unit 10A
by a 2nd embodiment of this invention. It is a block diagram showing the composition of speech
processing system 1B containing speech processing unit 10B by a 3rd embodiment of this
invention. It is a block diagram which shows the structure of 1 C of audio processing systems
07-05-2019
3
containing 10 C of audio processing apparatuses by 4th Embodiment of this invention.
[0011]
Hereinafter, embodiments of the present invention will be described with reference to the
drawings. First Embodiment FIG. 1 is a block diagram showing the configuration of a voice
processing system 1 including a voice processing apparatus 10 according to a first embodiment
of the present invention. As shown in FIG. 1, the audio processing system 1 includes an audio
processing device 10, a microphone 20 and a karaoke source 30. The voice processing device 10
according to the present embodiment is specifically a karaoke device. The user (singer) of the
voice processing apparatus 10 selects a karaoke song, holds the microphone 20 by hand, and
sings the selected song toward the microphone 20. The voice processing device 10 acquires from
the microphone 20 information (voice information) indicating the voice of the user and
information indicating how to hold the microphone 20 (hereinafter referred to as holding
information). The voice processing device 10 performs a process of correcting the obtained voice
information based on the obtained holding method information. Specifically, the voice processing
device 10 obtains voice information acquired when the user sings with the microphone 20 held
in an inappropriate manner when the user holds the microphone 20 in an appropriate manner
and sings Correct to voice information that would have been Then, the speech processing device
10 mixes and outputs the accompaniment information of the selected song acquired from the
karaoke source 30 and the corrected speech information. As a result, the user's voice and the
karaoke accompaniment corrected to the voice quality when holding and singing the microphone
20 with the proper manner are emitted. The above is the outline of the speech processing system
1 including the speech processing apparatus 10. The configuration of each element will be
described in detail below.
[0012]
The microphone 20 is a hand microphone for vocals held by the user. The microphone 20
includes a substantially cylindrical main body portion 28 and a grill portion 27 formed by
forming a net-like member into a substantially spherical shell shape. The grille 27 is connected to
one end of the main body 28.
[0013]
07-05-2019
4
The microphone 20 has a voice detection unit 22, a holding manner detection unit 24 and an
output unit 26. The voice detection unit 22 is a device that detects the voice of the user. The
voice detection unit 22 is accommodated in the grille 27. The voice detection unit 22 delivers the
voice information that is the detection result to the output unit 26. Also, the microphone 20 has
unidirectionality. The microphone 20 picks up the sound coming from the direction in which the
main body 28 is extended to the grille 27 side with the highest sensitivity. In this specification, a
line along the direction in which the sound collection sensitivity is highest is called a sound
collection axis.
[0014]
The holding manner detection unit 24 is a device that detects how the user holds the microphone
20. Specifically, the holding method detection unit 24 detects whether the user holds the
microphone 20 properly or holding the microphone 20 improperly. FIG. 2A is a diagram
illustrating how the user holds the microphone 20 properly. As shown in FIG. 2A, the way of
holding the main body 28 of the microphone 20 is proper. FIG. 2B is a diagram illustrating an
improper manner of holding the microphone 20 by the user. As shown in FIG. 2B, the way of
holding the grille portion 27 of the microphone 20 is an improper way of holding. In order to
detect that the user holds the microphone 20 in an inappropriate manner, the holding method
detection unit 24 of the present embodiment is provided in the grille portion 27 of the
microphone 20. Specifically, the holding method detection unit 24 includes a plurality of
piezoelectric elements, and the piezoelectric elements are disposed in a distributed manner on
the surface of the grille 27. When the user grips the grille portion 27 of the microphone 20, the
piezoelectric element of the holding manner detection portion 24 detects the grip strength of the
user and converts the grip strength into a voltage. For example, as shown in FIG. 2B, when the
user grips a portion (hereinafter referred to as the lower half of the grill portion 27) extending
from the connection portion of the grill portion 27 with the main body portion 28 to the center
of the grill portion 27; The voltage of each piezoelectric element arranged in the lower half of 27
increases. Then, the holding method detection unit 24 delivers the voltage of each piezoelectric
element to the output unit 26 as holding method information.
[0015]
The output unit 26 of the microphone 20 is a device that outputs the voice information delivered
from the voice detection unit 22 and the holding method information delivered from the holding
detection unit 24 to the voice processing device 10. The output unit 26 is built in the main body
unit 28. The output unit 26 is connected to the audio processing device 10 by a microphone
07-05-2019
5
cable or the like.
[0016]
The karaoke source 30 is a device including storage means such as a laser disk (registered
trademark) or a hard disk. The karaoke source 30 stores accompaniment information of a
plurality of karaoke songs. The karaoke source 30 is connected to the voice processing device
10. The karaoke source 30 outputs the accompaniment information of the karaoke song selected
by the user to the voice processing device 10. The karaoke source 30 may be connected to the
voice processing device 10 via a network or the like.
[0017]
Next, the voice processing device 10 will be described. The audio processing device 10 includes
an acquisition unit 12, a karaoke information acquisition unit 13, a holding manner
determination unit 14, a correction unit 16, a mixdown unit 17, and an audio output unit 18.
[0018]
The acquisition unit 12 is a device connected to the output unit 26 of the microphone 20 via a
microphone cable or the like. The acquisition unit 12 is means for acquiring voice information
and holding information from the microphone 20. The acquisition unit 12 delivers the acquired
voice information to the correction unit 16 and delivers the acquired holding information to the
holding manner determination unit 14.
[0019]
The holding manner determination unit 14 is a device that determines the manner in which the
user holds the microphone 20 based on the handing information delivered. Specifically, when the
voltage of one of the piezoelectric elements of the piezoelectric elements of the holding method
detection unit 24 becomes high, the holding method determination unit 14 holds the grille unit
27 and has an inappropriate holding method. It is determined that the microphone 20 is held. On
the other hand, if the voltage of any of the piezoelectric elements of the holding state detection
07-05-2019
6
unit 24 does not increase, the user does not hold the grille portion 27 and the holding manner
determination unit 14 does not hold the grille portion 27 and the microphone 20 is held
properly. Determine that you have The holding method determination unit 14 passes the holding
method determination result as to whether the holding method of the microphone 20 is an
appropriate holding method or an incorrect holding method to the correction unit 16.
[0020]
In the correction unit 16, the voice information is delivered from the acquisition unit 12 and the
holding manner determination result is delivered from the holding manner determination unit
14. The correction unit 16 is a device that corrects voice information according to the holding
manner determination result. Specifically, when it is determined in the holding manner
determination unit 14 that “the holding manner is appropriate (in other words, the grill unit 27
is not held)”, the correction unit 16 receives the voice information delivered from the
acquisition unit 12 Hand over to the mixdown section 17 as it is. On the other hand, when it is
determined in the holding manner determination unit 14 that “the holding manner is
inappropriate (in other words, the grille unit 27 is held)”, the correction unit 16 includes the
predetermined correction unit 16. The voice information delivered from the acquisition unit 12 is
corrected to the following filter and then delivered to the mixdown unit 17.
[0021]
The correction in the correction unit 16 will be described in detail. The correction in the
correction unit 16 is performed based on the directivity dependency of the frequency
characteristic of the microphone 20. FIG. 3 is a view showing the directivity dependency of the
frequency characteristic of the microphone 20. As shown in FIG. The angle in the circumferential
direction in FIG. 3 indicates the angle (sound collection angle) from the sound collection axis of
the microphone 20, and the radius in FIG. 3 indicates the sound collection sensitivity. FIG. 3
shows the sound collection sensitivity when the holding method is appropriate. In Fig. 3, A1 has
a sound collection sensitivity of 125 Hz, A2 has a sound collection sensitivity of 500 Hz, A3 has a
sound collection sensitivity of 1000 Hz, A4 has a sound collection sensitivity of 2000 Hz, and A5
has 4000 Hz. A6 shows the sound collection sensitivity of 8000 Hz sound. As shown in FIG. 3,
since the microphone 20 has unidirectionality, the sound collection sensitivity differs depending
on the sound collection angle. Further, the microphone 20 has different sound collection
sensitivities at respective sound collection angles depending on the frequency of the sound to be
collected.
07-05-2019
7
[0022]
When the user holds the grille portion 27 of the microphone 20, the voice detection unit 22 is
covered by the hand, so that the sound collection sensitivity of the microphone 20 is lowered. For
example, the figure of the frequency characteristic in FIG. 3 is deformed, and the area of the
figure is reduced. The degree to which the sound collection sensitivity decreases depends on the
frequency. Specifically, the degree of decrease in sound collection sensitivity at high frequencies
is greater than the degree of decrease in sound collection sensitivity at low frequencies. For this
reason, the sound collected by the microphone 20 in the state of holding the grille part 27
becomes a sound in which the high pressure level is suppressed. Such a sound is a so-called
"harmed feeling" sound.
[0023]
The filter of the correction unit 16 is a filter that shows the inverse characteristic of the
frequency characteristic of the microphone 20 in a state where the microphone 20 is improperly
held. Specifically, the filter of the correction unit 16 is a filter that has the grille unit 27 and
shows the inverse characteristic of the frequency characteristic of the microphone 20 when the
sound collection sensitivity is lowered. More specifically, the filter of the correction unit 16 is a
filter or the like that passes a sound signal higher than the cutoff frequency to reduce the sound
pressure level of the sound signal lower than the cutoff frequency. The cutoff frequency of this
filter is set to a frequency at which the sound collection sensitivity in the directivity dependency
of the frequency characteristic of the microphone 20 starts to change significantly. Further, the
degree of reduction of the sound pressure level in the low band in this filter is set to the same
degree as the degree of reduction of the sound collection sensitivity in the high band in the
frequency characteristics of the microphone 20 in a state where the microphone 20 is
improperly held. The frequency characteristics of the microphone 20 in a state where the
microphone 20 is improperly held may be assumed in advance by simulation, experiment, or the
like.
[0024]
The correction unit 16 amplifies and outputs the audio information after passing through the
filter. Specifically, the correction unit 16 amplifies the sound pressure level of the audio
information after passing through the filter by an amount corresponding to the degree of
07-05-2019
8
decrease in the sound collection sensitivity due to the user having the grille unit 27. In this
manner, the correction unit 16 is configured to obtain voice information that may be acquired by
the microphone 20 having frequency characteristics in a state where the user holds the
microphone 20 in an appropriate manner (in a state where the user does not have the grille
portion 27). Generate The correction unit 16 also corrects the audio information in real time. For
example, when the correction unit 16 acquires a holding method determination result indicating
that the holding method is improper in the middle of the music, the mixdown unit 17 passes
audio information to the filter from the time of obtaining the determination result. If the result of
holding is determined to be proper, the voice information is output to the mix-down unit 17 as it
is from the time of obtaining this judgment result. It is. Hereinafter, the voice information output
from the correction unit 16 will be referred to as voice information after correction.
[0025]
The karaoke information acquisition unit 13 of the voice processing unit 10 shown in FIG. 1 is
connected to the karaoke source 30. The karaoke information acquisition unit 13 is a device for
acquiring from the karaoke source 30 accompaniment information of the karaoke song selected
by the user. The karaoke information acquisition unit 13 delivers the acquired accompaniment
information to the mixdown unit 17.
[0026]
The accompaniment information and the corrected audio information are delivered to the
mixdown unit 17. The mix down unit 17 is a device that mixes (mixes down) accompaniment
information with corrected audio information. The mix-down unit 17 delivers information
indicating the mixed sound (hereinafter referred to as mixed sound information) to the audio
output unit 18.
[0027]
The audio output unit 18 is a device that outputs mixed sound information delivered from the
mix down unit 17. For example, a speaker is connected to the audio output unit 18 (not shown).
The audio output unit 18 amplifies the mixed sound information and causes the speaker to emit
a sound according to the mixed sound information. The above is the configuration of the audio
processing system 1 including the audio processing device 10.
07-05-2019
9
[0028]
Next, the operation and usage of the speech processing device 10 will be described. First, the
user operates a remote controller (not shown) of the voice processing apparatus 10 or the like to
select a karaoke song. When the karaoke song is selected, the karaoke information acquisition
unit 13 of the voice processing device 10 acquires the accompaniment information of the
selected karaoke song from the karaoke source 30. Then, the karaoke information acquisition
unit 13 delivers the accompaniment information to the mixdown unit 17 sequentially from the
beginning of the song. When the voice information and the holding method information are not
obtained, the mixdown unit 17 outputs the accompaniment information to the voice output unit
18 as it is. Thereby, the accompaniment of the karaoke song is emitted in order from the
beginning of the song.
[0029]
Next, the user holds the microphone 20 and sings along with the accompaniment output from
the voice processing device 10. At this time, the voice detection unit 22 of the microphone 20
detects the user's voice (song sound), and the holding manner detection unit 24 of the
microphone 20 detects whether the user is holding the grille 27 or not. The voice information
and the holding information, which are the detection results, are sequentially delivered to the
voice processing apparatus 10 through the output unit 26. The acquisition unit 12 of the speech
processing apparatus 10 sequentially acquires speech information and holding information sent
from the microphone 20. The holding method determination unit 14 of the voice processing
device 10 determines whether the holding method of the microphone 20 is appropriate or not
from the acquired holding method information, and sends the holding method determination
result to the correction unit 16. The correction unit 16 corrects the voice information acquired
from the microphone 20 according to the holding method determination result, and delivers the
corrected voice information to the mix-down unit 17. The mix-down unit 17 mixes the
accompaniment information and the corrected audio information one by one each time the
acquisition unit 12 acquires the audio information and the holding information. Then, through
the voice output unit 18, a sound including the accompaniment and the corrected singing sound
is emitted from the speaker.
[0030]
07-05-2019
10
As described above, in the voice processing apparatus 10 according to the present embodiment,
the information indicating the voice of the user acquired from the microphone 20 is corrected
based on the information indicating how to hold the microphone 20 by the user. Specifically, in
the voice processing apparatus 10, the voice information acquired when the user sings with the
grill portion 27 of the microphone 20 is acquired when the user sings without the grill portion
27 of the microphone 20. It will be corrected to voice information that will be. In other words, in
the voice processing apparatus 10, the voice information acquired when the user sings while
holding the microphone 20 in an inappropriate manner is acquired when the user sings while
holding the microphone 20 in an appropriate manner. It will be corrected to voice information
that will be. Therefore, if the voice processing device 10 according to the present embodiment is
used, the degradation of the quality of the voice (song sound) output from the voice processing
device caused by not holding the microphone 20 properly can be improved. be able to.
[0031]
Second Embodiment FIG. 4 is a block diagram showing the configuration of an audio processing
system 1A including an audio processing apparatus 10A according to a second embodiment of
the present invention. As shown in FIG. 4, the voice processing system 1A differs from the voice
processing system 1 according to the first embodiment in that the voice processing device 10A is
replaced with the voice processing device 10A and the microphone 20 is replaced with a
microphone 20A. In the voice processing system 1 according to the first embodiment, the
microphone 20 directly detects how to hold the microphone 20. On the other hand, the voice
processing system 1A according to the second embodiment indirectly detects how to hold the
microphone 20A.
[0032]
The microphone 20A differs from the microphone 20 in that it does not have a holding method
detection unit. Since the microphone 20A does not have the holding method detection unit, the
output unit 26 of the microphone 20A outputs only the audio information that is the detection
result of the audio detection unit 22 to the audio processing device 10A.
[0033]
07-05-2019
11
The voice processing apparatus 10A is different from the voice processing apparatus 10 in that
the voice processing apparatus 10A includes a voice information acquisition unit 12A and a
holding information acquisition unit 11A instead of the acquisition unit 12 and a holding method
determination unit 14A instead of the holding manner determination unit 14. It is different. The
voice information acquisition unit 12A is a device that obtains voice information from the
microphone 20A. The voice information acquisition unit 12A delivers the acquired voice
information to the correction unit 16.
[0034]
The holding method information acquisition unit 11A is a device that acquires holding method
information by image processing. It will be described in more detail. One or more cameras (not
shown) are connected to the holding information acquisition unit 11A. This camera is directed,
for example, in front of a display on which lyrics and the like are displayed. This camera captures
an image of the microphone 20A and the user's hand. Then, the holding method information
acquisition unit 11A processes the image captured by the camera, and acquires information
indicating the contour of the microphone 20A and information indicating the contour of the hand
holding the microphone 20A as holding information. The holding method information acquisition
unit 11A delivers the acquired holding method information to the holding method determination
unit 14A.
[0035]
The holding method determination unit 14A is a device that determines the holding method of
the microphone 20A by the user based on the holding method information delivered from the
holding method information acquisition unit 11A. Specifically, the holding method determination
unit 14A estimates the grille portion 27 in the microphone 20A image from the contour of the
microphone 20A, and determines whether or not the contour of the user's hand overlaps the
grille portion 27. Then, when the outline of the hand overlaps the grille 27 in the microphone
20A image, the user holds the grille 27 (that is, the user holds the microphone 20A in an
improper manner). If the user does not have the grille 27 (that is, the user holds the microphone
20 in a proper manner of holding) if it is determined that the hand outline on the grille 27 does
not overlap. And determine. The holding manner determination unit 14A delivers the holding
manner determination result to the correction unit 16.
[0036]
07-05-2019
12
The correction unit 16 corrects the voice information delivered from the voice information
acquisition unit 12A based on the holding manner determination result delivered from the
holding manner determination unit 14A, as in the first embodiment. For this reason, also in the
present embodiment, the same effects as in the first embodiment can be obtained. Further, in the
present embodiment, since the holding method of the microphone 20A is indirectly detected, it is
not necessary to provide the holding method detection unit in the microphone 20A, and a normal
microphone distributed in the market may be used as the microphone 20A. it can.
[0037]
Third Embodiment FIG. 5 is a block diagram showing the configuration of a voice processing
system 1B including a voice processing apparatus 10B according to a third embodiment of the
present invention. As shown in FIG. 5, the voice processing system 1B is different from the voice
processing system 1A according to the second embodiment in that a voice processing device 10B
is provided instead of the voice processing device 10A. The voice processing device 10B
according to the present embodiment differs from the voice processing device 10A in a specific
determination mode of how to hold the microphone 20A. The voice processing device 10B
includes a correction unit 16B in place of the holding method information acquisition unit 11A, a
holding method information acquisition unit 11B in place of the holding method determination
unit 14A and a correction unit 16 in place of the holding method determination unit 14B. And
the voice processing apparatus 10A.
[0038]
The holding information acquisition unit 11B is similar to the holding information acquisition
unit 11A in that holding information is acquired by image processing, but the specific content of
the image processing is different from the holding information acquisition unit 11A. The camera
connected to the holding method information acquisition unit 11B captures an image of the
microphone 20A and the user's mouth. The holding information acquisition unit 11B processes
the image captured by the camera, and acquires information indicating the outline of the
microphone 20A and information indicating the outline of the user's mouth as holding
information. The holding method information acquisition unit 11B delivers the acquired holding
method information to the holding method determination unit 14B.
07-05-2019
13
[0039]
The holding method determination unit 14B is a device that determines the holding method of
the microphone 20A by the user based on the holding method information delivered from the
holding method information acquisition unit 11B. Specifically, from the contour of the
microphone 20A and the contour of the user's mouth, the holding method determination unit
14B determines the sound collection angle formed by the arrival direction of the voice from the
user's mouth with respect to the sound collection axis of the microphone 20A. calculate. The
holding manner determination unit 14B delivers the calculated sound collection angle to the
correction unit 16B as the holding manner determination result.
[0040]
The correction unit 16B according to the present embodiment corrects the voice information
delivered from the voice information acquisition unit 12A with a filter having an inverse
characteristic of the frequency characteristic of the microphone 20A at the sound collection
angle output by the holding method determination unit 14B. Hereinafter, the correction unit 16B
will be described in detail. The correction unit 16B has a filter corresponding to each sound
collection angle for each sound collection angle. For example, the correction unit 16B has a
plurality of filters corresponding to the sound collection angles of 30 degrees, 60 degrees, 90
degrees, 120 degrees, 150 degrees, and 180 degrees. Each filter of the correction unit 16B
shows the inverse characteristic of the frequency characteristic of the microphone 20A at the
sound collection angle. For example, the filter corresponding to the sound collection angle of 30
degrees is a filter having the inverse characteristic of the frequency characteristic of the
microphone 20A at the sound collection angle of 30 degrees, and the filter corresponding to the
sound collection angle of 60 degrees is the sound collection angle of 60 degrees It is the
condition that it is a filter of the reverse characteristic of the frequency characteristic of
microphone 20A in. The reason why the filters having different characteristics are provided for
each sound collection angle is that the size of the sound collection sensitivity to the frequency is
different depending on the sound collection angle.
[0041]
When the correction unit 16B receives the holding method determination result, the correction
unit 16B selects a filter corresponding to the sound collection angle from the sound collection
angle indicated by the holding method determination result. For example, the correction unit 16B
07-05-2019
14
selects the filter corresponding to the sound collection angle of 60 degrees when receiving the
holding method determination result indicating that the sound collection angle is 60 degrees.
The correction unit 16B may select a filter corresponding to a sound collection angle close to the
sound collection angle indicated by the holding method determination result. For example, the
correction unit 16B selects the filter corresponding to the sound collection angle of 60 degrees
when receiving the holding method discrimination result indicating that the sound collection
angle is 55 degrees. Then, the correction unit 16B passes the audio information to the selected
filter. Thereafter, the correction unit 16B amplifies and outputs the audio information after
passing through the filter. The degree of amplification of the audio information differs for each
filter through which the audio information passes (in other words, for each sound collection
angle). This is because the degree of decrease in the sound collection sensitivity of the
microphone 20A differs for each sound collection angle. In this manner, the correction unit 16B
corrects the voice information that has arrived and acquired from the direction shifted from the
sound collection axis into voice information that may have been received and acquired from the
direction along the sound collection axis. In addition, the sound collection angle as a result of the
holding method determination is an angle (for example, 10 degrees) closer to the sound
collection axis than an angle (for example, 30 degrees) closest to the sound collection axis among
the sound collection angles corresponding to the filter. In this case, the correction unit 16B
outputs the acquired audio information to the mixdown unit 17 as it is.
[0042]
As described above, in the voice processing device 10B according to the present embodiment, the
voice information is corrected based on the sound collection angle formed by the arrival
direction of the voice from the mouth with respect to the sound collection axis of the microphone
20A. Since this sound collection angle is also related to how to hold the microphone 20A, the
same effects as in the first and second embodiments can be obtained also in this embodiment.
[0043]
Fourth Embodiment A voice processing apparatus 10C according to a fourth embodiment of the
present invention adds a function for scoring the user's singing ability to the voice processing
apparatus 10 according to the first embodiment. FIG. 6 is a block diagram showing a
configuration of an audio processing system 1C including the audio processing device 10C
according to the present embodiment. The voice processing system 1C is different from the voice
processing system 1 according to the first embodiment in that a voice processing device 10C is
replaced with the voice processing device 10 and a karaoke source 30C is replaced with the
07-05-2019
15
karaoke source 30.
[0044]
The karaoke source 30C is different from the karaoke source 30 in that information indicating a
model vocal corresponding to the accompaniment indicated by the accompaniment information
(hereinafter referred to as model vocal information) is stored in addition to the accompaniment
information. The karaoke source 30C outputs the accompaniment information of the karaoke
song selected by the user and the model vocal information of the song to the voice processing
device 10C.
[0045]
The voice processing device 10C has a karaoke information acquisition unit 13C in place of the
karaoke information acquisition unit 13, an acquisition unit 12C in place of the acquisition unit
12, and a holding manner determination unit 14C in place of the holding manner determination
unit 14 It differs from the speech processing device 10 in that it further includes a unit 15C and
a display output unit 19C.
[0046]
The karaoke information acquisition unit 13C differs from the karaoke information acquisition
unit 13 in that it acquires model vocal information from the karaoke source 30C in addition to
accompaniment information.
The karaoke information acquisition unit 13C delivers the accompaniment information to the
mixdown unit 17, and delivers the model vocal information to the scoring unit 15C. The
acquisition unit 12C is different from the acquisition unit 12 in that voice information is added to
the correction unit 16 and delivered to the scoring unit 15C. The holding method determination
unit 14C differs from the holding method determination unit 14 in that the holding method
determination result is added to the correction unit 16 and delivered to the scoring unit 15C.
[0047]
07-05-2019
16
The model vocal information, the voice information, and the holding method discrimination
result are delivered to the scoring unit 15C. The scoring unit 15C is a device that compares
model vocal information and voice information to calculate the degree of coincidence thereof,
and scores the result. Specifically, the scoring unit 15C compares the degree of coincidence, such
as pitch and tone generation timing. After scoring the degree of coincidence, the scoring unit 15C
increases or decreases the score in accordance with the result of how to check the holding
manner. For example, if the holding method determination result indicates that "the holding
method of the microphone 20 is appropriate", the point is added, and if the holding method
indicates that the holding method of the microphone 20 is improper ", the point is deducted. It is.
The scoring unit 15C delivers, to the display output unit 19C, information indicating the score
(that is, the scoring result) indicating the singing ability reflecting the result of the holding
manner determination. The display output unit 19C is a device that causes the connected display
(not shown) to display the scoring result by the scoring unit 15C.
[0048]
In this configuration, when the user selects a karaoke song, the voice processing device 10C
acquires, from the karaoke source 30C, accompaniment information of the selected karaoke song
and its model vocal information. Then, when the user sings with the microphone 20, the voice
processing device 10C acquires voice information and holding information from the microphone
20. As in the first embodiment, the voice processing device 10C corrects the voice information
based on the holding information, mixes the corrected voice information and the accompaniment
information, and outputs. At the same time, the scoring unit 15C of the voice processing device
10C sequentially compares model vocal information and voice information as the progression of
the music and scores it. Furthermore, the scoring unit 15C successively increases or decreases
the score according to the holding manner determination result as the music progresses. Then,
when the music is finished, the voice processing device 10C displays the final scoring result on
the display.
[0049]
Thus, in the voice processing device 10C according to the present embodiment, the way of
holding the microphone 20 is reflected in the score indicating the singing ability. Therefore, the
voice processing device 10C can perform more versatile scoring than the conventional karaoke
device having the function of scoring the singing ability. For this reason, if the voice processing
apparatus 10C is used, for example, even if the actual singing is excellent, the user can not obtain
a good scoring result as the user thinks if the holding manner of the microphone is
07-05-2019
17
inappropriate. On the contrary, the actual singing However, even if the microphone is not so
good, it is possible to score the singing ability such that if the user holds the microphone
properly, a better scoring result can be obtained than the user thinks.
[0050]
Various specific modes for reflecting the manner of holding the microphone 20 on the score
indicating the singing ability can be considered in various ways. For example, points may be
deducted when the grill portion 27 is touched, or may be increased or decreased depending on
the length of time during which the grill portion 27 is touched. In addition, the score may be
successively increased or decreased as the music progresses, or the score may be increased or
decreased by reflecting the result of the holding manner determination at a predetermined
timing in the music. Further, the present invention is not limited to the mode in which the score
is increased or decreased according to the holding manner determination result, and only
addition may be performed according to the holding manner determination result, or only the
deduction may be performed. In addition, the specific aspect made to reflect how to hold the
microphone 20 in the score which shows singing power is not restricted to these examples.
[0051]
<Other Embodiments> Although the first to fourth embodiments of the present invention have
been described above, other embodiments can be considered in the present invention. For
example:
[0052]
(1) In the above embodiments, the microphones 20 and 20A are connected to the voice
processing devices 10 to 10C. However, a plurality of microphones may be connected to the
voice processing device. Usually, the directivity dependency of the frequency characteristics of
the microphones differs from one microphone to another. For this reason, in this aspect, it is
preferable that the voice processing device acquire voice information and holding information in
association with each microphone. In addition, the filter of the correction unit preferably has an
appropriate characteristic for each microphone in accordance with the directivity dependency of
the frequency characteristic of each microphone. For example, the cutoff frequency of the filter
and the reduction degree of the sound pressure level may be set for each microphone.
07-05-2019
18
[0053]
(2) The voice processing device 10C according to the fourth embodiment is the voice processing
device 10 according to the first embodiment to which a singing scoring function is added.
Similarly to the fourth embodiment, a song scoring function may be added to the voice
processing devices 10A and 10B according to the second and third embodiments.
[0054]
(3) The voice processing apparatus 10C according to the fourth embodiment corrects the voice
information based on the result of the holding manner determination and reflects the holding
manner determination result of the microphone 20 in the score indicating the singing ability.
However, the voice processing apparatus may reflect the holding manner determination result of
the microphone on the score indicating the singing ability without correcting the voice
information. This is because the fourth embodiment is the same as the fourth embodiment in that
various scoring can be performed.
[0055]
(4) The holding manner determination unit 14 of the voice processing apparatus 10 according to
the first embodiment determines whether the user of the grill unit 27 determines whether the
voltage of the piezoelectric element disposed in the grill unit 27 has become high. It was
determined whether it had it or not. However, the method for determining how to hold the voice
processing device determines which of the piezoelectric elements disposed in the grille has a
higher voltage, and determines which part of the grille the user has. Or, it may be determined to
what extent the grill portion is covered. Further, the correction unit may have a plurality of types
of filters having different characteristics, and may select a filter to which audio information
should be passed according to the holding state determination result from among the plurality of
types of filters. For example, if the holding method discrimination result indicates that the center
of the grill portion is held, the correction portion selects a filter that exhibits the inverse
characteristic of the frequency characteristic of the microphone covered in the vicinity of the
center of the grill portion. And pass the audio information through the filter. Further, the degree
of amplification of the audio information after passing through the filter in the correction unit
may be made variable. For example, the correction unit is configured to increase the
07-05-2019
19
amplification degree of the audio information after passing through the filter when the range
covered by the hand in the grill unit is wide.
[0056]
(5) In the first embodiment, the holding portion detection unit 24 is provided in the grille portion
27 of the microphone 20. However, the position at which the holding method detection unit is
provided is not limited to the grill portion. For example, a holding state detection unit
(specifically, a piezoelectric element or the like) may be provided in the antenna portion of the
wireless microphone. The holding state detection unit of this aspect detects that the user has
touched the antenna portion of the microphone. When the user touches the antenna portion of
the microphone, there is a possibility that noise may be superimposed on the audio information
sent from the microphone to the voice processing device, and thus holding the antenna portion is
inappropriate. And the correction | amendment part in this aspect passes audio | voice
information to the filter which removes the noise superimposed on audio | voice information.
[0057]
(6) At the time of correction of voice information, a calibration (calibration) result previously
performed may be used. For example, first, the user utters toward the microphone with the
mouth arranged on the sound collection axis. Next, the user pronounces toward the microphone
in a state where the sound collection angle made by the arrival direction of the voice from the
mouth with respect to the sound collection axis is 45 degrees. The voice processing device
acquires frequency characteristics in these two states, and calculates the difference between the
acquired frequency characteristics. Next, the voice processing device calculates frequency
characteristics when the sound collection angle is other than 45 degrees by interpolation,
extrapolation, or the like. In this way, calibration of the directivity dependency of the frequency
characteristic of the microphone is performed in advance. Then, the voice processing device
passes the voice information to the filter having the inverse characteristic of the calibrated
frequency characteristic according to the holding method determination result indicating that the
holding method is improper.
[0058]
(7) The voice processing apparatus may give a warning indicating that the microphone is
07-05-2019
20
improperly held. Specifically, the voice processing device may display characters such as “the
holding of the microphone is not appropriate” on the display, or may emit an alarm sound. Also,
the voice processing device may perform both deduction and warning of the score indicating
singing ability.
[0059]
(8) The voice processing device may acquire the distance between the user's mouth and the
microphone as how to hold the microphone. When the mouth is close to the microphone, the
sound pressure level in the low range of the sound picked up by the microphone is increased by
the proximity effect, as compared with the state where the mouth is not close to the microphone.
For this reason, the distance between the mouth and the microphone can be detected by
calculating the degree of the sound pressure level which is increased by the proximity effect.
Then, in the voice processing device of this aspect, the voice information is corrected according
to the distance between the mouth and the microphone. For example, when the distance between
the mouth and the microphone is too close, the voice processing device determines that the
holding manner is inappropriate and corrects to reduce the increase in the sound pressure level
in the low range due to the proximity effect. And so on.
[0060]
(9) The voice processing apparatus may acquire the direction (posture) of the microphone with
respect to another sound source (such as a speaker) other than the user's mouth as the way
information. For example, the voice processing device acquires the sound collection angle formed
by the direction of arrival of the sound from the speaker with respect to the sound collection axis
of the microphone as the method information. In this aspect, when the sound collection angle is
smaller than the predetermined angle, the voice processing device determines that the holding
method is improper. If the sound collection angle is smaller than a predetermined angle, howling
is likely to occur due to the microphone and the speaker. Further, this sound collection angle
may be used to increase or decrease the score indicating the singing ability. Specifically, the point
may be deducted when the sound collection angle is smaller than a predetermined angle, or may
be added when the sound collection angle is larger than the predetermined angle. Further, the
voice processing device may warn that there is a possibility that howling may occur if the sound
collection angle is smaller than a predetermined angle.
[0061]
07-05-2019
21
(10) The voice processing devices 10A and 10B according to the second and third embodiments
indirectly detect how to hold the microphone 20A by image processing. However, the aspect of
indirectly detecting how to hold the microphone 20A is not limited to image processing. For
example, the voice processing apparatus may detect how to hold the microphone by ultrasonic
waves. This aspect can be realized as follows. First, a plurality of speakers are arranged around
the microphone. Each of these speakers emits ultrasonic waves of different bands for each
speaker. For example, the first speaker emits 18 kHz ultrasound, and the second speaker emits
20 kHz ultrasound. The user's microphone picks up ultrasonic waves from these speakers. The
voice processing device detects the direction and attitude of the microphone from the volume
balance of the sound from each speaker picked up by the microphone.
[0062]
(11) The correction unit 16B of the voice processing device 10B according to the third
embodiment includes a plurality of filters corresponding to the respective sound collection
angles. However, the correction unit has one common filter corresponding to each sound
collection angle, and switches the coefficient of the common filter according to the holding
method judgment result (sound collection angle) of the holding type judgment unit The voice
information may be corrected by
[0063]
(12) The voice processing device 10B according to the third embodiment does not acquire
holding method information from the microphone 20A at the time of calculating the sound
collection angle indicating the holding manner. However, holding information may be acquired
from the microphone when calculating the sound collection angle indicating the holding method.
This aspect can be realized, for example, by providing a microphone with a holding method
detection unit including an infrared sensor and the like.
[0064]
(13) The voice processing device may increase or decrease the score indicating the singing ability
according to how the breath is blown to the microphone. For example, the voice processing
07-05-2019
22
device deducts the score when the user blows a breath over the microphone. When the user
blows a breath into the microphone, the voice processing device obtains voice information from
the microphone including sudden low frequency components. The voice processing device
determines this frequency component as pop noise and deducts the score. Also in this aspect,
various scoring can be performed. Of course, both the way of holding the microphone and the
way of blowing breath may be reflected in the scoring.
[0065]
(14) The voice processing apparatus increases the score indicating the singing ability when the
method of holding the microphone by the user is substantially the same as the method of holding
the microphone by the actual artist of the karaoke song (hereinafter referred to as "model
holding"). good. For example, in the case where the artist's model holding style is to stand the
little finger, the user also increases the score indicating the singing ability when singing with the
little finger standing in the same manner. Specifically, the voice processing apparatus obtains
information indicating how to hold the model from the karaoke source, and realizes by
comparing the degree of agreement between the holding information by the user and the
information indicating the example of holding in the scoring unit. Can. In addition, the
information indicating how to hold the model may be stored in the voice processing device.
[0066]
(15) The voice processing devices 10 to 10C according to the above-described embodiments are
karaoke devices including the mixdown unit 17 that mixes accompaniment information with the
voice information after correction. However, the audio processing device may not have this
mixdown unit 17. The audio processing device may be at least a device including means for
acquiring microphone holding information and correction means for correcting the audio
information acquired from the microphone based on the microphone holding information. For
this reason, the voice processing apparatus is not limited to the karaoke application, and can be
used for various applications using a microphone. For example, the voice processing device can
be used as an audio device when a user holds a microphone and makes a speech.
[0067]
(16) Further, the mixdown unit of the voice processing apparatus is not limited to the mode in
07-05-2019
23
which the accompaniment information and the corrected voice information are mixed. The mixdown unit may be in any mode as long as the corrected audio information and other sound
information are mixed. For example, the mixdown unit mixes the corrected voice information
indicating the vocal with the sound output of the electric guitar. Further, the other sound
information referred to here may be other post-correction audio information different from the
post-correction audio information. For example, the mixdown unit may mix a plurality of postcorrection audio information corresponding to each of the plurality of microphones. In addition,
the mixdown unit may mix multiple types of other sound information with the corrected audio
information. Taking these into consideration, the voice processing device can be used as an
acoustic device (so-called PA) for forming a sound field in a live or concert.
[0068]
(17) The technical features of the above embodiments may be combined.
[0069]
(18) In the voice processing devices 10 to 10C according to the above embodiments, the
elements of the voice processing devices 10 to 10C are realized by hardware.
However, each element of the speech processing apparatus may be realized by causing a
computer to execute a program. In this case, the program may be traded as installed in the voice
processing device, may be traded separately as stored in a computer readable storage medium, or
may be traded by downloading via a network. It may be done.
[0070]
1, 1A, 1B, 1C: voice processing system, 10, 10A, 10B, 10C: voice processing device, 11A, 11B:
holding information acquisition unit, 12: acquisition unit, 12A: speech information acquisition
unit, 13, 13C ... Karaoke information acquisition unit 14, 14, 14A, 14B, 14C: holding manner
determination unit, 15C: scoring unit, 16, 16B: correction unit, 17: mixdown unit, 18: audio
output unit, 19C: display output unit, 20, 20A: microphone, 22: voice detection unit, 24: holding
method detection unit, 26: output unit, 27: grill unit, 28: main unit, 30, 30C: karaoke source.
07-05-2019
24
Документ
Категория
Без категории
Просмотров
0
Размер файла
39 Кб
Теги
jp2016082444
1/--страниц
Пожаловаться на содержимое документа