close

Вход

Забыли?

вход по аккаунту

?

JP2006211156

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JP2006211156
PROBLEM TO BE SOLVED: To provide a sound to which an appropriate sound effect is given
according to each listener. SOLUTION: When the microphone array 11 detects a voice, it specifies
the position of a sound source (ie, the speaker A). The information of the speaker A and the voice
signal received by the microphone array 11 are supplied to the control unit 10. The control unit
10 instructs the direction detection unit 13 to determine the face direction of A. The direction
detection unit 13 captures an image of A, performs image analysis processing, and determines
the direction of the face of A. The control unit 10 determines the talking party of A (that is, the
listener D) from the direction of the face of A. The control unit 10 is an acoustic effect capable of
causing the voice of A whose volume is amplified at a predetermined amplification factor to
reach only D, and to allow all participants including D to recognize that the sound source is the
speaker A. Decide. The control unit 10 calculates and provides the calculated parameters for
realizing the acoustic effect to the speaker array 12. The speaker array 12 generates a
predetermined acoustic wave. [Selected figure] Figure 3
Sound equipment
[0001]
The present invention relates to a technique for controlling sound provided to a listener.
[0002]
In a place where a large number of participants meet in one place to have a conversation, such as
in a conference, the listener becomes difficult to hear the speaker's voice if the distance between
04-05-2019
1
the parties of the conversation (speaker and listener) is large. Will occur.
Usually, in such a case, the speaker's voice is amplified and emitted using a speaker and a
speaker consisting of a microphone and a speaker. However, when such a loudspeaker system is
used, there arises a problem that a person near the speaker or a person near the speaker
unnaturally hears a loud volume. That is, the conventional loudspeakers can not provide an
appropriate sound according to the position of each listener. In other words, it is not possible to
generate an acoustic wave that satisfies all the listeners present.
[0003]
As a related art, Patent Document 1 discloses a method of determining the direction of a
speaker's face using a pair of microphones. However, simply acquiring information on the
direction of the speaker's face can not identify a listener to whom speech should be provided.
Therefore, it is not possible to provide sound according to the position of each listener. JP 10243494 A
[0004]
The present invention has been made in view of the above-described background, and it is an
object of the present invention to provide a method and apparatus for providing sound to which
an appropriate sound effect is given according to each listener.
[0005]
In order to solve the above problems, according to the present invention, a speaker position
specifying means for detecting a voice and specifying a position of the speaker, and a direction in
which the face of the speaker specified by the speaker position specifying means is facing A
speaker direction specifying means for specifying the speaker position, the speaker position
specified by the speaker position specifying means, and the other party of the speaker based on
the speaker direction specified by the speaker direction specifying means Sound effect to be
applied to the voice of the speaker based on the listening position specifying means for
specifying the position of the listener, the speaker position, and the listening position specified by
the listening position specifying means Sound determining means for determining the sound and
sound wave generating means for providing the sound of the speaker to the listener by
generating an acoustic wave corresponding to the sound effect determined by the sound
04-05-2019
2
determining means Provide an apparatus.
According to this acoustic device, the listener of the other party who is speaking with the speaker
can be specified, and the listener can be provided with the sound to which the acoustic effect
according to the position is given. It can assist in hearing.
[0006]
In a preferred embodiment, the audio device according to the present invention comprises a
listening position specifying means for specifying a listening position, and a listening direction
for specifying the direction in which the listener's face at the listening position specified by the
listening position specifying means is facing. Speaker position specifying means for specifying
the position of a speaker involved in a conversation desired to be heard by the listener based on
the specifying means, the listening position, and the direction specified by the listening direction
specifying means; Sound determining means for determining an acoustic effect to be applied to
the voice of the speaker based on the speaker position specified by the speaker position
specifying means and the listening position specified by the listening position specifying means
And acoustic wave generation means for providing the sound of the speaker to the listener by
generating an acoustic wave corresponding to the acoustic effect determined by the acoustic
determination means. According to this aspect, it is possible to assist the listener to listen to the
speech of the conversation desired by the listener.
[0007]
In another preferred embodiment, the audio apparatus according to the present invention
comprises a listening position specifying means for specifying a listening position, an operator
for causing a listener at the listening position to specify a desired listening direction, and the
listening position specifying means. A speaker position specifying means for specifying the
position of a speaker involved in a conversation desired to be heard by the listener based on the
listening position designated by the user and the listening direction designated using the
operator; Sound determining means for determining an acoustic effect to be applied to the voice
of the speaker based on the speaker position specified by the person position specifying means
and the listening position specified by the listening position specifying means; And sound wave
generation means for providing the sound of the speaker to the listener by generating an
acoustic wave corresponding to the sound effect determined by the sound determination means.
[0008]
04-05-2019
3
In yet another preferred embodiment, the audio device according to the present invention
comprises speaker position specifying means for specifying the position of the speaker by
detecting speech, and the speaker at the position specified by the speaker position specifying
means. A speaker direction specifying unit that specifies a direction in which the face is facing,
the speaker position specified by the speaker position specifying unit, and the direction specified
by the speaker direction specifying unit; A listening position specifying means for specifying the
position of a listener who is the other party of the speaker; a listening direction specifying means
for specifying the direction in which the listener's face at the position specified by the listening
position specifying means is facing; The speaker based on the speaker position specified by the
speaker position specifying means, the listening position specified by the listening position
specifying means, and the listening direction specified by the listening direction specifying means
Determine the sound effects to be applied to A sound determining unit; and an acoustic wave
generating unit that provides the listener with the voice of the speaker by generating an acoustic
wave corresponding to the sound effect determined by the sound determining unit. Do.
[0009]
Hereinafter, an operation example of the present invention will be described with reference to
the drawings.
In the following embodiment, it is assumed that a plurality of participants exchange a speech
while sitting around a round table.
It is FIG. 1 which showed arrangement | positioning of the participant in this case. As shown in
the figure, a total of 12 participants (that is, persons who can be either speakers or listeners) A to
L gather in the meeting room and sit in a chair provided at a predetermined position. , Round
table T to exchange messages. In addition, each participant shall not stand up from a chair and
move a seat.
[0010]
<Structure> FIG. 2 is a block diagram showing a functional structure of the listener support
apparatus 1 of the present invention. As shown in the figure, the listener support apparatus 1
includes a control unit 10, a microphone array 11, a speaker array 12, a direction detection unit
13, a storage unit 14, an input unit 15, and a bus 16 connecting the respective units.
04-05-2019
4
[0011]
The microphone array 11 is an audio input device including microphone units such as a plurality
of condenser microphones (not shown), an A / D conversion circuit, and a processor such as a
DSP. When voice is input at each microphone unit, the function of specifying the position of the
sound source of the voice (that is, the speaker if the received voice is a human voice) based on
the difference in sound pressure level and the arrival time difference Have. Information on the
identified sound source position is supplied to the control unit 10 together with the audio signal.
[0012]
The direction detection unit 13 includes one or more video cameras and an image processor (not
shown), and under the instruction of the control unit 10, images each participant and performs
predetermined image analysis processing on the image. , And an apparatus for determining the
orientation of each of the participants A to L. As an example, in the seating arrangement of the
participants shown in FIG. 1, a total of 12 video cameras are installed such that the areas Ra to Rl
where the participants are present are respectively imaged. The direction detection unit 13 may
be composed of two units, a unit for specifying the direction of the face of the speaker and a unit
for specifying the direction of the face of the listener. In this case, the former unit is configured
of a plurality of microphones and a voice signal processor, and the direction of the voice emitted
by the speaker is detected based on the difference in sound pressure level and delay amount
input to each microphone. The orientation of the face may be determined. Information on the
orientation of the identified participant's face is supplied to the control unit 10.
[0013]
The storage unit 14 is a storage device such as a RAM, a ROM, a hard disk, etc. and is used as a
work area of the control unit 10 and stores the positions of the seats of the round table T. The
input unit 15 is an input device such as a keyboard or a mouse, and needs information on the
seat position and the listener support device 1 to assist the participant in hearing when the
participant is a non-party of a conversation It is for inputting the information of whether there is
or not.
04-05-2019
5
[0014]
When the control unit 10 receives the information on the position of the speaker and the
direction of the face detected by the direction detection unit 13, the control unit 10 refers to the
storage unit 14, and the other party who the speaker is talking with In particular, it has a
function of identifying “a designated listener”. As an example, as shown in FIG. 3, assuming
that the position of the sound source, that is, the speaker is A, and the face direction of A is
identified as the direction of angle θ shown in the figure, the control unit 10 stores Referring to
the part 14, the participant (ie, the designated listener) positioned in the direction of the face
direction of A is determined to be D. When the control unit 10 receives the information on the
position of the speaker and the direction of the face and the information on the position of the
listener or the direction of the face from the microphone array 11 and the direction detection
unit 13, respectively, The sound field to be formed in the conference room is calculated in order
to provide each participant with the voice of the speaker to which the acoustic effect is applied.
As an example of a method of forming a sound field, a speaker can hear a sound from A to all the
participants B to L other than the speaker (that is, the speaker is A from the sound reaching his
own ears) And so that only the sound that can be heard in D is given an acoustic effect such that
it can be amplified and heard at a predetermined rate than the normal volume. Here, the
predetermined ratio is determined according to, for example, the distance between A and D. As
an example of the formed sound field, an acoustic wave of twice the sound level of the normal
reaches the area Rd which is the listening area of D, and the normal areas Ra to Rc and Re to Rl
other than the normal sound level Ensure that level acoustic waves are reached. As a result, while
the voice produced by A arrives at a normal volume to participants other than D, the volume is
doubled and delivered to D.
[0015]
The method of forming the sound field is not limited to this, and for example, a beam-shaped
acoustic wave with very high directivity may be generated in the direction from the speaker
array 12 to D. In this case, the sound emitted from the speaker array 12 and reaching the
participants other than D can be made very small or almost zero. In other words, participants
other than D will hear only the voice of A. As described above, the acoustic effect provided to
each participant can be freely controlled in accordance with the positional relationship of each
participant, the size and the shape of the room, and the like. The case where there are a plurality
of speakers or listeners, that is, the case where a plurality of conversations are simultaneously
proceeding, will be sequentially described. When the sound field is determined, the control unit
10 calculates parameters such as the delay amount and gain of the audio signal to be supplied to
each speaker unit, which are necessary to realize the sound field, to the speaker array 12. Output.
04-05-2019
6
[0016]
The speaker array 12 is an audio output device including a plurality of speaker units (not shown)
and a control circuit for controlling the speaker units. By appropriately controlling the delay
amount and gain of the audio signal supplied to each speaker unit, a directional acoustic wave is
generated to control the sound provided for each of the areas Ra to Rl shown in FIG. 3. Have. As a
result, it is possible to freely control the direction and position of the sound source recognized by
the listener at the position and the volume level. Specifically, when parameters such as the delay
amount and gain are received from the control unit 10, sound emission processing is performed
from each speaker unit according to the parameters.
[0017]
In addition, the installation place of each part of the listener assistance apparatus 1 such as the
microphone array 11 and the speaker array 12 is arbitrary. For example, it may be installed on
the round table T, or may be installed on the ceiling or wall of a room in which the round table T
is installed. The detailed configuration of the microphone array 11 and the speaker array 12,
such as the number of microphone units and speaker units, and the installation locations of the
microphone array 11 and the speaker array 12 are environments such as the size and shape of
the indoor space and the number of participants. It can be suitably selected accordingly.
[0018]
Operation Example 1 First, as the simplest example, as shown in FIG. 3, a case where a
conversation is performed between one speaker and one listener is considered. Specifically, as
shown in the figure, it is assumed that A is talking to D. FIG. 4 is a flowchart showing an example
of the operation of the listener support apparatus 1. First, when voice is detected by the
microphone array 11 (step S10), the microphone array 11 specifies the position of a sound
source (ie, a speaker) (step S12). Thereby, it is specified that the speaker is A. Then, the
information that the speaker is A and the voice signal received by the microphone array 11 are
supplied to the control unit 10. Subsequently, the control unit 10 instructs the direction
detection unit 13 to determine the direction of the face A. The direction detection unit 13
captures an image of A and performs an image analysis process to determine the direction of the
04-05-2019
7
face of A (step S14), and supplies this information to the control unit 10. The control unit 10
determines the other party with which A is talking (that is, the designated listener) from the
position of A and the face orientation of A (step S16). In this example, the designated listener will
be determined to be D. Thus, when the speaker and the designated listener are determined, the
control unit 10 receives the voice of A whose volume is amplified at a predetermined
amplification factor and includes D only to the designated listener D. A sound field for causing all
the participants to recognize that the sound source is at A is calculated (step S18). Subsequently,
the control unit 10 calculates parameters for realizing the calculated sound field and provides the
calculated parameters to the speaker array 12 (step S20). As a result, an acoustic wave for
realizing the sound field is generated from the speaker array 12 (step S22). As a result, the
amplified voice of the speaker A reaches only the region Rd which is the seat position of D.
[0019]
Thus, according to the present operation example, the speaker and the listener can be identified,
and the amplified voice of the speaker can be delivered to the designated listener. It can help to
listen to the story. On the other hand, the speaker's voice arrives at a normal volume to
participants other than the designated listener who are presumed not to be directly related to the
conversation between the speaker and the designated listener. This makes it possible to provide
an appropriate sound according to each participant.
[0020]
Operation Example 2 Next, a case where conversations are being made between a plurality of
speakers and listeners will be considered with reference to FIG. As shown in the figure, it is
assumed that conversations are being performed between A and D and between B and G (each
being conversation X and conversation Y). In this state, the participant J is focused. Since J is not
a party to the conversation, the speech of conversations X and Y reaches J at normal volume, as
described above. Now, J wants to go into either conversation A or conversation B for the time
being, but in this state the voices of the two conversations are mixed and reach J's ear, so J hears
both conversations It is a difficult situation.
[0021]
FIG. 6 is a flowchart showing this operation example. Steps S10 to S18 are the same as in the
04-05-2019
8
operation example 1. That is, when the speakers are A and G and the listeners are D and B, the
processes in steps S10 to S16 are performed twice for speech X and Y, respectively, and only the
listeners D and B hear The sound field is calculated to assist (step S18). In this operation
example, in addition to this, the sound field is further corrected. Specifically, the direction
detection unit 13 detects the direction of the face of J (step S19A). As shown in the figure, since J
wants to hear not conversation X but conversation Y, he points his face in the direction of G who
is the speaker of conversation Y. Therefore, from the position of J and the detected face direction
of J, the control unit 10 determines that the participant in that direction is G, and the voice of G is
predetermined in the area Rj which is the listening area of J. The sound field is recalculated so as
to be amplified and delivered at the amplification factor of (step S19B). The subsequent
processing is the same as that of the first embodiment, so the description will be omitted.
[0022]
In this example, for convenience of explanation, attention was paid to the participant J, but it
goes without saying that the sound field may be corrected in consideration of the face direction
of all the participants who are non-party of the conversation. In addition, in the case where sound
field correction is performed only for a specific non-party area, the non-party designation method
is arbitrary. For example, when there is an image captured by the direction detection unit 13 in
which the participant is not shown, it is known that the participant does not exist at the position
corresponding to the imaging region. The process of correcting the field can be omitted. In
addition, in the case where assistance is given to only a specific person among the non-party, that
is, when a listener is specified, as an example, when the participant is a non-party of a
conversation, to the participant With regard to the information as to whether or not it is
necessary to assist the hearing, the user inputs in advance to the listener support device 1 via the
input unit 15 before the conference is started, and the control unit 10 Based on the above, the
non-party requiring the hearing aid is identified, and the above-mentioned sound field correction
process for the hearing aid is performed only for the non-party. As described above, according to
the present operation example, since the sound that reaches the participant is adjusted according
to the direction in which the face of the participant who is not the party of the conversation faces
the face, assisting the listening of the conversation desired by the participant. Can.
[0023]
Operation Example 3 Next, a case where a plurality of speakers talk to one participant (that is,
one designated listener) will be considered. This is shown in FIG. As shown in the figure, there
are three speakers A, D, and F, all pointing their faces in the direction of J and talking to J at the
04-05-2019
9
same time. The voices of A, D and F are mixedly heard in J, so it is very difficult to hear each
conversation. In the following, only the sound reaching J, that is, the sound field formed in the
region Rj will be considered. In the present operation example, it is assumed that only J is
designated in advance as a participant to assist in hearing through the input unit 15.
[0024]
This operation example will be described with reference to FIG. As in the operation example 2,
after the processing of steps S10 to S16 is performed three times between A-J, D-J, and F-J three
times to calculate the sound field (step S18), the face direction of J is detected. The correction of
the sound field (steps S19A and 19B) is the same. In this operation example, in step S19C, the
sound field is corrected again so that the volume level of the voice emitted by the speaker whose
J does not face is attenuated by a predetermined attenuation rate and reaches J. According to this
operation example, the voice of the speaker from the direction in which the listener J is facing is
amplified and reaches J, while the voice of the speaker from the direction in which J is not facing
is attenuated. Arrives at J. That is, regardless of the orientations of the faces of the speakers A, D,
and F, if the listener J selects a speaker who wants to hear the story and turns the face to the
direction of the speaker, the speaker's You can listen to your voice selectively. This can provide a
listener with a suitable acoustic environment.
[0025]
(Modifications) In the operation examples 2 and 3, the listener is picked up, the image analysis is
performed, and the direction of the face of the listener is detected, thereby determining the
speaker that the listener wants to hear. The method of determining the desired speaker of the
listener is not limited to this. For example, a pointing device having a geomagnetic sensor, a
gravity sensor, or the like built therein may be provided to all the participants, and used to
designate each participant as a desired speaker. Specifically, when the participant points the
pointing device to a desired speaker, the direction in which the pointing device is pointed is
detected by the built-in sensor, and information on the direction is transmitted from the pointing
device to the direction detection unit 13 by wireless communication. Will be sent. By using the
pointing device in this way, the listener can specify the speaker regardless of the direction of the
face, so it is possible, for example, to face downward and take notes while listening to the story.
[0026]
04-05-2019
10
In the above embodiment, the positions of the participants are assumed to be approximately
fixed, but the present invention is also applicable when the participants move. In this case, as an
example, a sensor device having a wireless communication function is attached to clothes of all
the participants, and the listener supporting device 1 is provided with a wireless communication
function. The sensor device detects the current positions of all the participants in real time and
sequentially transmits them wirelessly to the listener assisting apparatus 1. Upon receiving the
information on the position of each participant, the listener assisting apparatus 1 updates the
listening area of the participant. According to such a configuration, since the listener assisting
apparatus 1 can grasp the positions of all the participants in real time, the position of the
designated listener can be determined from the direction of the speaker's face as in the abovedescribed embodiment. It is possible to specify a listening area of a listener who is to be provided
with a sound to which a predetermined sound effect such as increase or decrease of the volume
level of the speaker's voice is given.
[0027]
In the operation example 1, under the assumption that the other party D who is speaking by the
speaker A is in conversation with A, the direction of the listener's face is not detected. However,
in practice, for example, it is possible that D talks to J even though A talks to D. Thus, when the
intention of the speaker and the intention of the designated listener do not agree, in other words,
it is provided to persons who are parties of two or more conversations who are both the listener
and the speaker. Various aspects are conceivable for the acoustic effect. For example, when
priority is given to the intention of the speaker as in the operation example 1, the voice of the
person A who is talking to D may be amplified and provided to D in the region Rd regardless of
the direction of the face of D. Conversely, if it is desired to give priority to the intention of
listener D, the voice of speaker A may be provided to D with normal sound without amplification.
Alternatively, the voice of A may be amplified and provided to D at an amplification factor lower
than a predetermined amplification factor.
[0028]
In the above embodiment, a predetermined amplification factor is used when the sound volume is
corrected. However, there are individual differences in the speaker's voice volume. For example, a
speaker who speaks a large voice may not need to be amplified. Conversely, a speaker who
speaks a very small voice may be amplified. In some cases it may be better to set the rate higher
than normal. In such a case, a predetermined upper limit may be set for the volume reaching the
04-05-2019
11
listener, and if the volume of the amplified sound exceeds this threshold, this upper limit may be
the amplified volume. Alternatively, when the voice after amplification falls below a
predetermined lower limit value, this lower limit value can be used as the volume after
amplification. As described above, by setting the predetermined upper limit value or the lower
limit value, it is possible to provide the listener with the sound of the volume that has been
appropriately corrected regardless of the volume of the speaker.
[0029]
Further, in the above embodiment, the microphone array and the speaker array are used to
specify a speaker and to provide a desired listener with a desired voice, but it is not always
necessary to arrange the microphones and the speakers in an array shape. Absent. As for the
microphones, the installation position of each microphone unit is arbitrary as long as the speaker
(sound source) can be specified. As for the speakers, the installation positions of the respective
speaker units are arbitrary as long as it is possible to provide sounds to which different acoustic
effects are given to the positions of the participants (listeners). Further, instead of generating a
directional acoustic wave from the fixed speaker array, each speaker unit may be provided with a
drive function to appropriately change the direction of the speaker unit. The point is that the
sound field around the listener can be controlled based on the information on the position and
orientation of the speaker and the listener.
[0030]
Also, in the above embodiment, although the listener's hearing is assisted by amplifying or
attenuating the volume level of the sound reaching the target listener, the method of assisting the
listener, ie, the acoustic effect provided to the listener Is not limited to the increase or decrease of
the volume level. For example, the frequency characteristics of the audio signal may be changed
if the listener can easily hear it.
[0031]
It is a figure showing arrangement of a speaker and a listener. It is a figure showing functional
composition of listener support device 1 of the present invention. It is a figure showing the
position relation of a speaker and a listener. It is the flowchart which showed the operation
example of this invention. It is a figure showing the position relation of a speaker and a listener.
04-05-2019
12
It is the flowchart which showed the operation example of this invention. It is a figure showing
the position relation of a speaker and a listener. It is the flowchart which showed the operation
example of this invention.
Explanation of sign
[0032]
DESCRIPTION OF SYMBOLS 1 ... listener support apparatus, 10 ... control part, 11 ... microphone
array, 12 ... speaker array, 13 ... direction detection part, 14 ... storage part, 15 ... input
Department, 16 ... bus.
04-05-2019
13
Документ
Категория
Без категории
Просмотров
0
Размер файла
26 Кб
Теги
jp2006211156
1/--страниц
Пожаловаться на содержимое документа