close

Вход

Забыли?

вход по аккаунту

?

JP2008113118

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JP2008113118
The present invention provides an acoustic reproduction system in which localization feeling of a
virtually generated sound image is not impaired even if there is a change in the positional
relationship of the head of the listener or a change in movement of the head. SOLUTION: Two
speakers 11SW1 and 11SW2 are held at a predetermined position near both ears of the listener
without contacting both ears of the listener 4. An input sound signal is subjected to virtual sound
source processing using a head related transfer function and supplied to two speakers. For
detecting the positional relationship of the head of the listener with respect to the two speakers
and / or the movement of the head of the listener, the positional relationship of the head of the
listener with respect to the two speakers and / or the movement of the head of the listener A
head-related transfer function storage unit that stores a plurality of the corresponding headrelated transfer functions is provided. A head-related transfer function according to the positional
relationship of the head of the listener with respect to the two speakers detected by the detection
means and / or the movement of the head of the listener is read out from the head-related
transfer function storage unit to perform virtual sound source processing Use. [Selected figure]
Figure 1
Sound reproduction system and sound reproduction method
[0001]
The present invention relates to an acoustic reproduction system and an acoustic reproduction
method for acoustically reproducing an audio signal subjected to virtual sound source
processing.
[0002]
10-05-2019
1
An audiovisual reproduction system called a home theater system is in widespread use.
In this video and audio reproduction system, for example, video reproduction from a DVD (Digital
Versatile Disc) is performed by displaying it on a relatively large screen display, and sound
reproduction is performed using a multi-channel surround sound system, recently, 5.1. A channel
system is adopted to enable powerful video and audio reproduction.
[0003]
The 5.1-channel sound reproduction system requires four types of speakers: the front of the
listener (hereinafter referred to as the front), the front of the listener (hereinafter referred to as
the center), the back of the listener (hereinafter referred to as the rear) The subwoofer, which is a
speaker dedicated to the low frequency band, originally receives a band of 100 Hz or less in
monaural. Other speakers are responsible for 100 Hz to 20 kHz.
[0004]
Conventionally, the speaker arrangement in a 5.1 channel sound reproduction system is as
shown in FIG. That is, as shown in FIG. 22, the front left channel speaker 10FL is disposed on the
left side, the front right channel speaker 10FR is disposed on the right side, and the center
channel speaker 10C is disposed on the front side. .
[0005]
Further, the rear left channel speaker 10RL is disposed on the left side behind the listener 4, and
the rear right channel speaker 10RR is disposed on the right side. Furthermore, a subwoofer
speaker 10SW for low frequency effect (LFE) channel (low frequency dedicated) is disposed at an
appropriate position.
[0006]
10-05-2019
2
These six speakers 10FL, 10FR, 10C, 10RL, 10RR and 10SW are respectively attached to speaker
boxes (boxes) and arranged at respective positions. Usually, the front and rear six speakers are
often arranged with a distance ds to the listener 4 of, for example, about 2 meters.
[0007]
In the conventional sound reproduction system, for example, speakers for the front left and right
channels in which a speaker box of about 15 liters is used are changed to small boxes of about 1
liter and are also called satellite speakers. Of course there is no low frequency, so a low
frequency dedicated speaker called a subwoofer has been added to aid it. Thus, when the
speakers other than the subwoofer are small boxes, the crossover frequency of the audio signal
supplied to the subwoofer 10SW is often 150 Hz, which is slightly higher than the above 100 Hz,
There is no change in the considerably low frequency.
[0008]
When the 5.1 channel audio signal from the DVD is reproduced by the speaker system arranged
as described above, it is natural that a sufficient bass is reproduced. In addition, since the
reproduction side also has a special channel exclusively for the low-frequency area, it is possible
to have a powerful realism, such as in a room such as a movie or the like, where an unheard of
deep bass sounds in the room.
[0009]
However, in Japan where the house is small, there is a problem that the space for arranging the
six speakers as described above for reproducing the multi-channel surround sound can not be
secured, and the noise due to the sound leakage to the outside There's a problem.
[0010]
That is, in a normal 5.1 channel speaker configuration, a sound volume of about 90 dB or more is
required to reproduce powerful sound in DVD audiovisual appreciation.
10-05-2019
3
Therefore, when the listener tries to obtain the effect of multi-channel surround well, it is
necessary to consider the problem of external noise.
[0011]
At this time, in general, high frequency sounds can be easily isolated, and considerable noise can
be attenuated with a wall or a door. However, in the case of low-frequency sounds such as 100
Hz or less, sound insulation can not be easily achieved, and in small Japanese houses, the room
can not be large enough to block the low-frequency sounds. . Especially the subwoofer's 50Hz
and 40Hz bass sounds, and the sound is transmitted to a considerable range.
[0012]
For this reason, when the sound is reproduced from the subwoofer, the sound may reach the next
room as well as the upper and lower rooms and cause trouble. Sounds are particularly difficult in
low-band sounds, subwoofers are a big problem in Japanese housing situations, and there is a
current situation where it is not possible to fully use the sound reproduction system of 5.1
channels.
[0013]
In order to solve this problem, according to Patent Document 1 (Japanese Patent Laid-Open No.
5-95591), medium- and high-pitched sounds are acoustically reproduced by a small speaker (a
type in which a speaker unit is housed in a speaker box) A sound reproduction system has been
proposed in which sound is reproduced near the listener's ear by headphones or bone
conduction.
[0014]
According to the technique of Patent Document 1, since the bass is reproduced in the vicinity of
the listener's ear by headphones or bone conduction, even if a listener can hear a large volume, it
can be prevented from being transmitted to the next house. .
[0015]
10-05-2019
4
The above-mentioned patent documents are as follows.
Unexamined-Japanese-Patent No. 5-95591
[0016]
However, in the invention of Patent Document 1 described above, even if low-range sound is
reproduced in the vicinity of the ear, it is not a speaker but a vibrator using headphones or bone
conduction.
It is thought that obtaining a low-pass feeling equivalent to that of a speaker with a vibrating
body other than a speaker is not a sense that is generally accepted as generally accepted,
although there are individual differences. In addition, the listener has to wear headphones and a
headset for bone conduction, which is bothersome.
[0017]
Furthermore, although the invention of Patent Document 1 alleviates the problem of noise
relating to low-range sounds, the problem of having to place a large number of speakers in a
narrow space is solved by the invention of Patent Document 1 Absent.
[0018]
In light of this point, the applicant previously described, as Japanese Patent Application No.
2006-24302 (filed on February 1, 2006), two speakers with the head of the listener sandwiched
between the listener's left and right ears. , And the sound source processing is performed on the
input sound signal so that the listener listens as if sound is emitted from another speaker device
when sound reproduction is performed by the speaker. The present invention provides a sound
reproduction system adapted to supply two speakers.
[0019]
In the invention previously proposed, the two speakers are configured to be held in the vicinity of
the listener's ear, so that even if the speaker is not loud, the listener can be heard loudly. can do.
10-05-2019
5
For this reason, the sound transmitted to the next house is reduced.
[0020]
Furthermore, in the invention previously proposed, for example, the sound of the front channel
of the multi-channel surround sound and the sound of the rear channel are subjected to virtual
sound source processing and supplied by these two speakers, and the front channel and the rear
channel Sound is reproduced as sound.
Therefore, it is possible to eliminate the need to provide a front channel speaker and a rear
channel speaker.
[0021]
By the way, in the invention previously proposed, the sound image due to the sound reproduction
of the sound signal subjected to the virtual sound source processing by the two speakers is that
the listener does not move by sitting in the middle of the two speakers. It is created virtually as a
precondition. This is based on the assumption that in most cases, the listener is looking at the
image, and thus hardly changes the position of the head or the direction of the face (the direction
of the head).
[0022]
However, in reality, the listener sometimes changes the direction of the face (head), repositions
the head temporarily in either direction of the left and right speakers, or moves the head back
and forth. There were many things to do. Then, when there is a positional change or a movement
change of the head with respect to the two speakers of the head of such a listener, there is a
problem that the localization feeling of the virtually created sound image is lost.
[0023]
In this invention, in view of this problem, even if there is a change in the positional relationship
of the head of the listener with respect to the speaker or a change in the movement of the head,
10-05-2019
6
the sense of localization of the virtually generated sound image will not always be impaired. It is
an object of the present invention to provide a sound reproduction system.
[0024]
In order to solve the above problems, the present invention provides two speakers and the two
speakers in a predetermined position near both ears of the listener without contact with both
ears of the listener. Holding means for holding, detection means for detecting the positional
relationship of the head of the listener with respect to the two speakers and / or movement of
the head of the listener, and sound reproduction by the two speakers Audio signal supplied to the
two speakers by performing virtual sound source processing using a head-related transfer
function so that the listener listens as if sound is emitted from another speaker device An output
unit; a head-related transfer function storage unit that stores a plurality of head-related transfer
functions according to a positional relationship of the head of the listener with respect to the two
speakers and / or a movement of the head of the listener; With detection means A head-related
transfer function according to the positional relationship of the head of the listener with respect
to the two speakers output and / or the movement of the head of the listener is read from the
head-related transfer function storage unit, and the audio signal And a control means for
supplying output means. The sound reproduction system is provided.
[0025]
In the present invention, the head related transfer function storage unit stores a plurality of head
related transfer functions according to the positional relationship of the head of the listener with
respect to the two speakers and / or the movement of the head of the listener.
[0026]
And a detection means detects the positional relationship of the head of a listener with respect to
two speakers, and / or the motion of the head of a listener, and passes the detection result to a
control means.
The control means reads out from the head-related transfer function storage unit a head-related
transfer function according to the positional relationship of the head of the listener with respect
to the two speakers detected by the detecting means and / or the movement of the head of the
listener. Supply to audio signal output means.
10-05-2019
7
[0027]
The audio signal output means supplies the audio signal subjected to virtual sound source
processing to the two speakers using the head related transfer function passed from the control
means.
[0028]
Therefore, the audio signal output means uses the head related transfer function according to the
position and the movement state after the change even when the listener changes the position
and the movement of the head with respect to the two speakers. Then, an audio signal subjected
to virtual sound source processing is generated, and the generated audio signal is supplied to two
speakers.
As a result, with regard to a virtual sound image, it is always possible to enjoy the reproduced
sound without losing the sense of localization.
[0029]
According to the present invention, the sense of localization of the virtual sound image by the
sound signal subjected to virtual sound source processing using the head related transfer
function is such that the listener changes its head position with respect to the two speakers and
moves. At times, it is possible to provide a sound reproduction system that is not compromised.
[0030]
Hereinafter, several embodiments of the sound reproduction system according to the present
invention will be described with reference to the drawings, taking the case of sound reproduction
of the 5.1 channel multichannel surround sound described above as an example.
[0031]
First Embodiment In the first embodiment, video surveillance and video surveillance are
performed using a video signal and an audio signal reproduced by a DVD player and a digital
broadcast signal received by a television receiver. It is an example in the case of performing 5.1
channel surround sound listening.
[0032]
10-05-2019
8
FIG. 1 is a diagram showing an outline of a sound reproduction system according to the first
embodiment.
[0033]
As shown in FIG. 1, in the sound reproduction system according to the first embodiment, a
television receiver 1 as a video monitor apparatus having two speakers 11FL and 11FR, a DVD
player 2, and an audio signal output apparatus. Section 3 and two speakers 11SW1 and 11SW2
provided in the vicinity of the two ears of the user 4, and in this example, mounted on the
television receiver 1, the listener 4 and the two speakers 11SW1 and 11SW2 side It comprises:
an imaging unit 5 for imaging; marker means 6L and 6R for easily recognizing the positions of
the speakers 11SW1 and 11SW2 in the imaged image; and an audio signal receiving unit 7.
[0034]
In the first embodiment, two speakers 11FL and 11FR of the television receiver 1 are used for
reproducing the front left and right two channels of 5.1 channel surround sound.
The two speakers 11FL and 11FR may be incorporated in the television receiver 1, or may be
provided separately from the television receiver 1.
[0035]
Further, in the first embodiment, the two speakers 11SW1 and 11SW2 provided in the vicinity of
the both ears of the user 4 are for low-pass audio (LFE channel) reproduction of 5.1 channel
surround audio, that is, , And a subwoofer.
Further, in the first embodiment, in addition to the low-pass audio signal LFE, the two speakers
11SW1 and 11SW2 serving as the subwoofers, in addition to the low-pass audio signal LFE,
include rear left and right sides of the 5.1 channel surround audio. An audio signal of 2-channel
audio is subjected to virtual sound source processing in the audio signal output device unit 3 and
supplied.
[0036]
10-05-2019
9
The television receiver 1 has a function capable of receiving a digital broadcast signal,
reproduces a video signal and an audio signal of a digital broadcast program from the received
digital broadcast signal, and digitally displays the display screen 1D of the television receiver 1
While displaying the reproduced video of the broadcast program, the speakers 11FL and 11FR
reproduce the reproduced sound of the digital broadcast program by sound.
[0037]
In this case, when the sound of the digital broadcast program is multi-channel surround sound,
the reproduced sound of the digital broadcast program emitted from the speakers 11FL and
11FR is the front left and right two channels, center channel, rear left and right two channels, etc.
Voice is included.
[0038]
Then, in this embodiment, the audio signal Au 1 received and reproduced by the television
receiver 1 is supplied to the audio signal output device unit 3.
[0039]
The DVD player 2 reproduces and outputs the video signal and the audio signal recorded on the
DVD.
In this example, the video signal Vi reproduced by the DVD player 2 is supplied to the television
receiver 1, and a reproduced video by the reproduced video signal Vi is displayed on the display
screen 1D.
Further, the audio signal Au2 reproduced by the DVD player 2 is supplied to the audio signal
output unit 3 in this example.
[0040]
In this embodiment, the audio signal output device section 3 has a decoding function
corresponding to the 5.1 channel multi-channel surround audio system, and the audio of the
digital broadcast program received by the television receiver 1 is 5.1 channel surround When
10-05-2019
10
reproducing by voice, an audio signal to be supplied to the first and second speakers 11SW1 and
11SW2 provided near the user's 4 ears is generated.
Then, the audio signal output device unit 3 multiplexes (multiplexes) the generated audio signals
to be supplied to the first and second speakers 11SW1 and 11SW2, and in this example,
transmits to the audio signal receiving unit 7 using radio waves. Transmit wirelessly.
[0041]
The audio signal receiving unit 7 receives a radio wave from the audio signal output device unit
3, extracts a multiplexed audio signal from the received radio wave, de-multiplexes it, and
supplies it to the first speaker 11 SW 1. The audio signal and the audio signal to be supplied to
the second speaker 11SW2 are separated and supplied to the first speaker 11SW1 and the
second speaker 11SW2, respectively.
[0042]
Note that wireless transmission from the audio signal output device unit 3 is not limited to radio
waves, and ultrasonic waves or light may be used.
[0043]
Further, at the time of reproduction of the video and audio reproduced by the DVD player 2, the
audio signal output device unit 3 is only with the audio signals supplied to the first and second
speakers 11SW1 and 11SW2 provided in the vicinity of the user's 4 ears. Instead, audio signals
to be supplied to the two speakers 11FL and 11FR for the left and right two channels of the
television receiver 1 are generated and supplied to the respective corresponding speakers.
[0044]
In the first embodiment, the audio signal output device unit 3 transmits the audio signal L of the
front left channel and the center channel to the two speakers 11FL and 11FR for the left and
right two channels of the television receiver 1. The sum signal with the audio signal C and the
sum signal of the audio signal R of the front right channel and the audio signal C of the center
channel are supplied.
[0045]
Further, the audio signal output device unit 3 applies to the two speakers 11SW1 and 11SW2 in
the vicinity of the both ears of the listener 4 an audio signal RL * of a rear left channel subjected
10-05-2019
11
to so-called virtual sound source processing and a low frequency audio which will be described
later. The sum signal with the signal LFE and the sum signal of the voice signal RR * of the rear
right channel subjected to virtual sound source processing and the low-pass sound signal LFE are
supplied.
[0046]
The marker unit 6L is installed near the speaker 11SW1 so that the position of the speaker
11SW1 on the left ear side of the listener 4 can be specified, as described later.
Similarly, the marker unit 6R is installed near the speaker 11SW2 so that the position of the
speaker 11SW2 on the right ear side of the listener 4 can be specified as described later.
[0047]
The configuration of the imaging unit 5 will be described in detail later, but in this example, a
CCD (Charge Coupled Device) image sensor or a CMOS (Complementary Metal Oxide
Semiconductor) image sensor as an imaging device is provided, and near infrared light is emitted.
And an LED (Light Emitting Diode) as a light source for irradiating the listener 4.
[0048]
The reason why the near infrared light is irradiated to the listener 4 is to make it possible to
easily recognize the reflection from the pupil of the listener 4 as described later.
[0049]
The imaging unit 5 images an area including the listener 4 and the marker units 6L and 6R, and
sends the imaged image data CAM to the audio signal output device unit 3 through the
connection cable in this example.
It is to be noted that signal transmission between the imaging unit 5 and the audio signal output
device unit 3 may be performed wirelessly as well as signal transmission between the audio
signal output device unit 3 and the audio signal reception unit 7. Needless to say.
10-05-2019
12
[0050]
The audio signal output device unit 3 analyzes the captured image data CAM received from the
imaging unit 5 and detects the position of the listener 4 with respect to the speakers 11SW1 and
11SW2 and the direction of the face (head) as described later. The detection result is reflected in
virtual sound source processing.
[0051]
That is, in the virtual sound source processing, although the head-related transfer function is
used, the head-related transfer function is obtained in advance and stored in the audio signal
output device unit 3.
Conventionally, it is assumed that the listener 4 does not move, so the listener 4 is fixedly present
at the center position between the speakers 11SW1 and 11SW2 facing the screen of the front
television receiver 1 Only head related transfer functions in the state are stored.
[0052]
On the other hand, in this embodiment, the head transfer functions in various positions of the
listener 4 with respect to the speakers 11 SW 1 and 11 SW 2 and in various directions of the
face of the listener 4 are determined more finely. In the storage unit of
Then, the audio signal output device unit 3 detects the position of the listener 4 with respect to
the speakers 11SW1 and 11SW2 from the analysis result of the captured image data CAM
received from the imaging unit 5 and the direction of the face, and transmits the head according
to the detection result. The function is read out from the storage unit of the head related transfer
function, and virtual sound source processing is performed using the read out head related
transfer function.
[0053]
Thus, in this embodiment, even when the listener 4 moves the head, the listener 4 can listen to
10-05-2019
13
the reproduced sound without losing the sense of localization.
The above process will be described in more detail later.
[0054]
[Speaker Arrangement Example According to First Embodiment] Next, FIG. 2 illustrates a speaker
arrangement example in the sound reproduction system according to this embodiment described
above.
[0055]
As shown in FIG. 2, in this embodiment, the front left channel speaker 11FL is disposed on the
front and the left side of the listener 4, and the front right channel speaker 11FR is disposed on
the right.
[0056]
Since the speakers 11FL and 11FR are built in the television receiver 1 in this example, for
example, the front sides of the small-sized speaker boxes 12FL and 12FR (for example, the front
panel of the television receiver) are used as baffle plates. The speaker units 13FL and 13FR are
attached.
The speakers 11FL and 11FR are hereinafter referred to as front speakers when it is not
necessary to distinguish which channel.
[0057]
And in this embodiment, in the vicinity of the left and right ears of the listener 4, two speakers
11SW1 and 11SW2 are arranged across the head of the listener 4 so that the diaphragms face
the respective ears. Ru.
And these two speakers 11SW1 and 11SW2 are not accommodated in the speaker box so that
the sound radiated from the front and back of the diaphragm of the speaker unit can be mixed,
10-05-2019
14
and the baffles Not even attached to the board.
[0058]
Then, in this embodiment, low-frequency audio signals of the LFE channel are commonly
supplied to the two speakers 11SW1 and 11SW2 in the vicinity of both ears of the listener 4, and
the LFE channel is in phase from the speakers 11SW1 and 11SW2 in phase. The low range
sound is made to be emitted.
Therefore, in this embodiment, the speakers 11SW1 and 11SW2 become subwoofers.
Hereinafter, the speakers 11SW1 and 11SW2 will be referred to as subwoofers.
[0059]
As a result of this configuration, the low-frequency sound of the LFE channel is emitted in the
vicinity of both ears of the listener 4, so the listener 4 is heard at a large volume, but at a position
away from the listener 4, The sounds coming out before and after the diaphragms of the speaker
units of the subwoofers 11SW1 and 11SW2 are 180 degrees out of phase with each other, and
cancel each other out, so they are hardly heard.
As a result, it is possible to prevent the situation where the low-range sound propagates to the
next house and causes inconvenience as in the prior art.
[0060]
In order to confirm the attenuation of the low-range sound, in the anechoic chamber, as shown in
FIG. 3, the sound from the speaker unit 11SW with a diameter of, for example, 17 centimeters for
subwoofers is separated by a distance d from the speaker unit 11SW The sound was collected by
the microphone 14 at a distant position, and the frequency characteristic of the sound pressure
level was measured. As a result, it became as shown in FIG.
10-05-2019
15
In this case, the speaker unit 11SW is not housed in a box or attached to a baffle plate.
[0061]
The four frequency characteristic curves in FIG. 4 are, as shown in FIG. 4, that the distance d
between the speaker unit 11SW and the microphone 14 is d = 10 centimeters, d = 20
centimeters, d = 40 centimeters, respectively. It is the one at d = 80 cm.
[0062]
From FIG. 4, it was found that when the speaker unit was not included in the box, the sound of 1
kHz or less was considerably attenuated, and the amount of attenuation was particularly large as
the low-range sound was obtained.
[0063]
Then, in the case of this embodiment, the low-pass sound is attenuated so much by the ears of
the listener 4 that the respective distances dsw between the two subwoofers 11SW1 and SW2
and the left and right ears of the listener 4 are different. It is assumed that the distance
transmitted without any problem, in this example, dsw = about 20 cm.
[0064]
For example, while the distance to the subwoofer 2SW and the ear of the listener 4 is 2 meters in
general, in this embodiment, the distance between each of the subwoofers 11SW1 and 11SW2
and the both ears of the listener 4 is 20 cm. In the case of this embodiment, in the case of this
embodiment, the distance is 1/10 compared to the conventional one.
[0065]
For this reason, in this embodiment, the energy required for the listener 4 to feel the same sound
pressure may be 1/100 of that in the general case described above.
That is, if an amplifier of 100 W (watt) is temporarily required in the above-described general
example, in the case of this embodiment, the same sound pressure is felt even with the 1 W
amplifier.
10-05-2019
16
[0066]
In this embodiment, the diffusion of the sound is small only by the difference of the audio signal
output supplied to the speaker, and the low sound, for example, when the frequency reaches 20
Hz, 30 Hz and 40 Hz, the phase is canceled and the subwoofer Almost no sound can be heard
except in the immediate vicinity of the speaker unit.
On the other hand, the powerful sound effects included in DVD software are obtained by
recording a large amount of energy in this bass band, so the soundproofing effect becomes
greater.
[0067]
According to the above configuration, when focusing only on the low band sound and
considering that only the low band sound is to be attenuated, sufficient effects can be obtained.
It is needless to say that the same soundproof effect as described above can be obtained also
when the speakers 11SW1 and 11SW2 reproduce the sound other than the low frequency sound
and emit the sound from the speakers 11SW1 and 11SW2.
[0068]
In the case of 5.1 channel surround sound, there are also center channel audio and rear left and
right two channel audio.
Conventionally, as shown by a dotted line in front of the listener 4 in FIG. 2, the center-channel
audio speaker 11C has the speaker unit 13C attached with the front side of the speaker box 12C
as a baffle plate, as shown in FIG. Thus, it is arranged on the front side of the listener 4.
[0069]
Similarly, conventionally, as shown by the dotted lines at the rear of the listener 4 in FIG. 2, the
10-05-2019
17
rear left and right two-channel audio speakers 11RL and 11RR are provided on the front face of
the small speaker box 12RL and RR, respectively. The rear speaker units 13RL and 13RR
mounted as baffle plates on the side are arranged.
[0070]
However, in this embodiment, the sound of the center channel and the sound of the rear left and
right two channels are not provided with the dedicated speakers 11C, 11RL, 11RR, but as
described above, the two speakers 11FL of the television receiver. , 11FR and two speakers
11SW1 and 11SW2 near both ears of the listener 4 to reproduce sound.
[0071]
That is, the audio signal C of the center channel is added to the audio signals L and R of the front
left and right 2 channels, respectively, supplied to the speakers 11FL and 11FR, and the speakers
11FL and 11FR reproduce the sound. The audio signal RL on the rear left channel is assumed to
be an audio signal RL * subjected to virtual sound source processing, and is supplied to the
speaker 11SW1 facing the left ear of the listener 4 to reproduce sound.
Further, the sound signal RR of the rear right channel is set as the sound signal RR * subjected to
virtual sound source processing, and is supplied to the speaker 11SW2 facing the right ear of the
listener 4 to reproduce sound.
[0072]
As described above, since the speakers 11SW1 and 11SW2 are disposed at positions where the
distance to the ear of the listener 4 is small, the radiated energy in the sound range of the rear
left and right two-channel audio signals RL and RR is also reduced. Can contribute to
soundproofing.
[0073]
The sound reproduction of the left and right rear left and right two channels of sound processed
by the virtual sound source is performed by the subwoofers 11SW1 and 11SW2 arranged near
the ears of the listener 4, the sound from the rear left and right two channels is originally a
reverberation from the back of the listener 4. Since the sound source is the main sound source
and so on, the localization position is not so important, so there is an effect that good surround
sound can be obtained while realizing low speaker and low noise.
10-05-2019
18
[0074]
In addition, as described above, the sound pressure of the subwoofers 11SW1 and 11SW2 is
such that the distance dsw between the subwoofers 11SW1 and 11SW2 and the ear of the
listener 4 is 20 cm compared to 2 meters in the general example. Thus, the same can be applied
to the rear left and right two-channel audio signals RL and RR, and energy saving can be realized.
[0075]
As an example of the speaker arrangement in consideration of the above, there can be considered
a method of installing the speakers on a chair having a structure like a massage chair, for
example.
[0076]
FIG. 5 shows an example in that case, showing a structure in which the two speakers 11SW1 and
11SW2 to be disposed in the vicinity of the two ears of the listener 4 described above are
mounted on a chair. It is.
[0077]
That is, in this example, for example, the chair 20 is a business class seat structure of an airplane,
and the speaker holder 22 is attached to the top 21 a of the backrest 21 of the chair 20. , And
the subwoofers 11SW1 and 11SW2 are attached and held.
[0078]
And, in this embodiment, marker portions 6L and 6R are attached to the speaker holder 22
above the attachment positions of the subwoofers 11SW1 and 11SW2.
[0079]
6A and 6B illustrate an example of the speaker holder 22. FIG.
The speaker holder 22 is made of, for example, a pipe 221 made of metal such as aluminum.
10-05-2019
19
As shown in FIG. 6B, the pipe 221 is formed into a flat ring shape, and the subwoofers 11SW1
and 11SW2 and the auxiliary subwoofers 11SW3 and 11SW4 are fixed and held in the space
formed by the ring. It has a structure.
The marker portions 6L and 6R are attached on the fixed holding portions of the subwoofers
11SW1 and 11SW2 of the speaker holder 22.
[0080]
The subwoofers 11SW3 and 11SW4 for supplementary use can compensate for the lack of
power because the low-range sound may be perceived as insufficient in terms of hearing with the
subwoofers 11SW1 and 11SW2 arranged next to the listener 4 listener's ear alone. These
auxiliary subwoofers 11SW3 and 11SW4 are not essential.
[0081]
In this embodiment, only the low-range audio signal (LFE signal) is supplied to these auxiliary
subwoofers 11SW3 and 11SW4.
In the same manner as the subwoofers 11SW1 and 11SW2, these auxiliary subwoofers 11SW3
and 11SW4 may also be supplied with audio signals subjected to virtual sound source
processing.
[0082]
The pipe 221 is formed in a flat ring shape, and as shown in FIG. 6 (A), the ring-shaped portion
has a side of the head excluding the front direction of the face of the listener 4 (the left and right
ears And is arranged in a substantially U-shape so as to surround the rear side of the head and
the opposite side).
[0083]
Further, attachment legs 222a and 222b for attaching the ring-shaped pipe 221 to the backrest
21 of the chair 20 are provided in a connected manner, and the attachment legs 222a and 222b
10-05-2019
20
allow the backrest 21 of the chair 20 to be attached. , For example, to be removably attachable.
That is, for example, the top 21a of the backrest 21 of the chair 20 is provided with an elongated
hole (not shown) into which the mounting legs 222a and 222b are inserted and fitted, and the
mounting legs 222a and 222b By being inserted into and fitted to the long hole 21, the mounting
and fixing is configured.
[0084]
The subwoofers 11SW1 and 11SW2 are fixed to and held by the pipe 221 at positions facing the
left and right ears of the listener 4 when the listener 4 of the U-shaped ring-shaped pipe 221 sits
on a chair. Ru.
In addition, the subwoofers 11SW3 and 11SW4 for assistance are fixed to the pipe 221 and held
at the ring-shaped pipe position behind the head of the listener 4.
Then, as described above, the marker portions 6L and 6R are attached on the fixed holding
portions of the subwoofers 11SW1 and 11SW2 of the speaker holder 22.
[0085]
In the case of this example, when the listener 4 sits on the chair 20, the distance between the
subwoofers 11SW1 to SW4 and the head (especially the ear) of the listener 4 is configured to be,
for example, about 20 cm.
[0086]
The audio signals of the corresponding channels to the speakers 11SW1 to 11SW4 are
configured to be supplied from the audio signal output device unit 3 through the signal lines
(speaker cables) in this example.
Also in this example, the light emission control signal is supplied from the audio signal output
device unit 3 through the cables to the LEDs constituting the marker units 6L and 6R,
10-05-2019
21
respectively.
[0087]
[Configuration Example of Audio Signal Output Device Unit 3 According to First Embodiment]
FIG. 7 is a block diagram showing a configuration example of the audio signal output device unit
3 according to the first embodiment.
The audio signal output device unit 3 in the first embodiment includes an audio signal processing
unit 300 and a control unit 100 formed of a microcomputer.
[0088]
The control unit 100 has a ROM (Read Only Memory) 103 in which a software program and the
like are stored through a system bus 102 and a RAM (Random Access Memory) 104 for a work
area with respect to a central processing unit (CPU) 101. A plurality of input / output ports 105
to 108, a user operation interface unit 110, a head rear transfer function storage unit 111, a
captured image analysis unit 112, a marker unit drive unit 113, and the like are connected.
The user operation interface unit 110 includes one configured by a remote commander and a
remote control reception unit, in addition to a key operation unit provided directly to the audio
signal output device unit 3 and the like.
[0089]
As described above, in this embodiment, the audio signal output device unit 3 can receive the
audio signal Au1 from the television receiver 1 and the audio signal Au2 from the DVD player 2.
The received audio signals Au 1 and Au 2 are supplied to the input selection switch circuit 301.
[0090]
The input selection switch circuit 301 is switched by a switching signal through the input /
output port 105 of the control unit 100 in response to a user's selection operation through the
10-05-2019
22
user operation interface unit 110.
That is, when the user selects audio from the television receiver 1, the switch circuit 301 is
switched to select the audio signal Au1, and when audio from the DVD player 2 is selected. The
switch circuit 301 is switched to select the audio signal Au2.
[0091]
Then, the audio signal selected by the switch circuit 301 is supplied to the 5.1 channel decoding
unit 302.
The 5.1 channel decoding unit 302 receives the audio signal Au1 or Au2 from the switch circuit
301, performs channel decoding processing, and outputs the audio signals L and R for the front
left and right channels, the audio signal C for the center channel, and the rear The audio signals
RL and RR of the left and right channels and the low-pass audio signal LFE are output.
[0092]
The audio signal L of the front left channel from the 5.1 channel decoding unit 302 and the
audio signal C of the center channel are supplied to the synthesizing unit 303 and synthesized,
and the synthesized output audio signal (L + C) is transmitted through the amplifier 305. It is
derived to the audio output terminal 307.
The audio signal obtained at the output terminal 307 is supplied to one speaker 11FL of the
television receiver 1.
[0093]
Also, the audio signal R on the front right channel from the 5.1 channel decoding unit 302 and
the audio signal C on the center channel are supplied to the synthesizing unit 304 and
synthesized, and the synthesized output audio signal (R + C) is amplified by the amplifier It is
derived to the audio output terminal 308 through 306.
10-05-2019
23
The audio signal obtained at the output terminal 308 is supplied to the other speaker 11FR of
the television receiver 1.
[0094]
The amplifiers 305 and 306 have a muting function to shut off the audio signal output, and are
configured to be muted by the muting signal through the input / output port 107 of the control
unit 100.
[0095]
Then, in this embodiment, when the audio signal Au1 from the television receiver 1 is received,
the audio signals reproduced by the television receiver 1 are acoustically reproduced from the
speakers 11FL and 11FR of the television receiver 1. The amplifiers 305 and 306 are muted and
cut off so that the audio signal from the audio signal processing unit 3 is not supplied to the
speakers 11FL and 11FR of the television receiver 1.
[0096]
On the other hand, when the audio signal Au2 from the DVD player 2 is received, the amplifiers
305 and 306 are not muted and the audio signal from the audio signal device unit 3 is supplied
to the speakers 11FL and 11FR of the television receiver 1. Ru.
[0097]
It should be noted that, in place of the configuration for muting control of amplifiers 305 and
306, when decoding an audio signal from television receiver 1 in 5.1 channel decoding section
302, audio signals L and R for the front left and right channels and The audio signal C may be
configured not to be decoded and output.
In that case, a control signal for that purpose may be supplied through the input / output port
106.
[0098]
Next, the audio signals RL and RR of the rear left and right two channels obtained by being
10-05-2019
24
decoded by the 5.1 channel decoding unit 302 are supplied to a rear transfer function
convolution circuit 310 as a virtual sound source processing unit.
[0099]
The rear transfer function convolution circuit 310 uses a digital filter, for example, to transfer the
head rear transfer function prepared in the head rear transfer function storage unit 111 in
advance to the left and right two channels from the 5.1 channel decoding unit 302. The voice
signals RL and RR of R are folded.
[0100]
Therefore, in the rear transfer function convolution circuit 310, when the input voice signal is
not a digital signal, it is converted into a digital signal, and after the head rear transfer function is
convoluted, it is converted back into an analog signal and output.
[0101]
In this example, the head rear transfer function is measured and determined as follows, and
stored in the head rear transfer function storage unit 111.
FIG. 8 is a diagram for explaining a method of measuring the head rear transfer function.
[0102]
That is, as shown in FIG. 8A, the left channel measurement microphone 41 and the right channel
measurement microphone 42 are installed in the vicinity of the left and right ears of the listener
4.
Next, the rear left channel speaker 11RL is arranged at a position where the rear left channel
speaker is usually arranged behind the listener 4.
Then, for example, the emitted sound when the impulse is acoustically reproduced by the rear
left channel speaker 11RL is collected by the respective microphones 41 and 42, and from the
10-05-2019
25
collected sound signal, left and right from the rear speaker 11RL are obtained. Measure the
transfer function to the ear (head rear transfer function for the rear left channel).
[0103]
Similarly, for the rear right channel speaker 11RR, for example, emitted sound when the impulse
is acoustically reproduced is picked up by the respective microphones 41 and 42, and from the
picked up sound signal, left and right from the rear speaker 11RR are obtained. Measure the
transfer function to the ear (head rear transfer function for the rear right channel).
[0104]
The head rear transfer function is a transfer function from each speaker to the ear when the rear
speakers RL and RR are placed at a position of 2 m, for example, at 30 degrees left and right
from the rear center of the listener 4 It is good to measure and obtain the transfer function
obtained.
[0105]
Further supplement to the transfer function.
For example, a transfer function coming from the left rear to the left ear in FIG.
Next, the transfer function from the speaker 11SW1 near the left ear to the microphone 41 is
measured, and the obtained transfer function is taken as a transfer function B.
Then, when a transfer function B is multiplied by a certain transfer function X, a transfer function
X which becomes a transfer function A is obtained, and the obtained transfer function X is
convoluted with the signal sound sent to the nearby speaker 11SW1. The sound emitted from the
speaker 11SW1 feels as if it came from the left rear 2 m.
[0106]
However, the transfer function X may not necessarily be determined, and in some cases only the
10-05-2019
26
transfer function A may be used.
Although the above description has been described on behalf of the transfer function coming
from the rear left of the listener 4 to the left ear as one of the transfer functions, as shown in FIG.
8A, the transfer function is It goes without saying that there are a plurality of transfer functions,
such as transfer functions coming from the rear left to the right ear of the listener 4, transfer
functions from the rear right to the right ear of the listener 4, and transfer functions from the
rear right to the left ear of the listener 4. .
[0107]
When the sound signals RL * and RR * from the rear transfer function convolution circuit 310
are supplied to the speakers 11SW1 and 11SW2 arranged in the vicinity of both ears for sound
reproduction, the listener 4 is as if it were the rear left and right rear speakers 11RL. And listen
to the reproduced audio as if the audio was emitted from 11RR.
[0108]
The levels of the sound signals RL * and RR * of the rear left and right channels subjected to
virtual sound source processing at this time may be lower than the signal levels when supplied to
the speakers 11RL and 11RR.
This is because the speakers 11SW1 and 11SW2 are in the vicinity of the ear of the listener 4.
[0109]
In this specification, the above processing is called virtual sound source processing because
sound is heard so as to be emitted from a virtual speaker position by the above-described HRTF
convolution.
[0110]
As described above, the audio signals RL * and RR * subjected to the virtual sound source
processing from the rear transfer function convolution circuit 310 are supplied to the synthesis
units 311 and 312.
10-05-2019
27
The low band audio signal LFE from the 5.1 channel decoding unit 302 is supplied to the
synthesis units 311 and 312.
[0111]
The output sound signals of the synthesizing units 311 and 312 are signals to be supplied to the
speakers 11SW1 and 11SW2, respectively.
The output voice signals of the combining units 311 and 312 are supplied to the multiplexing
unit 313 and multiplexed (multiplexed), and wirelessly transmitted from the wireless
transmission unit 314 to the audio signal reception unit 7.
[0112]
The audio signal receiving unit 7 is configured as shown in FIG. 10 in this embodiment.
A signal wirelessly transmitted from the audio signal output device unit 3 is received by the
wireless reception unit 71 and supplied to the de-multiplexing unit 72. The de-multiplexing unit
72 de-multiplexes the audio signal multiplexed in the received signal, and supplies the audio
signal supplied to the first speaker 11SW1 and the audio signal supplied to the second speaker
11SW2 And separated. Then, the separated audio signals are supplied to the first speaker
11SW1 and the second speaker 11SW2 through the output terminals 75 and 76 through the
amplifiers 73 and 74, respectively.
[0113]
Therefore, the speakers 11SW1 and 11SW2 acoustically reproduce the low-frequency audio
signal LFE as a subwoofer and acoustically reproduce the rear left and right channel audio
signals RL * and RR * subjected to virtual sound source processing.
[0114]
In this case, when the audio signal Au1 from the television receiver 1 is decoded by the 5.1
10-05-2019
28
channel decoding unit 302 and the above-described sound reproduction is performed, the
emitted sound from the speakers 11FL and 11FR of the television receiver 1 It is necessary to
consider that the sound of the rear left and right channels is included.
[0115]
That is, the rear left and right channels included in the sound emitted from the speakers 11FL
and 11FR of the television receiver 1 are the sounds of the rear left and right channels emitted
from the speakers 11SW1 and 11SW2 arranged in the vicinity of both ears of the listener 4 The
problem is that the sound source localization may be deviated due to the voice of.
[0116]
However, in this embodiment, since the speakers 11SW1 and 11SW2 are provided near the ears
of the listener 4, they are sufficiently closer to the listener 4 than the positions of the speakers
11FL and 11FR of the television receiver 1, and therefore The voices emitted from the speakers
11SW1 and 11SW2 reach the listener 4 much earlier than the voices emitted from the speakers
11FL and 11FR of the television receiver 1.
[0117]
Therefore, due to the Haas effect, the listener 4 is in a state of listening only to the emitted sound
from the speakers 11SW1 and 11SW2 as the rear sound, and from the audio signals supplied to
the speakers 11FL and 11FR of the television receiver 1, It is unnecessary to perform processing
such as removing in advance the audio signals of the rear left and right channels.
[0118]
Of course, the audio signal output unit 3 may be configured to supply corresponding audio
signals to the first and second speakers 11SW1 and 11SW2 through the speaker cable without
providing the audio signal reception unit 7. .
[0119]
The audio signal system supplied to the subwoofers 11SW3 and 11SW4 for assistance is omitted
in FIG. 7, but as described above, the subwoofers 11SW3 and 11SW4 for assistance use only the
low frequency audio signal LFE. The voice signals RL and RR of the rear left and right channels
subjected to virtual sound source processing may be supplied to the low-pass voice signal LFE.
[0120]
10-05-2019
29
As described above, according to the sound reproduction system of the first embodiment in
which the multi-channel speaker is attached to the chair 20 shown in FIG. 5, the number of
listeners 4 sitting on the chair 20 is more than the number of channels. In addition to being able
to enjoy high-volume, realistic multi-channel sound using the speaker of the present invention,
leakage of sound to the surroundings can be significantly reduced.
[0121]
In this embodiment, in particular, the loudspeakers 11SW1 and 11SW2 for the subwoofer are
disposed in the vicinity of the listener's 4 ear without being housed in the box, thereby
significantly reducing the leakage of the deep bass into the adjacent room It can be done.
Further, as described above, since the sound of the rear left and right channels other than the
subwoofer channel is also processed by the virtual sound source processing by the speakers
11SW1 and 11SW2, the sound signal level can be lowered. In addition to the bass, the level of
sound leakage to the surroundings can be further reduced.
For this reason, for example, even when watching a DVD at midnight, it can be enjoyed at a
sufficient volume without concern for others.
[0122]
Further, since the speakers 11SW1 and 11SW2 are disposed near the listener's ear, the audio
signal output power can be made about 1/100 of that in the conventional case in the extreme
case, energy saving can be achieved, and the hardware can be realized. The cost of the (output
amplifier) can be significantly reduced.
Furthermore, since the audio output power can be small, there is an advantage that the speaker
can use a thin, light, inexpensive speaker which does not require a large stroke.
In addition, since the audio output power is reduced, the heat generation is reduced and the size
of the device such as a power source can be reduced, so that the battery can be driven and can
be embedded in the design of a chair or the like.
10-05-2019
30
[0123]
Therefore, the energy saving of the sound reproduction system can be realized in total, and there
is an advantage that the sound reproduction system can be provided which reduces the noise to
the surroundings without reducing the satisfaction of the viewer.
[0124]
Even with a normal soundproof window, even if there is a performance that can attenuate 45 dB
at 5 kHz, it drops to 36 dB at 1 kHz and 20 dB at 100 Hz.
Especially at 50 Hz or less, the amount of attenuation is further reduced, so the soundproofing
effect of the subwoofer according to this embodiment is remarkable, and the cost-effectiveness
that can be saved is considering the enjoyment of audiovisual reproduction through the
soundproofing of the room. There is a very big one.
[0125]
By the way, conventionally, the listener 4 sits on a chair, looks at the front almost in the middle
of both the speakers 11SW1 and 11SW2, and listens to an audio while watching a DVD image.
However, in order to realize more realistic virtual sound source localization, it is reported that the
accuracy of virtual sound source localization is improved by detecting how much the face of the
listener 4 has shifted from the front and performing the correction. (See the Sound Society of
Japan, March 2003, virtual sound imaging using headphones. Ryuji Yokote).
[0126]
Therefore, in this embodiment, in response to the listener 4 changing the position of the head or
changing the direction of the face, each of the plurality of head rear transfer functions described
above includes two speakers 11 SW 1. And 11SW2 in various positions of the listener 4 and in
various directions of the face (head) of the listener 4 to obtain.
10-05-2019
31
[0127]
In this example, as various positions of the listener 4 with respect to the two speakers 11SW1
and 11SW2, positions at which the distance between the listener 4 at the position sandwiched by
the two speakers 11SW1 and 11SW2 with the speakers 11SW1 and 11SW2 is variously changed
I assume.
[0128]
That is, when the head of the listener 4 exists at a position between the two speakers 11SW1 and
11SW2, the distance between the center line position of the head of the listener 4 and the
speaker 11SW1 (for example, the diaphragm position of the speaker 11SW1) Assuming that d1
is the distance between the center line position of the head of the listener 4 and the speaker SW2
(for example, the diaphragm position of the speaker 11SW1) d2, 2 of the head of the listener 4
due to the difference Δ = d1-d2. The position between the speakers 11SW1 and 11SW2 is
assumed to be represented.
However, the absolute value of the difference Δ is smaller than d1 + d2.
[0129]
That is, when the difference Δ = 0, the head of the listener 4 is at the center position between
the two speakers 11SW1 and 11SW2 as shown in FIG. 8A.
When the difference Δ is a positive value, the head of the listener 4 is at a position close to the
side of the speaker 11SW2 as shown in FIG. 8B and at a position according to the value of the
difference Δ.
Further, when the difference Δ is a negative value, the head of the listener 4 is at a position close
to the side of the speaker 11SW1 as shown in FIG. 8C, and is at a position according to the value
of the difference Δ. .
[0130]
10-05-2019
32
In this embodiment, the positions described above for each predetermined value of the positive
and negative differences Δ, for example, the positions for each integer multiple of the difference
Δ 4 cm (in the position of the head, every 2 cm), Measure the head related transfer function.
In that case, as shown in FIG. 8 (D), at each position, head transmission also occurs at a
predetermined angle, for example, every 2 degrees, with respect to the direction of the face of
the listener 4 also with respect to the face direction. Measure the function
[0131]
The head rear transfer function measured as described above is stored in the head rear transfer
function storage unit 111, supplied to the rear transfer function convolution circuit 310 through
the input / output port 108, and the rear transfer function convolution circuit 310 It is folded.
[0132]
FIG. 9 shows an example of information on one head related transfer function stored in the head
rear transfer function storage unit 111 according to various positions of the head of the listener
4 and the orientation of the face.
That is, in FIG. 9, the direction in which the face direction is made different by a predetermined
angle at the position indicated by one value Δj (j = 0, ± 1, ± 2,...) Of the difference Δ = d1−d2
The head related transfer functions at θ i (i = 1, 2,..., n) are shown.
[0133]
Here, the head related transfer function HLji (+) means the head related transfer function in the
case where the head is at the position of value + Δj and the face is in the left direction θi, and
the head related transfer function HRji (+) Denotes the head-related transfer function in the case
where the head is at the position of value + Δj and the face is in the direction θi facing right.
Further, the head related transfer function HLji (-) means the head related transfer function in the
case where the head is in the direction θi facing leftward at the position of value -Δj, and the
10-05-2019
33
head related transfer function HRji (-) Means the head-related transfer function in the case where
the head is at the position −value and the face is in the direction θi facing the right.
[0134]
Here, the head related transfer functions Hj 0 (+) and H j 0 (−) (where Δ0 (j = 0) does not have
(+) or (−)) indicate that the head is a face at the position of value Δj. It means the head related
transfer function in the case where the direction θi of is facing to the front.
[0135]
FIG. 9 shows stored values for one head related transfer function, and as described above, since
there are a plurality of head related transfer functions, the same storage as that shown in FIG. 9
for the plurality of head related transfer functions The values are stored in the head rear transfer
function storage unit 111.
[0136]
In the above description, it is assumed that the head of the listener 4 moves laterally along the
line connecting the speakers 11SW1 and 11SW2 to change the position, but the head of the
listener 4 moves vertically to move the position Assuming that it is changed, the above-described
head-to-head transfer function may be measured at each position in the vertical direction, for
example, every 2 cm, and stored in the head-rear transfer function storage unit 111.
[0137]
In addition, assuming that the head of the listener 4 moves forward and backward (the
movement to the rear is limited in the case of a chair), the head of the listener 4 is positioned at
each value ± Δj. The movement in the front-rear direction may be changed, and the head
related transfer function may be measured at each position of the change and stored in the head
rear transfer function storage unit 111.
[0138]
In this case, although the number of head transfer functions stored in the head rear transfer
function storage unit 111 is large, the number of head transfer functions to be stored may be
reduced by optimizing as appropriate.
[0139]
In this embodiment, the control unit 100 views the DVD image from among the head rear
10-05-2019
34
transfer functions corresponding to the various positions and face orientations stored in the head
rear transfer function storage unit 111. An appropriate head rear transfer function according to
the actual head position of the listener 4 who is listening to voice and the face direction is
selected and read out and supplied to the rear transfer function convolution circuit 310.
[0140]
In the case of this embodiment, in order to select an appropriate head-related transfer function,
the imaging unit 5 images a portion of the listener 4 listening to the audio while watching the
DVD video and the marker portions 6L and 6R Then, the captured image data CAM is supplied to
the audio signal output device unit 3.
[0141]
In the audio signal output device unit 3, the captured image data is received by the captured
image analysis unit 112.
The captured image analysis unit 112 analyzes the captured image by the received captured
image data CAM, detects the position of the head of the listener 4 with respect to the speakers
11SW1 and 11SW2, and detects the position of the listener 4 face as described in detail later. It
detects the direction in which the (head) points and sends the detected output to the system bus.
[0142]
The CPU 101 receives the position detection result of the head of the listener 4 and the face
direction detection result of the listener 4 from the captured image analysis unit 112, and
searches the stored contents of the head rear transfer function storage unit 111 as an argument
for reading them. The head rear transfer function corresponding to the position detection result
of the listener 4 and the face direction detection result is read out from the head rear transfer
function storage unit 111.
In this case, the head rear transfer function corresponding to the detected position and the head
rear transfer function measured at the position closest to the face direction and the face direction
corresponds to the position detection result of the listener 4 and the face direction detection
result. It is read as
10-05-2019
35
[0143]
The control unit 100 of the audio signal output device unit 3 receives the captured image data
CAM from the imaging unit 5 at all times while the listener 4 is watching the DVD video and
audio, and the listener 4 changes from moment to moment. The position of the head and the
direction of the face are detected, and a head related transfer function corresponding to the
detection result is read out from the head rear transfer function storage unit 111 and supplied to
the audio signal processing unit 300.
For this reason, the audio signal processing unit 300 performs virtual sound source processing
according to the change in position and movement of the head of the listener 4 in real time.
[0144]
Therefore, even if the listener 4 moves the position of the head or changes the direction of the
face, the listener 4 always enjoys multi-channel surround sound without losing the sense of
localization of the virtual sound image. it can.
[0145]
It is also conceivable to attach a gyro sensor to the head of the listener without using the imaging
unit 5 to detect a change in the position of the head and a change in the direction of the face.
[0146]
However, in the sound reproduction system according to this embodiment, sound is reproduced
from not the headphones but from the speaker disposed at a position distant from the listener 4,
and the listener 4 sits in a chair and the device is the body. It has won the comfort of not
touching.
Therefore, attaching a sensor such as a gyro sensor to the listener 4 in order to detect a change
in head position and a change in face direction of the listener 4 destroys the comfort of the
listener. It is preferable to use the imaging unit 5 that can avoid the discomfort of wearing the
device.
10-05-2019
36
[0147]
[Configuration Example of Imaging Unit 5] FIG. 11 shows a configuration example of the imaging
unit 5 in this embodiment.
[0148]
In the imaging unit 5 of this embodiment, the light from the subject transmitted through the
imaging lens 51 is received by the image sensor 52, and the subject is imaged.
In the case of this embodiment, as the image sensor 52, one capable of high-speed readout of a
captured image signal is used.
That is, although the frame rate of the captured image signal read from the normal image sensor
is 60 frames per second, the frame rate of the captured image signal from the image sensor 52
used in this embodiment is, for example, It is said that there are 240 frames per second.
[0149]
The virtual sound source processing that follows the position change of the head of the listener 4
and the change of the face direction of the captured image signal of such a high frame rate is the
position change of the head of the listener 4 or the face This is because it is necessary to carry
out within 20 to 30 milliseconds from the point of time of detection of the change in direction,
and it is reported that the virtual sense of localization of the sound image is lost if this becomes
60 milliseconds or more.
[0150]
The captured image signal from the image sensor 52 is supplied to a captured image signal
processing unit 53 and subjected to predetermined processing, and from the captured image
processing unit 53, captured image data CAM composed of luminance information and color
information of the captured image. Is obtained.
[0151]
10-05-2019
37
Then, the captured image data CAM from the captured image processing unit 53 is sent to the
audio signal output device unit 3 through a cable connected to the output terminal 54.
The audio signal output unit 3 receives the captured image data CAM, reads out a head related
transfer function according to the position of the head of the listener 4 detected based on the
image data CAM and the direction of the face, and the read out head related transmission The
time until the end of virtual sound source processing using a function is high-speed processing so
that the time from the detection time point of head position change and face direction change of
the listener 4 is within the above 20 to 30 milliseconds. Ru.
[0152]
The imaging unit 5 of this embodiment also includes a near-infrared light source 55 made of, for
example, an LED that generates near-infrared light.
The near-infrared light source 55 is driven to emit light by the near-infrared light source drive
unit 56 when the power is turned on.
[0153]
It is known that a human pupil has the property of reflecting incident light almost in the same
direction, and the brightness of the coaxial illumination state becomes remarkably large as
compared with other parts of the face.
Then, it is known that when the image of a person's face is irradiated with weak near-infrared
light, the pupil is particularly brightly imaged in the imaged image.
[0154]
Therefore, the captured image 60 captured by the image sensor 52 is as shown in FIG.
10-05-2019
38
In FIG. 12, reference numeral 61 denotes an image of the head of the listener 4, and the left and
right pupil images 61EL and 61ER are particularly bright.
[0155]
Reference numerals 62L and 62R denote images of the left and right marker units 6L and 6R, in
this example, an image of LED light.
That is, the marker units 6L and 6R are configured by LEDs in this example, and are configured
to be driven and lighted by the marker unit drive unit 113 shown in FIG.
[0156]
[Description of Detection of Head Position and Face (Head) Orientation of Listener 4] As
described above, in this embodiment, the captured image 60 as shown in FIG. 12 captured by the
image sensor 52. Captured image data is wirelessly transmitted from the imaging unit 5 to the
audio signal output device unit 3.
[0157]
In the audio signal output device unit 3, the captured image data is received by the wireless
reception unit 112, and the captured image analysis unit 112 analyzes the image.
[0158]
In this image analysis, first, as shown in FIG. 12, the center line 61C of the captured image 61 of
the head of the listener 4 is detected, and the center line 61C and the images 62L and 62R of the
left and right marker portions 6L and 6R. And the distances d1 and d2 of
Then, as described above, the position of the head of the listener 4 between the speakers 11SW1
and 11SW2 is detected as the difference Δj = d1−d2.
[0159]
10-05-2019
39
Further, the position of the head of the listener 4 in the front-rear direction (the direction from
the listener 4 to the television receiver 1) is, in this example, the head of the listener 4 in the
reference image of the size of the image of the head of the listener 4 Detect from the comparison
of the image size of.
[0160]
That is, in this embodiment, although the audio signal output device unit 3 is not shown, the
audio signal output device unit 3 includes a storage unit for storing a reference image.
The storage unit may be built in the captured image analysis unit 112 or may be separately
connected to the system bus.
[0161]
In this reference image, with the listener 4 facing the screen of the television receiver 1, the head
of the listener 4 is at the central position between the speakers 11SW1 and 11SW2, and the
reference position in the front-rear direction (For example, in a state in which the head is leaned
against the headrest of the chair 20, and in a position where the ear of the listener 4 faces the
diaphragms of the speakers 11SW1 and 11SW2) is there.
[0162]
In this case, the audio signal output device unit 3 has a reference image registration mode, and
when the sound reproduction system of this embodiment is installed, the user executes the
reference image registration mode and the reference image of the user Are registered and stored
in the storage unit.
[0163]
Before using the sound reproduction system according to this embodiment, the reference image
registration mode may be executed to register the reference image of the user who is about to
start the use and store the reference image in the storage unit. .
[0164]
10-05-2019
40
In addition, when the users who use the sound reproduction system are a plurality of specific
users, the reference images of the respective users are registered in correspondence with the
identifiers of the respective users, and You may make it memorize | store in a memory | storage
part.
In that case, when using the sound reproduction system, the user inputs his / her identifier so
that the sound reproduction system recognizes the reference image of the listener 4.
[0165]
When the movement of the head of the listener 4 in the vertical direction is considered, the
positional change of the head image 61 of the listener 4 in the vertical direction with respect to
the images 62L and 62R of the marker units 6L and 6R is detected. Good.
Also in this case, the positional change in the vertical direction can be detected with reference to
the reference image.
At this time, it is easier to detect the positional change in the vertical direction of the images
62EL and 62ER of the pupil of the listener 4 instead of detecting the positional change in the
vertical direction of the head image 61 of the listener 4 as a whole.
As described above, the images 62 </ b> EL and 62 </ b> ER of the pupil of the listener 4 are
brightly imaged against the irradiation of the near infrared light.
[0166]
Next, a method of detecting the face direction of the listener 4 will be described.
In this embodiment, since the listener 4 is watching a DVD image on the screen of the television
receiver 1, it is assumed that the direction of the face is changed by rotating the head in the
horizontal direction. Then, the imaging unit 5 detects the direction of the face by utilizing the fact
that the distance between the two eyes of the listener 4 when imaging the head of the listener 4
10-05-2019
41
changes according to the direction of the face. In the case of this embodiment, the distance
between the two eyes of the listener 4 is detected as the distance between the images 62L and
62R of the pupil of the listener 4 imaged brightly by the irradiation of the near infrared light, and
the direction of the face is detected. Do.
[0167]
In order to simplify the description, a method of detecting the face orientation of the listener 4
will be described assuming that the head of the listener 4 is at the reference image position and
the listener 4 changes the face orientation at that position.
[0168]
FIG. 13A shows a head image 61 of the listener 4 in the reference image, which is in a state of
facing the front as apparent from the drawing.
FIG. 13B is a reference view when the head of the listener 4 in the reference image is viewed
from overhead. As described above, the distance between the two eyes of the listener 4 in the
captured image in the state where the face of the listener 4 is directed to the front becomes the
largest. The distance between the two eyes at this time can be determined from the reference
image, and this value is a. Note that, in FIG. 13B, since the main object is to indicate the distance
between the two eyes of the listener 4, the distance between the two eyes is indicated by the
straight line 63.
[0169]
FIG. 14A shows a head image 61 of the listener 4 in the captured image when the listener 4
horizontally rotates the face to the right at the reference image position. Let b be the eye spacing
value. FIG. 14 (B) is a reference view of the head of the listener 4 in the state of FIG. 14 (A) as
viewed from overhead.
[0170]
As can be seen from FIG. 14, when the value a of the distance between both eyes of the listener 4
is compared with the value b, a> b, and assuming that the rotation angle (direction) of the face
with respect to the front direction is θ, cos θ It is in the relation of = b / a.
10-05-2019
42
[0171]
For example, when a person having a distance between both eyes of 16 cm (corresponding to the
above value a) rotates the face 60 degrees in the horizontal direction, the above value b becomes
b = a · cos 60 ° = 8 cm.
[0172]
Here, since the value a is a known value obtained from the reference image, the face rotation
angle θ can be obtained by obtaining the value b from the captured image.
For a person with a distance between both eyes of 16 cm, when the face is rotated, the
correspondence relationship between the distance b between the two eyes obtained from the
image captured from the front direction and the rotation angle (direction) θ of the
corresponding face The following is the example.
[0173]
That is, when b = 14 cm, when θ = 29 ° b = 12 cm, when θ = 41.4 ° b = 10 cm, when θ =
51.3 ° b = 8 cm, θ = 60 ° b = 6 cm When θ = 68 ° b = 4 cm, θ = 75.5 ° b = 2 cm, θ = 82.8
°.
[0174]
In the case where the head of the listener 4 moves in the direction of the television receiver 1,
the values a and b are changed to values according to the movement position, and in the same
manner as described above, It is possible to detect the orientation of the face.
[0175]
As described above, the rotation angle θ of the face direction of the listener can be detected by
detecting the distance b between the two eyes of the listener 4 in the captured image.
However, in this case, it can not be determined whether the listener 4 has rotated the face in the
10-05-2019
43
right direction or the left direction with respect to the front direction only by obtaining the
distance b.
[0176]
Therefore, in this embodiment, the reflected light from the two eyes of the light emitted from the
markers 6L and 6R (especially the near infrared light) provided corresponding to the positions of
the two speakers 11SW1 and 11SW2 is used. The direction in which the listener 4 has rotated
the face is detected.
[0177]
Therefore, in this embodiment, as shown in FIG. 15, the marker units 6L and 6R are attached so
as to be in front of the face when the listener 4 is at the reference position.
[0178]
Thus, as shown in FIG. 16A, when the listener 4 faces the front in the right direction, the light
from the marker unit 6R increases the incident light to the right eye, and the marker The incident
light of the light from the portion 6L to the left eye is reduced, and the luminance of the right eye
is larger than that of the left eye.
[0179]
Also, as shown in FIG. 16B, when the face of the listener 4 is directed to the left with respect to
the front direction, the light from the marker unit 6R to the right eye decreases, and the light
from the marker unit 6L decreases. Light incident on the left eye increases, and the brightness of
the left eye increases relative to the right eye.
[0180]
Therefore, it is possible to determine whether the listener 4 has rotated the face in the right
direction or the left direction with respect to the front direction based on the difference in the
luminance of both eyes.
[0181]
In this case, the marker unit drive unit 114 may change the luminance of one marker unit at high
speed, instead of lighting both the marker units 6L and 6R with the same luminance.
10-05-2019
44
[0182]
When the speed of the image recognition process is fast, the listener 4 faces the right with
respect to the front direction, depending on which side the positions of the images 62EL and
62ER of both eyes in the head image 61 are biased to. It may be determined whether it has been
rotated or to the left.
[0183]
As described above, in this embodiment, the position and orientation of the face of the listener 4
with respect to the two speakers 11SW1 and 11SW2 are detected, and the head measured at the
position and orientation of the face closest to the detection result The rear transfer function is
read out from the head rear transfer function storage unit 111 and supplied to the rear transfer
function convolution circuit 310.
[0184]
The rear transfer function convolution circuit 310 uses the head rear transfer function that
changes from moment to moment with respect to the audio signals RL and RR of the rear left and
right channels to move the head 4 such as positional change of the head of the listener 4 and
face rotation. The virtual sound source processing according to is performed to generate the
audio signal RL * and the audio signal RR * subjected to the virtual sound source processing, and
are supplied to the speakers 11SW1 and 11SW2 arranged in the vicinity of both ears.
[0185]
In the above description of the first embodiment, the audio signal Au2 is supplied to the speakers
11FL and 11FR of the television receiver 1 through the audio signal output device unit 3 when
the DVD player 2 is reproducing. The audio signal Au2 from the DVD player 2 is supplied to the
television receiver 1, and the audio mixed with the 5.1 channel audio from the speakers 11FL
and 11FR as in the reception of the digital broadcast program. May be emitted.
In that case, the audio signal system path supplied from the audio signal output device unit 3 to
the speakers 11FL and 11FR of the television receiver 1 becomes unnecessary.
[0186]
10-05-2019
45
The audio signal output device unit 3 can be provided at a predetermined position such as under
the seating surface of the chair 20.
In that case, the audio signal output device unit 3 can be configured to receive the multichannel
audio signal supply source and the audio signal Au2 from the television receiver 1 or the DVD
player 2 through the signal cable, but It is necessary to connect the television receiver 1 or the
DVD player 2 to the chair with a signal cable.
Therefore, a DVD player or the like is provided with means for wirelessly transmitting multichannel audio signals using radio waves and light, and the audio signal output device section 3
receives the multi-channel audio signals transmitted wirelessly. By providing the receiving unit,
the signal cable between the DVD player 2 or the like and the chair 20 can be eliminated.
[0187]
As described above, when audio signal output from a multi-channel audio signal supply source
such as a DVD player is transmitted by radio waves or light, it becomes cordless between the
DVD player etc. and the sound reproduction system, for example The chair 20 equipped with the
sound reproduction system has the advantage of being able to move freely.
[0188]
In the above description, the speaker of the television receiver is used as the front speaker, but it
is of course possible to separately provide a dedicated front speaker.
In that case, a speaker for the center channel may be provided.
[0189]
Second Embodiment In the second embodiment, all 5.1 channel surround sound is emitted by
two speakers 11 SW 1 and 11 SW 2 provided in the vicinity of the ear of the listener 4. It is the
case that it is possible to make the most of the effects of noise reduction and energy saving.
[0190]
10-05-2019
46
FIG. 17 is a diagram showing an outline of the sound reproduction system according to the
second embodiment.
As shown in FIG. 17, in the second embodiment, the video output signal Vi from the DVD player
2 is supplied to the display monitor device 15 without a speaker, and the reproduced video is
displayed on the display screen 15D. .
[0191]
As the speakers, only the speakers 11SW1 and 11SW2 as subwoofers provided near the listener
4 are provided.
[0192]
Then, only the audio signal Au 2 from the DVD player 2 is supplied to the audio signal output
device unit 3.
The audio signal output device unit 3 generates a low frequency audio signal LFE to be supplied
to the subwoofers 11SW1 and 11SW2 (which may include the auxiliary subwoofers 11SW3 and
11SW4) from the audio signal Au2, and also generates front left and right two-channel audio
signals FL, The FR and rear left and right two-channel audio signals RL and RR are generated,
subjected to virtual sound source processing, and the virtual sound source processed audio
signals are added to the audio signals supplied to the subwoofers 11SW1 and 11SW2.
In this example, the center channel audio signal C is synthesized with each of the front left and
right two-channel audio signals L and R subjected to virtual sound source processing.
[0193]
[Speaker Arrangement Example of Second Embodiment] FIG. 18 is a view for explaining a
speaker arrangement example according to the second embodiment.
10-05-2019
47
That is, in the second embodiment, only the two speakers 11SW1 and 11SW2 provided in the
vicinity of the both ears of the listener 4 are shown as solid speakers as the real speakers.
[0194]
The speaker 11C for the center channel, the speakers 11FL and 11FR for the front left and right
two channels, and the speakers 11RL and 11RR for the rear left and right two channels are not
provided as an entity as shown by a broken line in FIG.
Then, the audio signals supplied to the speakers are subjected to virtual sound source processing
and supplied to the speakers 11SW1 and 11SW2, so that the listener 4 can listen as if the
speakers are at the positions indicated by the dotted lines. Sound is reproduced.
[0195]
Note that the audio signal of the center channel is added to the audio signals of the front left and
right two channels, and thereafter, the audio signal of the front left and right channels combined
with the audio signal of the center channel is virtually subjected to virtual sound source
processing. In addition, the listener 4 can listen so that the sound is emitted from the position of
the speaker 11C indicated by a dotted line in FIG.
[0196]
Also in this second embodiment, the speakers 11SW1 and 11SW2 (including the auxiliary
subwoofers 11SW3 and 11SW4 as necessary) are attached to the chair 20, for example, as
shown in FIGS. 5 and 6 described above. It is configured as being
[0197]
[Configuration Example of Audio Signal Output Device Unit 3 According to Second Embodiment]
FIG. 19 is a block diagram showing a configuration example of the audio signal output device
unit 3 according to the second embodiment.
The audio signal output device unit 3 in the second embodiment also includes an audio signal
processing unit 300 and a control unit 100 formed of a microcomputer, as in the audio signal
10-05-2019
48
output device unit 3 in the above-described embodiment.
[0198]
The control unit 100 according to the second embodiment has a head front transfer function
storage unit 114 in addition to the head rear transfer function storage unit 111 in comparison
with the control unit 100 according to the first embodiment. In addition to the provision, an
input / output port 109 is added.
The other configuration of the control unit 100 is substantially the same as that of the first
embodiment.
[0199]
Further, in the audio signal processing unit 300 in the second embodiment, the input selection
switch circuit 301 in the case of the first embodiment is not provided.
In the audio signal processing unit 300 according to the second embodiment, the 5.1 channel
decoding unit 302 is provided, and in addition to the rear transfer function convolution circuit
310 in the case of the first embodiment, the front is also provided. A transfer function
convolution circuit 320 is provided.
[0200]
The 5.1 channel decoding unit 302 receives the audio signal Au2 from the DVD player 2,
performs channel decoding processing, and outputs audio signals L and R for the front left and
right channels, an audio signal C for the center channel, and the rear The audio signals RL and
RR of the left and right channels and the low-pass audio signal LFE are output.
[0201]
The audio signal L of the front left channel from the 5.1 channel decoding unit 302 and the
audio signal C of the center channel are synthesized by the synthesizing unit 303, and the
10-05-2019
49
synthesized output audio signal (L + C) constitutes a virtual sound source processing unit. Is
supplied to a front transfer function convolution circuit 320.
Further, the sound signal R of the front right channel from the 5.1 channel decoding unit 302
and the sound signal C of the center channel are synthesized by the synthesizing unit 304, and
the synthesized output sound signal (R + C) has a front transfer function convolution. It is
supplied to the circuit 320.
[0202]
The front transfer function convolution circuit 320 has the same configuration as the rear
transfer function convolution circuit 310. For example, the head front transfer function prepared
in advance in the head front transfer function storage unit 115 using a digital filter Are
convoluted to the audio signals from the synthesis units 303 and 304.
[0203]
Therefore, in the front transfer function convolution circuit 320, when the input audio signal is
not a digital signal, it is converted into a digital signal, and the head front transfer function is
convoluted and then converted back into an analog signal and output.
[0204]
In this example, the head front transfer function is measured and determined as follows, and
stored in the head front transfer function storage unit 114.
FIG. 20 is a diagram for explaining a method of measuring the head front transfer function.
[0205]
That is, as shown in FIG. 20, the left channel measurement microphone 41 and the right channel
measurement microphone 42 are installed in the vicinity of the left and right ears of the listener
4.
10-05-2019
50
Next, the front left channel speaker 11FL is arranged in front of the listener 4 where the front
left channel speaker is usually arranged.
Then, using this front left channel speaker 11FL, for example, the emitted sound when the
impulse is acoustically reproduced is picked up by the respective microphones 41 and 42, and
from the picked up sound signal, the left and right from the front speaker 11FL are picked up.
Measure the transfer function to the ear (head-to-front transfer function for the front left
channel).
[0206]
Similarly, for the front right channel speaker 11FR, for example, emitted sound when the impulse
is acoustically reproduced is collected by the respective microphones 41 and 42, and from the
collected audio signal, the left and right from the front speaker 11FR are obtained. Measure the
transfer function to the ear (head front transfer function for the front right channel).
[0207]
The head front transfer function is a transfer function from each speaker to the ear when the
speakers are placed at a position of 2 m from the front speakers FL and FR, for example, 30
degrees from the front center of the listener 4 to the left and right. It is good to measure and
obtain the transfer function.
[0208]
Further supplement to the transfer function.
For example, a transfer function coming from the left front to the left ear in FIG.
Next, a transfer function obtained by measuring a transfer function from the speaker 11SW1 in
the vicinity of the ear to the microphone 41 is taken as a transfer function B.
Furthermore, if a transfer function B is multiplied by a certain transfer function X, a transfer
function X which becomes a transfer function A is obtained, and the obtained transfer function X
is convoluted with the signal sound sent to the nearby speaker 11SW1. The sound emitted from
10-05-2019
51
the speaker 11SW1 feels as if it came from the left front 2 m.
[0209]
However, the transfer function X may not necessarily be determined, and in some cases only the
transfer function A may be used.
Although the above description is described on behalf of one of the transfer functions, it is
needless to say that there are actually a plurality of transfer functions, as also shown in FIG.
[0210]
Also in this case, in the same manner as in the case of the measurement of the head rear transfer
function described above, in response to the listener 4 changing the position of the head or
changing the direction of the face, The head front transfer functions are measured and obtained
at various positions of the listener 4 with respect to the two speakers 11 SW 1 and 11 SW 2 and
at various directions of the face (head) of the listener 4.
[0211]
That is, as in the case of the measurement of the head rear transfer function described above, the
centerline position of the head of the listener 4 when the head of the listener 4 is present at a
position between the two speakers 11SW1 and 11SW2. And the distance between the center line
position of the head of the listener 4 and the speaker SW2 (for example, the diaphragm position
of the speaker 11SW1) by d1 and the distance between the speaker 11SW1 (for example, the
diaphragm position of the speaker 11SW1) and d1 The position of the head of the listener 4
between the two speakers 11SW1 and 11SW2 is represented by the difference Δ = d1−d2.
However, the absolute value of the difference Δ is smaller than d1 + d2.
[0212]
That is, when the difference Δ = 0, as shown in FIG. 20A, the head of the listener 4 is at the
10-05-2019
52
central position between the two speakers 11SW1 and 11SW2.
When the difference Δ is a negative value, the head of the listener 4 is at a position close to the
side of the speaker 11SW1 as shown in FIG. 20B and at a position according to the value of the
difference Δ. Further, when the difference Δ is a positive value, the head of the listener 4 is at a
position close to the side of the speaker 11SW2 as shown in FIG. 20C and at a position according
to the value of the difference Δ .
[0213]
In this embodiment, the positions described above for each predetermined value of the positive
and negative differences Δ, for example, the positions for each integer multiple of the difference
Δ 4 cm (in the position of the head, every 2 cm), Measure the head related transfer function. In
that case, as shown in FIG. 20 (D) at each position, the head transmission is also performed at a
predetermined angle, for example, every 2 degrees, with respect to the direction of the face of
the listener 4 as shown in FIG. Measure the function
[0214]
In the above description, it is assumed that the head of the listener 4 moves laterally along the
line connecting the speakers 11SW1 and 11SW2 to change the position, but the head of the
listener 4 moves vertically to move the position Assuming that it is changed, the above-described
head-to-head transfer function may be measured at each position in the vertical direction, for
example, every 2 cm, and stored in the head-rear transfer function storage unit 111.
[0215]
In addition, assuming that the head of the listener 4 moves forward and backward (the
movement to the rear is limited in the case of a chair), the head of the listener 4 is positioned at
each value ± Δj. The movement in the front-rear direction may be changed, and the head
related transfer function may be measured at each position of the change and stored in the head
rear transfer function storage unit 111.
[0216]
In this case, although the number of head transfer functions stored in the head rear transfer
function storage unit 111 is large, the number of head transfer functions to be stored may be
reduced by optimizing as appropriate.
10-05-2019
53
[0217]
The head front transfer function similar to that shown in FIG. 9 and obtained as described above
is stored in the head front transfer function storage unit 114 and is input to the front transfer
function convolution circuit 320. It is fed through the output port 109 and is folded in the front
transfer function convolution circuit 320.
[0218]
Then, in this embodiment, the control unit 100 receives the position detection result of the head
of the listener 4 and the face direction detection result of the listener 4 from the captured image
analysis unit 112, and uses them as an argument for reading. The stored contents of the front
transfer function storage unit 114 are searched, and the head front transfer function
corresponding to the position detection result of the listener 4 and the face direction detection
result is read out from the head front transfer function storage unit 114.
That is, the control unit 100 listens to the audio while watching the DVD video from among the
head front transfer functions corresponding to various positions and face orientations stored in
the head front transfer function storage unit 114. An appropriate head front transfer function
according to the actual head position of the listener 4 and the face direction is selected and read
out and supplied to the front transfer function convolution circuit 320.
[0219]
Thus, from this front transfer function convolution circuit 320, the sound signal C of the center
channel is synthesized with the sound signal FL * of the front left channel subjected to virtual
sound source processing, and the sound of the front right channel subjected to virtual sound
source processing A signal FR * obtained by combining the center channel audio signal C is
obtained.
[0220]
When the audio signals (FL * + C) and (FR * + C) from the front transfer function convolution
circuit 320 are supplied to the speakers 11SW1 and 11SW2 arranged in the vicinity of both ears
to reproduce the sound, the listener 4 is as if forward As the sound is emitted from the left and
right front speakers 11FL and 11FR, the user listens to the reproduced sound and listens to the
center channel sound as emitted from the centrally located speaker.
10-05-2019
54
[0221]
The levels of the audio signals (FL * + C) and (FR * + C) at this time may be lower than the signal
levels supplied to the speakers 11RL and 11RR.
This is because the speakers 11SW1 and 11SW2 are in the vicinity of the ear of the listener 4.
[0222]
As described above, the audio signals (FL * + C) and (FR * + C) from the front transfer function
convolution circuit 320 subjected to virtual sound source processing are supplied to the
synthesis units 321 and 322.
The low-pass audio signal LFE from the 5.1 channel decoding unit 302 is supplied to the
combining units 321 and 322.
Then, the output voice signals of the synthesis units 321 and 322 are supplied to the synthesis
unit 331 and the synthesis unit 332 through the amplifiers 323 and 324.
[0223]
In addition, the audio signals RL and RR of the rear left and right 2 channels from the 5.1
channel decoding unit 302 are supplied to the rear transfer function convolution circuit 310
constituting the virtual sound source processing unit as in the first embodiment described above.
Be done.
[0224]
The head rear transfer function storage unit 111 stores the head rear transfer function measured
as described with reference to FIGS. 8 and 9 in the first embodiment.
10-05-2019
55
From the head rear transfer function storage unit 111, as described above, the position of the
listener 4 detected based on the position detection result of the head of the listener 4 and the
face direction detection result from the captured image analysis unit 112. The head rear transfer
function measured in the state closest to the face direction is read out and supplied to the rear
transfer function convolution circuit 310 through the input / output port 108, and in this rear
transfer function convolution circuit 310, 5.1 channel The audio signals of the rear left and right
two channels from the decoding unit 302 are convoluted.
[0225]
Then, the voice signal of the front left channel on which the voice signal of the center channel
has been synthesized, which has been subjected to virtual sound source processing from the
front transfer function convolution circuit 320, is low frequency band from the 5.1 channel
decoding unit 302 in the synthesis unit 321. After being synthesized with the speech signal LFE,
the speech signal LFE is supplied to the synthesis circuit 331 and synthesized with the speech
signal of the rear left channel subjected to virtual sound source processing from the rear transfer
function convolution circuit 310.
[0226]
Similarly, the voice signal of the front right channel on which the voice signal of the center
channel, which has been subjected to virtual sound source processing from the front transfer
function convolution circuit 320, is synthesized is synthesized by the synthesizing unit 322 from
the 5.1 channel decoding unit 302. After being synthesized with the low-pass speech signal LFE,
the low-pass speech signal LFE is supplied to the synthesizing circuit 332 and synthesized with
the speech signal of the rear right channel subjected to virtual sound source processing from the
rear transfer function convolution circuit 310.
[0227]
The synthesized speech signals from the synthesis circuits 331 and 332 are supplied to the
multiplexing unit 313 and multiplexed (multiplexed), and are wirelessly transmitted from the
wireless transmission unit 314 to the audio signal reception unit 7.
[0228]
The audio signal receiving unit 7 receives a radio wave from the audio signal output device unit
3, extracts a multiplexed audio signal from the received radio wave, de-multiplexes it, and
supplies it to the first speaker 11 SW 1. The audio signal and the audio signal to be supplied to
the second speaker 11SW2 are separated and supplied to the first speaker 11SW1 and the
10-05-2019
56
second speaker 11SW2, respectively.
[0229]
Therefore, the speakers 11SW1 and 11SW2 acoustically reproduce the low-pass audio signal LFE
as a subwoofer, and the front audio signals (FL * + C) and (FR * + C) subjected to the virtual
sound source processing, and the rear sound source subjected to the virtual sound source
processing Sound reproduces the audio signals RL * and RR *.
[0230]
The audio signal system supplied to the auxiliary subwoofers 11SW3 and 11SW4 is omitted in
FIG. 19, but only the low-pass audio signal LFE is supplied to these auxiliary subwoofers 11SW3
and 11SW4. It is also possible to add virtual sound source processed audio signals (FL * + C) and
(FR * + C), and virtual sound source processed rear audio signals RL * and RR * to low-pass audio
signal LFE. It may be supplied.
[0231]
As described above, in the second embodiment, only the speakers 11SW1 and 11SW2 in the
vicinity of the both ears of the listener 4 can enjoy multi-channel sound with a sense of presence
at a large volume and around It is possible to significantly reduce the leakage of sound and to
realize energy saving of the sound reproduction system.
[0232]
Then, even if the listener 4 changes the position of the head or changes the direction of the face,
in this embodiment, virtual sound source processing corresponding to the change in position and
movement of the head in real time is performed. Therefore, you can always enjoy multi-channel
surround sound without losing the sense of localization.
[0233]
[Other Embodiments or Modifications] In the above embodiment, the speakers 11SW1 and
11SW2 disposed in the vicinity of both ears of the listener 4 are provided at positions facing the
ears of the listener 4, so the listener 4 It becomes high efficiency as the sound of the low range to
reach.
10-05-2019
57
However, the arrangement position of the speaker is not limited to such a position. For example,
as shown in FIG. 21, for example, the radius of the head of the listener 4 is centered on the head
of the listener 4 (dsw + head of the listener 4 Any position may be used as long as it is a position
on the spherical surface of
However, as the arrangement position of the speaker, the space on the listener front side is not
preferable to the face of the listener 4 and it is desirable that the space is on the rear side of the
listener 4 face as shown in FIG.
[0234]
Further, in the above embodiment, the speakers 11SW1 and 11SW2 arranged in the vicinity of
the both ears of the listener 4 are always subwoofers. It may be provided in
[0235]
In addition, as a method of attaching speaker units, such as speaker 11SW1 and 11SW2, so that
the sound which comes out of the front and back of the diaphragm can be added, it is not
restricted to the structure attached to a pipe like the above-mentioned embodiment A speaker
unit for low-range reproduction is attached to a plate having a relatively large number of holes,
so that sounds emitted from the front and back of the diaphragm can be added through the
plurality of holes. May be
[0236]
Moreover, in the above-mentioned embodiment, although speaker 11SW1 and 11SW2 were fixed
and attached with respect to a chair, as a holding type of speaker 11SW1 and 11SW2, it is not
restricted to this.
For example, one each of the speakers 11SW1 and 11SW2 may be configured to be held in the
form of a floor stand, or may be configured to be suspended from a ceiling.
[0237]
Further, in the first embodiment, the target to be subjected to virtual sound source processing is
10-05-2019
58
audio signals of rear left and right two channels, and in the second embodiment, audio signals of
rear left and right 2 channels and front left and right 2 channels and center channel However,
the rear left and right two channels may be acoustically reproduced by real speakers, and the
front left and right two channels and the center channel may be subjected to virtual sound source
processing.
In that case, it is preferable that the real left and right two-channel speakers be attached to the
holders of the speakers 11SW1 and 11SW2 together so that they are attached near the listener 4
to reduce the reproduction volume.
[0238]
Also, although the above embodiments have been described for the case of a system for
reproducing 5.1 channel multi-channel audio signals, the present invention is not limited to 5.1
channels, and the present invention can reproduce audio signals of a plurality of channels. It is
applicable to all of the sound reproduction systems to make it do.
[0239]
Moreover, in the above-mentioned embodiment, although marker parts 6L and 6R were
constituted so that LED might be made to emit light, it is constituted by a reflector which reflects
light from image pick-up part 5 and ambient light, and is comprised rather than other parts You
may comprise by what is made to be imaged by high-intensity at the time of imaging.
[0240]
Further, in the above-described embodiment, both of the position of the listener 4 with respect to
the speakers 11SW1 and 11SW2 and the movement of the head (for example, the orientation of
the face) are detected, and the head related transfer functions corresponding to both are read
out. However, the present invention detects only the position of the listener 4 with respect to the
speakers 11SW1 and 11SW2 or only the movement of the head of the listener (face orientation
etc.) and detects the position of the listener 4 with respect to the speakers 11SW1 and 11SW2,
respectively. Alternatively, a head related transfer function which is measured and stored in
advance corresponding to only the head movement (face orientation or the like) of the listener
may be read from the storage unit.
[0241]
In the above embodiment, the position of the listener 4 and the face orientation with respect to
the speakers 11SW1 and 11SW2 are detected from the captured image in the imaging unit using
10-05-2019
59
the image sensor. The method for detecting the position and the orientation of the face is not
limited to this.
[0242]
For example, the detection may be performed by emitting ultrasonic waves toward the head and
the marker unit and detecting the reflected wave.
[0243]
The above description is about the case where the present invention is applied to a sound
reproduction system in which a so-called naked speaker unit is disposed near the listener's ear,
but the speaker unit is near the listener's both ears. The present invention is also applicable to
virtual sound source processing in the case where a speaker housed in a speaker box is disposed.
[0244]
It is a figure for demonstrating the structural example of the outline | summary of 1st
Embodiment of the sound reproduction system by this invention.
It is a figure for demonstrating the example of a speaker arrangement | positioning in the sound
reproduction system of 1st Embodiment.
It is a figure for demonstrating the example of a speaker arrangement | positioning in the sound
reproduction system of 1st Embodiment.
It is a figure used for operation | movement description of embodiment of the sound
reproduction system by this invention.
It is a figure for demonstrating the example of a speaker arrangement | positioning in the sound
reproduction system of 1st Embodiment.
It is a figure for demonstrating the example of a speaker arrangement | positioning in the sound
10-05-2019
60
reproduction system of 1st Embodiment.
It is a block diagram which shows the structural example of the audio | voice signal output
apparatus part in the sound reproduction system of 1st Embodiment.
It is a figure for demonstrating the head rear transmission characteristic used for virtual sound
source processing.
It is a figure for demonstrating an example of the head rear transmission characteristic memorize
| stored in a memory | storage part.
It is a figure which shows the structural example of the audio | voice signal receiving part of FIG.
It is a figure which shows the structural example of the imaging part of FIG.
It is a figure for demonstrating the example of the image imaged with the imaging part of the
example of FIG.
FIG. 5 is a diagram used to explain a method of detecting the orientation of the face of the
listener in the embodiment of the present invention.
FIG. 5 is a diagram used to explain a method of detecting the orientation of the face of the
listener in the embodiment of the present invention.
FIG. 5 is a diagram used to explain a method of detecting the orientation of the face of the
listener in the embodiment of the present invention. FIG. 5 is a diagram used to explain a method
of detecting the orientation of the face of the listener in the embodiment of the present invention.
It is a figure for demonstrating the structural example of the outline | summary of 2nd
Embodiment of the sound reproduction system by this invention. It is a figure for demonstrating
the speaker arrangement example in the sound reproduction system of 2nd Embodiment. It is a
block diagram which shows the structural example of the audio | voice signal output device part
in the sound reproduction system of 2nd Embodiment. It is a figure for demonstrating the head
10-05-2019
61
front transfer characteristic used for virtual sound source processing. It is a figure for
demonstrating the other speaker arrangement in one Embodiment of the sound reproduction
system by this invention. It is a figure for demonstrating the example of a general speaker
arrangement | positioning in the conventional sound reproduction system.
Explanation of sign
[0245]
3 audio signal output device unit 4 listener 5 imaging unit 6L, 6R marker unit 11 FL front left
channel speaker 11 FR front right channel speaker 11 C center channel speaker 11RL: speaker
for rear left channel, 11RL: speaker for rear right channel, 11SW1 to 11SW4: speaker arranged
in the vicinity of listener's ear, 20: chair, 22: speaker holder, 111: head rear transfer function
Storage unit 113: Captured image analysis unit 115: Head front transfer function storage unit
310: Rear transfer function convolution circuit 320: Front transfer function convolution circuit
10-05-2019
62
Документ
Категория
Без категории
Просмотров
0
Размер файла
84 Кб
Теги
jp2008113118
1/--страниц
Пожаловаться на содержимое документа