close

Вход

Забыли?

вход по аккаунту

?

JP2007274061

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JP2007274061
An apparatus for reproducing a sound field of an image, particularly an audio synchronized with
an object in the image by outputting to a speaker array, accurately expresses the sound image
localization of the image in the content. An input unit (4), a parameter calculation unit (5) and a
signal processing unit (6) are provided. The input unit 4 receives, for each object, an audio signal
41 attached to one or more objects appearing in the image and position information 42 in the
image of the object synchronized with the image. The signal processing unit 6 processes the
input signal of the voice and outputs a digital voice signal to each channel of a plurality of
speakers. Based on the position information of the object, a virtual sound source is set ahead of
the display for displaying the video, and various parameters 51, 52, 53 for forming a sound field
of the virtual sound source are calculated for each channel . The signal processing unit 6 outputs
a digital audio signal to each channel of each speaker based on the various parameters. [Selected
figure] Figure 2
Sound image localization apparatus and AV system
[0001]
The present invention relates to an apparatus for producing audio synchronized with an object in
a video, in particular an image, using a loudspeaker array.
[0002]
2. Description of the Related Art Conventionally, audio systems such as 2-channel stereo and 5.1channel surround have been put to practical use in order to reproduce audio accompanying a
video with a spread according to the video.
10-05-2019
1
[0003]
On the other hand, in recent personal computer games and video game machines, there has been
proposed a technology for more accurately localizing the sound image of the sound emitted by
the object at the position of the object (character or object) appearing in the video (for example,
1).
In the apparatus of Patent Document 1, stereophonic sound headphones can be localized in the
vertical direction by using stereo headphones having speakers respectively disposed above and
below the left and right ears.
Unexamined-Japanese-Patent No. 6-165877
[0004]
However, although the device of Patent Document 1 can localize the sound image in the vertical
direction as compared with the conventional stereo headphones, it has been impossible to give
the sound image a sense of perspective or depth. Further, in the device of Patent Document 1, it
is essential to use headphones, and it was not possible to localize a sound image using a speaker
set installed indoors.
[0005]
SUMMARY OF THE INVENTION It is an object of the present invention to provide an apparatus
for three-dimensionally localizing sound synchronized with movement of an image, particularly
an object in the image, using a speaker array.
[0006]
In the present invention, means for solving the above-mentioned problems are configured as
follows.
[0007]
(1) The present invention provides a speaker having an input unit for inputting an audio signal
10-05-2019
2
associated with an object included in an image, and positional information of the object in the
image, and a plurality of speakers arranged in an array. A control unit for controlling an array,
and a localization control unit for controlling timing of inputting the audio signal to each speaker
so as to be an audio beam focused on the position indicated by the position information; It is
characterized by being an apparatus.
[0008]
In the present invention, the input unit inputs position information of the object in the image, and
outputs sound to the speakers arranged in an array.
Further, based on the position indicated by the position information and the arrangement
position of the array speaker, the timing of inputting the sound signal to each speaker is
controlled so as to be an audio beam focused on the position indicated by the position
information.
Thus, when audio is output from the speaker array, audio that has spread from the virtual sound
source and has reached the speaker array can be reproduced, and sound image localization can
be performed at the position indicated by the position information of the object in the image. it
can.
[0009]
Further, according to the present invention, this sound image localization can be set at any
position without requiring mechanical elements such as a ceiling and a wall.
Furthermore, according to this apparatus, it is possible to accurately reproduce the Doppler
effect as a result of sound image localization, instead of simulating an effect sound such as the
Doppler effect.
[0010]
10-05-2019
3
In addition, control of the timing which inputs an audio | voice signal calculates the position of a
virtual sound source from the positional information in the imaging | video of an object about the
output to each speaker, for example, The distance from the position of this virtual sound source
to each speaker array This can be realized by delaying and outputting the audio signal by the
delay amount obtained by dividing by. Further, in the present invention, if the input of the audio
signal and the input of the positional information are processed by the signal processing unit
through the “input unit”, these inputs are stored in the external storage unit (hard disk,
magneto-optical disk, etc.) It may be streaming data provided through a communication line.
Further, this “input unit” can be, for example, a temporary storage device (memory).
[0011]
(2) The present invention is characterized in that the localization control unit further controls a
level at which the audio signal is input to each speaker.
[0012]
The present invention controls the level at which the audio signal is input to each speaker, so
that the sound image can be localized more accurately.
[0013]
(3) The present invention is characterized in that the position information includes distance
information in a depth direction of a display screen on which the video is displayed.
[0014]
According to the present invention, since the position information includes distance information
in the depth direction, sound image localization with a certain depth between the depths which
can not be obtained conventionally can be performed.
In addition, since the modulation of the frequency by the Doppler effect described above is
stronger when the object moves in the depth direction, the sound image localization becomes
three-dimensional.
[0015]
(4) In the present invention, the input unit inputs audio signals respectively associated with a
10-05-2019
4
plurality of objects included in a video, and positional information of the objects in the video, and
the localization control unit The voice beam is respectively generated based on the position
indicated by the position information with respect to the object of (4), and these are synthesized.
[0016]
Since the present invention generates voice beams based on position information of a plurality of
objects in the image, if the combined output of these voice beams is emitted from an array
speaker, the positions of virtual sound sources of the plurality of objects are generated. It is
possible to emit the sound localized at the same time.
Since this causes a relative difference in the position of the virtual sound source at which the
sound associated with these objects is localized, the sound image localization becomes more
three-dimensional.
[0017]
(5) The present invention outputs an image signal including an object, and an image output unit
that outputs positional information of the object in the image signal, and an image output unit
associated with the object in synchronization with the image signal. An audio output unit that
outputs an audio signal; a display unit that receives the video signal and displays an image; and a
speaker array that is installed side by side with the display unit and has a plurality of speakers
arranged in an array. 1) to a sound image localization apparatus according to any one of the
items 1) to 4).
[0018]
In the content reproduction apparatus according to the present invention, in synchronization
with the video signal output from the video output unit, the audio output unit focuses the audio
signal associated with the object to the position on the position information according to the
position on the video. Is output to the speaker array, so that the sound image can be localized
according to the position on the video.
[0019]
According to the present invention, accurate sound image localization can be performed based on
10-05-2019
5
the position of an object in an image.
Also, this sound image localization can be set at any position without the need for mechanical
elements such as a ceiling or a wall.
Moreover, according to this apparatus, as a result of accurate sound image localization based on
the parameter for setting the delay, the parameter for setting the attenuation of the volume, etc.
instead of simulating the sound effect such as the Doppler effect. , Can accurately reproduce the
Doppler effect.
[0020]
<Overview of AV System of First Embodiment> An overview of an AV system 1 including the
sound image localization apparatus 10 of the present embodiment will be described with
reference to FIG.
FIG. 1 is a view showing the appearance and configuration of an AV system 1 according to the
present embodiment.
FIG. 1A is a front view of the AV system 1, and FIG. 1B is a plan view of the AV system 1. The AV
system 1 includes a content reproduction device 2, a sound image localization device 10, a
speaker array 11 including a plurality of speakers SPi (i = 1 to N), and a display 12. The viewer
100 is viewing from the front of the speaker array 11 and the display 12.
[0021]
The content reproduction device 2 outputs a video signal to the display 12 and also outputs
position information to the sound image localization device 10. Examples of the content
reproduction device 2 that simultaneously outputs video and audio include a teleconferencing
device, such as a PC and a TV that outputs content including video and audio such as game
programs and movies. The sound image localization apparatus 10 outputs, to the speaker array
11, an audio beam whose focal point is the virtual sound source position calculated from the
positions of the objects 105A and 105B displayed on the display 12. The speaker array 11
10-05-2019
6
reproduces the sound emitted from these virtual sound sources and reached the speaker array
11. Thereby, the speaker array 11 localizes the sound image of these virtual sound sources. The
display 12 displays an image 104 including one or more objects 105A, 105B and the like.
[0022]
<Description of Configuration of AV System of First Embodiment> The AV system of the first
embodiment will be described with reference to FIGS. 1 and 2. FIG. 1 is a diagram showing the
appearance and configuration of this AV system as described above. FIG. 2 is an internal block
diagram of the sound image localization apparatus 10 included in the AV system 1 shown in FIG.
[0023]
First, the appearance of the AV system 1 will be described with reference to FIG. As shown in FIG.
1A, the AV system 1 includes a content reproduction device 2, a sound image localization device
10, a speaker array 11, and a display 12. The speaker array 11 includes a plurality of speakers
SPi (i = 1 to N), and the sound image localization apparatus 10 has an audio output system
corresponding to each of the speakers SPi (i = 1 to N). The display 12 is configured by, for
example, a plasma display, a liquid crystal display, etc., and displays an image output from the
content reproduction apparatus, in particular a moving image. In addition, in FIG. 1, although the
sound image localization apparatus 10 is comprised as a housing | casing separate from the
speaker array 11 and the display 12, it enters into the same sound image localization apparatus
housing | casing as the speaker array 11 or the display 12. It may be configured as
[0024]
Next, an outline of a control method of an audio output system of each of the speakers SPi (i = 1
to N) of the speaker array 11 will be described using the plan view of FIG. 1 (B). The objects
105A and 105B shown in FIG. 1A are included in the image 104 output from the content
reproduction apparatus 2. The content reproduction apparatus 2 provides information on the
positions of these objects (hereinafter referred to as "position information"). It is said. Is output to
the sound image localization apparatus 10. The sound image localization apparatus 10 sets
virtual sound sources 101A and 101B based on the position information of the objects 105A and
105B. Then, in the sound image localization apparatus 10, a voice (shown by an actual wavefront
103) that the viewer 100 listens to from the speaker array 11 is a virtual image in which the
10-05-2019
7
sound is located behind the display 12 as shown in FIG. The sound image is localized so that
sound is output from the sound sources 101A and 101B (shown by the virtual wavefront 102). In
order to realize the sound image localization, the sound image localization apparatus 10 controls
the timing of the audio signal input to each of the speakers SPi (i = 1 to N) of the speaker array
11. The control of the timing of inputting the audio signal adjusts the time delay (phase delay)
characteristic (hereinafter referred to as “delay”), the volume, and the frequency characteristic,
and outputs the audio 103 from these speakers SPi. Thereby, the sound image can be localized at
the positions of the virtual sound sources 101A and 101B.
[0025]
Next, the internal configuration of the sound image localization apparatus 10 will be described
using FIG. The sound image localization apparatus 10 includes an input unit 4, a parameter
calculation unit 5, a signal processing unit 6, a D / A converter DACi (i = 1 to N), and an amplifier
AMPi (i = 1 to N). And a control unit 7 that integrally controls the operation of the controller. The
content reproduction apparatus 2 provided outside the sound image localization apparatus 10
includes an external program 111, an external storage unit 31, and a CPU (not shown) for
operating the external program 111.
[0026]
The input unit 4 has an interface for inputting each signal and a buffer for storing the input
signal. The Internet, a content reproduction apparatus, etc. are connected to the input unit 4.
From these, the input unit 4 receives the audio signal 41 and the position information 42, and
temporarily stores these data in the buffer.
[0027]
The audio signal 41 is an audio signal input through the external storage unit 31, the external
program 111, and the Internet, and is data of a digital audio signal output in synchronization
with the image 104 as shown in FIG. . Further, the audio signal 41 is input for each of the objects
105A and 105B in the image 104 as shown in FIG. 1 (A). The position information 42 is position
information output by the content reproduction apparatus 2 as shown in FIG. 1 (A).
10-05-2019
8
[0028]
The parameter calculation unit 5 of FIG. 2 is configured by a calculation unit. The parameter
calculation unit 5 sets the positions of the virtual sound sources 101A and 101B as shown in FIG.
1A based on the position information 42 input to the input unit 4, and sets the delay parameter
51 and the volume adjustment parameter 52. The high-frequency attenuation parameter 53 is
calculated and stored in the input unit 4 or output to the signal processing unit 6.
[0029]
The delay parameter 51 calculated in the parameter calculation unit 5 of FIG. 2 is set for each
object and for each speaker SPi. For the object 105A as shown in FIG. 1B, the distances from the
virtual sound source 101A to each of the speakers SPi as shown in FIG. 1A are calculated, and
the delay in voice transmission based on this distance is calculated. A delay is set for each
speaker SPi. In the example of FIG. 1A, the distance from the virtual sound source 101B of FIG.
1B to each of the speakers SPi is calculated also for the object 105B appearing in the image 104.
The volume adjustment parameter 52 calculates the amount of decrease in volume for each
speaker SPi in consideration of the decrease in volume due to the audio transmission distance.
Also, this decrease is calculated for all objects. The high frequency attenuation parameter 53 is a
parameter for attenuating the high frequency based on the distance between the virtual sound
source 101A and the speaker SPi, or the arrangement of the speaker array 11 and the virtual
sound source 101A, and is set for each object or speaker SPi Do.
[0030]
In the delay parameter 51, the volume adjustment parameter 52, and the high-frequency
attenuation parameter 53, the distance between the virtual sound source and the speaker SPi (i =
1 to N) changes as the objects move. Updates the value of each parameter.
[0031]
The signal processing unit 6 processes digital audio signal data of the audio signal 41, and
calculates data to be output for each speaker SPi.
An audio signal is input to the signal processing unit 6 in synchronization with the video signal.
10-05-2019
9
The signal processing unit 6 outputs the audio signal 41 input for each of the objects 105A and
105B, based on the delay parameter 51, the volume adjustment parameter 52, and the highfrequency attenuation parameter 53 of the parameter calculation unit 5 described above.
Generate data of digital audio signal to be sent to the system. The signal processing unit 6
performs digital filter DFi (i = 1 to N) for performing delay based on the delay parameter 51 (see
FIG. 3 described later). , And the gain adjustment Gi (i = 1 to N) based on the volume adjustment
parameter 52 and the high frequency attenuation parameter 53 (see FIG. 3 described later). ). In
addition, when there are a plurality of objects, the data generated in this manner is output by
adding digital audio signal data for each of the objects 105A and 105B as described later with
reference to FIG.
[0032]
The D / A converter DACi (i = 1 to N) can be constituted by an IC chip for D / A conversion,
converts the digital audio signal data generated by the signal processing unit 6 into an analog
audio signal, Output to i = 1 to N). The amplifier AMPi (i = 1 to N) can be configured by, for
example, an amplification stage such as a FET. The amplifiers AMPi (i = 1 to N) may be external
AV amplifiers. The analog audio signal output from the D / A converter DACi is amplified and
sent to the speaker SPi. The speakers SPi (i = 1 to N) require a speaker array or a unit in which
three or more speakers are arranged, and each input an independent audio signal. Then, the
analog voice signal amplified by the amplifier AMPi (i = 1 to N) is converted into voice. The
control unit 7 includes, for example, a CPU, and controls each part of the internal configuration
of the sound image localization apparatus 10.
[0033]
The external program 111 operates in the content reproduction device 2. When the external
program 111 is executed, the audio signal 41 is read from the external storage unit 31 and
output to the input unit 4 simultaneously with the reproduction of the content by the external
program 111 or another program, and the position information 42 is calculated. Output to
[0034]
The external storage unit 31 includes a hard disk and an optical disk device, and inputs an audio
signal 41 to the input unit 4.
10-05-2019
10
[0035]
Here, among the configurations shown in FIG. 2, the external program 111, the audio signal 41,
and the position information 42 will be specifically described by citing two examples according
to applications of the AV system 1.
<< When Running a Game Program on a PC >> The content reproduction device 2 can be
configured, for example, by a sound board connected to a PC, and there are cases in which the
game program is run on this PC. The game program corresponds to the external program 111
shown in FIG. 2, and the game program outputs the position information 42 of an object
(including a character and an object) on the video calculated therein to the input unit 4. ,
Temporarily store position information 42. Further, from the game program, audio signals 41A
and 42B are output as audio of sound effects corresponding to the objects 105A and 105B in the
image 104 as shown in FIG. The audio signal 41 is stored in the input unit 4 in response to an
instruction to emit a sound effect when the game program is executed.
[0036]
<< When Reproducing Digital Content by Content Reproduction Program on PC >> The content
reproduction device 2 can be configured by, for example, a sound board connected to a PC
(personal computer) as described above, and the image and sound by the content reproduction
program on PC May play. If digital content is to be reproduced by the content reproduction
program, speech of a speech or an object is input as an audio signal 41. The content
reproduction program corresponds to the external program 111. Further, from the content
reproduction program, position information 42 synchronized with the movement, movement, and
stop of the object in time series is input to the input unit 4 corresponding to the image
corresponding to the audio signal 41.
[0037]
<Description of Operation of Signal Processing Unit> Next, the operation of the signal processing
unit 6 will be described in more detail with reference to FIG. FIG. 3 is a block diagram of the
parameter calculation unit 5 and the signal processing unit 6 in the sound image localization
apparatus of the present embodiment, and shows a virtual sound source 101A corresponding to
the object 105A of FIG. 1 (B). Further, the D / A converter DACi (i = 1 to N) and the amplifiers
10-05-2019
11
AMPi (i = 1 to N) shown in FIG. 2 are omitted, and the output system of the speaker shown in FIG.
1 and FIG. The number N is eight in FIG. However, in the AV system 1 of the present
embodiment, the number N of speaker output systems is not limited to eight and can be
executed. As shown in FIG. 3, the signal processing unit 6 is provided with a plurality of digital
filters DFi (i = 1 to 8) and gain adjustments Gi (i = 1 to 8).
[0038]
The digital filter DFi (i = 1 to 8) is also illustrated in the virtual sound source 101A of FIG. Set a
delay time corresponding to the propagation time by each distance Li ([m], i = 1 to 8) from the
speaker SPi (i = 1 to 8), and the audio input to each digital filter DFi Delay the signal and output.
The calculation of the delay is performed by the parameter calculation unit 5, and this calculation
is performed by dividing Li by the velocity of sound. As described above, since the video signal
and the audio signal input to the signal processing unit 6 are synchronized, when delaying the
audio signal and outputting it, the audio signal corresponding to the depth of the object
appearing in the video is output. The difference in arrival time makes it possible to give a sense
of depth to the sound.
[0039]
When the object 105A moves in the image, the parameter calculation unit 5 of FIG. 2
sequentially determines the position of the virtual sound source 101A from the position of the
object 105A, and as a result, the position of the virtual sound source 101A fluctuates This causes
the distance Li to fluctuate in time. Therefore, if the delay changes with time, the density of the
sound changes, so that the Doppler effect can be accurately reproduced as a result of sound
image localization. In addition, as the gain adjustment Gi (i = 1 to 8), the signal processing unit 6
sets the parameter calculation unit 5 for which the volume adjustment parameter 52 has been
calculated. Since the sound volume is inversely proportional to the square of the distance Li (i = 1
to 8), the reference sound volume is determined and calculated by: Reference sound volume × 1
[m] / distance Li <2> [m]. Further, the frequency characteristic filter for attenuating the high
region is convolutionally operated by the high region attenuation parameter 53 of FIG. As
described above, the parameter calculation unit 5 calculates the high frequency attenuation
parameter 53, and the high frequency attenuation parameter 53 is the distance Li between the
virtual sound source 101A and the speaker SPi, or the arrangement of the speaker array 11 and
the virtual sound source 101A. Is a parameter that attenuates the high region based on the angle
of. The attenuation amount of the high region calculated by the high region attenuation
parameter 53 is, for example, an angle αi (i = 1 to 8) between the plane SF on which the
10-05-2019
12
speakers SPi are arranged and the direction of the virtual sound source 101A viewed from each
speaker SPi. Is calculated by (cos (.alpha.i) .times.1 [m] / distance Li <2> [m]) times the flat part of
the frequency.
[0040]
The speaker array 11 and the display 12 do not necessarily have to be integrated as shown in
FIG.
[0041]
Further, as an application of the AV system 1 including the sound image localization apparatus
10 of the present embodiment, when the position for installing the speaker array 11 is installed
in front of the display 12 of the sound image localization apparatus 10, the distance is further
supported. The input unit 4 can also be configured to receive an input of the distance between
the speaker and the display in the listening direction of the viewer.
For example, the distance between the speaker and the display 12 in the listening direction is
input to the sound image localization apparatus 10, and the setting to be added to the distance Li
(i = 1 to N) is performed from an operation unit (not shown) instructing the control unit 7. In this
case, the parameter calculation unit 5 adds a delay amount obtained by adjusting the scale by
adjusting the delay corresponding to the distance to the position of the virtual sound source to
the delay corresponding to the distance between the position of the virtual sound source and
each speaker Then, parameters for the delay setting are calculated. Since the delay is performed
in consideration of the distance between the display 12 and the speaker SPi, the degree of
freedom in the installation of the display 12 and the speaker SPi can be increased. For example, if
the object 105A shown on the display 12 is large, the position of the virtual sound source of the
object is close, and if small, the position of the virtual sound source is far. Perspective may not
always coincide with the scale of the distance between the speaker and the display in the
listening direction of the viewer. In such a case, by adding a delay obtained by dividing the
distance between the speaker in the viewer's listening direction and the display by the velocity of
sound, a sound image in the depth direction can be reproduced on the display device.
[0042]
<Description of operations of parameter calculation unit and signal processing unit when
10-05-2019
13
multiple objects appear> Next, with reference to FIG. 4, operations of the parameter calculation
unit 5 and signal processing unit 6 when multiple objects appear in the image explain. FIG. 4
shows a configuration diagram of the parameter calculation unit 5 and the signal processing unit
6 inside the sound image localization apparatus of the present embodiment when the objects
105A and 105B shown in FIG. 1 appear respectively. As shown in FIG. 4, on the right side, as in
FIG. 3, it is necessary to process the virtual sound source 101A for the object 105A, the audio
signal 41A and the audio signal 41A to localize the virtual sound source 101A. The digital filter
DFAi (i = 1 to 8) and the gain adjustment GAi (i = 1 to 8) are provided.
[0043]
Further, as shown in FIG. 4, the virtual sound source 101B for the object 105B, the audio signal
41B, and the audio signal 41B are processed to correspond to the object 105B in the
configuration shown in FIG. The digital filter DFBi (i = 1 to 8) and gain adjustment GBi (i = 1 to 8)
necessary to localize the sound source 101B are further provided. 4 correspond to “A” and
“B” of the objects 105A and 105B, and add GAi and GBi to the digital filter DFi corresponding
to DFAi, DFBi, and gain adjustment Gi. ing. In the configuration shown in FIG. 4, calculation of
digital audio output necessary for localization of these virtual sound sources 101A and 101B is
performed for each of the speakers SPi using DFAi, DFBi, GAi, and GBi in the same manner as
described in FIG. ing. Furthermore, the adders ADDi (i = 1 to 8) add the calculated digital audio
outputs to each other. Thus, even when the objects 105A and 105B move independently, sound
image localization can be performed independently.
[0044]
Note that the objects (105A and 105B are included. When three or more components are to be
introduced, three or more components shown in FIG. 3 are provided, and the output signals
thereof are added by the addition unit ADDi.
[0045]
<Operation Flow of Parameter Calculation Unit and Signal Processing Unit when Object Moves>
Next, the operation flow of the parameter calculation unit 5 and the signal processing unit 6
when the object moves will be described using FIG. FIG. 5 is a diagram showing this flow. In S1 of
FIG. 5, it is determined whether the positions of the virtual sound sources 101A and 101B as
10-05-2019
14
shown in FIG. 1 (A), FIG. 3 and FIG. 4 have changed. If this position changes, it becomes Y and
proceeds to S2. If it does not change, N is obtained, and the parameters 51, 52, 53 such as the
current delay are maintained. In the determination in S1, the variation of the coordinates of the
virtual sound sources 101A and 101B is set by the parameter calculator 5 as shown in FIG. 2, but
this variation is originally based on the variation of the position information 42 .
[0046]
In S2 of FIG. 5, as shown in FIG. 3 and FIG. 4, the distance Li (i = 1 to N) between the virtual
sound source 101A and the speaker SPi (i = 1 to N) (the virtual sound source 101B is the same).
Recalculate). In S3, the parameter calculation unit 5 calculates parameters such as the delay
parameter 51, the volume adjustment parameter 52, and the high-frequency attenuation
parameter 53 from the calculated distance Li as described in FIG. At S4, the parameters obtained
at S3 are reset.
[0047]
As described above, according to the operation flow shown in FIG. 5, even when the objects 105A
and 105B move, the respective sound images can be localized based on the movement.
[0048]
<Specific Configuration of Digital Filter> Next, a specific configuration of the digital filter DFi (i =
1 to N) will be described with reference to FIG.
FIG. 6 is a specific block diagram of the digital filter of the sound image localization apparatus of
the present embodiment. The digital filter DFi shown in FIGS. 3 and 4 is actually realized by
writing and reading on the ring buffer 62.
[0049]
As shown in FIG. 6, the ring buffer 62 is a buffer set on a memory divided into a plurality of data
sections 63. This buffer needs to operate at high speed to cope with the input and output of
digital voice signals. One sample data is input to and output from the data section 63,
10-05-2019
15
respectively.
[0050]
In writing sample data to the ring buffer 62 of FIG. 6, the audio signal input to the input unit 4 of
FIG. 2 is circulated in the data section 63 along the data writing path 65 from the writing
position 64 into a ring. Repeat writing of sample data.
[0051]
Reading of sample data from the ring buffer 62 in FIG. 6 is performed by reading sample data
written on the ring buffer 62 by providing reading positions TAPi (i = 1 to N).
The read path 66 of this read position TAPi is in the same direction as the data write path 65.
The sample data read out at the read position TAPi is sent to gain adjustment Gi (i = 1 to N) as
shown in FIG.
[0052]
The reading at the reading position TAPi in FIG. 6 is performed by delaying the reading position
TAPi with respect to the writing position 64 by a predetermined delay amount. The amount of
delay is the value of the delay parameter 51 calculated by the parameter calculator 5 as
described with reference to FIGS. 2 and 3 described above. The amount of this delay fluctuates
due to the fluctuation of the position of the object 105A. That is, if the position of the object
105A fluctuates as described in FIGS. 1 and 2, as described in S1 of FIG. 5, the position of the
virtual sound source 101A is fluctuated by the parameter calculation unit 5. Since this movement
does not move in an instant, but moves at a certain speed, the delay parameter 51 set by the
parameter calculation unit 5 changes every moment. The fluctuation of the delay parameter 51
causes the reading position TAPi to move at a speed faster or slower than the normal sampling
speed when the object 105A is at rest. Then, such fluctuation of the moving speed of the reading
position TAPi causes frequency modulation and causes the Doppler effect. The Doppler effect is
not a retroactive sound effect as conventionally performed, but is a result of accurate sound
image localization, and the frequency fluctuates due to a change in velocity.
[0053]
10-05-2019
16
As described above, when the moving speed of the reading position TAPi as described in FIG. 6 is
changed, a decimal point is generated instead of an integral multiple of the sampling speed of the
D / A converter DACi. In this case, the digital filter can interpolate to correspond to a finer
sampling rate. It is also possible to simply round off this decimal point and output it.
[0054]
When three or more objects 105A and 105B appear, three or more ring buffers shown in FIG. 6
are provided, and the output signal system is added for each speaker by the adding unit ADDi.
[0055]
<Description of AV System According to Application of Present Embodiment> Next, as an
application of the AV system 1 of the present embodiment, an effect sound is output for each
object using FIG. A digital content such as a movie having an expansion and a sound image
localization apparatus used for the digital content will be described.
FIG. 7 is a diagram showing a sound image localization apparatus 10A according to an
application of the sound image localization apparatus 10 of the present embodiment. In addition,
the same part as what was shown in FIG. 2 attaches | subjects the same code | symbol, and the
overlapping description is abbreviate | omitted. As shown in FIG. 7, in the sound image
localization apparatus 10A, digital content recorded on a DVD or the like has the delay parameter
51, the volume control parameter 52, and the high frequency attenuation which are calculated
one by one by the parameter calculation unit 5 as described in FIG. The parameters 53 have
already been calculated and recorded in the external storage unit 31. The external program 111
inputs the recorded data as the sound image localization parameter 45, and the sound image
localization apparatus 10 as shown in FIG. 7 reads these parameters and outputs them to the
speaker SPi (i = 1 to N). Do. As a result, as in the AV system 1 shown in FIG. 2, the audio signal
output from the content reproduction system can be sound image localized at the position of the
virtual sound source corresponding to the object shown in the video.
[0056]
<Other Inventions> The following inventions are also conceivable. (A) The present invention is
10-05-2019
17
characterized in that: the localization control unit calculates a parameter for setting highfrequency attenuation based on the distance or angle between the position of the virtual sound
source and each of the speakers.
[0057]
According to this structure, since the attenuation of the high region is considered in the sound
image of the object, the sense of perspective between the objects can be more accurately
expressed by voice.
[0058]
(B) According to the present invention, using an audio signal associated with an object included
in an image stored in an input unit, and position information of the object in the image, a
plurality of arrays arranged in an array A control unit for controlling a speaker array having
speakers, wherein a localization control step of controlling a timing of inputting the audio signal
to each speaker so as to be an audio beam focused on a position indicated by the position
information to a computer It is a sound image localization program to be executed.
[0059]
When the program of the present invention is executed with such a configuration, the same
effect as (1) can be obtained.
[0060]
(C) According to the present invention, in the sound image localization program described in (B),
the localization control step further includes: determining a distance between the speaker and
the display in the listening direction of the viewer stored in the memory Using the input data, a
delay for the distance is added to the delay for the distance between the position of the virtual
sound source and each of the speakers to calculate a parameter for delay setting, and based on
this, the audio signal It is characterized by being a sound image localization program that
controls the timing of inputting to each speaker.
[0061]
The present invention uses the data of the input of the distance between the speaker and the
display in the listening direction of the viewer stored in the memory, and delays the
corresponding distance by the distance between the position of the virtual sound source and
each of the speakers Since the parameter for delay setting is added to the delay of 1 minute and
the timing for inputting the audio signal to each speaker is controlled based on the parameter for
10-05-2019
18
delay setting, the distance between the display and the speaker Even when there is a difference
between the two, the effect of (B) can be exhibited.
[0062]
A conceptual diagram of an AV device according to the present embodiment An internal
configuration diagram of a sound image localization apparatus according to the present
embodiment A configuration diagram of a parameter calculation unit and a signal processing unit
inside the sound image localization apparatus according to the present embodiment Parameter
calculation unit inside sound image localization apparatus of this embodiment, block diagram of
signal processing unit Operation diagram of parameter calculation unit of sound image
localization apparatus of this embodiment when object moves, sound flow diagram of signal
processing unit A concrete block diagram of a digital filter of the localization apparatus A
diagram showing a sound image localization apparatus according to an application of the sound
image localization apparatus of the present embodiment
Explanation of sign
[0063]
1-AV system, 10-sound image localization device, 10A-sound image localization device 11speaker array, 12-display device 2-content reproduction device, 31-external storage unit, 4-input
unit 41-audio signal, 41A-audio signal 41B-voice signal 42-position information input 45-sound
image localization parameter 5-parameter calculation unit 51 delay parameter 52 volume
adjustment parameter 53 high frequency attenuation parameter 6 signal processing unit 62 ring
buffer 63 -Data section, 64- writing position 65-data writing path, 66-reading path, 7-control unit
100-viewer, 101A-virtual sound source, 101B-virtual sound source 102-virtual wavefront, 103actual wavefront, 104 -Image, 105A-Object 105B-Object, 111-External program SPi (i = 1 to N)
Speaker, Gi (i = 1 to N)-gain adjustment GAi (i = 1 to N)-gain adjustment, GBi (i = 1 to N)-gain
adjustment AMPi (i = 1 to N)-amplifier, DFi (i = 1 to N)-Digital filter DFAi (i = 1 to N)-Digital filter,
DFBi (i = 1 to N)-Digital filter DACi (i = 1 to N)-D / A converter TAPi (i = 1) ~ N)-Reading position,
ADDi (i = 1 to N)-Adder
10-05-2019
19
Документ
Категория
Без категории
Просмотров
0
Размер файла
33 Кб
Теги
jp2007274061
1/--страниц
Пожаловаться на содержимое документа