close

Вход

Забыли?

вход по аккаунту

?

JP2011155580

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JP2011155580
An audio signal is associated with one still image obtained from a plurality of still images.
SOLUTION: After acquiring a first target input image by photographing an object in an imaging
range 331, the imaging direction is changed to the right direction, and an object in an imaging
range 332 is photographed by taking a second image. Get the target input image of. A target
composite image is generated by combining two target input images by image mosaicing and
combining them. On the other hand, by performing directivity control on output acoustic signals
of a plurality of microphones, an acoustic signal having a directional axis on the right side
(subject 321 side) of the imaging device is taken when photographing the first target input
image. When the second target input image is photographed, an acoustic signal having a
directional axis on the left side (the object 321 side) of the imaging apparatus is generated. And
the link acoustic signal which combined the produced | generated acoustic signal on time series
is linked | related with a target synthetic image, and is recorded. [Selected figure] Figure 6
Imaging device
[0001]
The present invention relates to an imaging device such as a digital camera.
[0002]
In recent years, a method of generating a still image with sound (in other words, a still image
with an acoustic signal) has been proposed.
[0003]
04-05-2019
1
For example, Patent Document 1 below discloses a method of recording an acoustic signal for a
predetermined time at the time of shooting a still image.
For example, Patent Document 2 below discloses a method of extracting a still image with sound
from a recorded audio signal and a moving image.
For example, Patent Document 3 below discloses a method in which a plurality of still images are
associated with one sound signal, and the still image is reproduced according to the reproduced
sound content when the sound signal is reproduced.
[0004]
For example, while cutting out a cutout image in which a specific subject is present from a
captured image, a stereophonic processing is performed on output sound signals of a plurality of
microphones to generate an audio signal in which the component of the sound emitted from the
specific subject is enhanced. The following Patent Document 4 proposes a method of recording a
generated acoustic signal on a recording medium together with a clipped image.
[0005]
On the other hand, a method of generating one still image from a plurality of still images has also
been proposed.
For example, a method has been proposed in which a panoramic image having a wide angle of
view is generated by performing still image shooting a plurality of times while changing the
shooting direction and combining and combining the obtained plurality of still images by image
mosaicing.
[0006]
JP-A-10-294919 JP-A-2006-295575 JP-A-2001-69453 JP-A-2009-147727
[0007]
04-05-2019
2
The above-described conventional method relating to an acoustic signal is a method of assigning
an acoustic signal to a photographed still image itself or a partial image (cutout image) of a still
image, and is applied to a panoramic image or the like generated from a plurality of still images.
Is not assumed.
Therefore, there is a need to develop a technique for associating an appropriate acoustic signal
with a panoramic image or the like generated from a plurality of still images.
[0008]
Further, the method of Patent Document 4 has high added value in that it can generate and
record an acoustic signal adapted to the angle of view of the cutout image to be recorded.
However, this method is a method used in the context of clipping processing, and is not a method
applied to the frame image itself extracted from the moving image. Therefore, there is a need for
the development of a technique that can associate an appropriate acoustic signal with a frame
image extracted from a moving image.
[0009]
Therefore, an object of the present invention is to provide an imaging apparatus capable of
associating an appropriate acoustic signal with a still image obtained from a plurality of still
images or a still image extracted from a moving image.
[0010]
An imaging apparatus according to the present invention includes an imaging unit for capturing
m input images having different shooting directions (m is an integer of 2 or more), a microphone
unit including a plurality of microphones, and the m input images or An image combining unit
that generates an output image by combining m extracted images extracted from m input images,
and directivity from output acoustic signals of the plurality of microphones during the imaging
period of the m input images A directional control unit that generates a link acoustic signal, and
an associating unit that associates the output image with the link acoustic signal.
[0011]
Specifically, for example, the output image has an image signal of a specific subject, and the
04-05-2019
3
directivity control unit generates the link acoustic signal such that the component of the sound
coming from the specific subject is enhanced. Good.
[0012]
This makes it possible to associate the output image with an acoustic signal suitable for the
output image, such as focusing on the sound from a specific subject.
[0013]
Further, for example, the imaging unit captures n input images including the m input images
(where n is an integer, n> m), and the imaging device receives the n input images. Of the n input
images that are the sources of the output image among the n input images based on the
detection result of the movement detection unit that detects the amount of movement between
different input images and the detection result of the motion detection unit An image selection
unit for selecting an image may be further provided.
[0014]
Alternatively, for example, the imaging unit captures n input images including the m input
images (where n is an integer, n> m), and the imaging device outputs output acoustic signals of
the plurality of microphones. A sound source direction estimation unit for estimating the
direction in which the specific object as a sound source is located, and the source image of the n
input images based on the estimation result of the sound source direction estimation unit; An
image selection unit for selecting the m input images may be further provided.
[0015]
Another imaging apparatus according to the present invention includes an imaging unit
configured to capture a moving image formed of a plurality of frame images, a microphone unit
configured of a plurality of microphones, and a target configured to extract a specific frame
image from the moving image as a target still image. A link acoustic signal having directivity
according to the angle of view of the target still image from the still image extraction unit and
output acoustic signals of the plurality of microphones in a specific period based on the
photographing time point of the specific frame image A directional control unit to generate, and
an associating unit to associate the target still image with the link acoustic signal.
[0016]
Specifically, for example, the directivity control unit may generate the link acoustic signal such
that the component of the sound from the subject located within the angle of view of the target
04-05-2019
4
still image is emphasized.
[0017]
As a result, it becomes possible to associate an acoustic signal suitable for the target still image
with the target still image, such as focusing on the subject on the target still image.
[0018]
According to the present invention, it is possible to provide an imaging device capable of
associating an appropriate acoustic signal with a still image obtained from a plurality of still
images or a still image extracted from a moving image.
[0019]
The significance and effects of the present invention will become more apparent from the
description of the embodiments given below.
However, the following embodiment is only one embodiment of the present invention, and the
meaning of the terms of the present invention or each constituent feature is not limited to those
described in the following embodiment. .
[0020]
FIG. 1 is an overall block diagram of an imaging device according to an embodiment of the
present invention.
It is an internal block diagram of the imaging part shown by FIG.
It is a schematic internal block diagram of the microphone part and acoustic signal processing
part which are shown by FIG.
It is an external appearance perspective view of an imaging device shown by FIG.
04-05-2019
5
They are the figures (a) and (b) which show the polar pattern example of R signal and L signal
produced | generated by the channel acoustic signal production | generation part of FIG. 3, and
the figure (c) for demonstrating the angle of a sound source.
In the first embodiment of the present invention, a diagram (a) showing a landscape spread in
front of the imaging apparatus, a diagram (b) showing a shooting range at the time of the first
and second shutter operations, and a first and second FIG. 16C is a relationship image diagram
(c) between the landscape and the imaging device at the time of the shutter operation of the
camera.
It is a flowchart which shows the operation | movement procedure of the imaging device which
concerns on 1st Example of this invention.
It is a figure which shows two object input images which concern on 1st Example of this
invention.
FIG. 9 is a diagram showing a target composite image that can be generated from two target
input images shown in FIG. 8.
FIG. 8 is a diagram for describing a method of setting a target period based on the photographing
time of each target input image according to the first embodiment of the present invention. It is a
figure which shows the other target synthetic | combination image which can be produced |
generated from two object input image shown by FIG. It is a figure for demonstrating the clipping
frame set at the time of production | generation of the target composite image of FIG. FIG. 7 is a
diagram illustrating a configuration of a moving image based on a target composite image
according to the first embodiment of the present invention. FIG. 17A is a diagram showing a
relationship between a scene and an imaging device at the first to third shutter operations
according to the second embodiment of the present invention (a) and a diagram showing a
shooting range at the first to third shutter operations. is there. It is a figure which shows three
object input images which concern on 2nd Example of this invention. It is a figure which shows
the target synthetic | combination image which can be produced | generated from three object
input images shown by FIG. FIG. 20 is a diagram showing a relationship between a landscape and
an imaging device at the time of the first and second shutter operations according to the third
embodiment of the present invention. FIG. 20 is a block diagram of a portion particularly
04-05-2019
6
involved in the operation of the fourth embodiment of the present invention. It is a figure which
shows the input image sequence which concerns on 4th Example of this invention. FIG. 21 is a
block diagram of a portion particularly involved in the operation of the fifth embodiment of the
present invention. It is a figure for demonstrating the emphasis object area which concerns on
6th Example of this invention. It is a figure which shows the scenery which spreads ahead of an
imaging device in 6th Example of this invention. It is a figure which shows the moving image
which concerns on 6th Example of this invention. They are a figure (a) for demonstrating the
meaning of an extraction period concerning a 6th example of the present invention, and a figure
(b) showing signs that an emphasis acoustic signal is generated from an acoustic signal in an
extraction period. It is a figure showing the target synthetic | combination image produced |
generated from two target input images, and the emphasis object image area | region set in this
target synthetic | combination image.
[0021]
An embodiment of the present invention will be specifically described below with reference to
the drawings. In the respective drawings to be referred to, the same parts are denoted by the
same reference numerals, and redundant description of the same parts will be omitted in
principle.
[0022]
FIG. 1 is an overall block diagram of an imaging device 1 according to an embodiment of the
present invention. The imaging device 1 has the respective portions referenced by reference
numerals 11 to 28. The imaging device 1 is a digital video camera, and can capture moving
images and still images, and can also capture still images simultaneously during moving image
capturing. Each part in the imaging device 1 exchanges signals (data) between the parts via the
bus 24 or 25. It is also possible to interpret that the display unit 27 and / or the speaker 28 are
provided in an external device (not shown) of the imaging device 1.
[0023]
The imaging unit 11 captures an object using an imaging element. FIG. 2 is an internal
configuration diagram of the imaging unit 11. The imaging unit 11 includes an imaging device
(solid-state imaging device) 33 including an optical system 35, a diaphragm 32, a charge coupled
04-05-2019
7
device (CCD) or a complementary metal oxide semiconductor (CMOS) image sensor, an optical
system 35, and a diaphragm 32. And a driver 34 for drive control. The optical system 35 is
formed of a plurality of lenses including a zoom lens 30 for adjusting the angle of view of the
imaging unit 11 and a focus lens 31 for focusing. The zoom lens 30 and the focus lens 31 are
movable in the optical axis direction. The positions of the zoom lens 30 and the focus lens 31 in
the optical system 35 and the opening degree of the diaphragm 32 are controlled based on the
photographing control signal from the CPU 23.
[0024]
The image sensor 33 is formed by arranging a plurality of light receiving pixels in the horizontal
and vertical directions. Each light receiving pixel of the image sensor 33 photoelectrically
converts an optical image of an object incident through the optical system 35 and the diaphragm
32, and outputs an electrical signal obtained by the photoelectric conversion to an AFE 12
(Analog Front End).
[0025]
The AFE 12 amplifies an analog signal output from the image sensor 33 (each light receiving
pixel), converts the amplified analog signal into a digital signal, and outputs the digital signal to
the image signal processing unit 13. The amplification degree of signal amplification in the AFE
12 is controlled by a CPU (Central Processing Unit) 23. The image signal processing unit 13
performs necessary image processing on the image represented by the output signal of the AFE
12 to generate an image signal representing the image after image processing. The image signal
includes, for example, a luminance signal and a color difference signal. The microphone unit 14
converts the ambient sound of the imaging device 1 into an analog sound signal, and the sound
signal processing unit 15 converts the analog sound signal into a digital sound signal.
[0026]
The compression processing unit 16 compresses the image signal from the image signal
processing unit 13 and the acoustic signal from the acoustic signal processing unit 15 using a
predetermined compression method. The internal memory 17 is composed of a DRAM (Dynamic
Random Access Memory) or the like, and temporarily stores various data. The external memory
18 as a recording medium is a non-volatile memory such as a semiconductor memory or a
04-05-2019
8
magnetic disk, and can record various signals such as an image signal and an acoustic signal
after being compressed by the compression processing unit 16.
[0027]
The decompression processing unit 19 decompresses the compressed image signal and acoustic
signal read from the external memory 18. The image signal after expansion by the expansion
processing unit 19 or the image signal from the image signal processing unit 13 is sent via the
display processing unit 20 to the display unit 27 composed of a liquid crystal display or the like
and displayed as an image. Further, the sound signal after being expanded by the expansion
processing unit 19 is sent to the speaker 28 through the sound signal output circuit 21 and
output as a sound.
[0028]
The TG (timing generator) 22 generates a timing control signal for controlling the timing of each
operation in the entire imaging device 1, and applies the generated timing control signal to each
unit in the imaging device 1. The timing control signal includes a vertical synchronization signal
Vsync and a horizontal synchronization signal Hsync. The CPU 23 centrally controls the
operation of each part in the imaging device 1. The operation unit 26 includes a recording button
26a for instructing start / end of shooting and recording of a moving image, a shutter button 26b
for instructing shooting and recording of a still image, and a zoom button for specifying a zoom
magnification. 26c, etc., and accepts various operations by the user. The content of the operation
on the operation unit 26 is transmitted to the CPU 23.
[0029]
In the operation mode of the imaging device 1, a shooting mode capable of shooting and
recording an image (still image or moving image) and an image (still image or moving image)
recorded in the external memory 18 are reproduced and displayed on the display unit 27 And a
play mode. The transition between the modes is performed according to the operation on the
operation unit 26. In the imaging mode, the imaging device 1 can periodically capture an object
at a predetermined frame cycle to sequentially acquire a captured image of the object.
04-05-2019
9
[0030]
In the present specification, an image signal of an image may be simply referred to as an image.
Also, since compression and decompression of image and audio signals are not related to the
essence of the present invention, the following description ignores the presence of compression
and decompression of image and audio signals unless otherwise required. Thus, recording a
compressed image signal for an image may be referred to simply as recording the image signal
or recording the image (as well as the acoustic signal). Also, the size of an image or the size of an
image area is also referred to as an image size. The image size of the image of interest or the
image area of interest can be expressed by the number of pixels forming the image of interest or
the number of pixels belonging to the image area of interest.
[0031]
FIG. 3 is a schematic internal block diagram of the microphone unit 14 and the acoustic signal
processing unit 15. FIG. 4 is an external perspective view of the imaging device 1. The
microphone unit 14 includes the microphones 14 </ b> L and 14 </ b> R disposed at different
positions on the housing of the imaging device 1. The microphone 14 </ b> L is disposed on the
left side of the casing of the imaging device 1, and the microphone 14 </ b> R is disposed on the
right side of the casing of the imaging device 1. The acoustic signal processing unit 15 includes A
/ D converters 51L and 51R, a channel signal generation unit 52, and a link acoustic signal
generation unit 53.
[0032]
Each of the microphones 14L and 14R converts the sound picked up by itself into an analog
sound signal and outputs it. The A / D converters 51L and 51R respectively convert analog
acoustic signals output from the microphones 14L and 14R into digital acoustic signals at a
predetermined sampling period (for example, 48 kHz) and output the digital acoustic signals. The
output signal of the A / D converter 51L is particularly called a left original signal, and the output
signal of the A / D converter 51R is particularly called a right original signal. The channel
acoustic signal generation unit 52 generates an L signal and an R signal having pointing axes in
different directions based on the left original signal and the right original signal.
[0033]
04-05-2019
10
As shown in FIG. 4, the direction in which the subject that can be photographed by the imaging
unit 11 is present is defined as the front, and the opposite direction is defined as the rear.
Forward and backward are directions along the optical axis of the imaging unit 11. Moreover,
right and left shall mean the right and left when seeing the front side from the back side.
[0034]
As the microphones 14L and 14R, non-directional microphones having no directivity can be
adopted. When the microphones 14L and 14R are nondirectional microphones, the left original
signal and the right original signal are nondirectional acoustic signals (acoustic signals having no
directivity). The channel acoustic signal generation unit 52 can generate the L signal and the R
signal by performing known directivity control (stereo processing) on the nondirectional left
original signal and the right original signal.
[0035]
The directivity control for the nondirectional left original signal and right original signal includes
delay processing for delaying the left original signal or right original signal, attenuation
processing for attenuating the left original signal or right original signal by a predetermined
ratio, and delay processing And / or a subtraction process of subtracting one from the one of the
left original signal and the right original signal subjected to attenuation processing. Specifically,
for example, in directivity control, a signal obtained by delaying the left original signal by a time
based on the distance between the microphones 14L and 14R and attenuating by a
predetermined rate is subtracted from the right original signal, It is possible to generate an R
signal having a blind spot in the direction of 45 ° diagonally left backward. The polar pattern of
this R signal is as shown by curve 310 in FIG. 5 (a), and the R signal comes to have a pointing
axis in the direction of 45.degree. That is, the R signal corresponding to FIG. 5A is an acoustic
signal having the highest directivity for the component of the sound that has arrived at the
imaging device 1 from the sound source located 45 ° to the right front of the imaging device 1.
Similarly, a signal obtained by delaying the original right signal by a time based on the distance
between the microphones 14L and 14R and attenuating by a predetermined rate is subtracted
from the original left signal, so that the right oblique backward 45 ° is obtained. An L signal
having a blind spot can be generated. The polar pattern of this L signal is as shown by a curve
311 in FIG. 5 (b), and the L signal comes to have a pointing axis in the direction of 45 °
diagonally left forward. That is, the L signal corresponding to FIG. 5B is an acoustic signal having
04-05-2019
11
the highest directivity for the component of the sound arriving from the sound source located
45.degree.
[0036]
As the microphones 14L and 14R, directional directional microphones (for example,
unidirectional microphones) can also be adopted. In this case, the direction of the directivity axis
of the microphone 14L as the directional microphone is directed left 45.degree. Forward, and the
direction of the directivity axis of the microphone 14R as the directional microphone is forward
45.degree. I aim at it. As a result, it is possible to generate the left original signal itself as the L
signal and generate the right original signal itself as the R signal. It is needless to say that the
specific content of the directivity of the L signal and the R signal described above is an example,
and they can be variously modified.
[0037]
Further, as shown in FIG. 5C, an XY coordinate plane (XY coordinate system) is defined with the
X axis and the Y axis as coordinate axes. The X axis is an axis passing through the center of the
microphone 14L and the center of the microphone 14R, and the origin O is located in the middle
of these centers. The Y-axis is orthogonal to the X-axis at the origin O. The direction along the Y
axis coincides with the direction of the optical axis of the imaging unit 11 (the optical axis for the
imaging device 33). It is assumed that the X axis and the Y axis are parallel to the horizontal
plane. It is assumed that the direction from the origin O toward the microphone 14R (that is, the
right direction of the imaging device 1) is the positive direction of the X axis, and the direction
from the origin O toward the front of the imaging device 1 is the positive direction of the Y axis.
A line segment 313 is a line connecting the origin O and a sound source SS which is an arbitrary
sound source. The angle between the X axis and the line segment 313 is represented by θ.
However, the angle θ is an angle between the X axis and the line segment 313 when the line
segment 313 is viewed counterclockwise from the line segment connecting the origin O and the
center of the microphone 14R. The counterclockwise direction refers to a direction in which a
line segment extending from the origin O to the center of the microphone 14R is rotated to the
front side of the imaging device 1. The angle θ of the sound source SS represents the direction
in which the sound source SS is located (ie, the sound source direction for the sound source SS).
[0038]
04-05-2019
12
The link acoustic signal generation unit 53 of FIG. 3 generates a link acoustic signal from the left
original signal and the right original signal or from the L signal and the R signal (details will be
described later). At the time of shooting of a normal moving image accompanied by the pressing
operation of the recording button 26a, the L signal and the R signal are recorded in the external
memory 18 together with the image signal of the moving image. However, the imaging apparatus
1 is provided with a special function for generating a still image with sound (in other words, a
still image with sound signal), and when this special function is enabled, a link sound signal is
generated. The link acoustic signal is associated with a specific still image and recorded in the
external memory 18, and in the reproduction mode, the specific still image is reproduced
together with the link acoustic signal using the display unit 27 and the speaker 28. The operation
mode of the imaging device 1 in which the special function works, which is a kind of
photographing mode, is called a special photographing mode.
[0039]
The operation and configuration example of the imaging device 1 in the special imaging mode
will be described below as first to sixth examples. Unless contradictory, it is also possible to apply
the matters described for one embodiment to the other embodiments. The following description
is the description of the operation of the imaging device 1 in the special imaging mode unless
otherwise stated.
[0040]
<< 1st Example >> 1st Example is described. FIG. 6A shows a landscape that spreads in front of
the imaging device 1. A picture in a solid frame indicated by reference numeral 320 is a
landscape that spreads in front of the imaging device 1. A subject 321, which is a person, is
present near the center of the landscape 320, and a mountain is present in the background of the
subject 321. The subject 321 also has a function as a sound source that generates a sound by
speaking.
[0041]
Although it is not possible to simultaneously fit the entire landscape 320 within the shooting
angle of view, it is assumed that the user desires to acquire an image in which the entire
04-05-2019
13
landscape 320 is captured. The shooting angle of view refers to the angle of view of the imaging
unit 11. FIG. 7 is a flowchart showing the procedure of generating the desired image. In this case,
the user performs the first shutter operation with only the left portion of the landscape 320 in
the shooting range (step S11). The range within the dashed-lined rectangular frame denoted by
reference numeral 331 in FIG. 6B is the shooting range at the time when the first shutter
operation is performed.
[0042]
The shutter operation is an operation to press the shutter button 26b. However, the shutter
operation may be an operation other than the operation of pressing the shutter button 26 b (for
example, a predetermined touch panel operation). The shooting range refers to a region (a region
in real space) that falls within a shooting angle of view, and an image of each subject located in
the region is formed on the image sensor 33. When the shutter operation is performed, each
subject in the imaging range at the time when the shutter operation is performed is
photographed. That is, at the time when the shutter operation is performed, a photographed
image representing an image of each subject positioned within the photographing range is
obtained as a still image, and in the first embodiment, the photographed image is treated as a
target input image. A captured image captured by the imaging unit 11 regardless of the shutter
operation is simply referred to as an input image (a target input image is also a type of input
image).
[0043]
After performing the first shutter operation, the user changes the shooting direction of the
imaging device 1 (that is, the direction of the optical axis of the imaging unit 11) (step S12), and
stores only the right part of the landscape 320 in the shooting range. In the second state, the
second shutter operation is performed (step S13). The range within the dashed-lined rectangular
frame indicated by reference numeral 332 in FIG. 6B is the shooting range at the time when the
second shutter operation is performed. The user adjusts the shooting direction so that the
shooting ranges 331 and 332 have mutually overlapping ranges (in other words, the shooting
ranges 331 and 332 include a common subject). Before the second shutter operation, display
(guideline display or the like) for supporting such adjustment may be performed on the display
unit 27. In the example illustrated in FIG. 6B, the subject 321 is located near the center of the
landscape 320, and the subject 321 is included in both of the shooting ranges 331 and 332. FIG.
6C is a diagram showing the relationship between the landscape 320 and the imaging device 1 at
the first and second shutter operations. Two solid square frames shown in the lower part of FIG.
04-05-2019
14
6C represent the casing of the imaging device 1 at the time of the first and second shutter
operations, and the broken lines extending therefrom indicate the outer edge of the field of view
of the imaging unit 11. (The same applies to FIG. 14A and FIG. 17 described later).
[0044]
The code 341 in FIG. 8A is a first target input image obtained by the first shutter operation, and
the code 342 in FIG. 8B is the two images obtained by the second shutter operation. It is a target
input image of the eye. When the required number of target input images are captured, the
image signal processing unit 13 executes an image combining process for generating a target
composite image which is one still image from a plurality of target input images (step S14). In the
first embodiment, one target composite image 343 shown in FIG. 9A is generated from the two
target input images 341 and 342.
[0045]
The image signal processing unit 13 generates the target composite image 343 by combining the
target input images 341 and 342 so that the common subject between the target input images
341 and 342 overlaps. Such synthesis is generally called image mosaicing, and a known image
mosaicing method can be used for the image synthesis processing of the image signal processing
unit 13.
[0046]
While the image signal processing unit 13 generates the target composite image 343, the
acoustic signal processing unit 15 generates link acoustic signals based on the output signals of
the microphones 14L and 14R around the photographing time of each target input image (step
S15). ). FIG. 9 (b) is an image diagram of the generated link acoustic signal. A method of
generating a link acoustic signal will be described.
[0047]
In the special imaging mode, the left original signal and the right original signal, and the L signal
04-05-2019
15
and the R signal are continuously generated by the microphone unit 14 and the acoustic signal
processing unit 15, and the obtained left original signal and the right original signal
Alternatively, the L signal and the R signal are temporarily stored in the internal memory 17 as
needed. When the shutter operation for obtaining the target input image is performed, the
acoustic signal processing unit 15 sets, for each target input image, a target period based on the
time when the shutter operation is performed. In the first embodiment, a first target period
corresponding to the target input image 341 and a second target period corresponding to the
target input image 342 are set.
[0048]
Now, the times at which the shutter operations for obtaining the target input images 341 and
342 are performed are represented by T1 and T2, respectively. The imaging times of the target
input images 341 and 342 may be considered to be T1 and T2, respectively. More specifically,
the photographing time of an image indicates, for example, the start time, the intermediate time,
or the end time of the exposure performed by the imaging device 33 to acquire an image signal
of the image. As shown in FIG. 10A, the start time and the end time of the first target period are
respectively set to the time (T1−ΔTA) and (T1 + ΔTB), and the start time and the end time of
the second target period are each set to the time It is set as (DELTA) TA) and (T2 + (DELTA) TB).
Here, ΔTA and ΔTB may have predetermined positive values in units of time, and these values
may be designated by the user. Time (T1-ΔTA) refers to time earlier than time T1 by time ΔTA,
and time (T1 + ΔTB) refers to time later than time T1 by time ΔTB (such as time (T2-ΔTA) and
(T2 + ΔTB), etc. As well).
[0049]
As shown in FIG. 10A, when the time (T1 + ΔTB) comes before the time (T2-ΔTA), the end time
of the first target period is shifted later or the start time of the second target period is earlier The
end time of the first target period may be made to coincide with the start time of the second
target period as shown in FIG. When time (T1 + ΔTB) is later than time (T2-ΔTA), as shown in
FIG. 10C, the middle time between time T1 and T2 is the end of the first target period. The
lengths of the first and second target periods may be reduced to coincide with the time of day
and the start time of the second target period. In addition, a correction is made to remove the
silent period from the first and second target periods determined as described above, and the
corrected first and second target periods are set as final first and second target periods. You may
do so. The silent period refers to a period in which the signal levels of the left original signal and
the right original signal or the signal levels of the L signal and the R signal are below a certain
04-05-2019
16
level.
[0050]
After setting each target period, the acoustic signal processing unit 15 reads from the internal
memory 17 the left original signal and the right original signal or the L signal and R signal
generated in each target period, and the read left original signal and the right original From the
signal or L signal and R signal, a link acoustic signal having directivity in a specific target
direction (in other words, having a directional axis in a specific target direction) is generated.
[0051]
The target direction is a direction in real space connecting the target subject present in the target
composite image and the imaging device 1 (however, it is also possible to set a direction other
than that as the target direction).
The target subject also has a function as a sound source. The link acoustic signal generation unit
53 in FIG. 3 can also determine the target direction based on the position of the target subject in
the image space. Now, consider the case where the target subject is the subject 321. In this case,
if the angle θ of the subject 321 as the sound source SS is determined, the target direction is
identified (see FIG. 5C), and a link acoustic signal having a directional axis in the incoming
direction of the voice of the subject 321 is generated. . Several methods of setting the target
direction will be illustrated, with a description of the method of generating the link acoustic
signal.
[0052]
--- First Setting Method of Target Direction --- The first setting method of the target direction will
be described. In the first setting method, the imaging device 1 sets a target subject and a target
direction using image processing. Specifically, for example, the image signal processing unit 13 is
formed so that the face detection processing for detecting the face of a person from the target
input image can be performed based on the image signal of the target input image. The face
detection processing is performed on the input images 341 and 342 to detect the position of the
subject 321 on the target input images 341 and 342 (for example, the center position of the face
area or torso area of a person as the subject 321).
04-05-2019
17
[0053]
The link acoustic signal generation unit 53 obtains an angle θ (hereinafter referred to as θ1)
for the subject 321 at time T1 from the position of the subject 321 on the target input image
341 and the focal distance at time T1. The angle θ of the subject 321 at time T2 (hereinafter
referred to as θ2) is determined from the position of the subject 321 at the point T2 and the
focal length at time T2. The angle θ1 at time T1 is treated as the angle θ of the subject 321 in
the first target period, and the angle θ2 at time T2 is treated as the angle θ of the subject 321
in the second target period. In the present specification, the focal length refers to the focal length
of the imaging unit 11 unless otherwise specified.
[0054]
The link acoustic signal generation unit 53 can generate a link acoustic signal in the first target
period from the L signal and the R signal in the first target period based on the angle θ1.
Specifically, the mixing ratio according to the angle θ1 is enhanced such that the component of
the sound arriving from the sound source corresponding to the angle θ1 (that is, the subject
321) to the imaging device 1 is emphasized in the link acoustic signal during the first target
period. The L signal and the R signal in the first target period can be mixed in the first target
period, and the monophonic signal obtained by this mixing can be used as the link acoustic signal
in the first target period. For example, when θ1 = 45 °, the link acoustic signal in the first
target period is generated by adding the signal obtained by multiplying the L signal by the
coefficient kL45 and the signal obtained by multiplying the R signal by the coefficient kR45; θ1
= 135 When it is °, the link acoustic signal in the first target period is generated by adding the
signal obtained by multiplying the L signal by the coefficient kL 135 and the signal obtained by
multiplying the R signal by the coefficient kR 135. Here, 0 ≦ kL45 <kR45 and 0 ≦ kR135
<kL135, and simply, for example, kL45 = 0, kR45 = 1, kL135 = 1, and kR135 = 0. Under the
assumption of the first embodiment, the angle θ1 is less than 90 ° and close to 45 °, and the
angle θ2 is 90 ° or more and close to 135 ° (see also FIGS. 5 (c) and 6 (c)) ).
[0055]
Similarly, the link acoustic signal generation unit 53 can generate a link acoustic signal in the
second target period from the L signal and the R signal in the second target period based on the
angle θ2. That is, the mixing ratio according to the angle θ2 is set so that the component of the
04-05-2019
18
sound arriving from the sound source (ie, the subject 321) corresponding to the angle θ2 to the
imaging device 1 is emphasized in the link acoustic signal in the second target period. The L
signal and the R signal in the two target periods can be mixed, and the monophonic signal
obtained by this mixing can be used as the link acoustic signal in the second target period. A
specific example of the mixing method is similar to that of the first target period.
[0056]
When the microphones 14L and 14R are nondirectional microphones, the link acoustic signal
may be directly generated by directivity control on the nondirectional left original signal and the
right original signal. That is, the same directivity control as the directivity control for generating
the L signal and the R signal having directivity from the nondirectional left original signal and the
right original signal, the nondirectional left during the first target period The link acoustic signal
in the first target period may be generated by applying to the original signal and the right
original signal, and similarly, it is applied to the nondirectional left original signal and the right
original signal in the second target period. Thus, the link acoustic signal in the second target
period may be generated. The generation is performed such that the acoustic signal in which the
component of the sound arriving from the sound source corresponding to the angle θi (ie, the
subject 321) to the imaging apparatus 1 has the highest directivity becomes the link acoustic
signal in the i-th target period Is executed based on the angle θi for each target period (where i
is 1 or 2). Also by this method, the component of the sound arriving from the sound source
corresponding to the angle θi (ie, the subject 321) to the imaging device 1 is emphasized in the
link acoustic signal in the i-th target period (i is 1 or 2) .
[0057]
--- Second Setting Method of Target Direction --- The second setting method of the target
direction is described. In the second setting method, the target direction is determined from the
positional relationship between the plurality of target input images. The details will be described
below.
[0058]
When generating the target composite image 343 from the target input images 341 and 342, the
image signal processing unit 13 associates feature points between the target input images 341
04-05-2019
19
and 342. That is, it is detected which point on the target input image 342 the feature point on
the target input image 341 corresponds to (the target input images 341 and 342 are synthesized
so that the corresponding points overlap on the target composite image 343 To obtain a target
composite image 343). A vector on the image space, which is directed from the position of the
feature point on the first input image to the position of a point on the second input image
corresponding to the feature point, is called a displacement vector. Here, the second input image
is an input image obtained by photographing after the photographing of the first input image.
For the target input images 341 and 342, the target input image 341 corresponds to the first
input image, and the target input image 342 corresponds to the second input image. A plurality
of optical flows between the first and second input images are calculated based on the image
signals of the first and second input images using representative point matching method, block
matching method, gradient method, etc. to form the optical flows The average vector of the
motion vectors of may be a displacement vector between the first and second input images.
[0059]
The link acoustic signal generation unit 53 determines the target direction based on the
displacement vector between the target input images 341 and 342. In this example, the direction
of the displacement vector between the target input images 341 and 342 is right. Therefore, it is
estimated that the target subject of the user's attention is located on the target input image 341
from the right side and on the target input image 342 from the left side. Therefore, the angles
θ1 and θ2 are determined such that “0 ° <θ1 <90 ° <θ2 <180 °” is satisfied based on
the fact that the direction of the displacement vector between the target input images 341 and
342 is rightward. At this time, the specific values of θ1 and θ2 may be determined using the
focal lengths at times T1 and T2 and the magnitude of the displacement vector, or the only
difference is that the orientation of the displacement vector is rightward. Based on the above, θ1
and θ2 may be set to predetermined angles within a range satisfying “0 ° <θ1 <90 ° <θ2
<180 °” (for example, θ1 = 45 ° and θ2 = 135 ° To do). The method of generating the link
acoustic signal performed after the derivation of the angle θi is similar to that described above.
[0060]
Although it does not fit the situation shown in FIGS. 8A and 8B, if the direction of the
displacement vector between the target input images 341 and 342 is leftward, “0 ° <θ2 <90
The angles θ1 and θ2 may be determined so as to satisfy ° <θ1 <180 °.
[0061]
04-05-2019
20
--- Third Setting Method of Target Direction --- The third setting method of the target direction
will be described.
In the third setting method, the imaging device 1 sets a target direction by sound source
direction estimation processing based on an acoustic signal. In the sound source direction
estimation process, the angle θ of the sound source SS is detected based on the phase difference
between the nondirectional left original signal and the right original signal. In the third setting
method, the link acoustic signal generation unit 53 executes sound source direction estimation
processing for each target period. Therefore, the angle θi is detected (i is 1 or 2) from the phase
difference between the nondirectional left original signal and the right original signal in the i-th
target period for each target period. A method of estimating the direction (corresponding to the
angle θ) in which the sound source is located from the phase difference of the output signals of
the plurality of nondirectional microphones is known, and thus the detailed description of the
method is omitted. The third setting method is based on the premise that the subject 321 as the
sound source SS is emitting a voice in all or part of each target period. The method of generating
the link acoustic signal performed after the derivation of the angle θi is similar to that described
above.
[0062]
--- The fourth setting method of the target direction-The fourth setting method of the target
direction will be described. In the fourth setting method, the user manually designates the target
subject and / or the target direction. The manual operation is an operation on the operation unit
26 by the user or a touch panel operation. Based on this manual operation, the position of the
target subject on the target input images 341 and 342 is designated, or the angles θ1 and θ2
indicating the target directions in the target input images 341 and 342 are designated. When the
user manually designates the subject 321 as the target subject, the positions of the subject 321
on the target input images 341 and 342 are determined, so the angles θ1 and θ2 are
determined using the positions according to the method described above. . The method of
generating the link acoustic signal performed after the derivation of the angle θi is similar to
that described above.
[0063]
[Modification of Generation Method of Target Composite Image] The target composite image 343
04-05-2019
21
is obtained by combining the entire target input image 341 and the entire target input image
342 by image mosaicing. Part of the target input image 342 or all or part of the target input
image 341 and all of the target input image 342 or part of the target input image 341 and part
of the target input image 342 The target synthesized image may be generated by synthesizing
the image by image mosaicing. As an example, FIG. 11 shows a target composite image 343a
obtained by combining a part of the target input image 341 and a part of the target input image
342 by image mosaicing.
[0064]
The target composite image 343a is an image obtained by combining the image in the cutout
frame 341a set in the target input image 341 and the image in the cutout frame 342a set in the
target input image 342 by image mosaicing. (See FIGS. 12 (a) and (b)). The image size of the
image in the cutout frame 341 a is smaller than the image size of the target input image 341, and
similarly, the image size of the image in the cutout frame 342 a is smaller than the image size of
the target input image 342.
[0065]
The image signal processing unit 13 uses any one of the above-described first to fourth setting
methods to cut out the cutout frame 341 a so that the common object (the object 321 as a target
object) is included in the cutout frames 341 a and 342 a. And after setting 342a, the images in
the cutout frames 341a and 342a are extracted from the target input images 341 and 342, and a
target composite image 343a is generated from the extracted two images. The method of
generating the target composite image 343a from the image in the cutout frame 341a and the
image in the cutout frame 342a is the same as the method of generating the target composite
image 343 from the target input images 341 and 342. The target composite image 343a may be
generated such that the image size and aspect ratio of the target composite image 343a are the
same as those of the target input image (341 or 342).
[0066]
[Association of Link Acoustic Signal and Target Composite Image] The link acoustic signal of each
target period obtained as described above is combined on a time series. The link acoustic signal
obtained by this combination is conveniently referred to as link acoustic signal α. In the first
04-05-2019
22
embodiment, since the target period is two, the link acoustic signal of the first and second target
periods combined in time series becomes the link acoustic signal α.
[0067]
The CPU 23 in FIG. 1 associates the image signal of the target composite image with the link
acoustic signal α, and records them in the external memory 18 in a state in which they are
associated (step S16 in FIG. 7). As a specific method of this recording, various methods including
the following first to fourth recording methods can be adopted. In the first recording method, the
image signal of the target composite image and the link acoustic signal α are recorded in the
external memory 18 in association with each other. In the second recording method, the image
signal of the target composite image, the link acoustic signal α, and the L signal and R signal in
each target period are recorded in the external memory 18 in a state of being associated with
each other. In the third recording method, an image signal of a target composite image, image
signals of a plurality of target input images (target input images 341 and 342 in the first
embodiment) from which the target composite image is derived, and a link acoustic signal α. The
L signal and the R signal in each target period are recorded in the external memory 18 in a state
of being associated with each other. In the fourth recording method, an image signal of a moving
image based on a target composite image and a link acoustic signal α converted to a moving
image acoustic signal are recorded in the external memory 18 in a state of being associated with
each other.
[0068]
A moving image based on a target composite image is a moving image having a configuration in
which a single target composite image is simply repeatedly arranged in time series as shown in
FIG. 13, and is converted into a moving image audio signal. The link audio signal α refers to a
link audio signal α converted into an audio signal for moving image according to a standard
such as MP4 (MP4 defined by Moving Picture Experts Group). In general, it is not assumed that a
still image is reproduced together with an audio signal, so a specific reproduction control is
required to reproduce a still image together with an audio signal. However, if a fourth recording
method is used, the target composite image is used. Can be easily reproduced together with the
link acoustic signal α.
[0069]
04-05-2019
23
Note that it is also possible to combine and implement a plurality of the first to fourth recording
methods. When recording the link acoustic signal α, the acoustic signal may or may not be
compressed (the same applies to other embodiments described later).
[0070]
The association can be performed by creating a common file in the external memory 18 and
storing the image signal and the audio signal to be associated with each other in the common file.
Alternatively, the image signal and the sound signal can be associated by storing the image signal
and the sound signal to be associated with each other in separate files and associating the files
with each other.
[0071]
After recording of the target composite image and the link acoustic signal α, when reproduction
of the target composite image is instructed in the reproduction mode, the target composite image
is displayed on the display unit 27 and at the same time, the link sound associated with the target
composite image. The signal α is reproduced as sound through the speaker 28 (the same applies
to other embodiments described later).
[0072]
According to the first embodiment, when generating a target composite image as a voice-added
still image from a plurality of still images, an acoustic signal suitable for the target composite
image (an acoustic signal that focuses on a specific sound source on the target composite image
Can be added to the audio-added still image to improve the user merit.
[0073]
<< 2nd Example >> 2nd Example is described.
In the first embodiment, one target composite image is generated from two target input images,
but one target composite image may be generated from three or more target input images.
04-05-2019
24
Based on the user's instruction or the like, the number of target input images to be the origin of
the target composite image can be determined in advance. In addition, the acquisition of the
target input image is continued until the user performs a predetermined operation indicating that
the acquisition of all the target input images is completed, and the acquisition of the target input
image is ended when the predetermined operation is performed. (In this case, the number of
target input images to be the basis of the target composite image is not determined until the time
when the predetermined operation is performed).
[0074]
As an example, a method of generating one target composite image from three target input
images will be described. The same applies to a method of generating one target composite
image from four or more target input images.
[0075]
FIG. 14A is a diagram showing the relationship between the landscape 320 and the imaging
device 1 at the first to third shutter operations. In the example shown in FIG. 14A, as in the case
of so-called panning, the landscape 320 is photographed in three separate steps while shaking
the camera operation of the imaging device 1 in the right direction. That is, the user performs the
first shutter operation with only the left portion of the landscape 320 in the shooting range, and
then changes the shooting direction of the shooting apparatus 1 and then shoots only the central
portion of the landscape 320 Perform the second shutter operation in the range, and then
change the shooting direction of the shooting device 1 and then the third shutter operation with
only the right part of the landscape 320 in the shooting range. I will. In FIG. 14B, the range
within the dashed rectangular frame denoted by reference numerals 401 to 403 is the imaging
range at the time when the first to third shutter operations are performed, respectively. In such a
way that the shooting ranges 401 and 402 have overlapping ranges (in other words, the shooting
ranges 401 and 402 include a common subject), and the shooting ranges 402 and 403 have
overlapping ranges. (In other words, the shooting ranges 402 and 403 include a common
subject), the user adjusts the shooting direction. Before the second and third shutter operations,
display (guideline display or the like) for supporting such adjustment may be performed on the
display unit 27.
[0076]
04-05-2019
25
Reference numerals 411 to 413 in FIGS. 15A to 15C denote target input images obtained by the
first to third shutter operations, respectively. In the second embodiment, the image signal
processing unit 13 generates the target composite image 420 of FIG. 16 by performing the
image combining process described in the first embodiment on the three target input images 411
to 413. That is, using image mosaicing, the target input images 411 to 413 are combined and
synthesized so that the common objects between the target input images 411 and 412 overlap
and the common objects between the target input images 412 and 413 overlap. By doing this,
the target composite image 420 is generated. In the example shown in FIGS. 15 (a) to 15 (c) (see
also FIG. 14 (a)), a common subject is not included between the target input images 411 and 413,
but is common to the target input images 411 and 413. In the case where the target subject
image is included, in the generation of the target composite image 420, the composition is made
such that the common subject between the target input images 411 and 413 also overlaps. As
described in the first embodiment, the cutout frame may be set for each target input image, and
the target composite image may be generated using only the image within the cutout frame.
[0077]
While the image signal processing unit 13 generates the target composite image 420, the
acoustic signal processing unit 15 generates a link acoustic signal based on the output signals of
the microphones 14L and 14R around the photographing time of each target input image. As in
the first embodiment, a first target period including time T1, a second target period including
time T2, and a third target period including time T3 are set, and the left original signal and the
right in each target period are set. From the original signal or L signal and R signal in each target
period, a link acoustic signal having directivity in a specific target direction (in other words,
having a directional axis in a specific target direction) is generated. In the second embodiment,
times T1 to T3 are times when the first to third shutter operations are performed (or shooting
times of target input images 411 to 413).
[0078]
The target direction can be set using any of the first to fourth setting methods described in the
first embodiment, and link acoustic signals for each target period can be generated based on the
setting result. However, in the example shown in FIGS. 15 (a) to 15 (c) (see also FIG. 14 (a)), the
first setting is made because some target input images do not include the subject 321 to be the
target subject. It is inappropriate to use the method alone.
04-05-2019
26
[0079]
As an example, processing contents when using the second setting method will be described. It is
assumed that the target subject located in the target direction is the subject 321. The image
signal processing unit 13 determines the target direction based on the positional displacement
vector VECA between the target input images 411 and 412 and the positional displacement
vector VECB between the target input images 412 and 413. In this example, their displacement
vectors are both directed to the right. Therefore, it is estimated that the target subject of the
user's attention is located on the target input image 411 from the right side, located near the
center on the target input image 412, and located on the target input image 413 from the left
side. Ru. Therefore, the angles θ1 to θ3 are determined so as to satisfy “0 ° <θ1 <θ2 <θ3
<180 °” based on the fact that the misregistration vectors VECA and VECB are directed to the
right. In the second embodiment, the angles θ1 to θ3 respectively represent the angles (or
estimated values of the angles) of the target subject 321 as the sound source SS in the first to
third target periods.
[0080]
A specific numerical value of θ1 to θ3 may be determined using the focal lengths at times T1 to
T3 and the magnitudes of positional deviation vectors VECA and VECB, or the directions of
positional deviation vectors VECA and VECB are directed to the right. The angles θ1 to θ3 may
be set to predetermined angles within a range satisfying “0 ° <θ1 <θ2 <θ3 <180 °” based
on only the fact (for example, θ1 = 45 °, θ2 = 90). And θ3 = 135 °).
[0081]
Further, the position of the target subject on the target input image is detected using the image
signal of the target input image including the target subject, and the angle corresponding to the
target input image including the target subject from the detected position is detected. The angle
θ may be determined after determining θ and then corresponding to another target input
image.
That is, in this example, since the target input image 412 includes the target subject (the target
input image 412 has an image signal of the target target), the target input image using the image
signal of the target input image 412 The position of the target object 321 on 412 is detected,
and an angle θ2 at time T2 is determined from the detected position and the focal length at time
04-05-2019
27
T2. If the target subject 321 is located at the center of the target input image 412, then θ2 = 90
°. Thereafter, using the determined angle θ2, the focal length at times T1 to T3, and the
directions and magnitudes of positional deviation vectors VECA and VECB, an angle in a range
satisfying “0 ° <θ1 <θ2 <θ3 <180 °” Specific numerical values of θ1 and θ3 may be
determined.
[0082]
By either method, a link acoustic signal is generated such that the sound enhancement direction
changes from the right to the left of the imaging device 1 as time passes. The method of
generating the link acoustic signal performed after the derivation of the angle θi is the same as
that described in the first embodiment. Thus, as in the first embodiment, the acoustic signal in
which the component of the sound arriving from the sound source corresponding to the angle θi
(that is, the object 321) to the imaging apparatus 1 has the highest directivity is during the i-th
target period. It is generated as a link acoustic signal. That is, the component of the sound
arriving from the sound source corresponding to the angle θi (ie, the subject 321) to the
imaging device 1 is emphasized in the link acoustic signal in the i-th target period (i is 1, 2 or 3) .
[0083]
The link acoustic signals in the first to third target periods are combined in time series, and the
link acoustic signal obtained by this combination becomes the link acoustic signal α in the
second embodiment. The method of associating and recording the link acoustic signal α with the
target composite image is the same as that described in the first embodiment.
[0084]
<< 3rd Example >> 3rd Example is described. In the first and second embodiments, the imaging
direction of the imaging device 1 is changed in the horizontal direction (horizontal direction)
between imaging of the target input image, but the change direction of the imaging direction
during imaging of the target input image is up and down It may be in the direction (vertical
direction).
[0085]
04-05-2019
28
In this case, if a target composite image is generated from two target input images, after the first
target input image is captured, the imaging direction of the imaging device 1 is changed upward
or downward, and then 2 Shoot the target input image for the first image. By changing the
imaging direction of the imaging device 1 upward or downward, the optical axis of the imaging
unit 11 rotates along the vertical plane. FIG. 17 is an image diagram of a change in shooting
direction when obtaining the two target input images, and a relationship image diagram of a
landscape and the imaging device 1 at the first and second shutter operations.
[0086]
Even when the change direction of the shooting direction during shooting of the target input
image is the vertical direction, the method of generating the target composite image and the link
acoustic signal is the same as that of the first or second embodiment, and the first or second It is
sufficient if the left and right in the embodiment are replaced up and down. However, when nondirectional microphones are used as the microphones 14L and 14R, it is necessary to shift the
microphones 14L and 14R in the vertical direction in order to enable directivity control in the
vertical direction. That is, it is necessary to incline the X axis in FIG. 5C with respect to the
horizontal plane (simply, for example, make the X axis orthogonal to the horizontal plane).
[0087]
Furthermore, the first or second embodiment and the third embodiment may be implemented in
combination. That is, a plurality of target input images having positional deviations in the
horizontal and vertical directions may be photographed, and one target composite image may be
generated from the plurality of target input images. Also in this case, the link acoustic signal
having directivity in the target direction is still generated.
[0088]
Also in the fourth and fifth embodiments described later, it is assumed that the shooting direction
of the imaging device 1 changes in the left-right direction during shooting of the target input
image, but the matters described in the third embodiment It is also possible to apply to the fourth
and fifth embodiments described later.
04-05-2019
29
[0089]
<< 4th Example >> 4th Example is described.
In the first to third embodiments described above, the target input image is acquired according to
the shutter operation, but in the fourth and fifth embodiments described below, the imaging unit
11 is caused to periodically capture the input image. From the n input images obtained, m target
input images to be the origin of the target composite image are extracted by image selection
processing. n and m are integers of 2 or more. As a result, all n input images may be selected as
the target input image as a result of the image selection processing. In this case, n = m. However,
in the following description, the existence significance of the image selection processing is
clarified. It is assumed that n> m (the same applies to the fifth embodiment). Furthermore, it is
assumed that m = 3 (same for the fifth embodiment).
[0090]
FIG. 18 is a block diagram of a portion particularly involved in the operation of the fourth
embodiment. The frame memory 71 and the target input image memory 74 can be provided in
the internal memory 17, and the motion detection unit 72, the image selection unit 73 and the
image combining unit 75 can be provided in the image signal processing unit 13. The target
input image memory 74 for storing the image signal of the target input image and the image
combining unit 75 for generating a target composite image from m target input images are also
used in the first to third embodiments described above. It can be done.
[0091]
When the operation of the special imaging mode according to the fourth embodiment is started,
the imaging unit 11 performs imaging of an input image periodically at a predetermined frame
cycle. The i-th captured input image is represented by a symbol INi (i is an integer). Now, it is
assumed that an input image sequence including input images IN1 to IN6 shown in FIG. 19 is
acquired. The input image sequence is a moving image obtained by photographing a subject 450
as a target subject while swinging the imaging device 1 in the rightward direction at a constant
speed. Therefore, at the time of shooting of the input images IN1 to IN6, the subject 450 as the
target subject is included in the shooting range. The subject 450 is a person who also has a
function as a sound source.
04-05-2019
30
[0092]
An image signal of the latest input image is supplied to the frame memory 71, the motion
detection unit 72, and the image selection unit 73 via the AFE 12. The frame memory 71 stores
the image signal of the acquired latest input image. The motion detection unit 72 calculates the
amount of movement between two temporally adjacent input images based on the image signal
stored in the frame memory 71 and the image signal of the latest input image given via the AFE
12. . The motion amount calculated here is the above-mentioned displacement vector. The
amount of movement between the input images INi to INi + 1, that is, the displacement vector
between the input images INi to INi + 1 is represented by VEC [i, i + 1].
[0093]
The image selection unit 73 determines whether each input image should be selected as a target
input image based on the displacement vector calculated by the motion detection unit 72, and
the image signal of the input image selected as the target input image is obtained. It is stored in
the target input image memory 74. A method of selecting a target input image by the image
selection unit 73 will be described.
[0094]
The image selection unit 73 first selects the first acquired input image IN1 as a first target input
image.
[0095]
Next, in order to select the second target input image, the image selection unit 73 uses the
calculation result of the motion detection unit 72 to generate a displacement vector VEC [1]
based on the first target input image IN1. , J] are derived.
j is an integer of 2 or more. When j is an integer of 3 or more, the displacement vector VEC [1, j]
is a composite vector of the displacement vectors VEC [1, 2], [2, 3], [j-1, j]. It is.
04-05-2019
31
[0096]
The image selection unit 73 compares the magnitude of the displacement vector VEC [1, j] with a
predetermined reference value VECTH while increasing the value of the variable j by 1 starting
from 2 (VECTH> 0), and the former If it is less than the latter, it is judged that there is no
movement of the target subject between the input images IN1 and INj or there is no change in
the photographing direction, and the input image INj is excluded from the target input image.
The input image INj is selected as the second target input image, judging that there is a
movement of the target subject between IN1 and INj or a change in the photographing direction.
Now, the magnitudes of the displacement vectors VEC [1, 2] and [1, 3] are less than the reference
value VECTH, and the magnitude of the displacement vector VEC [1, 4] is greater than or equal to
the reference value VECTH I assume. Then, the input image IN4 is selected as a second target
input image (see FIG. 19).
[0097]
After selecting the second target input image, in order to select the third target input image, the
image selection unit 73 uses the calculation result of the motion detection unit 72 as a reference
for the second target input image IN4. The positional deviation vector VEC [4, j] is derived. When
j is an integer of 5 or more and j is an integer of 6 or more at the time of derivation of the
displacement vector VEC [4, j], the displacement vector VEC [4, j] is a displacement vector VEC [4,
5], [5, 6]..., [J-1, j].
[0098]
As in the case where the second target input image is selected, the image selection unit 73
increases the value of the displacement vector VEC [4, j] while the value of the variable j is
increased by 1 from 5 as the starting point. As compared with the value VECTH, if the former is
less than the latter, it is determined that there is no movement of the target subject between the
input images IN4 and INj or no change in the photographing direction, and the input image INj is
excluded from the target input image If the latter is equal to or greater than the latter, it is
determined that the target subject moves between the input images IN4 and INj or changes in the
photographing direction, and the input image INj is selected as the third target input image. Now,
it is assumed that the magnitude of the displacement vector VEC [4,5] is less than the reference
value VECTH, and the magnitude of the displacement vector VEC [4,6] is greater than or equal to
04-05-2019
32
the reference value VECTH. Then, the input image IN6 is selected as the third target input image
(see FIG. 19).
[0099]
In the present embodiment, it is assumed that m = 3, so the selection operation of the target
input image is completed when the third target input image is selected, but if m> 3. The selection
operation of the fourth (and subsequent) target input image is continued.
[0100]
The image signal of each target input image selected by the image selection unit 73 is
temporarily stored in the target input image memory 74, and the image combining unit 75 uses
m stored contents of the target input image memory 74 to output m target input images. A target
composite image is generated from (the input images IN1, IN4 and IN6 in the above-mentioned
specific example).
The method of generating a target composite image from m target input images is the same as
that described above.
[0101]
The left original signal and the right original signal or L signal and R signal in the photographing
period of the input image sequence including the input images IN1 to IN6 are stored in the
internal memory 17, and selection of m target input images is made. Then, the sound signal
processing unit 15 sets, for each target input image, a target period based on the photographing
time of the target input image. For example, the imaging time of the i-th target input image may
be regarded as the above-mentioned time Ti, and the i-th object period including the time Ti may
be set according to the method described in the first or second embodiment.
[0102]
Then, in the same way as in the first or second embodiment, directivity is given to a specific
target direction from the left original signal and the right original signal or the L signal and R
signal in the target period for each target period (in other words, A link acoustic signal α
obtained by generating link acoustic signals (having a directional axis in a specific target
04-05-2019
33
direction) and combining link acoustic signals of the first to third target periods on a time series
with an image signal of a target composite image It is good to associate and record in the
external memory 18.
[0103]
Note that the input image sequence including the input images IN1 to IN6 may be taken in
response to the pressing operation of the recording button 26a, and in this case, the image signal
of the moving image as the input image sequence is the moving image It is recorded in the
external memory 18 together with acoustic signals (left original signal and right original signal,
or L signal and R signal) during the image taking period (the same applies to a fifth embodiment
described later).
Therefore, in this case, a link acoustic signal can also be generated from the acoustic signal
recorded in the external memory 18 (the same applies to the fifth embodiment described later).
[0104]
According to this embodiment, the same operation and effect as those of the first and second
embodiments can be obtained without the need for the user's shutter operation.
[0105]
<< Fifth Embodiment >> A fifth embodiment will be described.
FIG. 20 is a block diagram of a portion particularly involved in the operation of the fifth
embodiment. The target input image memory 74 and the image combining unit 75 are the same
as those in FIG. The image selection unit 73 a can be provided in the image signal processing unit
13, and the sound source direction estimation unit 81 can be provided in the acoustic signal
processing unit 15. As in the fourth embodiment, it is assumed that an input image sequence
including the input images IN1 to IN6 of FIG. 19 is acquired after the start of the operation of the
special imaging mode.
[0106]
04-05-2019
34
In the fifth embodiment, nondirectional microphones are used as the microphones 14L and 14R.
Furthermore, it is assumed that the subject 450 present in each input image is the target subject
and the sound source SS, and it is assumed that the subject 450 emits a voice during the
photographing period of the input image sequence.
[0107]
Based on the phase difference between the nondirectional left original signal and the right
original signal during the shooting period of the input image sequence, the sound source
direction estimation unit 81 determines the angle θ of the sound source SS at each time during
the shooting period An angle θ) is detected when 450 is regarded as a sound source SS.
Assuming that i is a natural number, the sound source direction estimation unit 81 sets an ith
evaluation period based on the photographing time of the input image INi, and determines the
first difference from the phase difference between the left original signal and the right original
signal during the i evaluation period. i The angle θ [i] of the sound source SS in the evaluation
period is detected. The shooting time of the input image INi is represented by T [i]. The i-th
evaluation period is, for example, a period from time (T [i] −ΔTC) to time (T [i] + ΔTC). ΔTC
has a predetermined positive value in units of time.
[0108]
The sound source direction estimation unit 81 sets an evaluation period for each input image,
detects the angle θ of the sound source SS, and detects the detection result (θ [1], θ [2], θ
[3],...) This is given to the selection unit 73a. The image selection unit 73a regards the change of
the angle θ of the sound source SS as the movement of the target subject 450 as the sound
source SS, and performs image selection according to the same purpose as the image selection
unit 73 of the fourth embodiment.
[0109]
Specifically, the image selection unit 73a first selects the first acquired input image IN1 as a first
target input image. Subsequently, in order to select the second target input image, the image
selection unit 73a changes the amount of change in the sound source direction (direction in
04-05-2019
35
which the sound source SS is located) with respect to the first target input image, ie, the input
The amount of change in the sound source direction between the shooting of the images IN1 and
INj is determined. The amount of change in the sound source direction during shooting of the
input images IN1 and INj is expressed by | θ [j] −θ [1] |. As described above, j is an integer of 2
or more.
[0110]
The image selection unit 73a compares the amount of change in the sound source direction | θ
[j] −θ [1] | with a predetermined reference value θTH while increasing the value of the
variable j by 1 starting from 2 (θTH> 0 If the former is less than the latter, it is determined that
there is no movement of the target subject between the input images IN1 and INj, no change in
the shooting direction, or no change in the sound source direction, and the input image INj is
excluded from the target input image On the other hand, if the former is equal to or greater than
the latter, it is determined that there is a movement of the target subject between the input
images IN1 and INj, or a change in the shooting direction, or a change in the sound source
direction. Select as image. Now, the amount of change | θ [2] −θ [1] | and | θ [3] −θ [1] | is
less than the reference value θTH, and the amount of change | θ [4] −θ [1] It is assumed that
| is greater than or equal to the reference value θTH. Then, the input image IN4 is selected as a
second target input image (see FIG. 19).
[0111]
After selecting the second target input image, in order to select the third target input image, the
image selection unit 73a changes the amount of change in the sound source direction based on
the second target input image, that is, the input. The amount of change in the sound source
direction | θ [j] −θ [4] | between the shooting of the images IN4 and INj is obtained. At the
time of derivation of the amount of change | θ [j] −θ [4] |, j is an integer of 5 or more.
[0112]
As in the case where the second target input image is selected, the image selection unit 73a
increases the amount of change in the sound source direction | θ [j] −θ [4] while increasing
the value of the variable j by 1 starting from 5 Is compared with the above reference value θTH,
and if the former is smaller than the latter, it is determined that there is no movement of the
04-05-2019
36
target subject between the input images IN4 and INj, no change in the photographing direction,
or no change in the sound source direction While INj is excluded from the target input image, if
the former is greater than the latter, it is determined that there is a movement of the target
subject between the input images IN4 and INj, a change in the photographing direction, or a
change in the sound source direction Select INj as the third target input image. Now, it is
assumed that the amount of change | θ [5] −θ [4] | is less than the reference value θTH, and
the amount of change | θ [6] −θ [4] | is greater than or equal to the reference value θTH.
Then, the input image IN6 is selected as the third target input image (see FIG. 19).
[0113]
In the present embodiment, it is assumed that m = 3, so the selection operation of the target
input image is completed when the third target input image is selected, but if m> 3. The selection
operation of the fourth (and subsequent) target input image is continued.
[0114]
The image signal of each target input image selected by the image selection unit 73 a is
temporarily stored in the target input image memory 74, and the image combining unit 75 uses
m stored contents of the target input image memory 74 to output m target input images. A target
composite image is generated from (the input images IN1, IN4 and IN6 in the above-mentioned
specific example).
The method of generating a target composite image from m target input images is the same as
that described above. The acoustic signal processing unit 15 generates a link acoustic signal α to
be associated with the target composite image by the method described in the fourth
embodiment. The target composite image and the link acoustic signal α are associated with each
other and recorded in the external memory 18.
[0115]
Also in this embodiment, the same operation and effect as those of the first and second
embodiments can be obtained without the need for the user's shutter operation.
[0116]
04-05-2019
37
<< 6th Example >> 6th Example is described.
In the sixth embodiment, an optimum acoustic signal is generated in accordance with the optical
zoom magnification etc. for a frame image as an input image extracted from a moving image, and
the optimum acoustic signal is associated with the extracted still image. The contents will be
described in detail.
[0117]
First, the zoom function involved in the operation of the sixth embodiment will be described. The
driver 34 in FIG. 2 is a zoom lens so that the shooting angle of view (in other words, the angle of
view of the input image) which is the angle of view of the imaging unit 11 becomes the angle of
view according to the optical zoom magnification set by the CPU 23 Control 30 positions. That is,
by controlling the position of the zoom lens 30 according to the optical zoom magnification, the
angle of view of the input image which is an image formed on the effective pixel area of the
imaging device 33 is determined. If the optical zoom magnification is k1 times starting from a
certain magnification, the angle of view of the input image is 1 / k1 in both the horizontal and
vertical directions of the image (k1 is a positive number, for example, 2) Times). The optical zoom
magnification is represented by the symbol ZOPT. In the following description, the optical zoom
magnification ZOPT may be expressed as the magnification ZOPT or simply expressed as the
ZOPT.
[0118]
In the sixth embodiment, nondirectional microphones are used as the microphones 14L and 14R.
The acoustic signal processing unit 15 performs appropriate directivity control on the
nondirectional directional left original signal and the right original signal to enhance the
emphasized acoustic signal in which the component of the sound coming from the sound source
in the emphasis target area is emphasized. Can be generated.
[0119]
The emphasis area AR will be described with reference to FIG. In FIG. 21, the area filled with
diagonal lines is the emphasis area AR. The emphasis target area AR is an area defined on the XY
04-05-2019
38
coordinate plane and located between line segments 501 and 502 indicated by broken lines.
Each of line segments 501 and 502 is a line segment extending from origin O to the first or
second quadrant of the XY coordinate plane, and the angles formed by line segments 501 and
502 with the X axis are represented by φA and φB, respectively. . However, the angle φA is the
angle between the X axis and the line segment 501 when the line segment 501 is viewed
counterclockwise from the line segment connecting the origin O and the center of the
microphone 14R, and the angle φB is the origin O The angle between the X axis and the line
segment 502 when the line segment 502 is viewed in the counterclockwise direction from the
line segment connecting the and the center of the microphone 14R. Furthermore, it is assumed
that 0 ° <φA <φB <180 °. In the example shown in FIG. 21, 0.degree. <. Phi.A <90.degree. <.
Phi.B <180.degree., But 0.degree. <. Phi.A <.phi.B.ltoreq.90.degree., And 90.degree..phi.A <.phi.B
<180.degree. It can also be
[0120]
The acoustic signal processing unit 15 can generate the enhanced acoustic signal ESS in which
the component of the sound coming from the sound source (subject as a sound source) located in
the enhancement target area AR is enhanced. The signal component of the sound coming from
the sound source located in the emphasis object area AR can be extracted as a necessary
component from the left original signal and the right original signal by directivity control, and
the extracted necessary component itself can be made the emphasis acoustic signal ESS. .
Alternatively, while the necessary components are extracted from the left original signal and the
right original signal, signal components other than the necessary components are extracted as
unnecessary components from the left original signal and the right original signal, and then the
mixing ratio of the necessary components is relatively The emphasized acoustic signal ESS may
be generated by weighted addition of the necessary component and the unnecessary component
so as to be large. That is, the coefficients kA and kB satisfying 0 <kB <kA are set, and a signal
obtained by adding a signal obtained by multiplying the necessary component by the coefficient
kA and a signal obtained by multiplying the unnecessary component by the coefficient kB is
generated as the enhanced acoustic signal ESS. You may. The generation process of the
emphasized acoustic signal represented by the emphasized acoustic signal ESS can be performed
by the link acoustic signal generation unit 53 of FIG.
[0121]
The operation of the imaging apparatus 1 according to the sixth embodiment will be described
with reference to FIGS. 22 and 23. Prior to the execution of the special operation according to
04-05-2019
39
the sixth embodiment, the moving image 550 is photographed according to the operation of the
recording button 26a, etc., and the image signal of the moving image 550 is left nondirectional
during the photographing period of the moving image 550. It is assumed that the original signal
and the right original signal are recorded in the external memory 18. Normally, the L signal and
the R signal during shooting of the moving image 550 are also recorded in the external memory
18. FIG. 22 shows a landscape 530 which spreads in front of the imaging device 1 at the time of
shooting a moving image 550, and FIG. 23 shows the configuration of the moving image 550. As
shown in FIG. On the landscape 530, there are two trees and a subject 560 which is a person.
[0122]
The moving image 550 is configured by arranging a plurality of frame images in time series, and
the plurality of frame images include frame images 551 to 556. The shooting times of the frame
images 551 to 556 are represented by TA [1] to TA [6], respectively, and the shooting is
performed in the order of the frame images 551 to 556. Therefore, time TA [i + 1] is a time later
than time TA [i]. A subject 560 is commonly present on the frames 551 to 556. The frame image
551 corresponds to an input image obtained by shooting at time TA [1] (the same applies to
other frame images).
[0123]
Now, in order to simplify the description, it is assumed that all the subjects including the case of
the imaging device 1 and the subject 560 are stationary during the shooting period of the
moving image 550. However, it is assumed that the magnification ZOPT is gradually increased
from time TA [1] to time TA [4]. The magnification ZOPT at time TA [i] is represented by ZOPT [i].
It is ZOPT [1] <ZOPT [2] <ZOPT [3] <ZOPT [4]. In FIG. 22, the ranges within the dashed
rectangles denoted by reference numerals 531 and 534 are imaging ranges at the time of
capturing the frame images 551 and 554, respectively. The magnification ZOPT (that is, the
value of ZOPT [i]) at the time of shooting of each frame image is assumed to be recorded in the
external memory 18 together with the image signal of the moving image 550.
[0124]
After capturing the moving image 550, the user changes the operation mode of the imaging
device 1 to the reproduction mode. The following description in the sixth embodiment is a
04-05-2019
40
description of the operation of the imaging device 1 in the reproduction mode. In the
reproduction mode, the user can specify a desired frame image from the moving image 550, and
when this specification is made, the imaging device 1 generates a still image with sound based on
the specified frame image. can do. As an example, a method of generating a still image with
sound when a frame image 554 is designated will be described.
[0125]
In this case, first, the acoustic signal processing unit 15 (for example, the link acoustic signal
generation unit 53) sets an extraction period 570 including the photographing time TA [4] of the
frame image 554 as shown in FIG. The extraction period 570 is, for example, a period from time
(TA [4] −ΔTD) to time (TA [4] + ΔTD). ΔTD may have a predetermined positive value in units
of time, and the value may be designated by the user. The link acoustic signal generation unit 53
reads from the external memory 18 the left original signal and the right original signal in the
extraction period 570, and performs directivity control on the read left original signal and the
right original signal to obtain the angle of view of the frame image 554. An emphasized sound
signal ESS 554 (see FIG. 24B) is generated in which the component of the sound from the sound
source positioned inward (in other words, the subject as the sound source positioned in the
imaging range 534) is emphasized.
[0126]
That is, the left original signal and the right in the extraction period 570 are set by the same
method as the method of setting the emphasis target area AR 554 according to the angle of view
of the frame image 554 and generating the emphasis acoustic signal ESS corresponding to the
emphasis target area AR. From the original signal, the enhanced acoustic signal ESS 554
corresponding to the enhancement target area AR 554 is generated. The angles φ 554 A and φ
554 B, which are the angles φ A and φ B for the enhancement target area AR 554, are
determined from the angle of view of the frame image 554 which is the shooting angle of view at
time TA [4]. Calculated from the value of the recorded magnification ZOPT [4]. Specifically,
φ554A and φ554B are determined so that the angle (φ554B−φ554A) matches the angle of
view of the frame image 554 and “(180 ° −φ554B) = φ554A” holds. As a result, the
enhanced acoustic signal ESS 554 has directivity in accordance with the angle of view of the
frame image 554 (for example, it has only the signal component of the sound coming from the
sound source located in the enhancement target area AR 554). ).
04-05-2019
41
[0127]
If the angle of view of a frame image captured before the frame image 554 is already known, the
image of the frame image 554 is obtained based on the known angle of view and the optical flow
between those frame images. The corners can also be estimated. In this example, the difference in
angle of view between the frame images 551 and 554 is estimated from the optical flow between
the frame images 551 and 552, the optical flow between the frame images 552 and 553, and the
optical flow between the frame images 553 and 554. If the angle of view of the frame image 551
is known, the angle of view of the frame image 554 can also be determined by estimation from
the optical flow thereof and the angle of view of the frame image 551. This method is useful
when the magnification ZOPT [4] is not recorded in the external memory 18.
[0128]
The enhanced acoustic signal ESS 554 functions as a link acoustic signal for the frame image
554. The CPU 23 of FIG. 1 associates the link acoustic signal as the enhanced acoustic signal ESS
554 with the image signal of the frame image 554 in accordance with the method described in
the first embodiment, and records them in the external memory 18 in a state of associating them.
be able to. Thereafter, when reproduction of the frame image 554 is instructed, the frame image
554 is displayed on the display unit 27, and at the same time, the link acoustic signal associated
with the frame image 554 is reproduced as sound through the speaker 28.
[0129]
In this embodiment, instead of merely associating the recorded acoustic signal in the extraction
period with the designated frame image, an enhanced acoustic signal having directivity according
to the angle of view of the designated frame image is generated and associated with the
designated frame image. To generate a still image with sound. Therefore, it becomes possible to
reproduce an acoustic signal adapted to the angle of view of the designated frame image (an
acoustic signal focusing on a sound source located within the angle of view of the designated
frame) together with the designated frame image.
[0130]
04-05-2019
42
In the above description, it is assumed that the zoom magnification determining the angle of view
of the frame image is the optical zoom magnification by the optical zoom, but the zoom
magnification determining the angle of view of the frame image is the electronic zoom
magnification by the electronic zoom It may be a combination of an optical zoom magnification
and an electronic zoom magnification. As with the optical zoom magnification, if the electronic
zoom magnification is k1 times starting from a certain magnification, the angle of view of the
input image as a frame image becomes 1 / k1 in each of the horizontal and vertical directions of
the image (k1 Is a positive number, for example twice).
[0131]
<< Modifications Etc. >> The specific numerical values shown in the above description are merely
examples, and it is obvious that they can be changed to various numerical values. As annotations
applicable to the above-described embodiment, annotations 1 to 6 will be described below. The
contents described in each annotation can be combined arbitrarily as long as no contradiction
arises.
[0132]
[Note 1] In the above description, the detection of the amount of movement (positional
displacement vector) between the images is performed based on the image signal, but a motion
sensor (not shown) for detecting the movement of the imaging device 1 is used as the imaging
device 1 The movement amount may be detected based on the detection result of the movement
sensor. The motion sensor is, for example, an angular velocity sensor that detects the angular
velocity of the casing of the imaging device 1 and the imaging device 33, or an acceleration
sensor that detects the acceleration of the casing of the imaging device 1 and the imaging device
33.
[0133]
[Note 2] In each of the above-described embodiments, the link acoustic signal can be generated
at any stage. That is, for example, in some of the examples described above (for example, the
example shown in FIG. 7), the link acoustic signal is generated in substantially real time from the
left original signal and the right original signal or the L signal and R signal in shooting mode ,
The left original signal and the right original signal (or the L signal and the R signal) which are
04-05-2019
43
the origin of the link acoustic signal are temporarily stored in the internal memory 17 or the
external memory 18, and at any timing after the storage (for example, In the reproduction mode),
the link acoustic signal may be generated by reading out the left original signal and the right
original signal or the L signal and the R signal from the memories.
[0134]
[Note 3] In the first embodiment, the two target input images 341 and 342 (FIGS. 8A and 8B)
from which the target composite image is generated are captured twice using the common
imaging element 33. However, the imaging unit 11 may be provided with two imaging elements
(not shown) including the first and second imaging elements, and the target input images 341
and 342 may be obtained by the two imaging elements. . Although each of the first and second
imaging elements is equivalent to the imaging element 33, the imaging direction (the direction of
the optical axis for the first imaging element) of imaging using the first imaging element and the
second imaging element The shooting directions (the direction of the optical axis for the second
image sensor) of the shooting using Y.sup.
[0135]
An imaging unit such that an image of each subject in the imaging range 331 is formed on the
first imaging device at the same time, and an image of each subject in the imaging range 332 is
formed on the second imaging device By forming 11 optical systems (see FIG. 6B), the image
signals of the target input images 341 and 342 can be simultaneously acquired by the first and
second imaging elements, and the acquired targets From the image signals of the input images
341 and 342, an image signal of the target composite image 600 of FIG. 25 can be generated.
The target composite image 600 is equivalent to the target composite image 343 of FIG. 9A or
the target composite image 343a of FIG.
[0136]
At this time, a period including the photographing time of the target input images 341 and 342
using the first and second imaging elements is set as the target period, and the left original signal
and the right original signal in the target period are set as the target composite image 600. It can
also be recorded in the external memory 18 in association with the image signal of the target
composite signal 600. An emphasizing target image area 601 is set in the target composite image
04-05-2019
44
600, and an emphasizing acoustic signal ESS 601 focusing on the emphasizing target image area
601 is generated from the nondirectional left original signal and the right original signal during
the target period. The acoustic signal ESS 601 may be associated with the image signal of the
target composite signal 600 as a link acoustic signal and recorded in the external memory 18
together with the image signal of the target composite signal 600. The enhancement target
image area 601 is a part of the entire image area of the target composite image 600 disposed
near the center of the target composite image 600, and there is an image signal of a common
subject between the target input images 341 and 342. Include the image area. Therefore, an
image signal of the subject 321 as a target subject is present in the emphasis target image area
601.
[0137]
The emphasis sound signal ESS 601 is a sound signal in which the component of the sound from
the sound source (subject) located within the angle of view of the emphasis target image area
601 is emphasized. Assuming that the sound source located within the angle of view of the
enhancement target image area 601 is the sound source located within the enhancement target
area AR of FIG. 21, the left original signal and the right original signal of nondirectional in the
target period are By applying the directivity control described in the embodiment, the enhanced
acoustic signal ESS 601 can be generated.
[0138]
[Note 4] Although the number of microphones provided in the imaging apparatus 1 is two in the
above description, three or more microphones are provided in the imaging apparatus 1, and link
acoustic signals are output from output signals of three or more microphones. It may be
generated.
[0139]
[Note 5] The imaging device 1 of FIG. 1 can be configured by hardware or a combination of
hardware and software.
When the imaging device 1 is configured using software, a block diagram of a portion realized by
the software represents a functional block diagram of the portion. A function implemented using
software may be described as a program, and the function may be implemented by executing the
04-05-2019
45
program on a program execution device (for example, a computer).
[0140]
[Note 6] For example, the following can be considered. The imaging device 1 is provided with a
directivity control unit that generates a link acoustic signal having directivity from output
acoustic signals of a plurality of microphones. The directivity control unit may include the link
acoustic signal generation unit 53 of FIG. 3 and may further include a channel acoustic signal
generation unit 52 as a component. The imaging apparatus 1 according to the sixth embodiment
is provided with a target still image extraction unit for extracting a designated frame image (the
frame image 554 in the above specific example) from the moving image 550 as a target still
image, and the target The still image extraction unit is realized by the CPU 23 and / or the image
signal processing unit 13 of FIG. Further, the imaging device 1 is provided with an associating
unit that associates the output image (the target composite image or the above-described target
still image) with the link acoustic signal, and the associating unit is realized by the CPU 23 in FIG.
[0141]
Reference Signs List 1 imaging device 11 imaging unit 13 image signal processing unit 14
microphone unit 15 acoustic signal processing unit 33 imaging device 52 channel acoustic signal
generation unit 53 link acoustic signal generation unit 14L, 14R microphone SS sound source
04-05-2019
46
Документ
Категория
Без категории
Просмотров
0
Размер файла
71 Кб
Теги
jp2011155580
1/--страниц
Пожаловаться на содержимое документа