close

Вход

Забыли?

вход по аккаунту

?

JP2005142639

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JP2005142639
PROBLEM TO BE SOLVED: To easily recognize voices from multiple sources. SOLUTION: Speech
input means for inputting plural kinds of speech signals obtained from different sources, n
speech signals (n is an integer of 3 or more) are processed by processing speech signals inputted
from the speech input means The n voices are selected based on a combination of voice
processing means for output to an output device, selection means for selecting an audio signal to
be simultaneously output among the plurality of types of audio signals, and the selected plurality
of audio signals. And control means for determining an audio output device to which the plurality
of selected audio signals should be output from the output device. [Selected figure] Figure 2
Signal processor
[0001]
The present invention relates to signal processing apparatus, and more particularly to processing
of audio signals.
[0002]
Conventionally, in an apparatus for displaying images of a plurality of channels (sources) on one
screen as in Patent Document 1, one of audio signals of each source is selected and output.
In addition, a device that synthesizes and outputs voice of each source is also considered. JP-A-7298162
09-05-2019
1
[0003]
However, if one of a plurality of sources of audio is selected and output, the audio of other
sources can not be heard. When the speech of each source is synthesized and output, the speech
is from the same direction, so recognition of each speech is difficult.
[0004]
An object of the present invention is to solve such a problem and to allow voices from multiple
sources to be easily recognized.
[0005]
According to the present invention, the speech input means for inputting plural kinds of speech
signals obtained from different sources, n speech signals inputted from the speech input means
are processed, n (n is an integer of 3 or more) The n based on the combination of the audio
processing means for outputting to the audio output device of the above, a selection means for
selecting an audio signal to be output simultaneously among the plurality of audio signals, and
the selected plurality of audio signals Control means for determining an audio output device to
which the plurality of selected types of audio signals should be output from the audio output
device.
[0006]
According to the present invention, recognition of a plurality of sounds related to the source of
an image to be displayed simultaneously is facilitated, and a plurality of processes are
simultaneously provided without sacrificing the sounds.
[0007]
First Embodiment FIG. 1 is a block diagram of an embodiment of a processing apparatus
according to the present invention, and 100 is a video camera signal receiving unit, which
receives a video signal from a video camera and sends it to the signal selection unit 105. Output.
A microphone signal reception unit 101 receives an audio signal from a recording device and
09-05-2019
2
outputs the audio signal to the signal selection unit 105.
A DVD signal receiving unit 102 receives a DVD encoded signal from a DVD device and outputs
the signal to the signal selecting unit 105.
Reference numeral 103 denotes a videophone signal receiving unit, which receives a videophone
signal from the videophone signal receiving device, and outputs the videophone signal to the
signal selecting unit 105. Reference numeral 104 denotes a BS broadcast signal receiving unit,
which receives a BS broadcast signal from a BS broadcast receiving device based on broadcast
channel information from the system control unit 111, and outputs the signal to the signal
selection unit 105. A signal selection unit 105 outputs signals of a plurality of selected processes
based on the signal selection information from the system control unit 111. Specifically, the
signals from the video camera signal reception unit 100 and the microphone signal reception
unit 101 are sent to the video telephone signal code unit / multiplexing unit 106, and the signals
from the DVD signal reception unit 102 are sent to the DVD signal separation unit / decoding
unit 107. The signal from the videophone signal reception unit 103 is sent to the videophone
signal separation unit / decoding unit 108, and the signal from the BS broadcast signal reception
unit 104 is sent to the BS broadcast signal separation unit / decoding unit 109 and the BS data
broadcast signal separation unit / decoding Output to unit 110.
[0008]
Reference numeral 106 denotes a videophone signal code unit / multiplexing unit, which
encodes and multiplexes the videophone signal from the signal selection unit 105 and outputs it
to the system control unit 111. A DVD signal separation unit / decoding unit 107 separates the
DVD signal from the signal selection unit 105 into a video signal, an audio signal, and a control
signal, decodes each of them, and outputs the decoded signal to the system control unit 111. A
video telephone signal separation unit / decoding unit 108 separates the video telephone signal
from the signal selection unit 105 into a video signal, an audio signal, and a control signal,
decodes each of them, and outputs the decoded signal to the system control unit 111. A BS
broadcast signal separation unit / decoding unit 109 separates the BS broadcast signal from the
signal selection unit 105 into a video signal, an audio signal, and a control signal, decodes each
of them, and outputs the decoded signal to the system control unit 111. Reference numeral 110
denotes a BS data broadcast signal separation unit / decoding unit, which separates the BS data
broadcast signal from the signal selection unit 105 into a video signal, an audio signal, and a
control signal, decodes each of them, and outputs the decoded signal to the system control unit
111. 111 is a system control unit, and the video telephone signal from the video telephone signal
09-05-2019
3
code unit / multiplex unit 106 is sent to the video telephone signal transmission unit 121, the
DVD signal separation unit / decoding unit 107, the video telephone signal separation unit /
decoding unit 108, The audio signal from the 109BS broadcast signal separation unit / decode
unit and 110BS data broadcast signal separation unit / decode unit is sent to the audio frequency
conversion unit 114, the control signal is sent to the control signal selection unit 117, and the
video signal is sent to the video frequency conversion unit 118. Process selection information
from remote control transmission unit 112 to signal selection unit 105, audio output device
determination unit 113 and video synthesis unit 119, output device setting information for each
audio to audio output device determination unit 113, broadcast channel information to BS
broadcast Desired processing is performed to the voice synthesis unit 115 to output an operation
sound by remote control operation to the signal reception unit 104. A remote control reception
unit 112 outputs various operation information from the user to the system control unit 111.
Reference numeral 113 denotes an audio output device determination unit, which is an output
device setting information of each audio from the system control unit 111 or a prescribed output
device setting information among a plurality of selected audios which can be understood from
the process selection information from the system control unit 111 To determine the output
device to be output, and output to the speech synthesis unit 115. An audio frequency converter
114 selects an audio signal having the highest frequency among a plurality of audio signals from
the system controller 111, and performs frequency conversion processing so that each audio
signal has a selected frequency. The selection information is output to the synthesis unit 115,
and the selection information is output to the control signal selection unit 117.
Reference numeral 115 denotes a voice synthesis unit, which downmixes the voice signals from
the system control unit 111 and the voice frequency conversion unit 114 so as to be the device
determined by the voice output device determination unit 113, and performs voice synthesis and
voice output control. Output to unit 116. An audio output control unit 116 is controlled by the
control signal selected by the control signal selection unit 117, and outputs the audio signal from
the audio synthesis unit 115 from the audio output device. Reference numeral 117 denotes a
control signal selection unit, and among the plurality of control signals from the system control
unit 111, for audio, the audio control signal determined by the audio frequency conversion unit
114 is sent to the audio output control unit 116 and the video is an image. The control signal of
the video determined by the frequency converter 118 is output to the video output controller
120. Reference numeral 118 denotes a video frequency conversion unit, which selects a video
signal having the highest frequency among the plurality of video signals from the system control
unit 111, performs frequency conversion processing so that each video signal has a selected
frequency, The selection information is output to the synthesis unit 119, and the selection
information is output to the control signal selection unit 117. A video synthesis unit 119
converts each video signal from the video frequency conversion unit 118 into a defined
resolution depending on the processing selection information from the system control unit 111,
synthesizes the video, and outputs it to the video output control unit 120. Do. A video output
09-05-2019
4
control unit 120 is controlled by the control signal selected by the control signal selection unit
117, and outputs the video signal from the video synthesis unit 119 from the display device.
Reference numeral 121 denotes a videophone signal transmission unit, which outputs the
videophone signal from the system control unit 111 from the transmission device.
[0009]
Next, the method of determining the output voice by the voice output device determining unit
113 will be described.
[0010]
If there is output setting of the user through the remote control receiver unit 112 and the system
control unit 111, an output device is determined according to the setting, and if not specified, an
output device is determined according to the table of FIG.
[0011]
For example, when BS broadcast reception processing, BS data broadcast reception processing,
and videophone processing are selected as sources, and there is no user's output setting,
according to the table of FIG. The sound of the BS data broadcast is determined to be the rear left
speaker, and the sound of the videophone is determined to be the rear right speaker, and is
output to the voice synthesis unit 115.
[0012]
Next, the synthesis method of the speech synthesis unit 115 will be described.
[0013]
Only when the audio signal has a greater number of channels than the specified number of
devices, downmixing using a prescribed downmix coefficient is performed, and voice synthesis is
performed so that the device determined by the audio output device determination unit 113 is
obtained. Output to output control section 116.
[0014]
In the above example, when the selected BS broadcast audio signal is 6 channels and both the BS
data broadcast and videophone audio signals are 2 channels, the BS broadcast audio signal is
based on the 6 channel audio signal and the front left and right speakers , The center speaker,
and the woofer speaker are converted to four signals mixed at a specified mixing ratio, and the
09-05-2019
5
BS data broadcast audio signal and the videophone audio signal are converted to one signal at a
specified mixing ratio and synthesized to be 116 audio It is output via an output control unit and
an audio output device.
[0015]
As described above, in this embodiment, the user can easily recognize the voice of the source of
each image displayed on the screen because the device (speaker) that outputs the voice is
selected according to the type and number of voices to be output simultaneously. it can.
[0016]
<< Embodiment 2 >> FIG. 4 is a block diagram of an embodiment of a signal processing
apparatus according to the present invention. Reference numeral 300 denotes a microphone
signal receiving unit, which receives an audio signal from a recording device and sends it to the
signal selecting unit 306. Output.
Reference numeral 301 denotes a telephone signal receiving unit, which receives a telephone
voice signal from a telephone signal receiving device and outputs the signal to the signal
selecting unit 306.
An AM signal receiving unit 302 receives an AM audio signal from an AM signal receiving device
based on the broadcast channel information from the system control unit 310 and outputs the
signal to the A / D conversion unit 304.
An FM signal receiving unit 303 receives an FM audio signal from an FM signal receiving device
based on the broadcast channel information from the system control unit 310 and outputs the
signal to an A / D 304 conversion unit.
An A / D conversion unit 304 performs A / D conversion on the audio signals from the AM signal
reception unit 302 and the FM signal reception unit 303, and outputs the audio signals to the
signal selection unit 306.
An MP3 signal receiving unit 305 receives an MP3 encoded signal from an external medium
09-05-2019
6
based on the MP3 control signal from the system control unit 310, and outputs the signal to the
signal selection unit 306.
Reference numeral 306 denotes a signal selection unit based on signal selection information
from the system control unit 310 among the audio signals from the microphone signal reception
unit 300, the telephone signal reception unit 301, the A / D conversion unit 304, and the MP3
signal reception unit 305. The microphone signal is output to the telephone signal encoding unit
307, the telephone signal to the telephone signal decoding unit 308, the AM signal and the FM
signal to the system control unit 310, and the MP3 encoded signal to the MP3 signal decoding
unit 309.
A telephone signal code unit 307 encodes the microphone signal from the signal selection unit
306 and outputs the encoded microphone signal to the system control unit 310. Reference
numeral 308 denotes a telephone signal decoding unit, which decodes the telephone signal from
the signal selection unit 306 and outputs the decoded signal to the system control unit 310. An
MP3 signal decoding unit 309 decodes the MP3 encoded signal from the signal selection unit
306 and outputs the decoded signal to the system control unit 310. Reference numeral 310
denotes a system control unit, and the telephone coding signal from the telephone signal coding
unit 307 is sent to the telephone signal transmission unit 317, the telephone signal from the
telephone signal decoding unit 306, the AM · FM signal from the signal selection unit 306, and
the MP3 signal. The audio signal from the decoding unit 309 is sent to the audio frequency
conversion unit 313, the process selection information from the remote control transmission unit
311 is sent to the signal selection unit 306 and the audio output coordinate determination unit
312, and the output coordinate setting information for each audio is output Performs desired
processing of broadcast channel information to AM signal reception unit 302, FM signal
reception unit 303, MP3 control signal to MP3 signal reception unit 305, and operation sound
by remote control operation to sound 3D emulation unit 314. Output. A remote control reception
unit 311 outputs various operation information from the user to the system control unit 310. An
audio output coordinate determination unit 312 is an output coordinate setting information of
each audio from the system control unit 310 or a prescribed output coordinate setting
information among a plurality of selected audios which can be understood from the processing
selection information from the system control unit 310. To determine the output coordinates to
be output, and output to the audio 3D emulation unit 314. An audio frequency converter 313
selects an audio signal having the highest frequency among the plurality of audio signals from
the system control unit 310, and performs frequency conversion processing so that each audio
signal has a selected frequency. It is output to the 3D emulation unit 314. A voice 3D emulating
unit 314 processes the voice signals from the system control unit 310 so as to be the coordinates
determined by the voice output coordinate determination unit 113, and outputs the processed
voice signals to the voice synthesis unit 315. Reference numeral 315 denotes a voice synthesis
09-05-2019
7
unit, which performs voice synthesis on the voice signals from the system control unit 310 and
the voice frequency conversion unit 313, and outputs the voice signals to the voice output
control unit 316. An audio output control unit 316 outputs the audio signal from the audio
synthesis unit 315 from the audio output device. A telephone signal transmission unit 317
transmits the telephone signal from the system control unit 310 from the telephone signal
transmission device.
[0017]
Next, the process of the audio output coordinate determination unit 312 will be described.
[0018]
If there is an output setting of the user through the remote control reception unit 311 and the
system control unit 310, the output coordinates are determined according to the setting. If there
is no setting, the output coordinates are determined according to the following rules.
[0019]
1.1 Output coordinates of two points separated by 0.7 meters per processing.
[0020]
2.
Output coordinates of two points for one process are determined so that a plurality of selected
voices are equally spaced on a plane circumference of 1 meter around the user.
[0021]
3.
The processes are assigned counterclockwise around the user in the order in which they were
selected.
09-05-2019
8
[0022]
For example, when external media speech decoding processing, telephone reception processing,
and FM reception processing are selected in order and there is no user's output setting, as shown
in FIGS. 5 and 6, the external media speech has coordinates A1 and A2, and telephone speech has
coordinates. B1 · B2 and FM speech are determined at coordinates C1 · C2 and output to the
speech synthesis unit 315.
The voice synthesis unit 315 synthesizes and outputs voice signals so that the voice of the source
to be written is output from the position according to the voice coordinates determined in this
way to the plurality of connected voice output devices (speakers) Do.
[0023]
It is a block diagram of a signal processing device in an embodiment of the present invention. It
is a figure which shows the table which shows the kind of audio | voice signal which should be
output, and the combination of the speaker at that time. It is a figure which shows arrangement |
positioning of the speaker in embodiment. It is a block diagram of a signal processing device in
an embodiment of the present invention. It is a figure which shows the audio | voice output
coordinate in embodiment. It is a figure which shows the audio | voice output coordinate in
embodiment.
09-05-2019
9
Документ
Категория
Без категории
Просмотров
0
Размер файла
17 Кб
Теги
jp2005142639
1/--страниц
Пожаловаться на содержимое документа