Patent Translate Powered by EPO and Google Notice This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate, complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or financial decisions, should not be based on machine-translation output. DESCRIPTION JP2011053062 To estimate the distance between a sound source and one microphone array. A sound source distance measuring apparatus according to the present invention records the relationship between a distance ratio and a distance ratio between a single microphone array consisting of a plurality of microphones, a plurality of frequency domain conversion parts, a distance ratio estimation part. And a distance determination unit. The plurality of frequency domain conversion units respectively receive the received sound signals received by the plurality of microphones, and convert the received sound signals into signals in the frequency domain. The in-between ratio estimation unit estimates the in-between ratio of the sound reception signal with the signals in the frequency domain output by the plurality of frequency domain conversion units as input. A distance determination unit refers to a distance-to-intersection ratio database with the interportal ratio as an input, and estimates a sound source distance estimated value corresponding to the inter-port ratio. [Selected figure] Figure 4 Inter-station ratio estimation device, sound source distance measurement device, noise removal device, method of each device, device program [0001] The present invention relates to, for example, a sound source distance measuring device for estimating the distance from a microphone array to a sound source using a single microphone array applicable to a hands free method for operating a device by voice input, and a sound source distance measuring device The present invention relates to a direct current ratio estimation device used in the present invention, a noise removal device using a sound source distance measurement device, a method of each device, and a device program. 04-05-2019 1 [0002] The concept of measuring the distance between the conventional microphone and the sound source disclosed in Non-Patent Document 1 will be briefly described with reference to FIG. The idea is to estimate the distance between the microphone and the sound source by the principle of triangulation. Two microphone arrays 1 and 2 are disposed on a straight line in the direction orthogonal to the traveling direction of the sound wave of the sound source at an interval of a distance D12. [0003] From the known distance D12 and the angles θ1 and θ2 estimated by the microphone array, the distance D between the sound source and the straight line formed by the microphone arrays 1 and 2 is estimated. [0004] M. Omologo and P. Svaizer, "Use of the Crosspower-Spectrum Phase in Acoustic Event Location," IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 5, NO. 3, MAY 1997. [0005] Heretofore, the only method of measuring the distance between the microphone and the sound source has been based on the above-described triangulation principle. In order to measure the distance D accurately by trigonometry, the view angles θ1 and θ2 from the microphone arrays 1 and 2 need to be sufficiently different, so there is a problem that a somewhat large installation space is required. Also, the distance between the microphone arrays needs to be known, and multiple microphone arrays are essential. Therefore, the installation work of the measurement environment of the distance between sound sources is complicated. 04-05-2019 2 [0006] The present invention has been made in view of such problems, and it is possible to estimate the distance between the microphone and the sound source even with a single microphone array, in order to enable the estimation of the distance between the microphone and the sound source. Proposes a new method using (described later). Then, the sound source distance measuring device using the direct ratio, the direct current ratio estimating device constituting the sound source distance measuring device, the noise removing device using the sound source distance measuring device, and the method and program of each device are provided. The purpose is [0007] The sound source distance measuring device according to the present invention is a distance-tostraight recording the relationship between a single microphone array composed of a plurality of microphones, a plurality of frequency domain conversion units, a direct ratio estimation unit, and a direct ratio and a distance. An inter-ratio database and a distance determination unit are provided. The plurality of frequency domain conversion units respectively receive the received sound signals received by the plurality of microphones, and convert the received sound signals into signals in the frequency domain. The in-between ratio estimation unit estimates the inbetween ratio of the sound reception signal with the signals in the frequency domain output by the plurality of frequency domain conversion units as input. The distance determination unit estimates a distance corresponding to the direct-to-intersection ratio database with reference to the direct-to-intersection ratio database with the direct-to-intersection ratio as an input. [0008] Further, in the direct-to-inside ratio estimation device of the present invention, the same microphone array as the sound source distance measuring device of the present invention, a single microphone array consisting of a plurality of microphones, a plurality of frequency domain conversion units, and a direct-current ratio estimation unit Prepare. [0009] Further, the noise removal device of the present invention further includes a processing target signal generation unit, a target signal adjustment unit, and an inverse frequency domain conversion unit in the configuration of the sound source distance measurement device of the 04-05-2019 3 present invention. The processing target signal generation unit combines signals in the frequency domain output from the plurality of frequency domain conversion units to generate a processing target signal. The target signal adjustment unit receives the processing target signal and the direct current ratio, and generates a processed signal in which the amplitude of the processing target signal is adjusted according to the value. The inverse frequency domain conversion unit converts the processed signal into a time domain signal. [0010] According to the sound source distance estimation device of the present invention, since it is possible to estimate the distance between the sound source and the microphone array with one small scale microphone array, it is possible to save space. In addition, there is no need for information that must be measured and placed in advance, such as the distance between microphone arrays that changes with each installation. [0011] Further, the direct-to-right ratio estimating device of the present invention can provide the directto-ear ratio which is an index for sound source distance estimation. Also, the noise removal device according to the present invention estimates the direct ratio and filters the received signal according to the value. The direct-to-inside ratio is a ratio of direct sound to indirect sound (reversed sound) included in the received sound, and is a value that changes monotonously according to the distance between the microphone and the sound source. By filtering the received sound according to this value, it is possible to pick up a sound by emphasizing or suppressing only the component of the sound source determined to be within a certain distance range. As a result, it is possible to pick up (remove noise from) only the sound of a sound source located at a specific distance with one microphone array. [0012] The figure which shows an example of the scene which utilizes the sound source distance measuring apparatus 100 of this invention. The figure which shows the propagation path of the 04-05-2019 4 sound indoors. The figure which shows the relationship between the ratio between direct and the distance between microphones. The figure which shows the operation | movement flow of the figure sound source distance estimation apparatus 400 which shows the function structural example of the sound source distance estimation apparatus 400 of this invention. FIG. 6 is a diagram showing an example of a functional configuration of a direct-to-proportion ratio estimation unit 43. The figure which shows the function structural example of the noise removal apparatus 700 of this invention. FIG. 16 shows an operation flow of the noise removal apparatus 800. The figure which shows the experimental condition of an effect confirmation experiment. The figure which shows an example of a direct ratio. The figure which shows the view which measures the distance between the conventional microphone and sound source disclosed by the nonpatent literature 1. FIG. [0013] Hereinafter, embodiments of the present invention will be described with reference to the drawings. The same reference numerals are given to the same components in the drawings, and the description will not be repeated. Also, in the following description, the symbols “¯”, “^”, etc. used in the text should originally be written directly above the previous character, but due to the limitations of the text notation, immediately after the character Described in. In the formula, these symbols are described at their original positions. [0014] Before describing the embodiments, the idea of the present invention will be described. [0015] The present invention estimates the distance between a microphone array and a sound source using a single microphone array. The scene which utilizes the sound source distance estimation apparatus 400 of this invention at FIG. 1 is illustrated. One microphone array 11 and a speaker 12 exist in a room 10 having reverberation characteristics. The microphone array 11 and the speaker 12 are disposed at a distance. 04-05-2019 5 [0016] In this situation, we would like to estimate the distance D between the speaker 12 and the microphone array 11. Therefore, the present invention estimates the distance between the sound sources using the direct ratio. [0017] The direct ratio is a ratio of direct sound to indirect sound (reverberation sound) included in the received sound. FIG. 2 shows a propagation path of sound from the sound source 21 to the microphone 22 when the microphone is placed indoors and sound is recorded. The direct sound is a sound wave indicated by a thick solid line which directly reaches from the sound source 21 to the microphone. One reverberation sound is a sound wave indicated by a broken line which reaches the microphone 22 after the sound emitted from the sound source 21 is reflected by a wall, a floor, a ceiling or the like. [0018] FIG. 3 shows the relationship between the in-plane ratio and the distance between microphones. The horizontal axis in FIG. 3 is the distance from the microphone to the sound source, and the vertical axis is the direct ratio. In general, indirect sound exhibits a constant magnitude that does not depend on the distance from the microphone. With respect to the indirect sound, the direct sound exhibits a monotonically decreasing characteristic as the distance from the microphone increases. The direct ratio divided by the indirect sound by the indirect sound becomes a characteristic that monotonously decreases with the increase of the distance as the direct sound. [0019] The sound source distance estimation apparatus of the present invention makes it possible to estimate the distance between the microphone array and the sound source from the received sound received by one microphone array by using this direct ratio. The direct ratio estimation device of the present invention outputs a direct ratio. Moreover, the noise removal apparatus of this invention removes the noise of a sound reception signal according to the in-room ratio which the in-room ratio estimation apparatus outputs. 04-05-2019 6 [0020] FIG. 4 shows an example of the functional configuration of a sound source distance estimation apparatus 400 according to the present invention. The operation flow is shown in FIG. The noise eliminator 400 includes one microphone array 41, a plurality of frequency domain conversion units 421 to 42M, an inter-area ratio estimation unit 43, and a distance-inter-area ratio database (hereinafter referred to as a distance-inter-area ratio DB). And 44) and a distance determination unit 45. Each functional component except the microphone array 41 is realized by, for example, a predetermined program being read into a computer including a ROM, a RAM, a CPU, and the like, and the CPU executing the program. [0021] The microphone array 41 comprises a plurality of microphones m1,. A plurality of frequency domain conversion units 421, ..., 42M receive the received signals xm (n) received by the plurality of microphones m1, ... mM, respectively, and convert the respective received signals into signals in the frequency domain ( Step S42). The frequency domain conversion units 421, ..., 42M sample the sound reception signal xm (n) at, for example, a sampling frequency of 16 kHz and convert it into a digital signal. For example, 256 samples are made into one frame, and discrete Fourier transform is performed in each frame. The conversion is performed to output the frequency component Xm (ω, l) (step S42). ω is a frequency and l is a frame number. The A / D converter for converting the sound reception signal xm (n) into a digital signal is omitted. [0022] The direct-to-right ratio estimation unit 43 estimates the direct-to-right ratio E of the sound reception signal with the signal Xm (ω, l) in the frequency domain output by the plurality of frequency domain conversion units 421, ..., 42m as an input (step S43) . [0023] The distance-intersection ratio DB 44 records the relationship between the inter-area ratio E and the distance between the microphone array and the sound source. 04-05-2019 7 The distance determination unit 45 receives the direct ratio as an input, refers to the distancedirect ratio DB 44, and estimates a distance corresponding to the direct ratio (step S45). The operations from step S42 to step S45 are continued until all the sound reception signals xm (n) are finished. [0024] By the above operation, for example, only a sound within a specific distance range is enhanced by one microphone array, and sounds outside the range are suppressed for noise removal. Hereinafter, the present invention will be described in more detail by showing a more specific functional configuration example of each part. [0025] [In-Plane Ratio Estimating Unit] FIG. 6 shows an example of a functional configuration of the inplane ratio estimating unit 43. The in-between ratio estimating unit 43 includes a spatial correlation matrix calculating unit 431, a signal power estimating unit 432, and an in-between ratio calculating unit 433. The spatial correlation matrix calculating unit 431 receives the signals X1 (ω, l),..., XM (ω, l) in the frequency domain output by the plurality of frequency domain transforming units 421,. (Ω, l),..., XM (ω, l) are vectorized, and the spatial correlation matrix R (ω) shown in equation (1) is calculated using the input signal. [0026] [0027] Here, T represents transposition of a matrix, H represents conjugate transposition, and L represents the number of frames to be averaged. The spatial correlation matrix R (ω) is input to the signal power estimation means 432. 04-05-2019 8 [0028] The signal power estimation unit 432 is given by each component R ij (ω) of the spatial correlation matrix R (ω) output by the spatial correlation matrix calculation means 431, the microphone arrangement of the microphone array given in advance, and the direction of the sound source In equation (5), which is configured from the matrix Rd (ω) (formula (3)), each component dij (ω) of the matrix Rr (ω) (formula (4)), and each component rij (ω) The matrix A (ω) shown and B (ω) shown in equation (6) are used. [0029] [0030] Here, Dmn is the distance between the mth microphone and the nth microphone, and θ is the direction of the sound source as viewed from the front of the microphone array. Here, the shape of the microphone array is a linear arrangement, and the front of the microphone array means the normal direction of the line in which the microphones are arranged. [0031] [0032] Then, by setting up the simultaneous equations shown in equation (7) and solving them, a vector P (ω) composed of the power Pd (ω) of the direct sound and the power Pr (ω) of the reverberation (equation (8)) The direct sound power Pd (ω) and the reverberation sound power Pr (ω) are respectively output. [0033] [0034] The matrix Rd (ω) when the arrangement of the microphone array is an arrangement other than a straight line can be expressed in the form shown in a more general expression (9). 04-05-2019 9 [0035] [0036] Here, Dmn (θ) represents the difference in distance between the m-th microphone and the n-th microphone when viewed from the direction of the angle θ °. Further, for the solution of the simultaneous equations of the equation (7), for example, as shown in the equation (11), the pseudo inverse matrix A <+> (ω) of the A (ω) (the equation (10)) It is done by the method of hanging from the left of). [0037] [0038] The in-plane ratio calculation means 43 calculates and outputs the in-plane ratio E by the equation (12) from the direct sound power Pd (ω) and the reverberation sound power Pr (ω). [0039] [0040] With the configuration of the direct-to-right ratio estimation unit 43 described above, one microphone array 41, and a plurality of frequency domain conversion units 421 to 42M, the direct-to-right ratio estimation device 71 that outputs the direct-to-ear ratio E can be configured. Further, the in-plane ratio may be obtained from the eigenvalues obtained by eigenvalue expansion of the spatial correlation matrix R (ω). [0041] 04-05-2019 10 Information on the relationship between the distance and the in-plane ratio is recorded in advance in the distance-in-intersection ratio DB 44. The information on the relationship between the distance and the in-plane ratio can be obtained by linear interpolation of pairs (d1, E1), (d2, E2),... A functional expression d = f (E) indicating the relationship between the distance between the function and the approximation function obtained from the set of (d1, E1), (d2, E2),. The function f (E) is described, for example, in the reference "M. Tohyama et. Al. "The Nature and Technology of Acoustic Space," Academic Press, 1995. "It is described in. [0042] The distance determination unit 45 refers to the relationship between the distance ratio E recorded from the distance ratio estimation unit 43, the distance recorded in the distancedistance ratio DB 44, and the distance ratio, and the distance ratio E. The source distance estimated value d ^ corresponding to is output. [0043] When the pairs (d1, E1), (d2, E2),... Which correspond to the distances and the in-plane ratios are stored in the distance-in-intersection DB 44, the sound source distance estimated values Find and output d ^. [0044] First step: Among E1, E2,... Stored in the distance-intersection ratio DB 44, two inter-area ratios Em and En adjacent to the inter-area ratio E obtained by the inter-area ratio estimation unit 43 are determined. [0045] Second step: The distances dm and dn corresponding to the direct current ratios Em and En, 04-05-2019 11 respectively, are obtained from the distance-direct current ratio DB44. [0046] Third step: From the distances dm and dn, a sound source distance estimated value d ^ is obtained by linear interpolation as shown in equation (13). [0047] [0048] Further, when the functional expression d = f (E) is stored in the distance-distance ratio DB 44, the distance determination unit 45 estimates the sound source distance from the distance ratio E input from the distance ratio estimation unit 43. Calculate and output the value d ^. [0049] The inter-period ratio calculation means 433 divides the accumulated value ωωPd (ω) of direct sound power of all frequencies ω by the accumulated value ωωPr (ω) of indirect sounds of all frequencies ω as shown in equation (12). The calculated value is calculated as the direct ratio E. Some of the received sound signals have components concentrated in a specific frequency band. When the ratio E of the received sound signals is calculated by the ratio calculation means 433, the estimation accuracy of the ratio E is degraded. [0050] Therefore, as shown in the equation (14), the accuracy of estimation of the in-plane ratio is improved by using the in-plane ratio calculating means 433 '(FIG. 6) for calculating the in-plane ratio E in a specific frequency region Ω. Can do. [0051] 04-05-2019 12 [0052] Here, the frequency domain Ω is determined, for example, by selecting a frequency band in which signal components concentrate. For example, among the output Xm (ω, l) of the frequency domain conversion unit 42m connected to an arbitrary m-th microphone, the absolute value of Xm (ω, l) is preset as shown in the equation (15) It is determined by selecting the frequency ω having a value larger than the threshold Pth or selecting the frequency ω from the one with the largest absolute value of Xm (ω, l) to the K-th. [0053] [0054] Here, Pth is, for example, the average value of all the frequencies of | Xm (ω, l) |. [0055] FIG. 7 shows an example of a functional configuration of the noise removal apparatus 700 of the present invention. The operation flow is shown in FIG. The noise removal apparatus 700 includes the direct-current ratio estimation apparatus 71 described in the first embodiment, the processing target signal generation unit 72, the target signal adjustment unit 73, and the inverse frequency domain conversion unit 74. [0056] Processing target signal generation unit 72 receives signal Xm (ω, l) in the frequency domain output by the plurality of frequency domain conversion units 421 to 42 M in direct-current ratio 04-05-2019 13 estimation device 71 as input signal to be processed X (ω, l) Are output (step S72). The processing target signal Y (ω, l) is a signal obtained by combining the signal Xm (ω, l) in the frequency domain by, for example, addition means (not shown). Prior to the addition, the signal Xm (ω, l) in each frequency domain may be multiplied by a weight. [0057] The target signal adjustment unit 73 receives the processing target signal X (ω, l) output from the processing target signal generation unit 72 and the processing target signal X (ω, l) output from the processing unit signal generation unit 72. The processed signal Y (ω, l) is generated by adjusting the amplitude of X (ω, l) (step S73). The inverse frequency domain conversion unit 74 converts the processed signal Y (ω, l) into a time domain signal y (n) (step S74). [0058] The target signal adjustment unit 73 includes, for example, a distance calculation unit 721, a filter formation unit 722, and a multiplication unit 723. The distance calculating means 721 incorporates a functional expression d = f (E) indicating the relationship between the distance between the microphone array 41 and the sound source and the ratio E, and the sound source corresponding to the ratio E input An estimated distance value d ^ is calculated (distance calculation step S721). [0059] The filter forming means 722 sets the sound source distance estimation value d ^ to emphasize time frequency components taking values between the two threshold values df and dn, which are different, as shown in equation (16), A filter is formed to emphasize only the sound sources in the 04-05-2019 14 band-like region in the two distance sections. [0060] [0061] Here, l and ω of G (ω, l) are the L frames and straight lines averaged by the equation (1) in the spatial correlation matrix calculating means 431 among the processes of the above-mentioned ratio calculating unit 43. The same G (ω, l) is multiplied to all the frequencies included in the frequency Ω (formula (14)) averaged by the ratio calculation means 433. Further, in the equation (16), the values of G (ω, l) do not necessarily have to be 1 and 0, and may be values having sufficiently different magnitudes such as 0.9 and 0.1. [0062] The multiplying means 723 multiplies the processing target signal X (ω, l) by the filter G (ω, l) to generate a processed signal Y (ω, l). Therefore, the processed signal Y (ω, l) is obtained by emphasizing or suppressing the sound of the sound source located in two distance sections, that is, in a specific distance range from the microphone array 41. The post-processing signal Y (ω, l) is converted by the inverse frequency domain conversion unit 73 into the signal y (n) in the time domain. [0063] [Experimental Results] For the purpose of confirming the effect of the present invention, computer simulation is performed in which two sound sources are disposed at different positions in the same direction as viewed from the microphone array and the sound of the sound source far from the microphone array is suppressed. The 04-05-2019 15 [0064] The simulation conditions are shown in FIG. We assumed a room with a plane size of 4 x 6 m and a height of 2.5 m. A microphone array was used in which three microphones were linearly arranged at an interval of 4 cm. The size of the microphone array is 8 cm. In the microphone array, the central microphone was placed at a height of 1.5 m and 1 m from a wall of 4 m. Then, a sound source emitting white noise following a normal distribution is placed in the direction of an angle of 10 ° from the central axis of the central microphone, the distance from the microphone array is changed, and the direct ratio is estimated each time. [0065] In FIG. 10, the horizontal axis represents the distance [cm] between the microphone array and the sound source, and the vertical axis represents the direct ratio [dB]. The direct ratio estimated by the method of the present invention is plotted by ○. The actual ratio between the two determined from the impulse response is plotted by □. Although a tendency different from the actual value is exhibited at 20 cm or less, the same tendency as the actual value is exhibited at a distance of 30 cm or more. [0066] It can also be understood well from FIG. 10 that the distance can be determined from the value of the direct ratio. 04-05-2019 16 [0067] In this way, even with a single compact microphone array, it is possible to estimate the distance between the microphone and the sound source. The idea of the present invention can be applied to a direct ratio estimation device, a sound source distance estimation device using the direct ratio estimation device, and a noise removal device. [0068] Note that the processes described in the above method and apparatus are not only performed in chronological order according to the order of description, but also may be performed in parallel or individually depending on the processing capability of the apparatus that executes the process or the need. Good. [0069] Further, when the processing means in the above-mentioned device is realized by a computer, the processing content of the function that each device should have is described by a program. Then, by executing this program on a computer, the processing means in each device is realized on the computer. [0070] The program describing the processing content can be recorded in a computer readable recording medium. As the computer readable recording medium, any medium such as a magnetic recording device, an optical disc, a magneto-optical recording medium, a semiconductor memory, etc. may be used. Specifically, for example, as a magnetic recording device, a hard disk device, a flexible disk, a magnetic tape or the like as an optical disk, a DVD (Digital Versatile Disc), a DVDRAM (Random Access Memory), a CD-ROM (Compact Disc Read Only) Memory), CD-R (Recordable) / RW (Rewritable), etc. as magneto-optical recording medium, MO (Magneto Optical 04-05-2019 17 disc) etc., as semiconductor memory, EEP-ROM (Electronically Erasable and Programmable Only Read Memory) etc. It can be used. [0071] Further, this program is distributed, for example, by selling, transferring, lending, etc. a portable recording medium such as a DVD, a CD-ROM or the like in which the program is recorded. Furthermore, the program may be stored in a storage device of a server computer, and the program may be distributed by transferring the program from the server computer to another computer via a network. [0072] Further, each means may be configured by executing a predetermined program on a computer, or at least a part of the processing content may be realized as hardware. 04-05-2019 18

1/--страниц