close

Вход

Забыли?

вход по аккаунту

?

DESCRIPTION JP2009044588

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JP2009044588
An object of the present invention is to realize an apparatus for emphasizing and collecting a
sound from a desired sound source with an SNR higher than that of the prior art without
expanding the size of a microphone array. A specific direction sound pickup apparatus according
to the present invention includes a plurality of beam formers, a plurality of frequency domain
converters, a specific direction selector, a signal amount estimator, a gain coefficient calculator,
and a multiplier. The signal amount estimation unit includes a region aggregation unit, an inverse
matrix operation unit, and a multiplication unit. The area aggregation means obtains an
aggregated power vector composed of the signal amount of the specific direction frequency
region signal and the signal amount of the angle region signal in the other direction. The inverse
matrix calculation means calculates an inverse matrix of the aggregation gain matrix obtained
from the directivity characteristic of the beam former. The multiplying means multiplies the
aggregated power vector by the inverse matrix to obtain an estimated value of the sum of
frequency domain signals. [Selected figure] Figure 5
Specific direction sound collection device, specific direction sound collection method, specific
direction sound collection program, recording medium
[0001]
The present invention relates to a sound collection device that acquires sound in a hands-free
manner such as voice communication and operation of a device, and in particular, it collects
sound by emphasizing only the sound from a sound source present in a specific direction from
the sound collection device. The present invention relates to a specific direction sound pickup
device, a specific direction sound pickup method, a specific direction sound pickup program, and
a recording medium on which a specific direction sound pickup program is recorded, which is
suitably applied in the case where it is desired.
10-04-2019
1
[0002]
In the prior art, as shown in FIG. 1, microphone mic. Placed at M different positions (p1, q1) to
(pM, qM) on the xy plane. 1 to mic.
Using M, a sound generated from a sound source in the direction of an arbitrary angle θS is
used as a signal, and sounds generated in other directions are used as noise to emphasize only
the signal and collect at a high SNR (signal-to-noise ratio) Sound. FIG. 2 is a block diagram
showing the configuration of a conventional enhanced sound collection method. A signal ym (n)
is obtained by adding a delay Dm to the signal xm (n) (m = 1... M) received by the microphone m
disposed at the position (xm, ym) as in equation (1). Get
[0003]
ym (n) = xm (n−Dm) (1) At this time, the delay amount Dm can be derived from the direction θS
of the desired sound source given in advance by the equation (2).
[0004]
Dm = (dm / c) sin θ S (2) where c is the velocity of sound, and d m is the distance between the
microphone m and the reference point when viewed from the sound wave arriving from the θ S
direction in FIG. Is represented by.
[0005]
dm = pm sin θ + q m cos θ (3) Next, the obtained ym (n) is added as shown in equation (4) to
obtain a signal z (n) in which the sound emitted from the desired position is emphasized.
[0006]
The above is the conventional emphasized sound collection method (non-patent document 1).
As shown in FIG. 3, in the directivity characteristic formed by this prior art, a region called a side
lobe SB having a relatively large gain is generated in the vicinity of the main beam BM, so that
10-04-2019
2
noise can be sufficiently suppressed. Can not.
In order to keep the gain of the side lobe SB as small as possible, it is necessary to increase the
number of microphones and to increase the size of the microphone array (Non-Patent Document
1).
Oga Juro, Yamazaki Yoshio and Kanada Toyo co-authors, Acoustics system and digital
processing, Institute of Electronics, Information and Communication Engineers P. 181-P. 186
7.1.2.
[0007]
When the directional characteristics of the sound collection device are directed to a specific
direction using the prior art, the sound emitted in that direction is emphasized, and the sound
emitted in other directions is suppressed and collected according to the prior art. The directional
characteristics formed have side lobes. Therefore, there is a problem that the sound emitted from
the direction to be originally suppressed is picked up without being sufficiently suppressed. For
this reason, when there is a noise source that emits a very loud sound other than the direction of
the sound source to be emphasized, the prior art sound collection device can not obtain a
sufficient emphasizing effect on the desired sound source. Further, in the prior art, in order to
reduce the side lobes, the number of microphones must be increased and the microphone array
must be large, and in practical use, installation and transportation have been difficult.
Furthermore, since the directional characteristics of the conventional sound pickup apparatus
change with frequency, there is a problem that a sufficient enhancement effect can not be
obtained depending on the frequency structure of the desired sound and noise.
[0008]
The present invention has been made to solve the above problems, and realizes an apparatus for
enhancing and collecting a sound from a desired sound source with a higher SNR than that of the
prior art without expanding the size of the microphone array. It is.
[0009]
The specific direction sound pickup device of the present invention includes a plurality of beam
10-04-2019
3
formers, a plurality of frequency domain converters, a specific direction selector, a signal amount
estimator, a gain coefficient calculator, and a multiplier.
The beam former unit emphasizes and picks up sounds coming from angular regions in different
directions using output signals of a microphone array configured by mounting a plurality of
microphones. The frequency domain conversion unit converts each of the angle domain signals
collected by the plurality of beam formers into a frequency domain signal divided into a plurality
of band components. The specific direction selection unit selects a specific direction frequency
domain signal belonging to an angular region of a desired direction in the frequency domain
signal output from each frequency domain conversion unit. The signal amount estimation unit
includes a region aggregation unit, an inverse matrix operation unit, and a multiplication unit.
The area aggregation means obtains an aggregated power vector composed of the signal amount
of the specific direction frequency region signal and the signal amount of the angle region signal
in the other direction. The inverse matrix calculation means calculates an inverse matrix of the
aggregation gain matrix obtained from the directivity characteristic of the beam former. The
multiplying means multiplies the aggregated power vector by the inverse matrix to obtain an
estimated value of the sum of frequency domain signals. The gain coefficient calculation unit
calculates a gain coefficient for each frequency band on the basis of the ratio between the signal
amount of the specific direction frequency domain signal and the total amount of the frequency
domain signal. The multiplication unit multiplies the signal amount of each corresponding
frequency band of the specific direction frequency domain signal by the gain coefficient
calculated by the gain coefficient calculation unit.
[0010]
According to the specific direction sound collection device of the present invention, in order to
enhance the sound emitted by the sound source in the desired direction and improve the
enhancement effect when picking up the sound, a plurality of beam forks are used using signals
received by the microphone array. The power of the sound signal emitted from each sound
source is estimated from the result of the mer portion processing, and the desired sound signal is
enhanced using a non-linear filter coefficient that enhances the signal in the sound collection
region. Therefore, it is not necessary to increase the number of microphones or increase the size
of the microphone array. In addition, the emphasis effect can be improved with a small-scale
system that is easy to install and transport in practical use.
[0011]
10-04-2019
4
Further, in the processing of the specific direction sound pickup device of the present invention,
the processing in the inverse matrix calculation means requires the most calculation time. Since
the signal amount estimation unit of the specific direction sound collection device of the present
invention uses a two-dimensional aggregated power vector, the aggregation gain matrix obtained
from the directivity characteristic of the beam former is also 2 rows and 2 columns. Therefore,
the amount of calculation of the entire process can be greatly reduced.
[0012]
Principle FIG. 4 shows an example of the arrangement of microphone arrays of the specific
direction sound pickup device of the present invention. In the present invention, as shown in FIG.
4, the sound pickup area is divided into a plurality of direction areas, and the directivity of the
microphone array is controlled to be directed to each direction area, and a signal received is
used. At this time, the signal processed by the microphone array is also referred to as power
("signal amount" according to the direction in which the sound source is present, as compared to
that before the processing. ) Changes. In the present invention, the amount of change in power is
used to estimate the power of the signal coming from each direction area. Then, from the
estimated power, a non-linear filter for enhancing a signal coming from a previously given
direction region is constructed (a gain coefficient is determined), and the filtered signal is
obtained as a final output signal. Also, in order to reduce the amount of calculation, direction
regions are aggregated in the above-described power estimation of signals.
[0013]
Specific embodiments will be described below. Note that components having the same function
and the same processing will be assigned the same reference numerals and redundant
description will be omitted. First Embodiment First, the general outline of the present invention
will be described. FIG. 5 shows an example of the overall configuration of the specific direction
sound collecting device of the present invention. FIG. 6 shows an example of the processing flow
of the specific direction sound pickup device of the present invention. A signal xm (n) (m = 1, 2,...,
M) received by the microphone array 11 composed of M (2 2) microphones is transmitted from
the beam former 12-1 to the beam former The signal is input to Q beam formers 12-1 to 12-Q up
to 12-Q. Here, n represents the sample number of the discrete time signal.
[0014]
10-04-2019
5
In the beam formers 12-1 to 12-Q, for example, the directional beam BM as shown in FIG. 7 is
directed to any one of Q direction regions Θ1 to ΘQ given in advance in FIG. A process of
emphasizing and collecting the sound emitted in the direction area is performed, and the result is
output (S12-1 to S12-Q). The output signals y1 (n), y2 (n),..., YQ (n) of the beam formers 12-1 to
12-Q are input to the frequency domain converters 13-1 to 13-Q, respectively. The frequency
domain conversion units 13-1 to 13-Q decompose the input signal into frames with a short time
length (for example, about 256 samples in the case of sampling frequency 16000 Hz), and
perform discrete Fourier transform on each frame. The obtained .OMEGA. Number of frequency
components are output as output signals Y1 (.omega., L), Y2 (.omega., L),... YQ (.omega., L) (S13-1
to S13-Q). The signal subjected to frequency domain conversion is input to the signal amount
estimation unit 14 and the specific direction selection unit 15, respectively.
[0015]
The signal amount estimation unit 14 obtains the power component of the sum of sound signals
emitted from the sound source in each direction region Θ1 to ΘQ from the output signal power
of the beamformer units 12-1 to 12-Q inputted, and calculates 1 A signal power vector Xest (ω,
l) grouped into one vector is output (S14).
[0016]
The specific direction selection unit 15 selects the output of the beam former which directed the
directional beam to the direction area to be emphasized, and outputs it as YS (ω, l) (S15).
[0017]
The gain coefficient calculation unit 16 calculates a gain coefficient R (ω, l) from the input signal
power vector Xest (ω, l) and outputs it (S16).
The gain coefficient R (ω, l) is input to the multiplication unit 17.
The multiplying unit 17 outputs the result of multiplying the input gain coefficient R (ω, l) and
the output YS (ω, l) of the specific direction selecting unit 15 for each component of the same
frequency (S17). The output signal YSR (ω, l) of the multiplication unit 17 is input to the inverse
frequency domain conversion unit 18, and inverse discrete Fourier transform is performed to
10-04-2019
6
output the signal y (n) restored to the time signal (S18). This signal y (n) is a signal picked up by
emphasizing the desired sound by the device of the present invention.
[0018]
The details of the beamformer units 12-1 to 12-Q, the signal amount estimation unit 14, the
specific direction selection unit 15, and the gain coefficient calculation unit 16 will be
sequentially described below using another figure.
[0019]
(Beam Former Unit) FIG. 8 shows a configuration of one of the beam formers 12-1 to 12-Q.
Similar processing is performed in all beamformers. The input signals xm (n) (m = 1, 2,..., M) are
input to the filter processing units FC1 to FCM. The filter processing units FC1 to FCM output the
signal x'qm (n) obtained by substituting the filter coefficient Wqm (n) given in advance (the
determination method will be described later) in the convolution operation shown in equation
(5). .
[0020]
Output signals of the filter processing units FC1 to FCM are input to the addition unit ADD. The
adder ADD adds the input signals as shown in equation (6) to obtain output signals yq (n) (q = 1...
Q) of the beam former.
[0021]
Here, the filter coefficients Wqm (n) are emitted in the Q-direction region ΘQ given in advance
of the directivity characteristics Dq (ω, θ) of each of the beam formers 12-1 to 12-Q shown in
FIG. It is designed to emphasize and receive sound and to suppress sound emitted in other
directions.
[0022]
10-04-2019
7
(Signal Amount Estimating Unit) FIG. 9 shows a configuration of the signal amount estimating
unit 14.
The frequency components Y1 (ω, l), Y2 (ω, l),..., YQ (ω, l) input to the signal amount estimation
unit 14 are input to the power calculation units PW-1 to PW-Q, respectively, and the signals
Power values | Y1 (ω, l) | <2>, | Y2 (ω, l) | <2>, ..., | YQ (ω, l) | <2> are output and input to the
area aggregation unit 14A (SPA in FIG. 6). The area aggregation unit 14A obtains the average of
the power values of the signals emitted from the set S of areas desired to be picked up and the
power average of the signals emitted from the set N of the areas desired to be suppressed. The
vector Y (ω, l) is output (S14A in FIG. 6).
[0023]
Here, NS indicates the number of areas included in the set S, and NN indicates the number of
areas included in the set N. Also, all direction areas (1 to Q) are determined in advance to belong
to the set S or the set N. For example, when Q = 4, the set S and the set N may be determined as S
= {1, 2} and N = {3, 4}.
[0024]
The beam former output power vector Y (ω, l) is input to the multiplier 14 B. The power
estimation matrix T <−1> (ω), which is the other input of the multiplier 14B, is an output signal
of the inverse matrix calculator 14C. The aggregation gain matrix T (ω) defined by the equation
(8) is input to the inverse matrix operation unit 14C, and the inverse matrix T <−1> (ω) is
output (S14C in FIG. 6).
[0025]
Each element of the aggregation gain matrix T is a parameter obtained from the average value of
the directivity in each direction area of each beam former as shown in FIG. 10, for example, the
direction of the directivity as shown in the equation (9) Use the average value for
[0026]
10-04-2019
8
αpq is an average value of the directivity characteristic for the q-th direction area of the beam
former 12-p.
The directivity characteristic can be obtained from the filter coefficient Wm (n) using the
technique described in Non-Patent Document 1, for example.
[0027]
The multiplying unit 14B performs multiplication of the input beamformer unit output power
vector Y (ω, l) and the power estimation matrix T <−1> (ω) for each frequency component as
shown in equation (10). The estimated signal power vector Xest (ω, l) is output (S14B in FIG. 6).
[0028]
Xest (ω, l) = T <−1> (ω) Y (ω, l) (10) It is to be noted that the power (signal amount) of the
signal is calculated by aggregating the direction areas described in the principle of the present
invention. The signal amount estimation unit 14 estimates.
[0029]
(Specific Direction Selection Unit) FIG. 11 shows the configuration of the specific direction
selection unit 15.
In the specific direction selection unit 15, among the frequency components Y1 (ω, l) to YQ (ω,
l) input from each of the frequency domain conversion units 13-1 to 13-Q, the q direction region
(q should be emphasized) to be emphasized. , Select one corresponding to one selected from
among 1,..., Q) and output it as YS (ω, l).
[0030]
YS (ω, l) = Yq (ω, l) (11) (Gain Coefficient Calculation Unit) FIG. 12 shows a configuration of the
gain coefficient calculation unit 16.
The estimated signal power vector Xest (ω, l) input from the signal amount estimation unit 14 is
10-04-2019
9
input to the vector element extraction unit 16A. As shown in equation (12), the estimated signal
power vector Xest (ω, l) is a first component of the estimated sound collection area signal of the
input estimated signal power vector | S (ω, l) | <2>. A suppression region signal estimated power
| N (ω, l) | <2> of the input estimated signal power vector is set as a second component.
[0031]
Xest (ω, l) = [| S (ω, l) | <2> | N (ω, l) | <2>] <T> (12) The vector element extraction unit 16A
estimates the sound collection region signal power The | S (ω, l) | <2> and the suppression area
signal estimated power | N (ω, l) | <2> are output, and they are input to the SN ratio estimation
unit 16B. The SN ratio estimation unit 16B calculates and outputs a gain coefficient R (ω, l) that
emphasizes the signal in the desired direction region using Expression (13).
[0032]
Here, α is a parameter for adjusting the emphasis of the signal in the desired direction area by
the gain coefficient R (ω, l), and may be, for example, α = 1/2.
[0033]
As described above, according to the specific direction sound collection device of the present
embodiment, the signal received by the microphone array 11 is enhanced in order to enhance
the enhancement effect when the sound emitted from the sound source in the desired direction is
emphasized and collected. The power of the sound signal emitted by each sound source is
estimated from the results of the plurality of beam formers 12-1 to 12-Q, and a desired gain
coefficient (nonlinear filter coefficient) is used to emphasize the signal in the sound collection
region. Enhance the sound signal.
Therefore, it is not necessary to increase the number of microphones or increase the size of the
microphone array. In addition, the emphasis effect can be improved with a small-scale system
that is easy to install and transport in practical use.
[0034]
10-04-2019
10
In addition, since the signal amount estimation unit 14 of the specific direction sound collection
device of the present embodiment uses a two-dimensional integrated power vector, the
integrated gain matrix obtained from the directivity characteristics of the beam former units 121 to 12-Q is also 2 rows and 2 columns. Therefore, the amount of calculation of the entire
process can be greatly reduced. Second Embodiment The specific direction sound collection
device according to the second embodiment changes the processing procedure of the signal
amount estimation unit 14, the gain coefficient calculation unit 16, and the multiplication unit 17
of the specific direction sound collection device according to the first embodiment. It is FIG. 13 is
a view showing a configuration example of the specific direction sound pickup device of the
second embodiment. The difference from the first embodiment is that band division units 19-1 to
19-Q are provided at the subsequent stage of frequency domain conversion units 13-1 to 13-Q,
and signal amount estimation unit 14, gain coefficient calculation unit 16, multiplication The
band combining unit 21 is provided at a stage subsequent to the multiplying unit 17 in each
frequency band in that each process of the unit 17 is performed for each of Ω frequency bands,
and the outputs from the multiplying unit 17 of each band are combined It is a point. FIG. 14
shows the configuration of the band dividing unit, and FIG. 15 shows the configuration of the
band combining unit.
[0035]
The aggregated gain matrix Tx (ω) of the signal amount estimating unit 14 of the same band
component collecting unit 20-x (where x is 1,..., Ω) according to the present embodiment may be
determined as in equation (14) .
[0036]
Where Nx is the number of frequency bins included in the aggregated x-th band.
The other parts are the same as in the first embodiment.
[0037]
Since it is such a structure, the specific direction sound collection apparatus of 2nd Embodiment
can also acquire the same effect as the specific direction sound collection apparatus of 1st
Embodiment. Furthermore, the specific direction sound collection device according to the second
embodiment can perform calculations for each of the Ω frequency bands, so that the amount of
10-04-2019
11
calculations can be reduced. [Experimental Example] FIG. 16 shows an experimental result of the
specific direction sound pickup device according to the present invention. FIG. 16 shows the
noise suppression amount in the experiment in which the position of the desired sound source
(female voice) is fixed at 0 degrees and the position of the noise source (male voice) is changed in
the direction of every 15 degrees shown in FIG. It is a thing. In FIG. 16, the amount of noise
suppression is larger as it goes inside the polar coordinate system. Further, in the present
experiment, since the area to be picked up is set to 0 degree to 90 degrees and 270 degrees to
360 degrees, the other direction (90 degrees to 270 degrees) is the area to be suppressed. In
addition, in this experiment, the microphone array which consists of four single directivity
microphones arrange | positioned at each vertex of the square of 24 cm of sides was used.
[0038]
In the prior art, the noise suppression amount gradually increases with distance from the desired
sound source. However, in the method according to the present invention, the noise suppression
amount is uniformly low in the region to be picked up and crosses the boundary with the region
to be suppressed. And has increased sharply. Further, while the noise suppression amount of the
prior art is about 7 dB in the largest direction, the method according to the present invention
realizes the noise suppression amount of 10 dB or more in most directions of the region to be
suppressed. From this, it is possible to confirm that the method according to the present
invention uniformly acquires the sound of the region to be picked up, and has high noise
suppression performance over the entire region to be suppressed as compared with the prior art.
[0039]
FIG. 17 shows an example of the functional configuration of a computer. The sound pickup
apparatus of the present invention causes the recording unit 2020 of the computer 2000 to read
a program for operating the computer 2000 as each component of the present invention, and
operates the processing unit 2010, the input unit 2030, the output unit 2040, and the like. It can
be realized by In addition, as a method of reading into a computer, a program is recorded in a
computer readable recording medium, and a method of reading into a computer from the
recording medium, a program recorded in a server or the like is read into the computer through
a telecommunication line or the like. There is a way to
[0040]
10-04-2019
12
The figure for demonstrating the sound collection method of the microphone array by a prior art.
The block diagram for demonstrating the conventional specific direction sound collection
apparatus. The figure for demonstrating the directivity characteristic of the conventional specific
direction sound collection apparatus. The figure which shows the example of arrangement |
positioning of the microphone array of the specific direction sound collection apparatus of this
invention. BRIEF DESCRIPTION OF THE DRAWINGS The figure which shows the example of a
whole structure of the specific direction sound collection apparatus of 1st Embodiment. The
figure which shows the example of the processing flow of the specific direction sound collection
apparatus of 1st Embodiment. The figure for demonstrating the directional characteristic of the
beam former part used for this invention. FIG. 2 is a diagram showing the configuration of a
beam former unit. The figure which shows the structure of a signal amount estimation part. The
figure for demonstrating an example of the directional characteristic of the beam former part
used for this invention. The figure which shows the structure of a specific direction selection
part. FIG. 2 is a diagram showing the configuration of a gain coefficient calculation unit. The
figure which shows the structural example of the specific direction sound collection apparatus of
2nd Embodiment. The figure which shows the structure of a band division part. The figure which
shows the structure of a band synthetic | combination part. The figure which shows the
experimental result of the specific direction sound collection apparatus of this invention. The
figure which shows the function structural example of a computer.
Explanation of sign
[0041]
11 Microphone array 12-1 to 12-Q Beamformer 13-1 to 13-Q Frequency domain converter 14
Signal amount estimation unit 14A Region aggregation unit 14B Multiplication unit 14C Inverse
matrix operation unit 15 Specific direction selection unit 16 Gain coefficient Calculation unit 17
Multiplication unit 18 Inverse frequency domain conversion unit 19-1 to 19-Q Band division unit
20-1 to 20-Ω Same band component collection unit 21 Band combination unit
10-04-2019
13
Документ
Категория
Без категории
Просмотров
0
Размер файла
25 Кб
Теги
jp2009044588, description
1/--страниц
Пожаловаться на содержимое документа