close

Вход

Забыли?

вход по аккаунту

?

JPH11304906

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JPH11304906
[0001]
BACKGROUND OF THE INVENTION In this method, signals received by a plurality of
microphones (hereinafter sometimes referred to simply as "microphones") in an automatic
speaker follow-up camera, an automatic speaker follow-up directional sound collector, etc. The
present invention relates to a method of estimating the position of a sound source and a
recording medium therefor.
[0002]
2. Description of the Related Art In this section, first, the principle of sound source position
estimation will be described, and then, the delay-sum method and the cross-correlation method,
which are conventional sound source position estimation methods, will be described. .
[0003]
§1.
Preliminary Description First, necessary conditions for estimating the sound source position will
be described. Estimating the position of the sound source from signals received by a plurality of
microphones is basically equivalent to identifying three or more triangles with the sound source
as a common vertex in the space by the congruence condition of triangles . Here, each triangle
takes one side on a straight line passing through the two microphones, and the sound source is at
10-05-2019
1
the vertex opposite to that side. The congruence condition of triangles is three, such as a
trilateral phase, a two-sided pinch phase, and a two-sided angular phase. All these conditions
necessarily require the length of one side, but in the sound source position estimation, since the
positional relationship of the microphones is known, this side is taken as a straight line through
which two or more microphones pass. Now, for simplicity, let us consider the relationship
between these congruence conditions and position estimation when the sound source is on a
plane.
[0004]
This condition means that a triangle is uniquely determined if the length of three sides is known.
In this case, the apex of the triangle is the sound source and two microphones, and the distance
between them corresponds to the length of three sides.
[0005]
This condition means that a triangle is uniquely determined if two corners and the length of a
side between them are known. When this condition is used, three microphones are arranged in a
straight line, and the narrow side is taken on this straight line, and the two end points of the
narrow side are considered as the middle point of the central microphone and the microphones
on the left and right thereof. Also, the remaining vertices are sound sources. Then, the two angles
in the condition are angles at which the sound source looks from the right and center
microphones and the left and center microphones.
[0006]
This condition means that a triangle is uniquely determined if two sides and an angle between
the two sides are known. Of the two sides, one side is taken on a straight line connecting the two
microphones, and the end points are the midpoint of the two microphones and one of the
microphones. The other side is an end point between the sound source and the microphone.
[0007]
10-05-2019
2
Next, the symbols necessary for the description of the conventional method are defined. FIG. 4 is
a diagram for explaining how sound waves are received by a plurality of microphones, where 11
is a sound source, 12-1 to 12-M are microphones, 13 is an A / D converter, and s (k) Is the
source signal at time k, M is the total number of microphones, d (m), (m = 1, 2,..., M) is the
distance between the source and the mth microphone, x ( m, k) represents the sound reception
signal of the m-th microphone at time k. In the present specification, the expression of time
(time) is expressed as discrete time and represents time by the integer k. Usually, in addition to
the sound that directly reaches the microphone from the sound source, there is a reflected sound
that reaches the microphone after being reflected on the wall, the floor, etc., but in the
explanation of FIG. . Further, it is assumed that the positions of the microphones 12-1 to 12-M
are known.
[0008]
Now, the amount obtained by dividing the sound velocity by the sampling frequency is called
normalized sound velocity, and expressed by c, the received signal x (m, k) of the mth
microphone at time k is d (m) / c time ago Since it is equal to the sound source signal, the
following equation (1) is established. x(m,k)=s(k−d(m)/c) ・・・(1)
[0009]
This equation (1) indicates that the distance between the sound source and the microphone is
converted into the time difference between the sound source signal and the sound reception
signal at the microphone. That is, if the time difference is known, the distance to the sound
source is known, and the position of the sound source is known from the distance. In addition, if
a microphone (in this specification, the microphone 12-1) to be a reference is determined and the
sound reception signal of the microphone is a delayed sound source signal, then x (m, k) = s
(k−d (m) / C) = s (k-d (m) / c + d (1) / c-d (1) / c) = s (k-d (1) / c-(d (m)-d (1)) / C) = x (1, k− (d
(m) −d (1)) / c) (2) In the equation (1), the time difference between the sound source signal and
the sound reception signal of each microphone is unknown, but in the equation (2), the time
difference between the microphone 1 and the sound source and the time difference between the
microphone 1 and the other microphones are unknown It is.
[0010]
10-05-2019
3
FIG. 5 shows how a plane wave is incident on two microphones, and explains what geometrically
the time difference between the signals of the two microphones means. In FIG. 5, the broken line
21 represents the equiphase surface of the sound wave, and depicts how the incident sound
wave first reaches the microphone 12-1 and arrives at the microphone 12-2. From FIG. 5, the
arrival time difference of the sound wave is obtained by dividing the product of the distance
between the microphones and the extra angle of the incident angle θ by the normalized sound
velocity, as in the following equation (3). Sound wave arrival time difference = microphone
interval cos (θ) / c (3) When equation (3) is modified, θ = arc cos (c sound wave arrival time
difference / microphone distance) (4). Therefore, it is understood that the incident angle θ can
be calculated if the sound wave arrival time difference and the microphone interval are known.
[0011]
§2. There are various conventional estimation methods of sound source position estimation
methods, but in the present specification, two methods belonging to the simplest class among
them, the delay sum method and the cross correlation method will be described.
[0012]
(Conventional Method 1) Delay-Sum Method The delay-sum method is a method using three
sides or the like according to the above classification. In advance, it is assumed that there are a
plurality of positions where a sound source is likely to be present, and a sound source position
that conforms to the reference contained therein as much as possible. The reference calculation
uses not the distance between the microphone and the sound source itself, but the time
difference between the sound arrival of the reference microphone and the other microphones.
According to equation (2), the sound reception signals at all the microphones are the sound
reception signals of the reference microphone shifted in time. Considering the signals added after
delaying or advancing these signals, the power of the added signal is maximized when the signals
of microphone m (m = 1, 2,..., M) Advance by (d (m) -d (1)) / c (or delay by (d (1) -d (m)) / c), and
all microphone signals are the same as x (1, k) It is a case where it becomes a phase. In fact, since
it is not possible to advance the signal, x (1, k) is delayed by Dsup (> d (m)-d (1)) to be in phase
with x (1, k-Dsup / c) Delay the signals of all the microphones so that Thus, in the delay-sum
method, the power of a signal obtained by delaying and adding a received signal is taken as a
standard, the value thereof is taken as a match, and the prepared delay is applied to the received
signal. The position corresponding to the set of delays given the maximum sum signal power is
taken as the sound source position. Specific position estimation procedures will be described
below.
10-05-2019
4
[0013]
FIG. 6 is a diagram for explaining the signal flow of the delay-sum method, where 31 is a delay,
32 is an adder, and D (i, m), (i = 1, 2,..., I) is the next As defined by equation (5), the delay amount
of the m-th microphone such that the signals of all the microphones become in phase when there
is a sound source at the ith sound-source assumed position, I presupposed the sound source
position Where y (i, k) represents the output signal of the adder corresponding to the delay D (i,
m) and is referred to as the delay sum. D (m, i) = Dsup + d (i, m) / c (5) where d (i, m) is the
distance between the ith sound source postulated position and the microphone m.
[0014]
The signal x (m, k) received by the microphone 12 is delayed by D (i, m) by the delay unit 31 and
then added by the adder 32 to become an output signal y (i, k). The output signal y (i, k) is
calculated by the following equation (6). y (i, k) = Σx (m, k−D (i, m)) (6) where た だ し relates to
the microphone number m. The estimated sound source position is a position corresponding to i.
However, means that the square of a and E [a] take the average of a. These calculation
procedures are shown in FIG.
[0015]
(Conventional Method 2) Cross Correlation Method According to the above classification, the
cross correlation method is a method which utilizes a double angle narrow side phase or the like.
In the cross correlation method, the time difference between the reference microphone and the
signals of the other microphones is regarded as the time difference giving the maximum value of
the cross correlation function, and the angle of incidence is obtained from the time difference
and the microphone spacing. The cross correlation function r (τ, m) of the signals of the
reference microphone 12-1 and the microphone 12-m is defined by the following equation (8). r
(τ, 1, m) = E [x (1, k) x (m, k + τ)] (8) Since there is no time difference between the microphone
spacing / normalized sound velocity, the time difference τ is A cross-correlation function is
determined in the range from − (microphone spacing / normalized sound velocity) to +
(microphone spacing / normalized sound velocity). If there is no noise, the cross-correlation
function takes the maximum value when τ (m) = (d (m) −d (1)) / c, so the time difference
between the microphones giving the maximum value of the cross-correlation function It can be
10-05-2019
5
considered as arrival time difference. When using cross-correlation, choose a special microphone
placement so that source location can be easily calculated. For example, a method generally
called trigonometry will be described with reference to FIG. In trigonometry, as shown in FIG. 8,
three microphones are arranged in a straight line. Assuming that the microphone 12-1 is a
reference microphone, the delay times of the signals of the reference microphone and the
microphones 12-2 and 12-3 are determined from the cross-correlation function, and then from
equation (4), these two delay times The incident angle θ2 of the sound wave from the sound
source related to the microphones 12-1 and 12-2 and the incident angle θ3 related to the
microphones 12-1 and 12-3 are calculated. Then, the triangle formed by connecting the sound
source, the middle point of the microphones 12-1 and 12-2, and the middle point of the
microphones 12-1 and 12-3 is determined by the two angles and the narrow side, so that the
triangle is uniquely determined. , The sound source position is determined. The above procedure
is shown in FIG.
[0016]
§3. Comparison between conventional methods The amount of computation and the noise
resistance performance are compared for the delay-sum method and the cross-correlation
method, which are conventional methods. First, the amount of operation excluding the averaging
operation is compared. Since the main calculation of the delay-and-sum method is the part that
calculates the delay sum y (i, k) (i = 1, 2,..., I) of equation (6), the amount of operation is the
number of microphones It is estimated to be the product of the number I of sound source
positions assumed to be M. On the other hand, in the cross correlation method, the main part of
the calculation is equation (8), and the amount of operation is the product of (the number of
microphones-1) and the average microphone spacing divided by the normalized sound velocity. It
is estimated to be about twice. When comparing the amount of operation of these two methods,
both are almost the same in proportion to the number of microphones, and the difference is that
in the delay-sum method, the number of sound source positions I assumed and in the crosscorrelation method The point is proportional to the microphone spacing divided by the
normalized sound velocity.
[0017]
The number of sound source positions to be assumed varies depending on the application, but in
applications where the camera is pointed to the speaker's position or the speaker's voice is
selectively collected, high position resolution is required. It will be tens of thousands. For
example, in the case of a speaker-following camera, the speaker talks within a 120 degree
10-05-2019
6
horizontal angle range, a 30 degree elevation angle 1 degree resolution, and a 5 m distance 50
cm resolution with the camera at the center. When the position of the person is searched, it is
approximately I = 120 × 30 × (5 / 0.5) = 36000.
[0018]
On the other hand, the number obtained by dividing the microphone interval of the cross
correlation method by the normalized sound velocity is at most about 100. For example, when
the microphone spacing is 50 cm, the sampling frequency is 16 kHz, and the sound speed is 340
m / s, 2 * 0.5 / (340/16000) = 47. From the above, it can be seen that the operation amount of
the delay-sum method is 100 to 1000 times the operation amount of the cross-correlation
method.
[0019]
Next, the noise resistance performance is compared. Although not considered in FIG. 4, when the
sound source position is actually estimated, reverberation and background noise of the room
exist, and the position estimation performance is degraded. If the noise other than the direct
sound from these sound sources is collectively expressed as noise, and expressed as n (m, k), the
sound reception signal at time k of the microphone m is x (m, k) = s (k−d ( m) / c) + n (m, k) (9)
[0020]
Noise immunity depends on the microphone arrangement and the frequency band of the sound
source, so it is hard to say in general, but here the y of equation (6) and equation (8) when the
delay time of the microphone is set correctly The ratio of the power of the source signal s at r to
the absolute value of the other components will be discussed. In the case of the delay-sum
method, assuming that D (i, m) = Dsup-d (m) / c in the equation (6), the sound source signal s is
in phase as in the following equation (10).
[0021]
At this time, it is multiplied by the power of the delay sum. Is the power of the sound source
10-05-2019
7
signal s, Σ in the fourth term of the second row relates to the microphone number m, m '(m ≠
m'), and the other Σ relate to m.
[0022]
Let us examine the right side of equation (11) in detail. First, the first term is the power of the
sound source signal multiplied, and if there is no noise, only this term appears. The second term
means the correlation between the source signal and the noise. The noise contains a component
that has a correlation with the sound source, such as reverberation, and the average value E [s
(k−Dsup) n (m, k−D (i, m))] of this correlation is denoted as snsn. The third term is a term that
means the sum of noise power of each microphone. The third term has no effect on the
estimation of the position since the noise power of each microphone is constant with respect to
the source position i = 1, 2,. Finally, the fourth term represents the average of the crosscorrelation of the noise, which is denoted η nn.
[0023]
Based on the above discussion, the meaningful part of the right side of the equation (11) can be
rewritten as follows. The ratio of the power of the source signal s to the absolute value of the
other is as follows. Where abs (x) represents the absolute value of x. From this equation (13), it
can be seen that the power ratio of the sound source signal s increases in proportion to the
number M of microphones.
[0024]
On the other hand, in the cross correlation method, in r (τ, 1, m), when there is no noise, s is in
phase when r = (d (m) −d (1)) / c. Is the largest. In the presence of noise, r when τ = (d (m) −d
(1)) / c is Similar to the delay-sum method, the ratio of the power of the sound source signal s in
r to the absolute value of the other is as follows. This value does not depend on the number M of
microphones, and is the same value as the delay-sum method in the case of M = 2. Therefore,
when the delay-sum method and the cross-correlation method are compared, it can be seen that
the delay-sum method has a larger ratio of the sound source signal s by the number of
microphones M / 2 and is excellent in noise resistance.
10-05-2019
8
[0025]
As described above, the delay-and-sum method of the conventional method has a problem that
the amount of calculation is large, and the cross-correlation method has noise resistance
compared to the delay-and-sum method. There is a problem that it is inferior. The present
invention solves the above problem of the delay sum method by pre-estimating the sound source
position by the cross correlation method and narrowing the sound source position search range,
and the same sound source position as the delay sum method while maintaining the noise
resistance performance. It is possible to obtain estimated performance.
[0026]
According to the present invention, in the method of processing signals received by a plurality of
microphones and estimating the position of a sound source, the cross correlation function of the
received signals is set to all the microphones. A first step of calculating for the second crosscorrelation function, obtaining a time difference giving the maximum value of the crosscorrelation function between the reference microphone and the other microphones, and setting
the second difference as the preliminary estimation time difference; A third step of searching for
a time difference maximizing the power of the delay sum for the microphone in the vicinity of the
preliminary estimated time difference and setting it as an estimated time difference; and a fourth
step of calculating a sound source position based on the estimated time difference It is
characterized by having.
[0027]
DETAILED DESCRIPTION OF THE INVENTION The present invention will be described in detail
below.
As a starting point of the explanation, the calculation formula of the delay-sum method is shown
again. This formula can be expanded as However, Σ in the second term of the second term
relates to the microphone number m, m '(m ≠ m'), and the other Σ relates to m. In the equation
(19), the first term on the right side is the power of the signal of the m-th microphone, which is
common with respect to the sound source position i and does not affect the result of the equation
(18) for finding the maximum value .
[0028]
10-05-2019
9
The second term on the right side is the sum of M (M-1) / 2 cross correlation functions, and each
cross correlation function E [x (m, k-D (i, m)) x (m ', k−D (i, m ′)) is a range that D (i, m) −D (i,
m ′) can take when the sound source position i is changed, ie, at most − (microphone m and If
it is calculated in the range from the distance between m '/ normalized sound velocity) to + (the
distance between microphone m and m' / normalized sound velocity), it always exists in it. Thus,
it can be seen that it can be evaluated by the sum of the cross correlation function.
[0029]
By the way, the cause of the large amount of operation of the delay-and-sum method is that the
number of sound source positions to be assumed increases in order to obtain sufficient
performance. Therefore, in the present invention, in order to solve the problem, the time
difference (that is, the sound source position) is roughly estimated by first estimating the time
difference between the microphones by the cross correlation function, and then, in the vicinity of
the time difference Similar to the delay-sum method, a time difference that maximizes the delay
sum power is searched. Finally, the position is calculated back from the determined time
difference.
[0030]
Specifically, Step 1: Calculate a cross correlation function. Step 2: The arrival time difference of
the sound wave between the microphone 12-1 and the microphone 12-m is estimated by the
cross correlation method. The time difference between the microphone 12-1 and the microphone
12-m is represented by τ (m). Step 3: In the vicinity of the delay time obtained in Step 2, τ (m)
−δτ (m) ≦ t (m) ≦ τ (m) + δτ (m) (where δτ (m)> 0) 20) Search t (m) which maximizes
the following equation (21). Typical values of δτ are between 1 and 5. Σ E [x (m, k-t (m)) x (m ',
k-t (m'))] ... (21) Step 4: The sound wave arrival time difference t (m) obtained in step 3 Based on
the above, the sound source position is determined. The transformation of the time difference
and the position is carried out by solving the trigonometry described in the cross correlation
method and the simultaneous equations concerning the position.
[0031]
10-05-2019
10
The above procedure is shown in FIG. In addition, the execution means of this invention is
comprised as a sound source position estimation part 710 (refer FIG.2, 3) mentioned later as an
example. Specifically, the sound source position estimation unit 710 is a computer device
including a CPU (central processing unit) and its peripheral circuits. The procedure shown in FIG.
1 is stored in a semiconductor memory (ROM, RAM, etc.) or other recording medium (magnetic
disk, etc.) as a control program in the sound source position estimation unit 710. And CPU of the
sound source position estimation part 710 performs the sound source position estimation
method by this invention based on the said control program.
[0032]
Equation (21) omits the coefficient 2 in the second term on the right side of equation (19), and
sets D (i, m) and D (i, m ') to t (m) and t (m'), respectively. It is the replaced equation. In the
equation (19), the assumed sound source position i is a variable, but in the equation (21), the
delay time t (m) is a variable. As described in the comparison of the conventional methods, the
number of assumed sound source positions is several thousand to several tens of thousands. On
the other hand, in equation (21), the number of delay times is at most several thousand with M =
4 or so.
[0033]
The calculation amount and noise resistance performance of the present invention will be
described. When the calculation amount of the present invention is estimated, first, calculation of
the cross correlation function in step 1 requires calculation of a product of M (M-1) times and the
average microphone spacing divided by the normalized sound velocity It is. Also, for sound
source position estimation, the steps 2 to 4 are executed, but this requires an approximate
operation. Let's compare this operation amount with that of the delay-and-sum method. For
example, M = 4, δτ (m) = 2, and the other conditions are “§3. In the same case as the
example of the comparison between the conventional methods, in the delay-and-sum method,
36000 M = 14000, in the present invention, and about one hundredth of that in the delay-andsum method. As for the noise resistance performance, since the present invention evaluates the
power of the delay sum in the same manner as the delay and sum method, it has the same noise
resistance performance as the delay and sum method and is superior to the cross correlation
method.
[0034]
10-05-2019
11
The following will describe supplementary matters for the implementation of the present
invention. Although the time difference has been described as an integer time in which the
reciprocal of the sampling frequency is one unit in a cross-correlation function or the like,
sufficient resolution may not be obtained in integer time. In such a case, the cross correlation
function is interpolated and steps 2 and 3 are repeated. Since interpolation also requires crosscorrelation values around cross-correlation values to be interpolated, it is necessary to extend the
range for calculating the cross-correlation by the amount used in interpolation.
[0035]
It greatly contributes to the amount of computation of the present invention. If more accurate τ
(m) is obtained in step 2, δτ (m) can be selected smaller, so the amount of computation can be
reduced. In order to obtain more accurate τ (m) in step 2, the microphones other than the
reference microphones are divided into two or more groups, and with each group and the
reference microphones, delay-and-sum method or steps 2 and 3 of the present invention Can be
used.
[0036]
A sound wave with a frequency of about 300 to 500 Hz or less has a smaller change in amplitude
at the same arrival time difference than a sound wave of a higher frequency because of the
longer wavelength, and therefore, there is less information useful for direction estimation.
Nevertheless, the sound source (voice) has a large power in this band and overlaps with the loworder resonance frequency of the room with a long decay time, so the direction other than the
sound source such as the wall or ceiling is the same The reflected sound waves coming from the
source increase, leading to an error in the estimation of the sound source direction. For the above
reasons, the frequency band of the signal used for the calculation of the cross correlation
function is about 300 to 500 Hz or more.
[0037]
The embodiment of the present invention has been described in detail with reference to the
drawings, but the specific configuration is not limited to this embodiment, and there may be
changes in design etc. within the scope of the present invention. Is also included in the present
invention.
10-05-2019
12
[0038]
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS The first embodiment is a speaker
automatic tracking directional sound collector.
FIG. 2 shows a functional configuration of a speaker automatic tracking directivity system to
which the method of the present invention is applied. A speech signal is supplied to the sound
source position estimation unit 710 and the delay unit 31 as the input signal x (k, m), and the
output of the delay unit 31 is changed in amplitude by the load 720 and then added by the adder
32. The output y (k) of the automatic follow-up directional sound collector. The sound source
position estimation unit 710 estimates the position of the sound source and sends the obtained
estimated sound source position to the delay / load calculation unit 730, and the delay / load
calculation unit 730 maximizes the signal to noise ratio of the output y (k). To determine the
delay time and the load.
[0039]
The speaker automatic follow-up directional sound collector is a device for selectively collecting
only the voice of the speaker in a high speed voice communication system such as a video
conference system. Conventional desk top microphones have the problem that unpleasant
sounds such as the sound of a collision between a desk and a paper or pen are likely to be mixed
in, and a collection of microphones placed around walls, ceilings, displays, etc. Sound is required.
At this time, as the microphones are separated from the speaker, the signal-to-noise ratio per
microphone decreases. In order to compensate for this, it is necessary to receive a plurality of
microphones and to delay, weight and add the signals appropriately, and to estimate the sound
source position in order to obtain the appropriate delay time and weight. .
[0040]
At this time, if the estimated sound source position is incorrect, the power of the high frequency
band of the output voice is lowered, and a problem occurs that the sound becomes muffled, so
more accurate sound source position estimation is required. According to the present invention,
with respect to the delay-and-sum method, which is the conventional method, the amount of
10-05-2019
13
calculation is much smaller, and the resolution can be further enhanced with the same
processing device, so that more accurate estimation of the sound source position can be
performed. . Further, the cross correlation method, which is also a conventional method, has
excellent noise resistance performance, so that more accurate sound source position estimation
can be performed. As a result, by using the method of the present invention, higher quality voice
can be received using the conventional method.
[0041]
The second embodiment is a speaker auto follow video camera. FIG. 3 shows a functional
configuration of a speaker automatic tracking video camera to which the method of the present
invention is applied. An audio signal is supplied to the sound source position estimation unit 710
as the input signal x (k, m), the sound source position estimation unit 710 estimates the position
of the sound source, and sends the obtained estimated sound source position to the camera
control unit 810. The unit 810 issues a control signal to the video camera 820, and the video
camera 820 changes the horizontal angle, elevation angle and zoom according to the control
signal.
[0042]
The automatic speaker tracking video camera is a device for automatically putting the speaker
into the visual field of the video camera in a video conference system or the like. In a conference
with a plurality of people, the conventional fixed-type video camera has a problem that it may not
know who the speaker is. In addition, when a person operates the video camera, there is a
problem that it takes time and effort. For this reason, there is a need for a video camera that
automatically follows the speaker. In order to properly fit a person in the field of view of a video
camera, it is required to know the position of the sound source with high accuracy. If the
estimated sound source position is incorrect, the image may be out of the screen, or the image of
the speaker may be too small for the purpose of achieving the purpose.
[0043]
According to the present invention, with respect to the delay-and-sum method, which is the
conventional method, the amount of calculation is much smaller, and the resolution can be
further enhanced with the same processing device, so that more accurate estimation of the sound
10-05-2019
14
source position can be performed. . Further, the cross correlation method, which is also a
conventional method, has excellent noise resistance performance, so that more accurate sound
source position estimation can be performed. As a result, by using the method of the present
invention, it is possible to properly fit the speaker into the field of view of the video camera, as
compared with the conventional method.
[0044]
The third embodiment is an abnormal sound automatic tracking and monitoring camera. This
embodiment will not be described in detail because the speaker is replaced with an abnormal
sound source in the second embodiment.
[0045]
According to the present invention, compared with the delay-and-sum method which is the
conventional method, the effect that the amount of operation can be reduced while maintaining
the sound source estimation accuracy is obtained. In other words, given the same arithmetic
device, the present invention has the effect of being able to estimate the position accurately by
raising the resolution, and to expand the sound source search range.
10-05-2019
15
Документ
Категория
Без категории
Просмотров
0
Размер файла
29 Кб
Теги
jph11304906
1/--страниц
Пожаловаться на содержимое документа