close

Вход

Забыли?

вход по аккаунту

?

JPH1118194

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JPH1118194
[0001]
BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a
microphone array apparatus which arranges a plurality of microphones and performs signal
source position detection, target sound emphasis, noise suppression and the like by signal
processing. The microphone array device can, for example, arrange a plurality of nondirectional
microphones and provide directivity equivalently by target sound emphasis, noise suppression,
and the like. Also, by detecting the sound source position based on the phase relationship of the
output signals of a plurality of microphones, for example, the video camera is automatically
moved in the direction of the speaker in the video conference system etc., and the video is
displayed together with the speaker's voice. It can be transmitted. In this case, by suppressing the
surrounding noise, the speech of the speaker can be clarified. Also, the speech of the speaker can
be emphasized by adding and matching the phase of the speaker's speech. There is a demand to
stabilize the operation of such a microphone array device.
[0002]
2. Description of the Related Art In the prior art microphone array apparatus, for the purpose of
noise suppression, a filter is connected to each microphone, and filter coefficients are set
adaptively or fixedly so as to minimize noise components. It is known (for example, refer
Unexamined-Japanese-Patent No. 5-111090). Also, for the purpose of detecting the position of
the sound source, a configuration is known in which the phase relationship of the output signal
of each microphone is determined to measure the sound source direction and the distance to the
sound source (for example, JP-A 63-177087 or See JP-A-4-236385).
10-05-2019
1
[0003]
Also, an echo canceller is known as a noise suppression technique. For example, as shown in FIG.
20, the transmission / reception interface unit 202 of the telephone set is connected to the
network 203, the echo canceller 201 is connected between the microphone 204 and the speaker
205, and the voice of the speaker is input to the microphone 204. By reproducing the other
party's voice from 205, the two will talk to each other.
[0004]
At that time, the sound that is routed from the speaker 205 to the microphone 204 in the dotted
arrow path becomes an echo (noise) to the other party's telephone. Therefore, an echo canceller
201 including a subtractor 206, an echo component generation unit 207, and a coefficient
calculation unit 208 is provided. The echo generation unit 207 generally uses a filter
configuration that generates an echo component based on a signal that drives the speaker 205.
In the subtractor 206, the echo component is subtracted to minimize its residual. As a result, the
coefficient calculation unit 208 performs update control of the filter coefficient of the echo
generation unit 207.
[0005]
The update of the filter coefficients c1, c2,... Cr of the echo component generation unit 207 of
this filter configuration can be obtained by applying the already known steepest descent method.
For example, based on the output signal e of the subtractor 206 (residual signal of the echo
component), update control of the filter coefficients c1, c2,... Cr is performed with J = e2 (1) And
will be as shown in Here, * represents a multiplication symbol, r represents a filter order, and f
(1),... F (r) represent filter memory values (output signals of delay units for delaying in sample
units), and norms fnorm is as shown in equation (3). Also, ? is a constant, and represents the
speed and accuracy of convergence of the filter coefficient to the optimum value.
[0006]
10-05-2019
2
In such an echo canceller 201, the order of the filter is several hundreds. Therefore, an echo
canceller using a microphone array shown in FIG. 21 is known. In the figure, 211 is an echo
canceller, 212 is a transmission / reception interface unit, 214-1 to 214-n are microphones
constituting a microphone array, 215 is a speaker, 216 is a subtractor, 217-1 to 217-n is a filter.
, 218 is a filter coefficient calculation unit.
[0007]
In this case, sound is input from the speaker 215 to the microphones 214-1 to 214-n in the path
of the dotted arrow and it becomes an echo, so the speaker 215 becomes a noise source.
Therefore, when the speaker does not pronounce, the update control of the filter coefficients c11,
c12, ... c1r, ... cn1, cn2, ... cnr of the filters 217-1 to 217-n is an evaluation function. If it is made
the same as equation (1),
[0008]
In this case, equation (4) shows filter coefficients c11 and c12 of the filter 217-1 to which the
output signal of the reference microphone is input, with the microphone 214-1 of the plurality of
microphones 214-1 to 214-n as the reference microphone. , ... c1r, and equation (5) shows filter
coefficients c21 and c22 of the filters 217-2 to 217-n which respectively receive the output
signals of the microphones 214-2 to 214-n other than the reference microphone. , ... c2r, ... ...
cn1, cn2, ... cnr. The subtractor 216 is configured to subtract the output signals of the filters
217-2 to 217-n corresponding to the other microphones from the output signal of the filter 2171 corresponding to the reference microphone. It is.
[0009]
FIG. 22 is an explanatory view of sound source position detection and target sound emphasizing
processing according to the conventional example, 221 is a target sound emphasizing unit, 222
is a sound source position detecting unit, 223 and 224 are delay units, 225 is a delay sample
number calculation unit, 226 Denotes an adder, 227 denotes a correlation coefficient value
calculation unit, 228 denotes a position detection processing unit, and 229-1 and 229-2 denote
microphones.
[0010]
10-05-2019
3
The target sound emphasizing unit 221 includes Z-da and Z-db delay units 223 and 224, a delay
sample number calculation unit 225, and an adder 226, and the sound source position detection
unit 222 The configuration includes a relation numerical value calculation unit 227 and a
position detection processing unit 228.
The sound source position detection unit 222 obtains the correlation coefficient value r (i) of the
output signals a (j) and b (j) of the microphones 229-1 and 229-2 by the correlation coefficient
value calculation unit 227, and performs position detection processing. The sound source
position is obtained from the value imax of i at which the correlation coefficient value r (i)
becomes maximum by the unit 228, and the delay sample number calculation unit 225 is
controlled.
[0011]
The correlation coefficient value r (i) is represented by r (i) = ?n j = 1 a (j) * b (j + i) (6). ? n j = 1
indicates that j = 1 to j = n is added, i has a relationship of ?m ? i ? m, and m is between
microphones 229-1 and 229-2 It is a value determined by the distance and the sampling
frequency, and m = (sampling frequency) * (distance between microphones) / (sound velocity)
(7). Also, n is the number of samples to be subjected to the convolution operation, and is
generally several hundreds.
[0012]
Also, the number of delay samples da, db between the delay 223 of Z-da and the delay 224 of Zdb is i 0 0 from the value of i when the value of the correlation coefficient value r (i) is maximum.
In the case of da = i and db = 0i <0, da = 0 and db = -i. Thereby, the phase of the target sound
from the sound source is matched, added by the adder 226, and the target sound is emphasized
and output.
[0013]
SUMMARY OF THE INVENTION In the prior art for noise suppression, in the case where a noise
source such as a speaker is provided together with a microphone array, the reproduced sound
10-05-2019
4
from the speaker is a microphone when the speaker of the target sound source does not utter An
echo canceller can cancel out the echo component which got into the array. However, when the
voice of the speaker and the reproduced sound from the speaker are simultaneously input to the
microphone array, the update of the filter coefficient for canceling the echo component (noise)
does not converge. That is, the residual signal e in the equations (4) and (5) is the sum of the
component for which the subtractor 216 can not suppress the echo component (noise) and the
speech of the speaker. If the filter coefficient is updated so as to minimize e, the voice of the
speaker as the target sound is also suppressed together with the echo component (noise), and
there is a problem that the target noise can not be suppressed.
[0014]
Also, in the conventional example for sound source position detection and target sound
emphasis, for example, the output signals a (j) and b (j) of the microphones 229-1 and 229-2 in
FIG. There is autocorrelation in the vicinity of. In the case where the sound source is white noise
or pulse noise, the autocorrelation becomes small, and in the case of speech etc, the
autocorrelation becomes large. The correlation function value r (i) according to the abovementioned equation (6) has a smaller change in value for i than a signal having a small
autocorrelation, for a signal having a large autocorrelation. Therefore, it is not easy to obtain an
accurate maximum value, and it is difficult to detect the sound source position accurately and
quickly.
[0015]
In the conventional example in which synchronous addition is performed to enhance the target
sound, the degree of enhancement depends on the number of microphones constituting the
microphone array, and if the correlation between the target sound and the noise is small, N By
using this microphone, it is possible to perform N-fold emphasis on the power ratio, but if the
correlation between the target sound and the noise is large, the power ratio will be small.
Accordingly, in order to enhance the target sound when the correlation between the target sound
and the noise is large, it is necessary to increase the number of microphones, and there is a
problem that the microphone array becomes large. When the sound source position of the target
sound is detected from the correlation coefficient value according to the above-mentioned
equation (6), it often becomes difficult to detect the sound source position in an environment
where noise and the like are large. An object of the present invention is to enable stable and
reliable processing of noise suppression, target sound emphasis and sound source position
detection using a microphone array.
10-05-2019
5
[0016]
A microphone array device according to the present invention comprises: (1) a microphone array
device having a microphone array in which a plurality of microphones 1-1 to 1-n are arranged; 1
to 1-n output signals, the output signals of the microphones 1-1 to 1-n, the noise source signal,
and the filters 2-1 to 2-n The residual signal obtained by subtracting the output signal of the
other microphone 1-2 to 1-n from the output signal of the reference microphone 1-1 among the
output signals of the microphones 1-1 to 1-n is input, and this residual And a filter coefficient
calculation unit 4 for obtaining the coefficients of the filters 2-1 to 2-n according to an evaluation
function based on a signal.
[0017]
(2) The delay unit based on the condition that the cross-correlation function value becomes the
maximum value by obtaining the cross-correlation function value between the delay unit
connected to the front stage of the filter and the noise source signal and the output signals of
plural microphones And a delay calculation unit for obtaining a delay amount of
Therefore, a signal whose phase is aligned by the delay unit is input to the filter coefficient
calculation unit 4, and the update control of the filter coefficient becomes easy.
[0018]
(3) The noise source signal is a signal for driving the speaker. That is, in a system having a
microphone array and a speaker, the reproduced sound from the speaker gets into the
microphone array and becomes noise. When this speaker is a noise source, a signal for driving
the speaker is a noise source signal. By using the filter coefficient calculation unit 4, processing
in the filter coefficient calculation unit 4 becomes easy.
[0019]
(4) An auxiliary microphone can be provided which outputs a noise source signal, together with a
microphone array consisting of a plurality of microphones. In this case, in a system having only a
10-05-2019
6
microphone array, the filter coefficient calculation unit 4 performs update control of the filter
coefficient using the output signal of the auxiliary microphone as a noise source signal.
[0020]
In addition, it is possible to provide a cyclic low-pass filter for reducing the weight of the memory
value of the filter of the convolution operation in the filter coefficient update unit in the filter
coefficient calculation unit.
[0021]
(6) A linear prediction filter for inputting the output signal of the microphone, a linear prediction
analysis unit for inputting the output signal of the microphone and updating filter coefficients of
the linear prediction filter according to the linear prediction analysis, and an output signal of the
linear prediction filter It is possible to provide a sound source position detection unit which
obtains a correlation coefficient value based on a linear prediction error signal and outputs
sound source position information based on a value at which the correlation coefficient value is
maximum.
[0022]
(7) A target sound source may be a speaker, and a signal for driving the speaker may be input to
provide a linear prediction analysis unit that controls updating of filter coefficients for linear
prediction filters for a plurality of microphones.
This linear prediction analysis unit can be made common to the linear prediction filter
corresponding to the microphone.
[0023]
(8) Based on the output signals of the plurality of microphones and the propagation velocity of
the sound wave, it is estimated that the estimation microphones are arranged according to the
arrangement interval of the microphones, and the output signals of the estimation microphones
are output of the microphones constituting the microphone array A signal estimation unit that
outputs together with the signal, and a synchronous addition unit that combines the phases of
the output signals of the microphones and estimation microphones that constitute the
microphone array of the signal estimation unit can be provided.
10-05-2019
7
[0024]
(9) A reference microphone arranged according to the arrangement spacing of the microphones
is provided on the arrangement line of the plurality of microphones constituting the microphone
array, and the signal estimation unit estimates based on the output signals of the plurality of
microphones constituting the microphone array The arrangement position of the microphone
and the output signal of the estimation microphone can be corrected.
Therefore, the target sound can be enhanced by reducing the error in the calculation processing
of the estimation microphone.
[0025]
(10) Weighting according to the auditory characteristics is performed on the error signal of the
difference between the output signal of the reference microphone and the output signal of the
estimation microphone of the arrangement position of the reference microphone estimated by
the signal estimation unit, and the auditory sensitivity is high. An estimation coefficient
determination unit may be provided to increase the estimation accuracy of the band.
[0026]
(12) The direction of the sound source with respect to the microphone array is divided into
predetermined angles, and the output signals of the microphones constituting the microphone
array and the output signals of the estimation microphone estimated based on the output signals
corresponding to the divided directions are A signal estimation unit for outputting, a
synchronous addition unit for adding the output signals of the signal estimation unit in phase
with each other, and a sound source position detection unit for outputting sound source position
information based on the maximum value of the output signal of the synchronous addition unit
Can be provided.
[0027]
(12) A sound source position detection unit that detects a sound source position based on output
signals of a plurality of microphones, a camera that picks up a sound source, a detection unit that
detects a sound source position based on an image pickup signal of the camera, a sound source
According to the position information from the position detection unit and the position
information from the detection unit, an integrated determination processing unit that outputs
sound source position information indicating the position of the sound source can be provided.
10-05-2019
8
[0028]
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS FIG. 1 is an explanatory view of a
first embodiment of the present invention, in which 1-1 to 1-n are n microphones constituting a
microphone array, and 2-1 to 2-n. Indicates a filter, 3 indicates an adder, 4 indicates a filter
coefficient calculator, 5 indicates a speaker (target sound source), and 6 indicates a speaker
(noise source).
The voice from the speaker 5 is input to the microphones 1-1 to 1-n and converted into an
electric signal, and becomes an output signal via the filters 2-1 to 2-n and the adder 3, and is
output via the network or the like. It is transmitted to the other party.
Further, the speaker 6 is driven with an audio signal from the other party as an input signal to be
reproduced audio.
Thus, the speaker 5 can talk with the other party.
In this case, since the reproduced voice from the speaker 6 is input to the microphones 1-1 to 1n, the voice from the speaker 5 becomes noise. Therefore, the speaker 6 becomes a noise source
for the target sound source.
[0029]
Therefore, in the present invention, the filter coefficient calculation unit 4 adds the output signals
of the microphones 1-1 to 1-n, the noise source signal (the input signal for driving the speaker 6
as the noise source), and The output signal (residual signal) of the unit 3 is input to perform the
coefficient update of the filters 2-1 to 2-n. In this case, the microphone 1-1 is used as a reference
microphone, and the output signals of the other filters 2-2 to 2-n are subtracted at the adder 3
from the output signal of the filter 2-1.
[0030]
10-05-2019
9
The filters 2-1 to 2-n can be configured as shown in FIG. 2, for example. In the figure, 11-1 to 11r-1 are Z-1 delay devices, 12-1 to 12-r are coefficient devices for multiplying filter coefficients
cp1, cp2,. , 14 is an adder, and r indicates the order of the filter.
[0031]
Assuming that the signal from the noise source (speaker 6) is xp (i) and the signal from the target
sound source (speaker 5) is yp (i) (where i is a sample number, p is 1, 2, ...) n) The values of the
memories of the filters 2-1 to 2-n (the input signals to the filters and the output signals of the
delay units 11-1 to 11-r-1) fp (i), fp (i) = xp ( i) + yp (i) ... (8).
[0032]
In the echo canceller using the microphone array of the conventional example, the output signal
e of the adder 3 in FIG.
In this case, the adder 3 indicates that the output signals of the filters 2-2 to 2-n are subtracted
from the output signal of the filter 2-1. Here, f1 (1), f1 (2),... F1 (r),... Fi (1), fi (2),... Fi (r) indicate
the values of the memory of the filter.
[0033]
On the other hand, in the present invention, when the phase of the signal xp (i) from the noise
source is matched and then folded, the output signal e 'of the adder 3 is as follows. Note that x
(1) (p),..., X (q) (p) (p) indicates that the signal is from a noise source in which the phases of the
microphones 1-1 to 1-n are matched, q indicates the number of samples to be subjected to the
convolution operation.
[0034]
When both the signal xp (i) from the noise source and the signal yp (i) from the target sound
source are simultaneously input, that is, when the speech of the speaker 5 and the reproduced
speech from the speaker 6 simultaneously occur, The correlation between the two is small
because they are voices of different human beings.
10-05-2019
10
[0035]
As understood from the equation (12), the influence of the signal yp (i) from the target sound
source at [fp (1) ',..., Fp (r)'] is reduced.
The signal e 'of the equation (10) is determined using the equation (12), and the evaluation
function J = (e') 2 is determined based on this, and the filter 2 is calculated based on the
evaluation function J = (e ') 2. The control of updating the filter coefficients of (1) to (2) -n is
performed. That is, even when voices are simultaneously input to the microphones 1-1 to 1-n
from the speaker (target sound source) 5 and the speaker (noise source) 6, output signals from
the microphones 1-1 to 1-n The noise source signal contained in the signal has a high correlation
with the input signal for driving the speaker 6 input to the filter coefficient calculation unit 4 and
a correlation with the target sound source signal. Therefore, the evaluation function J = (e ') It
becomes possible to control updating of filter coefficients according to 2. Therefore, the output
signal of the adder 3 is the voice signal of the speaker 5 whose noise is suppressed.
[0036]
FIG. 3 is an explanatory view of a second embodiment of the present invention, in which the
same reference numerals as in FIG. 1 denote the same parts, and 8-1 to 8-n denote delay units (Zd1 to Z-dn), 9 Is a delay calculation unit. In this embodiment, the delay calculation unit 9
calculates the number of delay samples of the delay units 8-1 to 8-n so that the phases of the
signals from the microphones 1-1 to 1-n are matched, and the filter 2- Filter coefficients of 1 to
2?n are calculated in the filter coefficient calculation unit 4 and update control is performed.
Therefore, the delay calculation unit 9 receives the output signals of the microphones 1-1 to 1-n
and the input signal (noise source signal) for driving the speaker 6, and the filter coefficient
calculation unit 4 receives the delay. The output signals of the amplifiers 8-1 to 8-n, the output
signal of the adder 3, and the input signal (noise source signal) for driving the speaker 6 are
input.
[0037]
Let the output signals of the microphones 1-1 to 1-n be gp (j) (where p = 1, 2,... N, j = sample
10-05-2019
11
number) and cross-correlate with the signal x (j) from the noise source The function value Rp (i)
is determined as shown in the following equation. Rp (i) =. SIGMA.S j = 1 gp (j + i) * x (j) (13)
where .SIGMA.S j = 1 represents addition from j = 1 to j = s, and s is the number of samples to be
subjected to the convolution operation Indicates This sample number s can usually be several
tens to several hundreds samples. Further, assuming that the maximum number of delayed
samples corresponding to the distance from the noise source to the microphone is D, i in
equation (13) is i = 0, 1, 2,.
[0038]
For example, assuming that the maximum distance between the noise source and the microphone
is 50 cm and the sampling frequency is 8 kHz, the sound velocity is approximately 340 m / s, so
the maximum delay sample number D is D = (sampling frequency) * (noise The maximum
distance between the source and the microphone) / (sound speed) = 8000 * (50/34000) = 11.76
иии ? 12. Therefore, i in this case is in the range of i = 1-12. Also, assuming that the maximum
value of the distance between the noise source and the microphone is 1 m, the maximum number
D of delay samples is 24.
[0039]
Further, the value ip (p = 1, 2,..., N) at which the absolute value of the cross-correlation function
value Rp (i) obtained by the equation (13) is maximized is obtained, and the maximum value imax
of ip is further obtained. Ask for This process follows the steps shown in (A1) to (A11) of FIG.
That is, imax = initial value (for example, 0) and p = 1 (A1), then Rpmax = initial value (for
example 0.0), ip = initial value (for example 0), and i It is assumed that = 0 (A2), and the cross
correlation function value Rp (i) according to the above-mentioned equation (13) is obtained
(A3).
[0040]
Then, it is judged whether or not the cross correlation function value Rp (i) is larger than Rpmax
(A4). If larger, Rp (i) at that time is set as Rpmax (A5), and if smaller, i = i + 1. (A6). Then, it is
determined whether or not i ? D (A7), and when i is less than the maximum delay sample
number D, the process proceeds to step (A3), and when i exceeds the maximum delay sample
number D, the process proceeds to step (A8) Do. In this step (A8), it is judged whether or not ip is
10-05-2019
12
larger than imax. If larger, the ip at that time is set to imax (A9). If not larger, p = p + 1 is set
(A10), p ? 5. It is determined whether n is n or not (A11), and when p ? n, the process proceeds
to step (A2). When the condition is not satisfied, the search of the cross correlation function
value Rp (i) is completed, i The maximum value i max of ip in the range of ? D is obtained.
[0041]
The number dp of delay samples of the delay unit is determined by the following equation using
ip and imax obtained by the above-mentioned maximum value detection. dp = imax-ip (14)
Accordingly, the delay calculation unit 9 sets the delay sample numbers d1 to dn of the delay
units 8-1 to 8-n.
[0042]
Also, as described above, the filters 2-1 to 2-n can apply the configuration shown in FIG. 2, and
the output signals of the filters 2-1 to 2-n are outp (p = 1, If it is assumed that 2,... N), then outp =
.SIGMA.i i = 1 cpi * fp (i) (15). Note that nni = 1 represents addition from i = 1 to i = n, cpi
represents a filter coefficient, fp (i) represents a value of the memory of the filter, and is also an
input signal of the filter in this case.
[0043]
The filter coefficient calculation unit 4 calculates the cross-correlation function value between
the input signals of the present and past filters 2-1 to 2-n and the signal from the noise source,
and updates the filter coefficients. The cross correlation function value fp (i) 'is fp (i)' =. SIGMA.q j
= 1 x (j) * fp (i + j-1) (16). Note that jq j = 1 represents addition from j = 1 to j = q, q represents
the number of samples to be subjected to the convolution operation when calculating the crosscorrelation function value, and in general, is there.
[0044]
The output signal e 'of the adder 3 is obtained using such a cross correlation function value fp
(i)'. That is, e ? = ?r j = 1 [f1 (j) ? * c1j] ??n i = 2 rr j = 1 [fi (j) ? * cij] (17), which is a
10-05-2019
13
convolution operation , And can be calculated by a digital signal processor (DSP). In this case,
from the output signal of the reference microphone 1-1 through the filter 2-1, the adder 3
outputs the output signals of the other microphones 1-2 through 1-n through the filters 2-2
through 2-n. An output signal e 'is output after subtraction.
[0045]
Assuming that the output signal e 'of the adder 3 described above is an error signal and the
evaluation function J = (e') 2, the filter coefficient is obtained based on the evaluation function J =
(e ') 2. For example, as described above, it can be determined by the steepest descent method,
and filter coefficients c11, c12,... C1r,... Cn1, cn2,. The norm fpnorm corresponds to the equation
(3), and fpnorm = [(fp (1) ?) 2 + (fp (2) ?) 2... + (Fp (r) ?) 2] 1 / 2 ... (20). Also, as described
above, ? in the equations (18) and (19) is a constant and represents the speed and accuracy of
convergence of the filter coefficient to the optimum value.
[0046]
Therefore, the output signal e 'of the adder 3 becomes e' = out1-.SIGMA.i i = 2 outi (21), and the
phase of the input signal to the filters 2-1 to 2-n is delayed by the delay units 8-1 to 8-8. Since
the filter coefficients can be aligned by -n, the filter coefficients can be easily updated by the
filter coefficient calculation unit 4 and the filter coefficients can be updated and controlled even
in a state in which the speaker 5 and the speaker 6 simultaneously produce sounds. As a result,
the noise coming from the speaker 6 as a noise source to the microphones 1-1 to 1-n can be
reliably suppressed.
[0047]
FIG. 5 is an explanatory view of a third embodiment of the present invention, in which the same
reference numerals as in FIG. 1 indicate the same parts, 16 is a noise source, and 21 is an
auxiliary microphone.
The auxiliary microphone 21 can be a microphone having the same configuration as the
microphones 1-1 to 1-n constituting the microphone array.
[0048]
10-05-2019
14
This embodiment is substantially the same as the embodiment shown in FIG. 1, but the output
signal of the auxiliary microphone 21 is inputted to the filter coefficient calculation unit 4 as a
signal of a noise source. Therefore, even when noise source 16 is an arbitrary noise source such
as an air conditioning sound other than the speaker for speaker 5 or any target sound source, the
filter as described with reference to FIG. Noise can be suppressed based on the evaluation
function J = (e ?) 2 used for updating the coefficients.
[0049]
FIG. 6 is an explanatory view of a fourth embodiment of the present invention, and the same
reference numerals as in FIG. 3 and FIG. 5 denote the same parts. This embodiment is
substantially the same as the embodiment shown in FIG. 3, but the output signal of the auxiliary
microphone 21 is input to the delay calculating unit 9 and the filter coefficient calculating unit 4
as a signal of a noise source. Therefore, as in the case of the embodiment shown in FIG. 3, the
delay calculation unit 9 controls the number of delay samples of the delay units 8-1 to 8-n, and
the filter coefficient calculation unit 4 controls the filters 2-1 to 2-. It is possible to perform noise
suppression by performing update control of n filter coefficients.
[0050]
FIG. 7 is an explanatory view of a low pass filter in the filter coefficient updating process
according to the embodiment of the present invention, wherein 22 and 23 are coefficient units,
24 is an adder, and 25 is a delay unit. The case where the cross-correlation function value fp (i)
'described above is calculated using the low-pass filter shown in FIG. 7 is shown, where the
coefficient of the coefficient unit 23 is ? and the coefficient of the coefficient unit 22 is 1-?.
Show. fp (i) '=. beta. * fp (i)' old + (1-.beta.) * [x (1) * fp (i)] (22) Note that the coefficient .beta. And
fp (i) 'old indicates the value of the memory (delay 25) of the low pass filter.
[0051]
By using this recursive low-pass filter, the weighting of the past signal is reduced, the output
value in the convolution operation is prevented from becoming excessive, and the crosscorrelation function value fp (i) 'is stably obtained. Can.
10-05-2019
15
[0052]
FIG. 8 is an explanatory view of an embodiment of the present invention using a DSP (digital
signal processor), in which 1-1 to 1-n are microphones constituting a microphone array, and 30
is a digital signal processor (DSP , 31-1 to 31-n are low pass filters (LPF), 32-1 to 32-n are AD
converters (A / D), 33 is a DA converter (D / A), 34 is a low pass filter (LPF) , 35 indicates an
amplifier, and 36 indicates a speaker.
[0053]
The filters 2-1 to 2-n and the filter coefficient calculator 4 in the embodiment shown in FIG. 1
and the filters 2-1 to 2-n and the filter coefficient calculator 4 in the embodiment shown in FIG.
Since the delay units 8-1 to 8-n and the delay calculation unit 9 can be realized by a combination
of repetitive processing, product-sum operation and conditional branch, such processing is
realized by the arithmetic function of the digital signal processor 30. It is
[0054]
The low-pass filters 31-1 to 31-n and 34, for example, remove signal components other than the
voice band, and the AD converters 32-1 to 32-n convert the microphones 1-1 to 1-n. Is converted
into a digital signal through the low pass filters 31-1 to 31-n, and corresponds to the number of
bits to be processed in the digital signal processor 30 by sampling at 8 kHz, for example. Convert
to 8 bits, 14 bits, etc.
[0055]
Further, an input signal through a network or the like is converted into an analog signal by the
DA converter 33, input to the amplifier 35 through the low pass filter 34, and amplified to drive
the speaker 36.
The reproduced sound from the speaker 36 in this case becomes noise for the microphones 1-1
to 1-n.
However, as described above, noise can be suppressed by updating the filter coefficients by the
digital signal processor 30 or the like.
[0056]
10-05-2019
16
FIG. 9 is an explanatory diagram of the processing function of the DSP (digital signal processor)
according to the embodiment of the present invention, and the same reference numerals as in
FIG. 3 and FIG. The illustration of 1 to 31-n and 34, AD converters 32-1 to 32-n, DA converter 33
and amplifier 35 are omitted.
The filter coefficient calculation unit 4 includes a cross correlation calculation unit 41 and a filter
coefficient update unit 42. The delay calculation unit 9 includes a cross correlation calculation
unit 43, a maximum value detection unit 44, and a delay sample number calculation unit 45. It is
[0057]
The cross-correlation calculation unit 43 of the delay calculation unit 9 receives the output
signals gp (j) of the microphones 1-1 to 1-n and the drive signal of the speaker 36 as a noise
source, The correlation function value Rp (i) is calculated.
Further, the maximum value detection unit 44 detects the maximum value of the cross
correlation function value Rp (i) according to the flowchart shown in FIG. 4, and the delay sample
number calculation unit 45 uses ip and imax obtained by the maximum value detection. The
number of delay samples dp of the delay units 8-1 to 8-n is obtained according to the equation
(14), and the number of delay samples of the delay units 8-1 to 8-n is set.
[0058]
In addition, the cross-correlation calculation unit 41 of the filter coefficient calculation unit 4
includes a signal in which the phase of the signal of the noise source is matched by the delay
units 8-1 to 8-n, a drive signal of the speaker 36 as the noise source, and the adder 3 And the
cross-correlation function value fp (i) 'according to the above-mentioned equation (16). In the
process of calculating the cross correlation function value fp (i) ', the process of the low pass
filter shown in FIG. 7 can be included. Further, the filter coefficient updating unit 42 calculates
the filter coefficient cpr according to the equations (17), (18), (19), and, for example, updates the
filter coefficients of the filters 2-1 to 2-n of the function shown in FIG. It is something to do.
10-05-2019
17
[0059]
FIG. 10 is an explanatory diagram of the delay unit, wherein 46 is a memory, 47 is a write
control unit, 48 is a read control unit, and 9 is a delay calculation unit. The case where the delay
unit is realized using the internal memory of the digital signal processor is shown, and the
memory 46 has an area of the maximum value D of the number of delay samples, and writing is
performed under the control of the write control unit 47, The data is read out under the control
of the read control unit 48. Further, the write pointer WP and the read pointer RP are set at an
interval of the delay sample number dp calculated by the delay calculation unit 9, and shifted in
the direction of the dotted arrow at every write / read timing. Therefore, the signal written to the
address designated by the write pointer WP is read out when designated by the read pointer RP
after the set delay sample number dp.
[0060]
FIG. 11 is an explanatory diagram of the fifth embodiment of the present invention, in which 511 and 51-2 are microphones constituting a microphone array, 52-1 and 52-2 are linear
prediction filters, and 53-1 and 53, respectively. 2 denotes a linear prediction analysis unit, 54
denotes a sound source position detection unit, and 55 denotes a sound source of a speaker or
the like. A large number of microphones constituting the microphone array may be provided, but
for convenience of explanation, the case where two microphones 51-1 and 51-2 are provided will
be described.
[0061]
The output signals a (j) and b (j) of the microphones 51-1 and 51-2 are input to the linear
prediction analysis units 53-1 and 53-2 and the linear prediction filters 52-1 and 52-2,
respectively, The linear prediction analysis units 53-1 and 53-2 obtain autocorrelation function
values to calculate linear prediction coefficients, and update the filter coefficients of the linear
prediction filters 52-1 and 52-2 using the linear prediction coefficients. The sound source
detection unit 54 detects the position of the sound source 55 based on the linear prediction error
signal of the output signal of the linear prediction filters 52-1 and 52-2, and outputs the sound
source position information.
[0062]
10-05-2019
18
FIG. 12 shows the function of each part shown in FIG. 11 in more detail, and the same reference
numerals as in FIG. 11 indicate the same parts, and 56-1 and 56-2 indicate autocorrelation
function value calculation parts, 57-1 and 57-. 2 is a linear prediction coefficient calculation unit,
58 is a correlation coefficient value calculation unit, and 59 is a position detection processing
unit.
The linear prediction analysis units 53-1 and 53-2 are configured to include autocorrelation
function value calculation units 56-1 and 56-2 and linear prediction coefficient calculation units
57-1 and 57-2, respectively. The output signals a (j) and b (j) of 1, 51-2 are input to
autocorrelation function value calculation units 56-1, 56-2.
[0063]
The autocorrelation function value calculation unit 56-1 of the linear prediction analysis unit 531 calculates the autocorrelation function value Ra (i) according to the following equation based
on the output signal a (i) of the microphone 51-1. Ra (i) = ?n j = 1 a (j) * a (j + i) (23) where nn j
= 1 represents addition from j = 1 to j = n, and n is the number of samples of the convolution
operation Generally, the value is several hundred. If q is the order of the linear prediction filter,
then 0 ? i ? q.
[0064]
Further, the linear prediction coefficient calculation unit 57-1 calculates linear prediction
coefficients ?a1, ?a2, ..., ?aq based on the autocorrelation function value Ra (i). The linear
prediction coefficient can be obtained by various known methods such as correlation method,
partial autocorrelation method, covariance method and the like. Therefore, it can also be realized
by the arithmetic function of the above-mentioned digital signal processor (DSP).
[0065]
Also in the linear prediction analysis unit 53-2 corresponding to the microphone 51-2, the
autocorrelation function value calculation unit 56-2 calculates the autocorrelation function value
Rb based on the output signal b (j) of the microphone 51-2. The equation (i) is calculated in the
same manner as the equation (23), and the linear prediction coefficients calculator 57-2
calculates the linear prediction coefficients ?b1, ?b2, ..., ?bq.
10-05-2019
19
[0066]
The linear prediction filters 52-1 and 52-2 can be configured as q-order FIR filters, and the filter
coefficients c1, c2,..., Cq are linear prediction coefficients .alpha.a1, .alpha.a2,. .., .Alpha.aq,
.alpha.b1, .alpha.b2,..., .Alpha.bq are updated.
The filter order q of the linear prediction filters 52-1 and 52-2 is a value determined by q =
(sampling frequency) * (distance between microphones) / (sound velocity) (24), and the right side
is the above-mentioned (7) It is similar to the formula.
[0067]
The sound source position detection unit 54 includes a correlation coefficient value calculation
unit 58 and a position detection processing unit 59. The correlation coefficient value calculation
unit 58 is an output signal of the linear prediction filters 52-1 and 52-2, that is, a microphone
The correlation coefficient value r '(i) is calculated using the linear prediction error signals a' (j)
and b '(j) of the output signals a (j) and b (j) of 51-1 and 51-2. The range of i in this case is ?q ?
i ? q.
[0068]
The position detection processing unit 59 obtains a value imax of i that maximizes the value of
the correlation coefficient value r ? (i), and outputs sound source position information indicating
the position of the sound source 55 based on the value imax. The relationship between the sound
source position and imax in this case is as shown in FIG. That is, in the case of imax = 0, the
sound source 55 is present in front of or behind the microphones 51-1 and 51-2 and at an equal
distance from the microphones 51-1 and 51-2. In the case of imax = q, it exists on the side of the
microphone 51-1 on the arrangement line of the microphones 51-1 and 51-2, and in the case of
imax = -q, it exists on the side of the microphone 51-2. If the number of microphones is three or
more, the position of the sound source can be detected including the distance to the sound
source.
[0069]
10-05-2019
20
The speech signal generally has a large autocorrelation function value, and the conventional
example for obtaining the correlation coefficient value r (i) using the output signals a (j) and b (j)
of the microphones 51-1 and 51-2 is not Although the change of the value of the correlation
coefficient value r (i) with respect to i is small, detection of the sound source position is not easy,
but according to the embodiment of the present invention described above, the autocorrelation
function value is large. Even in this case, the correlation coefficient value r '(i) is determined
using the linear prediction error signal, equivalently to reducing the autocorrelation, and the
detection of the sound source position becomes easy.
[0070]
FIG. 14 is an explanatory diagram of the sixth embodiment of the present invention, in which the
same reference numerals as in FIG. 11 denote the same parts, 53A is a linear prediction analysis
unit, and 55A is a speaker as a sound source.
By inputting the drive signal of the speaker 55A as a sound source to the linear prediction
analysis unit 53A, the signal of the sound source is subjected to linear prediction analysis to
obtain a linear prediction coefficient, and the linear prediction filters 52-1 and 52-2 are obtained.
As a common linear prediction analysis unit 53A, linear prediction error signals of the output
signals a (j) and b (j) of the microphones 51-1 and 51-2 are determined, and the sound source
position detection unit 54 calculates the linear prediction error signals. The correlation
coefficient value r '(i) can be obtained using this to detect the position of the sound source.
[0071]
FIG. 15 is an explanatory diagram of the seventh embodiment of the present invention, in which
61-1 and 61-2 indicate microphones constituting a microphone array, 62 indicates a signal
estimation unit, 63 indicates a synchronous addition unit, and 65 indicates a sound source. For
example, in the synchronous addition unit 63, it is assumed that microphones 64-1, 64-2,...
Shown by dotted lines as estimated positions are present on the arrangement line of the two
microphones 61-1 and 61-2. It shows a configuration in which the target sound emphasis is
performed by performing synchronous addition of the output signals of the microphones 61-1
and 61-2.
[0072]
10-05-2019
21
FIG. 16 is a functional block diagram of the seventh embodiment of the present invention, in
which the same reference numerals as in FIG. 15 indicate the same parts, 66 indicates a particle
velocity calculator, 67 indicates an estimation processor, 68-1 and 68-2. ,... Are delay units, and
69 is an adder.
A case is shown where the sound source 65 is positioned in the direction of ? with respect to
the arrangement line of the two microphones 61-1 and 61-2 constituting the microphone array,
and on the arrangement line of the microphones 61-1 and 61-2. It is presumed that microphones
64-1, 64-2,... Shown by dotted lines are arranged along the lines.
[0073]
Further, the signal estimation unit 62 has a configuration including a particle velocity calculation
unit 66 and an estimation processing unit 67. Also, the sound wave from the sound source 65
can be represented by a wave equation (Wave Equation). In this case, assuming that the sound
pressure is P, the particle velocity is V, the bulk elastic modulus of the medium is K, and the
density of the medium is ?, the sound wave propagating in the medium is ??V / ?x = (1 / K) (
It is known that it is expressed by the relation of ?P / ?t) -?P / ?t = ? (?V / ?t) (25).
[0074]
The particle velocity calculation unit 66 sets the amplitude of the output signal a (j) of the
microphone 61-1 to the sound pressure P (j, 0) and the amplitude of the output signal b (j) of the
microphone 61-2 to the sound pressure P (j, 0). 1) and the particle velocity V is determined by
the sound pressure difference. That is, the particle velocity V (j + 1,0) at the microphone 61-1 is
V (j + 1,0) = V (j, 0) + [P (j, 1) -P (j, 0) (26) )It can be expressed as. Here, j is a sample number.
[0075]
Assuming that the estimated position is x, the estimation processing unit 67 sets P (j, x + 1) = P (j,
x) + ? (x) [V (j + 1, x) -V (j, x)] V (j + 1, x) ) = V (j + 1, x-1) + [P (j, x-1) -p (j, x)] (27) The estimated
positions of the microphones 64-1, 64-2,. be able to. Here, ? (x) is an estimation coefficient.
10-05-2019
22
[0076]
Therefore, assuming that the arrangement position of the microphone 61-2 is x = 1 and the
arrangement position of the microphone 61-1 is x = 0, the microphone 64-1 at the estimated
position is x = 2 and the microphone 64-2 is x = 3, The estimation processing unit 62 uses the
two microphones 61-1 and 61-2, and as if the microphones 64-1, 64-2,... The respective output
signals of 2,... Are input to the synchronous addition unit 63. Therefore, with the microphone
array including the two microphones 61-1 and 61-2, target sound emphasis can be performed by
synchronous addition as in the microphone array in which a large number of microphones are
arranged.
[0077]
The synchronous adder 63 includes delay units 68-1, 68-2,... And an adder 69. Assuming that the
number of delay samples is d, the delay units 68-1, 68-2,. , Z?d, Z?2d, Z?3d,..., And the sound
source position relative to the arrangement line of the microphones 61-1 and 61-2 is determined
based on the angle ? determined according to the above-described embodiment. The number d
of delay samples is determined by sampling frequency) * (distance between microphones) * cos
? / (sound velocity) (28).
[0078]
Thereby, the output signals of the microphones 61-1, 61-2 and the microphones 64-1, 64-2, ... at
the estimated position are aligned in phase by the delay units 68-1, 68-2, ... The target sound
emphasis processing by synchronous addition can be performed by the addition by the adder 69.
Therefore, using a small number of microphones, it is possible to emphasize the target sound to
have a power corresponding to the estimated number of microphones.
[0079]
FIG. 17 is an explanatory diagram of the eighth embodiment of the present invention, in which
the same reference numerals as in FIG. 15 denote the same parts, 71 denotes a reference
microphone, 72 is a subtractor, 73 is a weighting filter, 74 is a coefficient for estimation It is a
10-05-2019
23
department. In this embodiment, the reference microphone 71 is disposed at the position x = 2 at
the same interval as the distance between the microphone 61-1 at the position x = 0 and the
microphone 61-2 at the position x = 1, and the estimated position error is subtracted. A case is
described in which an auditory characteristic is obtained by the weighting unit 73 and the
estimation coefficient determination unit 74 determines estimation coefficients ? (x), ? (x) and
? (x).
[0080]
That is, an estimated error e (j) of the difference between the estimated signal P (j, 2) of the
microphone 64-1 at position x = 2 (the estimated microphone at the position of the reference
microphone 71) and the output signal ref (j) of the reference microphone 71. ) Is obtained by the
subtracter 72. e (j) = P (j, 2) -ref (j) = P (j, 1) +. beta. (2) [V (j + 1,1) -V (j, 1)]-ref (j) (29) )
[0081]
The estimation coefficient ? (2) can be determined in the estimation coefficient determination
unit 74 so that the average power of the estimation error e (j) is minimized. That is, the signal
estimation unit 62 (see FIG. 15 or FIG. 16) uses the estimation coefficient .beta. (2) which makes
the average power of this estimation error e (j) the minimum at x = 2, 3, 4,. Thus, the output
signals of the estimation microphones 64-1, 64-2,... Can be estimated and output.
[0082]
Further, in FIG. 17, the estimation error e (j) is weighted according to the auditory characteristic
by the weighting filter 73, and the auditory characteristic is a sensitivity near 4 kHz as known as
an equal loudness curve. Shows high. Therefore, weighting is increased for a band near 4 kHz,
which is high in sensitivity to the estimation error e (j). Therefore, even in the processing of the
output signal of the estimation microphone after position x = 2, the target sound can be
emphasized by synchronous addition while reducing the estimation error of the band where the
sensitivity of hearing is large.
[0083]
FIG. 18 is an explanatory diagram of the ninth embodiment of the present invention, in which 61-
10-05-2019
24
1 and 61-2 are microphones constituting a microphone array, 62-1, 62-2,... 62-s are signal
estimations. , 63-s are synchronous addition units, 64-1, 64-2, ... are estimation microphones, 65
is a sound source, and 80 is a sound source position detection unit.
[0084]
The signal estimation unit 62-is divided into angles ? 0, ? 1,... ? s with respect to the direction
of the microphone array composed of the microphones 61 1 and 61 2 and corresponding to the
divided angles ? 0, ? 1,. 1 to 62-s and synchronous addition units 63-1 to 63-s.
Each of the signal estimation units 62-1 to 62-s obtains the estimation coefficient ? (x, ?) in
advance. For example, as shown in FIG. 17, a reference microphone is provided to estimate the
estimation coefficient ? (x, Set ?).
[0085]
The synchronous addition units 63-1 to 63-s match and add the phases of the output signals of
the signal estimation units 62-1 to 62-s, and obtain output signals respectively corresponding to
the directions of the angles ?0 to ?s. be able to. Therefore, the sound source position detection
unit 80 compares the powers of the output signals of the synchronous addition units 63-1 to 63s, and determines the angle corresponding to the output signal of the maximum power as the
direction of the sound source 65. Output information Also, the output signal of the maximum
power can be output as the target sound emphasis signal.
[0086]
FIG. 19 is an explanatory diagram of the tenth embodiment of the present invention, 90 is a
camera such as a television camera, 91-1 to 91-2 are microphones constituting a microphone
array, 92 is a sound source position detection unit, 93 is a A face position detection unit as a
detection unit that detects the position of a sound source, 94 indicates an integrated
determination processing unit, and 95 indicates a sound source.
[0087]
10-05-2019
25
The microphones 91-1 and 91-2 and the sound source position detection unit 92 are configured
by applying any of the above-described embodiments, and the position information of the sound
source 95 is integrated from the sound source position detection unit 92 to the integrated
determination processing unit 94. input.
Further, the speaker 90 is imaged by a camera 90 such as a television camera or a digital camera
to detect the position of the speaker's face. For example, a method of detecting the position of a
face by a template matching method using a face template, a method of detecting a position of a
face by extracting a skin color area based on a color video signal, or the like can be applied. The
integrated determination processing unit 94 determines the position of the sound source 95
based on the position information by the sound source position detection unit 92 and the
position detection information by the face position detection unit 93, and outputs the sound
source position information.
[0088]
For example, the direction of the speaker (sound source) is divided into a plurality of angles ?0
to ?s with respect to the arrangement lines of the microphones 91-1 and 91-2 and the imaging
direction of the camera 90, and the microphones 91-1 and 91-2. Sound source position detection
by calculating the correlation coefficient value using the linear prediction error of the output
signal of the above, or sound source position detection using the output signals of the
microphones 91-1 and 91-2 and the estimation microphones on the arrangement line thereof
The position information inf-A (?) indicating the probability of the sound source direction is
obtained. Also, position information inf-V (.theta.) Indicating the probability of the direction of
the face of the speaker (sound source) using the video signal from the camera 90 is determined.
Then, the integrated determination processing unit 94 calculates the product res (?) with the
respective position information inf-A (?) and inf-V (?), and the angle ? at which the product
res (?) becomes maximum is calculated. Output as sound source position information. Therefore,
the direction of the sound source 95 can be detected more accurately. Further, by detecting the
direction of the sound source 95 and automatically controlling the zooming of the camera 90,
etc., it is also possible to take an enlarged image of the sound source 95.
[0089]
The present invention is not limited to the above-described embodiments, and various additions
and modifications can be made. The above-described embodiment can be applied to noise
10-05-2019
26
suppression, target sound emphasis, sound source position detection, etc. Can be combined. The
target sound emphasis and the sound source position detection can be applied not only to the
voice of the speaker or the like but also to detection of a sound source emitting another sound
wave.
[0090]
As described above, according to the present invention, the output signals of the microphones 11 to 1-n constituting the microphone array, and the noise source signal such as the drive signal
of the speaker 6 and the output signal of the auxiliary microphone , The residual signal of the
output of the adder 3 is input to the filter coefficient calculation unit 4, and the update control of
the filter coefficient of the filters 2-1 to 2-n to which the output signals of the microphones 1-1
to 1-n are input Even if the speech of the speaker as the target sound and the speech as the noise
are simultaneously input to the microphones 1-1 to 1-n, the cross-correlation function value of
the two is effective. Since it becomes small, the influence of the speaker's voice as the target
sound can be reduced, and the filter coefficient update control can be continued to perform noise
suppression.
[0091]
Also, by connecting a delay device to the front stage of the filters 2-1 to 2-n and adjusting the
phase of the noise signal, the update control of the filter coefficients of the filters 2-1 to 2-n
becomes easy, so the target sound Even when the voice of the speaker as the voice and the voice
as noise are simultaneously input to the microphones 1-1 to 1-n, noise suppression becomes
easy.
[0092]
Also, the output signal of the microphone array or the signal of the target sound source is input
and linear prediction analysis is performed to update the filter coefficients of the linear
prediction filter to which the output signal of the microphone is input, and the sound source
position based on the output signal of the linear prediction filter Even if the voice of the speaker
of the target sound source and the voice from the noise source are simultaneously input to the
microphone by detecting the autocorrelation function value of the neighboring samples of the
voice signal by linear prediction analysis, The position of the target sound source can be detected
reliably.
Therefore, it is possible to emphasize the voice from the target sound source or to suppress the
10-05-2019
27
sound other than the voice of the target sound source as noise.
[0093]
In addition, by performing synchronous addition including the output signals of estimation
microphones at intervals according to the arrangement intervals of the microphones constituting
the microphone array, target sound enhancement similar to that of the microphone array using a
large number of microphones with a small number of microphones And there is an advantage
that detection of the target sound source position can be performed.
[0094]
Further, by integrally determining the detection of the sound source position by the microphone
array and the position detection by the imaging signal of the target sound source, the position of
the target sound source can be detected quickly and accurately.
[0095]
Brief description of the drawings
[0096]
1 is an explanatory view of a first embodiment of the present invention.
[0097]
2 is an explanatory view of the filter.
[0098]
3 is an explanatory view of the second embodiment of the present invention.
[0099]
4 is a processing flowchart of the delay calculation unit in the second embodiment of the present
invention.
[0100]
5 is an explanatory view of a third embodiment of the present invention.
10-05-2019
28
[0101]
6 is an explanatory view of a fourth embodiment of the present invention.
[0102]
7 is an explanatory diagram of a low pass filter in the filter coefficient updating process of the
embodiment of the present invention.
[0103]
<Figure 8> It is the explanation drawing of the form of execution of this invention which uses
DSP.
[0104]
9 is an explanatory view of the processing function of the DSP of the embodiment of the present
invention.
[0105]
10 is an explanatory diagram of a delay unit.
[0106]
11 is an explanatory view of a fifth embodiment of the present invention.
[0107]
12 is a functional block diagram of the fifth embodiment of the present invention.
[0108]
13 is a diagram for explaining the relationship between the sound source position and imax.
[0109]
14 is an explanatory view of the sixth embodiment of the present invention.
[0110]
10-05-2019
29
15 is an explanatory view of the seventh embodiment of the present invention.
[0111]
16 is a functional block diagram of a seventh embodiment of the present invention.
[0112]
17 is an explanatory view of the eighth embodiment of the present invention.
[0113]
18 is an explanatory view of a ninth embodiment of the present invention.
[0114]
19 is an explanatory view of a tenth embodiment of the present invention.
[0115]
FIG. 20 is an explanatory diagram of an echo canceller of the conventional example.
[0116]
21 is an explanatory view of an echo canceller using the microphone array of the conventional
example.
[0117]
FIG. 22 is an explanatory diagram of sound source position detection and target sound emphasis
processing of the conventional example.
[0118]
Explanation of sign
[0119]
1-1 to 1-n Microphone 2-1 to 2-n Filter 3 Adder 4 Filter coefficient calculator 5 Speaker (target
sound source) 6 Speaker (noise source)
10-05-2019
30
10-05-2019
31
Документ
Категория
Без категории
Просмотров
0
Размер файла
47 Кб
Теги
jph1118194
1/--страниц
Пожаловаться на содержимое документа