close

Вход

Забыли?

вход по аккаунту

?

DESCRIPTION JP2010028653

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JP2010028653
An object of the present invention is to achieve a desired amount of echo cancellation even if a
change in acoustic directivity occurs due to adaptive beamforming processing at low calculation
cost. SOLUTION: An echo signal contained in a collected sound signal collected by N sound
collection means is erased, a noise signal contained in the collected sound signal is suppressed,
and output as an output signal, The echo signal component included in the frequency band lower
than the predetermined reference value of the sound collection signal is eliminated for each
sound collection means, the residual echo signal is output, and the noise signal is suppressed
from the N residual echo signals. An adaptive beamforming signal is output, and an echo signal is
canceled from the adaptive beamforming signal to output an output signal. [Selected figure]
Figure 3
Echo cancellation apparatus, echo cancellation method, program therefor, recording medium
[0001]
According to the present invention, an echo canceler that cancels an echo signal included in a
collected sound signal collected by a plurality of sound collection means, suppresses a noise
signal included in the collected sound signal, and outputs it as an output signal , The program,
the recording medium.
[0002]
In addition to conventional telephone conferences and video conferences, in recent years,
opportunities for using hands-free high-frequency calls such as video-phone and desktop
10-04-2019
1
conferences using a PC (personal computer) have increased.
In the hands-free voice communication, while the other party's voice received is emitted from the
reproduction means (for example, the speaker), the voice of the receiver is collected by the sound
collection means (for example, the microphone). It has the great advantage of being able to write
the minutes and operate the PC.
[0003]
However, the other party's voice emitted by the speaker not only reaches the receiver, but is
collected around the microphone, and as the acoustic echo, it degrades the speech quality and
causes howling. Therefore, in order to cancel such echo signal components and prevent
occurrence of howling, an echo cancellation technology has been introduced in hands-free
speech communication.
[0004]
There is also an adaptive beamforming technique for selectively receiving and following the
user's voice from various environmental noises and performing noise suppression, and then
picking up the speaker's voice with high accuracy. This technology is described in a sound
collection method and a sound collection apparatus "echo suppression by microphone array,
directional AGC microphone array" of Patent Document 1. This technology suppresses the noise
signal from the noise source by controlling the acoustic directivity using a plurality of
microphones, and picks up the voice of the speaker with high accuracy. In recent years, this
adaptive beamforming technology is also included in the echo cancellation apparatus (see NonPatent Document 1).
[0005]
FIG. 1 shows an example of the functional configuration of a conventional echo canceller 100
equipped with adaptive beamforming. The echo canceller 100 is connected to N (N is an integer
of 1 or more) sound collecting means (for example, microphones). The echo canceller 100 has N
echo cancelers 102 n (n = 1,..., N) and an adaptive beamforming unit 104. Further, in the
following description, the reproduction signal emitted from the reproduction means 2 is x (t), and
10-04-2019
2
the signal in which the reproduction signal x (t) wraps around to the sound collection means 4n
is an echo signal a (t). Let b (t) be the noise signal of the above and c (t) be the speaker signal
from the speaker 8. A signal consisting of an echo signal a (t), a noise signal b (t) and a speaker
signal c (t) is taken as a collected signal, and the signal collected by the sound collecting means
4n is taken as a collected signal yn (t). , And indicates an echo signal, a noise signal, and a
speaker signal as an (t) bn (t) cn (t). t indicates discrete time, which is indicated by sampling the
continuous time signal at a constant sample interval T. For example, the sampling frequency fs =
48 kHz (48,000 times per second). The sampling frequency fs = 1 / T. In addition, although echo
signals, noise signals, and speaker signals should be represented at real time, not at discrete time
until reaching the sound collection means, they are all shown at discrete time t for simplification
of explanation. Since the echo cancellation apparatus 100 and the echo cancellation apparatuses
200 and 300 described below also handle digital signals that are signals represented by discrete
time t, the input analog signal is converted to a digital signal by a DA converter and output It is
necessary to convert a digital signal to be converted into an analog signal by an AD converter,
but this is a matter of course, so the AD converter and the DA converter are not shown in FIGS. 1
to 3. When the noise source 6 and the speaker 8 do not exist, the collected signal does not
include the noise signal and the speaker signal.
[0006]
The explanation is returned to FIG. The collected signal yn (t) is input to the echo canceller 102n.
The echo canceler 102 n suppresses the echo signal an (t) from the collected signal yn (t). Then,
a residual echo signal en (t) which is a signal from which the echo signal has been canceled is
input to the adaptive beamforming unit 104.
[0007]
The adaptive beamforming unit 104 suppresses the noise signal from the N residual echo signals
en (t) (= 1,..., N) and outputs it as an output signal z (t). The detailed processing of the echo
cancellation unit 102 n and the adaptive beam forming unit 104 will be described in the
following “Best Mode for Carrying Out the Invention”.
[0008]
In the configuration of the echo cancellation apparatus 100, an adaptive beamforming section
10-04-2019
3
104 is installed at the subsequent stage of the N echo cancellation sections 102n. Therefore, it is
possible not to include the variation of the acoustic directivity due to the adaptive beamforming
of the adaptive beamforming unit 104 in the echo path model of the echo cancellation unit 102n.
[0009]
Next, FIG. 2 shows a functional configuration example of a conventional echo cancellation
apparatus 200 provided with adaptive beamforming. In the configuration of the echo
cancellation apparatus 200, the adaptive beamforming section 202 is provided at the front stage,
and the echo cancellation section 204 is provided at the rear stage. With such a configuration,
when there is one output signal from adaptive beamforming section 202, echo cancellation
section 204 performs one echo cancellation processing (adaptive filter processing) on the output
signal. It only needs to be done, and it has a low operation and low memory configuration. Patent
No. 4104626 Kellermann, W., "Strategies for combining acoustic echo cancellation and adaptive
beamforming microphone arrays," IEEE International Conference on Acoustics, Speech, and
Signal Processing, vol. 1, 21-24, pp. 219-222, Aprr 1997.
[0010]
With the configuration of the echo canceller 100, since it is necessary to provide N echo
cancelers 102n, which have high calculation costs, as the number of sound collection means,
enormous calculation costs are required. Further, with the configuration of the echo cancellation
apparatus 200, since time-variant beamforming must be considered in the echo path model of
the echo cancellation section, if the change in acoustic directivity occurs due to the adaptive
beamforming section 202 The echo cancellation unit 204 has to re-estimate the echo path model
(adaptive filter), which may not reach the desired echo cancellation amount.
[0011]
The object of the present invention is to (1) reduce the calculation cost and (2) shorten the time
to reach the desired echo cancellation amount of the echo cancellation unit, thereby to cope with
the variation of acoustic directivity due to the adaptive beamforming process. It is an object of
the present invention to provide an echo canceler that reaches a desired echo cancellation
amount.
10-04-2019
4
[0012]
The echo canceler according to the present invention erases an echo signal included in the
collected sound signal collected by N collection means, suppresses a noise signal included in the
collected sound signal, and outputs it as an output signal. It is a thing.
The echo canceler has N low-pass echo cancelers, an adaptive beamforming unit, and an echo
canceller. The low-pass echo canceler cancels the component of the echo signal included in the
frequency band lower than a predetermined reference value of the collected sound signal, and
outputs a residual echo signal. The adaptive beamforming unit suppresses the noise signal from
the N residual echo signals and outputs an adaptive beamforming signal. The echo canceler
outputs an output signal by canceling an echo signal from the adaptive beamforming signal.
[0013]
In the present invention, power is in a low frequency band (hereinafter simply referred to as "low
band". The desired echo cancellation amount in the low frequency band is referred to as a high
frequency band (hereinafter simply referred to as "high frequency") because of the nature of the
speaker voice concentrated on the. Note that it is larger than the desired echo cancellation
amount of. Then, since the N low-pass echo cancelers cancel echoes only in the low band lower
than a predetermined reference value of the input sound pickup signal, the N low-echo echo
cancelers in the echo canceler 100 The computational cost can be reduced by comparison.
Therefore, the problem of the echo canceller 100 can be solved.
[0014]
Further, the configuration of the echo cancellation apparatus of the present invention is
configured in the order of N low-pass echo cancellation units, low-pass beam forming units, and
echo cancellation units from the former stage. Therefore, in the echo path model of the low-pass
echo cancellation unit, re-estimation due to the change in acoustic directivity of the adaptive
beamforming unit 104 can be avoided. In the echo cancellation unit, the low-pass echo
cancellation unit cancels the echo signal component of the low-frequency echo cancellation
signal, that is, the signal (adaptive beamforming signal) from which most echo signal components
are erased. Do. Therefore, the power of the whole area residual echo signal used in the echo
cancellation unit can be reduced from the beginning, and as a result, the time required to reach
10-04-2019
5
the desired echo cancellation amount of the echo cancellation unit can be shortened. Therefore,
even if acoustic directivity variation occurs due to the adaptive beamforming unit, the echo
cancellation unit can stably achieve the desired echo cancellation amount. Further, the above
Non-Patent Document 1 does not describe the processing procedures of the echo cancellation
technology and the adaptive beamforming technology.
[0015]
The following shows the best mode for carrying out the invention. In addition, the same number
is attached | subjected to the structure part which has the same function, and the process which
performs the same process, and the duplication description is abbreviate | omitted.
[0016]
FIG. 3 shows an example of the functional configuration of the echo cancellation apparatus 300,
and FIG. 4 shows the processing flow. The echo canceller 300 includes N low-pass echo cancelers
302 n, an adaptive beamforming unit 104, and an echo canceller 304. FIG. 5 shows an example
of the functional configuration of the low-pass echo cancellation unit 302n, FIG. 6 shows an
example of the functional configuration of the adaptive beamforming unit 104, and FIG. 7 shows
an example of the functional configuration of the echo cancellation unit 304. The constituent
means of all the low-pass echo cancellation units 302 n (n = 1,..., N) are the same, and therefore
the subscript “n” is omitted for the reference numerals of the respective constituent means
shown in FIG. doing. Also, the echo cancellation unit 304 is the same as the echo cancellation
unit 204 shown in FIG.
[0017]
As shown in FIG. 5, the low-pass echo cancellation unit 302 of this embodiment includes lowpass filter means 3014, down-sampling means 3016, adaptive filter means 3018, band dividing
means 3012, subtraction means 3020, and band combining. And a unit 3022. The band dividing
means 3012 is composed of a low pass filter means 30122, a high pass filter means 30124, and
two down sampling means 30126. Further, as shown in FIG. 6, the adaptive beamforming unit
104 of this embodiment includes a frequency domain conversion unit 1042, an adaptive
beamforming unit 1044, a time domain conversion unit 1046, an impulse response storage unit
1050 (or a steering vector And a storage unit 1048). Further, as shown in FIG. 7, the echo
10-04-2019
6
cancellation unit 304 of this embodiment is constituted by an adaptive filter unit 3048 and a
subtraction unit 3051.
[0018]
[Process of Low-Band Echo Cancellation Unit 302n (Step S102)] First, the collected sound signal
yn (t) is input to the low-pass echo cancellation unit 302n. The echo cancellation unit 302 n
cancels the component of the echo signal an (t) included in the frequency band lower than the
predetermined reference value α of the collected signal yn (t), and outputs the residual echo
signal en (t). . First, the reference value α will be described. The present invention focuses on the
fact that the desired echo cancellation amount in the low band is larger than the desired echo
cancellation amount in the high band, because of the nature of the speaker voice in which the
power is concentrated in the low band. For example, assuming that the entire frequency band is
24 kHz, the speaker voice often concentrates on the frequency band of reproduced signal less
than 12 kHz, and as a result, if the echo signal component included in the frequency band of 12
kHz or less is collected , Can eliminate most echo signal components. The reference value α
indicates the frequency value of the upper limit of the band in which the signal (for example, the
signal of the speaker's voice) that is the source of the echo signal is concentrated. The reference
value α is predetermined and in the above example, α = 12 kHz. As described above, the lowpass echo cancellation unit 302 n performs the echo cancellation process only on the low-pass
sound pickup signal yn (t) in order to reduce the calculation cost. The reference value α of this
frequency band may be set in advance. Hereinafter, the processing of the low-pass echo
cancellation unit 302 will be specifically described with reference to FIG. In addition, for the
collected sound signal yn (t), the collected sound signal input to the low-pass echo cancellation
unit 302 is denoted as “y (t)” for simplification.
[0019]
The sound pickup signal y (t) picked up by the sound pickup means and inputted is inputted to
the low pass filter means 30122 and the high pass filter means 30124 in the band dividing
means 3012. The low-pass filter means 30122 convolutes the coefficients of the low-pass filter
(in the above example, only signals of less than 12 kHz in the above example) with respect to the
collected signal y (t) and outputs a low-range collected signal yL (t) Do.
[0020]
10-04-2019
7
On the other hand, the high-pass filter means 30124 convolves the coefficients of the high-pass
filter (in the above example, only signals of 12 kHz or more in the above example) with respect to
the collected signal y (t), Output In the low pass filter means 30122 and the low pass filter means
3014, it is preferable to make the reference value α of the frequency band to be passed the
same.
[0021]
The low-pass sound pickup signal yL (t) and the high-pass sound pickup signal yH (t) are input to
the downsampling units 30126 and 30128, respectively. Then, the downsampling unit 30126
performs downsampling processing to thin out, for example, M−1 pieces of low-pass sound
pickup signals yL (t). The thinning number M is a positive real number of 1 or more, for example,
"2". The signal after the downsampling process is represented as yL (Mt). Similarly, the
downsampling unit 30128 performs downsampling processing to thin out, for example, M−1
pieces of high frequency sound pickup signals yH (t). The signal after the downsampling process
is represented as yH (Mt). The low-frequency collected signal yL (Mt) is input to the subtracting
means 3020, and the high-frequency collected signal yH (Mt) is input to the band combining
means 3022.
[0022]
On the other hand, the reproduction signal x (t) is input to the reproduction means 2 and the low
pass filter means 3014. The low-pass filter means 3014 convolutes the reproduction signal x (t)
with the low-pass filter (in the above example, a signal of less than 12 kHz in the above example)
to generate a low frequency signal of the reproduction signal x (t). (Hereafter, it will be referred
to as "low frequency reproduction signal x L (t). Output). The low frequency reproduction signal
xL (t) is input to the downsampling unit 3016. The downsampling unit 3016 performs
downsampling processing of thinning out M−1 pieces of low-frequency reproduction signal xL
(t). The signal xL (Mt) after the downsampling process is input to the adaptive filter means 3018.
[0023]
Here, the downsampling process by the downsampling units 3016, 30126, and 30128 may be
lower than the original sampling frequency. The rational downsampling is described in detail, for
10-04-2019
8
example, in the up-down sampling apparatus, the up-down sampling apparatus, and its program
in Japanese Patent Application No. 2007-262986. Also, the downsampling means 3016, 30126,
and 30128 may be omitted.
[0024]
The adaptive filter means 3018 outputs the low-pass pseudo echo signal dL (Mt) by convolving
the low-pass reproduction signal xL (Mt) with the coefficient hL (Mt) of the adaptive filter. The
equation of the convolution is as follows. Here, S indicates the number of taps.
[0025]
The subtracting means 3020 outputs the low-pass residual echo signal eL (Mt) by taking the
difference between the low-pass sound pickup signal yL (Mt) and the low-pass pseudo echo
signal dL (Mt). In the example of FIG. 5, the low-pass pseudo echo signal dL (Mt) is subtracted
from the low-pass sound pickup signal yL (Mt). The low-pass residual echo signal eL (Mt) is input
to the band combining means 3022 and the adaptive filter means 3018. The coefficients of the
adaptive filter of the adaptive filter means 3018 are updated by the low-pass reproduced signal
xL (Mt) and the past low-pass residual echo signal eL (Mt). Although various algorithms are
known for this update, for example, there are a learning identification algorithm (NLMS:
Normalized Least-Mean-Squares) and an exponential weighting algorithm, but the learning
identification algorithm will be briefly described just in case. The adaptive filter coefficient vector
HL (Mt) is updated according to the following equation using the vector XL (Mt) of the low-pass
reproduced signal xL (Mt) by the learning identification algorithm. However, XL (Mt) = [xL (Mt),
xL (M (t-1)),. . . , XL (M (t-S + 1)) HL (Mt) = [hL, 0 (Mt), hL, 1 (Mt),. . . , HL, S−1 (Mt)], β is the
step size of the scalar quantity, and the possible range is 0 <β <2. Also, δ is a minute constant
for preventing the denominator from becoming zero. <The explanation of the learning
identification algorithm is above>
[0026]
The band synthesizing unit 3022 performs band synthesis on the low-pass residual echo signal
eL (Mt) and the high-pass sound pickup signal yH (Mt) to output a residual echo signal e (t). As a
specific example of the processing details, for the low-pass residual echo signal eL (Mt) and the
high-pass sound pickup signal yH (Mt), M−1 zeros are inserted between the samples, and
10-04-2019
9
imaging occurs in up-sampling A residual echo signal e (t) is output by adding the respective
output signals that are convoluted with a filter that removes (the folded spectrum of the sampling
frequency before upsampling). Then, the above processing is performed in all the low-pass echo
cancellation units 302 n (n = 1,..., N), and the residual echo signal en (t) is output from each lowpass echo cancellation unit 302 n.
[0027]
[Process of Adaptive Beamforming Unit 104 (Step S104)] All N residual echo signals en (t) (n =
1,..., N) are input to the adaptive beamforming unit 104. The adaptive beamforming unit 104
suppresses the noise signal from the N residual echo signals en (t) and outputs an adaptive
beamforming signal s (t). Hereinafter, a specific example will be described using FIG. In addition,
about the detail of description of the following adaptive beam forming part 104, it describes in
stage-of Unexamined-Japanese-Patent No. 2008-60635 grade | etc.,.
[0028]
The input N residual echo signals en (t) are input to the frequency domain conversion means
1042. The frequency domain conversion means 1042 converts the residual echo signal en (t),
which is a time domain signal, into a frequency domain signal En (f, τ). Here, f indicates a
frequency and τ indicates an arbitrary time. A known technique such as Fourier transform may
be used for frequency domain transformation. All En (f, τ) (n = 1,..., N) are input to the adaptive
beamforming means 1044.
[0029]
Adaptive beamforming means 1044 enhances the speaker signal c (t) which is the target signal,
and filters (W1 (f),..., WN (f) which suppress the noise signal b (t) which is an unnecessary signal
as much as possible. )] It is realized by estimating <T>. When designing the adaptive
beamforming means 1044, “steering vector a (f) = [exp (−) which is an approximation of
impulse response vector g (f) from speaker to each sound collecting means or g (f) It is assumed
that “i2πfτ1),..., exp (−i2πfτN)] <T> is known”. Here, τ n is a time difference between the
time when the speaker signal (target signal) from the speaker 8 reaches the sound collecting
means n and the time when the origin 0 is reached. Then, as shown in FIG. 8, the sound pickup
means 4n is often arranged in a straight line. Assuming that the direction of the speaker 8 is θ,
10-04-2019
10
and the coordinates dn of the sound collection means 4 n when the sound collection s means 41
is a reference (origin), the above τ n is given by τ n = dn cos θ / c. Where c is the speed of the
signal.
[0030]
Referring back to FIG. 6, as the adaptive beamforming means 1044 for suppressing unnecessary
signals, adaptive filter group (vector) 1048n (n = n) for minimizing the output power A (W (f))
expressed by the following equation W (f) = [W1 (f),. . . , WN (f)]. A (W (f)) = V {| S | <2> (f, τ)} =
V {S (f, τ) S <*> (f, τ)} = V {W <T> (f ) E (f, τ) E <T> (f, τ) W (f)} = W <T> (f) RE (f) W (f) (1)
[0031]
Here, V {} is an averaging operation with respect to time τ, A <*> is a complex conjugate of A, RE
(f) = V {E (f, τ) E <T> (f, τ)} is an input signal The correlation matrix S (f, τ) is the output of the
adaptive beamforming means 1044 and can be expressed by the following equation. S (f, τ) = W
<T> (f) E (f, τ) (2) where the meaningless solution (W (f) = 0 = [0,..., 0] <T> In order to avoid), the
constraint shown in the following equation that the target signal is obtained without distortion is
given. W<T>(f)g(f)=1 (3)
[0032]
Thus, the problem of finding the value of W (f) which satisfies the equation (3) and in which the
value of A (W (f)) in the above equation (1) is minimized is described below using Lagurange's
undetermined multiplier p. Can be expressed by the equation (4) of A ′ (W (f)) = A (W (f)) + p (W
<T> (f) g (f) −1) (4) Adaptive filter group (vector) by solving equation (4) W (f) is obtained by the
following equation (5).
[0033]
The adaptive beamforming signal S (f, τ) in the frequency domain output from the adaptive
beamforming unit 1044 is input to the time domain conversion unit 1046. The time domain
conversion means 1046 converts the adaptive beamforming signal S (f, τ), which is a signal in
10-04-2019
11
the frequency domain, into the adaptive beamforming signal s (t) in the time domain using, for
example, the inverse Fourier transform which is a known technique. Convert and output.
[0034]
[Process of Echo Cancellation Unit 304 (Step S106)] The adaptive beamforming signal s (t) is
input to the echo cancellation unit 304. The echo canceller 304 cancels the echo signal from the
adaptive beamforming signal s (t), and outputs it as an output signal z (t). Although the
processing contents of the echo cancellation unit 304 are known, they will be described with
reference to FIG.
[0035]
The reproduction signal x (t) is input to the adaptive filter means 3048. The adaptive filter means
3048 outputs the vector d (t) of the pseudo echo signal by convolving the coefficient h (t) of the
adaptive filter with the reproduced signal x (t) as in the following equation.
[0036]
Then, the subtracting means 3051 subtracts the pseudo echo signal d (t) from the adaptive
beamforming signal s (t) to output a residual echo signal e (t). Then, the vector of the adaptive
filter of the adaptive filter means 3048 is updated using the above-described learning
identification method or the like with the past residual echo signal and reproduction signal.
[0037]
Here, the components of the echo signal included in the low frequency band of the collected
signal are erased by the N low-pass echo cancellation units 302 n, that is, most of the
components of the echo signal included in the adaptive beamforming signal are Because of the
cancellation, the power of the residual echo signal e (t) is much smaller. Therefore, the time until
reaching the desired echo cancellation amount can be shortened, and even if the acoustic
directivity of the adaptive beamforming unit 104 fluctuates, the desired echo cancellation
amount can be reached.
10-04-2019
12
[0038]
The residual echo signal e (t) is then output as an output signal z (t). When the speaker signal c
(t) is present, the residual echo signal e (t) and the speaker signal c (t) are output as the output
signal z (t).
[0039]
The echo cancellation unit 304 of the first embodiment cancels the echo signal component for
the entire frequency band of the input signal. However, since the components of the low-pass
echo signal are canceled by the N low-pass echo cancellation units 302 n, the echo cancellation
unit adapts the frequency band higher than the reference value α (α = 12 kHz in the above
example). The echo signal may be canceled from the beamforming signal. By doing this, the
calculation cost of the echo cancellation unit can be significantly reduced. In this case, the
reference numeral of the echo cancellation unit is 306, and a functional configuration example of
the echo cancellation unit 306 is shown in FIG. The echo cancellation unit 306 has a
configuration in which the high pass filter means and the low pass filter means of the low-pass
echo cancellation unit 302 (see FIG. 5) are interchanged. The processing contents of the echo
cancellation unit 306 will be briefly described.
[0040]
First, the adaptive beamforming signal s (t) is input to the high pass filter means 30622 and the
low pass filter means 30624 in the band dividing means 3062. The high-pass filter means 30622
convolves the coefficients of the high-pass filter (in the above example, only a signal of 12 kHz (=
reference value α) or more) to adaptive beamforming s (t), and adaptive beamforming in the
high band Output a signal sH (t). The low-pass filter means 30624 convolutes the coefficients of
the low-pass filter (in the above example, only signals less than 12 kHz in the above example)
with respect to the adaptive beamforming signal s (t) to obtain a low-pass adaptive beamforming
signal sL (t) Output
[0041]
10-04-2019
13
The high band adaptive beamforming signal sH (t) and the low band adaptive beamforming
signal sL (t) are input to the downsampling units 30626 and 30628, respectively. The downsampling units 30626 and 30628 perform down-sampling processing on the high-pass adaptive
beamforming signal sH (t) and the low-pass adaptive beamforming signal sL (t), respectively. The
high band adaptive beamforming signal sH (Mt) after the downsampling process is input to the
subtracting means 3070, and the low band adaptive beamforming signal sL (Mt) after the
downsampling process is input to the band combining means 3072.
[0042]
On the other hand, the high-pass filter means 3064 generates a reproduction signal x (t) of the
coefficient of the high-pass filter (in the above example, only a signal of 12 kHz (= reference
value α) or more) with respect to the reproduction signal x (t). The high frequency reproduction
signal xH (t) is output by convoluting the signal with. The high frequency reproduction signal xH
(t) is input to the downsampling unit 3066. The downsampling unit 3066 performs a
downsampling process on the high frequency reproduction signal xH (t). The signal xH (Mt) after
the downsampling process is input to the adaptive filter means 3068. Also, the downsampling
means 3066, 30626, 30628 may be omitted.
[0043]
The adaptive filter means 3068 outputs the high-pass pseudo echo signal dH (Mt) by convolving
the low-pass reproduction signal with the adaptive filter. The subtracting means 3070 subtracts
the high-pass adaptive echo signal dH (Mt) from the high-pass adaptive beamforming signal sH
(Mt) to output a high-pass residual echo signal eH (Mt). The highband residual echo signal eH
(Mt) is input to the band combining means 3072 and the adaptive filter means 3068. The
adaptive filter coefficient of the adaptive filter means 3068 is updated using the abovementioned learning identification algorithm etc. by the high frequency reproduction signal xH
(Mt) and the past high frequency residual echo signal eH (Mt).
[0044]
The band synthesis unit 3072 performs band synthesis and up-sampling processing on the
highband residual echo signal eH (Mt) and the lowband adaptive beamforming signal sL (Mt) to
10-04-2019
14
output a residual echo signal e (t). Details of the processing of the band synthesizing unit 3072
are the same as the processing of the band synthesizing unit 3022, and thus will not be
described. Then, the residual echo signal e (t) is outputted from the band synthesizing means
3072 as an output signal z (t). <Description of the processing of the echo cancellation unit 306 or
more>
[0045]
Although the above-described echo cancellation apparatus 300 has been described for the case
where there is only one reproduction unit 2, the present invention can be applied even when the
number of reproduction units is two or more. Also, an example was described in which the
processing of the low-pass echo cancellation unit 302 n and the echo cancellation unit 304 was
performed in the time domain, and the processing of the adaptive beamforming unit 104 was
performed in the frequency domain. However, the processing of the low-pass echo cancellation
unit 302 n and the echo cancellation unit 304 may be performed in the frequency domain, and
the processing of the adaptive beamforming unit 104 may be performed in the time domain.
[0046]
The present invention focuses on the fact that the desired echo cancellation amount in the low
band is larger than the desired echo cancellation amount in the high band, because of the nature
of the speaker voice in which the power is concentrated in the low band. Then, the echo
cancellation apparatus according to the present embodiment performs “echo cancellation
processing of low band by low band echo cancellation unit 302 n” → “adaptive beam forming
processing by adaptive beam forming unit 104” → “echo cancellation by echo cancellation
unit 304”, Process in the order of First, the low-pass echo cancellation unit 302 n cancels the
component of the echo signal that is included only in the low frequency band of the collected
signal. Therefore, the calculation cost can be reduced more than the echo canceller 100. Also,
before the adaptive beamforming processing, low-range echo cancellation processing is
performed. Therefore, in the echo path model of the low-pass echo cancellation unit, reestimation due to the change in acoustic directivity of the adaptive beamforming unit 104 can be
avoided. In addition, the components of most of the echo signal included in the collected signal
are eliminated by the low-range echo cancellation processing. Therefore, the power of the
residual echo signal e (t) from the subtraction means 3051 in the echo cancellation unit 304 is
small, and as a result, the time until the desired echo cancellation amount of the echo
cancellation unit is reached can be shortened. Therefore, even if the change of the acoustic
directivity characteristic by the adaptive beamforming unit 104 occurs, the desired amount of
10-04-2019
15
echo cancellation can be reached.
[0047]
Also, experimental results showing that the echo canceller 300 of the first embodiment is
superior to the conventional echo canceller 100 will be described. When the number of
microphones is N = 4, the entire frequency band is 24 kHz, and the reference value of the
frequency band is α = 12 kHz, that is, the low band is less than 12 kHz and the high band is 1224 kHz. The calculation cost of can be reduced by about 40% compared to the calculation cost of
the echo cancellation process of the echo cancellation apparatus 100.
[0048]
<Hardware Configuration> The present invention is not limited to the above-described
embodiment. In addition, the various processes described above are not only executed
chronologically according to the description, but may be executed in parallel or individually
depending on the processing capability of the apparatus executing the process or the necessity. It
goes without saying that other modifications can be made as appropriate without departing from
the spirit of the present invention. Also, when the above configuration is implemented by a
computer, the processing content of the function that the echo canceller 300 should have is
described by a program. Then, the processing function is realized on the computer by executing
this program on the computer.
[0049]
The program describing the processing content can be recorded in a computer readable
recording medium. The computer readable recording medium may be any medium such as a
magnetic recording device, an optical disc, a magneto-optical recording medium, a semiconductor
memory, etc. Specifically, for example, a hard disk device as a magnetic recording device, flexible
A disk, a magnetic tape or the like as an optical disk, such as a DVD (Digital Versatile Disc), a
DVD-RAM (Random Access Memory), a CD-ROM (Compact Disc Read Only Memory), a CD-R
(Recordable) / RW (ReWritable), etc. An MO (Magneto-Optical disc) or the like can be used as a
magneto-optical recording medium, and an EEP-ROM (Electronically Erasable and
Programmable-Read Only Memory) or the like can be used as a semiconductor memory.
10-04-2019
16
[0050]
Further, this program is distributed, for example, by selling, transferring, lending, etc. a portable
recording medium such as a DVD, a CD-ROM or the like in which the program is recorded.
Furthermore, this program may be stored in a storage device of a server computer, and the
program may be distributed by transferring the program from the server computer to another
computer via a network.
[0051]
For example, a computer that executes such a program first temporarily stores a program
recorded on a portable recording medium or a program transferred from a server computer in its
own storage device. Then, at the time of execution of the process, the computer reads the
program stored in its own recording medium and executes the process according to the read
program. Further, as another execution form of this program, the computer may read the
program directly from the portable recording medium and execute processing according to the
program, and further, the program is transferred from the server computer to this computer
Each time, processing according to the received program may be executed sequentially. In
addition, a configuration in which the above-described processing is executed by a so-called ASP
(Application Service Provider) type service that realizes processing functions only by executing
instructions and acquiring results from the server computer without transferring the program to
the computer It may be Note that the program in the present embodiment includes information
provided for processing by a computer that conforms to the program (such as data that is not a
direct command to the computer but has a property that defines the processing of the computer).
[0052]
Further, in this embodiment, although the present apparatus is configured by executing a
predetermined program on a computer, at least a part of the processing contents may be realized
as hardware. Further, the echo cancellation apparatus 300 described in the present embodiment
includes a central processing unit (CPU), an input unit, an output unit, an auxiliary storage
device, a random access memory (RAM), a read only memory (ROM), and a bus. (All not shown).
The CPU executes various arithmetic processing in accordance with the various programs read.
The auxiliary storage device is, for example, a hard disk, a magneto-optical disc (MO), a
10-04-2019
17
semiconductor memory or the like, and the RAM is a static random access memory (SRAM) or a
dynamic random access memory (DRAM). The bus communicably connects the CPU, the input
unit, the output unit, the auxiliary storage device, the RAM, and the ROM.
[0053]
<Collaboration between Hardware and Software> The word addition apparatus of the present
embodiment is constructed by reading a predetermined program into the hardware as described
above and executing it by the CPU. Hereinafter, the functional configuration of each device
constructed in this way will be described. The low-pass echo cancellation unit 302 n, the adaptive
beam forming unit 104, and the echo cancellation unit 304 of the echo cancellation apparatus
300 are arithmetic units that are constructed by the CPU reading and executing a predetermined
program. A storage unit (not shown) of the echo canceller 300 functions as the auxiliary storage
device.
[0054]
The figure which showed the function structural example of the conventional echo cancellation
apparatus. The figure which showed the function structural example of the conventional echo
cancellation apparatus. FIG. 2 is a diagram showing an example of a functional configuration of
an echo cancellation apparatus according to a first embodiment. FIG. 6 is a diagram showing a
process flow of the echo cancellation apparatus of the first embodiment. FIG. 5 is a diagram
showing an example of a functional configuration of a low-pass echo cancellation unit of the first
embodiment. FIG. 2 is a diagram showing an example of a functional configuration of an adaptive
beamforming unit according to the first embodiment. FIG. 2 is a diagram showing an example of
a functional configuration of an echo cancellation unit of the first embodiment. The figure for
demonstrating (tau) used by an adaptive beam forming process. FIG. 7 is a diagram showing an
example of a functional configuration of an echo cancellation unit according to a second
embodiment.
10-04-2019
18
Документ
Категория
Без категории
Просмотров
0
Размер файла
32 Кб
Теги
description, jp2010028653
1/--страниц
Пожаловаться на содержимое документа