close

Вход

Забыли?

вход по аккаунту

?

DESCRIPTION JPH06303689

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JPH06303689
[0001]
BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a
noise removing device, which outputs a noise removed acoustic signal obtained by removing a
noise component from an acoustic signal containing a noise component.
[0002]
2. Description of the Related Art In recent years, devices using digital signal processing have
been improved in various electronic technology fields.
[0003]
For example, in recent years, technological developments such as voice input devices, voice
recognition devices, and hands-free (HandsFree) telephones have been actively conducted.
As an important technology in these devices, the noise removal technology is regarded as an
essential technology as pre-processing before processing speech. For this reason, various noise
removal devices have been proposed.
[0004]
10-04-2019
1
For example, in a hands-free telephone set mounted on a car, when background noise is large,
noise is superimposed on an audio signal, which causes a problem that communication becomes
difficult. Furthermore, specifically, in hands-free telephones, background noise such as engine
noise, running noise (window = wind noise, road noise, etc.), fan noise, etc. is mixed with the
voice signal into the microphone and signal-to-noise to the other party The ratio (S / N) may be
lowered to make listening difficult and prevent calls from being made.
[0005]
Therefore, there is a need for a noise removal device that can remove background noise as
described above. Various proposals have been made as such a noise removal device. An example
of the noise removal device is shown in the document: Japanese Patent Laid-Open No. 4-245300.
The noise removal apparatus shown in this document will be described with reference to FIG.
[0006]
In FIG. 2, an audio signal or the like captured by the microphone 101 is given to the feature
extraction unit 31. On the other hand, a noise signal or the like captured by the microphone 201
is given to the feature extraction unit 41. The feature extraction unit 31 or 41 obtains a timeseries feature vector that expresses acoustic features of the input signal in time series. For this
reason, it is comprised by the converter etc. which are comprised by a discrete Fourier transform,
a fast Fourier transform, or a band pass filter bank.
[0007]
Then, a time-series feature vector such as a voice signal output from the feature extraction unit
31 is supplied to the stationary noise removal unit 21 and the noise section estimation unit 20.
Further, a time-series feature vector such as a noise signal output from the feature extraction unit
41 is provided to the stationary noise removal unit 22. The noise segment estimation unit 20
estimates a noise segment that does not include speech based on a time-series feature vector
such as speech from the feature extraction unit 31. The stationary noise removal unit 21
estimates stationary noise from the time-series feature vector of the input speech in the section
output by the noise section estimation unit 20. Then, the stationary noise obtained by estimation
is removed from the entire time-series feature vector of the input speech, and is supplied to the
10-04-2019
2
non-stationary noise removing unit 23.
[0008]
Also, the stationary noise removing unit 22 estimates stationary noise from the time-series
feature vector of ambient noise of the input in the section output by the noise section estimation
unit 20. Then, the stationary noise is removed from the entire time-series feature vector of the
ambient noise of the input, and is applied to the non-stationary noise removing unit 23.
[0009]
Then, the non-stationary noise removing unit 23 calculates the correction coefficient between the
two inputs from the time-series feature vectors of the two inputs in the noise segment output by
the noise segment estimating unit 20, and is supplied from the stationary noise removing unit
21. Non-stationary noise contained in a time-series feature vector of speech after stationary noise
removal is estimated, and non-stationary noise obtained from the entire time-series feature
vector of speech after stationary noise removal is removed.
[0010]
SUMMARY OF THE INVENTION According to the noise removal device of the above-mentioned
document, it is considered that noise removal is performed well when there is no voice signal and
only the noise signal.
However, in an actual environment, the speech signal and stationary noise (ideal stationary noise
is rare and time-varying. ) And non-stationary noise are mixed and captured by the two
microphones 101 and 201.
[0011]
For this reason, it is also difficult to clarify the distinction between the noise section and the
speech signal section and to estimate the noise component. Similarly, it is very difficult to
distinguish between speech signals and non-stationary noise.
10-04-2019
3
[0012]
Therefore, since it is difficult to estimate noise clearly, it is possible to erroneously determine the
noise component as the component that should originally be the component of the audio signal,
and to remove it, in this case The signal waveform may be distorted. In addition, in order to
estimate noise with high accuracy, the amount of processing and computation for estimation
increase, and the noise removal response speed may be slow.
[0013]
From such a thing, a very difficult technique was needed in order to fully remove the noise
component while maximizing the output sound quality.
[0014]
The present invention has been made in view of the above problems, and the object of the
present invention is to remove stationary noise, non-stationary noise and the like with a simple
configuration and at a response speed that can be sufficiently put to practical use. It is providing
a device.
[0015]
In order to achieve the above object, the present invention comprises at least two or more
acoustic capture means (for example, microphone units etc.) for capturing an acoustic and
outputting an acoustic signal; The noise removal apparatus for outputting a noise removal
acoustic signal obtained by removing noise (for example, noise, noise and the like) from the
acoustic signal of the above-mentioned acoustic capture means is realized with the following
characteristic configuration.
[0016]
In other words, noise spectrum estimation means for estimating the spectrum of noise from the
acoustic signals of the two or more acoustic capture means, and outputting an estimated noise
spectrum, and acoustic spectrum conversion means for obtaining the spectra of acoustic signals
of the two or more acoustic capture means And processing means for processing from the
estimated noise spectrum and the acoustic spectrum to output the denoised acoustic signal from
which noise has been removed.
10-04-2019
4
[0017]
As the above-mentioned spectrum, for example, processing may be performed focusing on any of
a frequency spectrum, an amplitude spectrum, a phase spectrum, and a power spectrum.
[0018]
According to the present invention, the (for example, frequency) spectrum of the noise
component contained in the acoustic signal is estimated from the acoustic signals captured by
the two or more acoustic capturing means.
Furthermore, the above-mentioned acoustic signal is also converted into (for example, frequency)
spectrum.
The conversion to this (for example, frequency) spectrum can also be performed by, for example,
fast Fourier transform (FFT).
[0019]
Then, processing is performed from the acoustic (for example, frequency) spectrum and the
estimated noise spectrum focusing on (for example, frequency) spectrum to perform accurate
estimation or correction of the noise spectrum.
For example, correlation processing or the like is performed between the sound (for example,
frequency spectrum) and the estimated noise spectrum, the noise spectrum is finely processed,
the features are finely corrected, and a corrected noise spectrum is obtained.
[0020]
Then, for example, by removing (for example, subtracting) the corrected noise spectrum from the
acoustic (for example, frequency) spectrum, it is possible to obtain an acoustic (for example,
frequency) spectrum from which noise is removed with high accuracy. .
[0021]
10-04-2019
5
Then, the acoustic frequency spectrum from which noise has been accurately removed in this
manner can output a noise-removed acoustic signal in the time domain, for example, by
performing a fast inverse Fourier transform (IFFT) or the like.
[0022]
Since the above operation is possible, it is not necessary to perform complicated processing.
In addition, the estimated noise spectrum can be corrected with high accuracy in consideration of
the (for example, frequency) spectrum of the main acoustic signal, and even if there is noise
fluctuation, it can be estimated with high accuracy.
[0023]
It should be noted that even if attention is focused on, for example, an amplitude spectrum, a
phase spectrum, or a power spectrum as another spectrum, an action equivalent to the above can
be obtained.
[0024]
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT A preferred embodiment of the
noise removal apparatus of the present invention will now be described with reference to the
drawings.
[0025]
FIG. 1 is a functional block diagram of the noise removal apparatus of this embodiment.
In FIG. 1, the noise removing device captures sound, noise and the like by the main microphone 2
and the reference microphone 1.
Then, the noise removing device removes noise and outputs only a necessary acoustic signal.
10-04-2019
6
[0026]
For this reason, estimation of noise is performed by the adaptive noise predictor 5.
Then, the frequency domain processing unit 20 performs fast Fourier transform on the estimated
noise signal Sn * and the main acoustic signal Sa.
Then, the noise component is again predicted from the noise spectrum F (Sn *) in the frequency
domain and the main acoustic signal F (Sa) in the frequency domain, and the noise signal is
removed to remove the stationary noise and non-stationary noise accurately. Output.
[0027]
In FIG. 1, the reference sound capture signal captured by the reference microphone 1 is supplied
to an A / D (analog / digital) converter 3.
Then, the A / D (analog / digital) converter 3 converts the reference sound capture signal into a
predetermined bit number unit by a conversion method of a predetermined method, and outputs
a digital reference sound signal Sn. The digital reference sound signal Sn is supplied to an
adaptive noise predictor 5 composed of an adaptive digital filter.
[0028]
On the other hand, the main acoustic capture signal captured by the main microphone 2 of FIG. 1
is supplied to an A / D (analog / digital) converter 4. And, this A / D (analog / digital) converter 4
has the same function as the A / D (analog / digital) converter 3 and is a conversion system of a
predetermined system for the main sound capture signal, It converts into a predetermined bit
number unit, and outputs digital main acoustic signal Sa. The digital main acoustic signal Sa is
supplied to the subtractor 6, the fast Fourier transformer 8, and the voice detector 7.
[0029]
10-04-2019
7
The above-mentioned A / D conversion method is, for example, linear conversion. The converted
word length (the number of bits) of one sample is, for example, 12 bits or more.
[0030]
The voice detector 7 shown in FIG. 1 monitors the change of the power of the main acoustic
signal Sa, and determines the presence or absence of voice by judging according to a
predetermined rule. Then, for example, in a state where no audio signal is detected, the
coefficient update control signal Ck is output, for example, as logic 1, and this coefficient update
control signal Ck is supplied to the adaptive noise predictor 5 as logic 1. Then, in order to
estimate noise, the adaptive noise predictor 5 performs adaptive control of filter coefficients (tap
coefficients) according to a predetermined method, and outputs an estimated noise signal Sn *.
[0031]
That is, since the adaptive noise predictor 5 of FIG. 1 is configured by the adaptive digital filter as
described above, the reference sound signal Sn supplied and the output signal Se of the
subtracter 6 (estimated error signal or residual are And adaptive filtering is performed to output
an estimated noise signal Sn *. Then, the estimated noise signal Sn * is supplied to the subtraction
input (-) of the subtracter 6.
[0032]
That is, the adaptive noise predictor 5 performs, for example, convolutional integration or the
like from an impulse response representing the characteristic of a propagation path in a car, and
obtains this estimated noise signal Sn *. Then, the subtractor 6 obtains a difference between the
main acoustic signal Sa and the estimated noise signal Sn * to obtain an estimated error signal Se
(residual signal). The estimated error signal Se is again applied to the adaptive noise predictor 5
for coefficient value updating. Furthermore, it is also provided to the voice detector 7.
[0033]
Then, the fast Fourier transformer 8 of FIG. 1 performs fast Fourier transformation (FFT) on the
main sound signal Sa to convert it into a power spectrum in the frequency domain, and the sound
10-04-2019
8
frequency spectrum signal F (F Sa) is output to a subtractor 11 and a noise component predictor
10.
[0034]
Then, the fast Fourier transformer 9 performs fast Fourier transform (FFT) on the estimated
noise signal Sn * to convert it into a power spectrum in the frequency domain, and the noise
frequency spectrum signal F (Sn *) The noise component is output to the noise component
predictor 10.
[0035]
Then, the noise component predictor 10 predicts again an accurate noise spectrum from the
acoustic frequency spectrum signal F (Sa) and the noise frequency spectrum F (Sn *).
That is, since the estimated noise signal Sn * is estimated by the adaptive noise predictor 5 using
an estimation algorithm such as a learning identification method, this estimation may take a
relatively long time.
For example, when the change of the acoustic signal is severe, noise may be estimated several
milliseconds to several hundreds of milliseconds from the acoustic signal at the actual time.
[0036]
On the other hand, when the noise is stable and stationary, estimation of the noise is quick
because the change of the power and the like is small, and the noise can be estimated accurately
with a small error.
[0037]
Because of this, the main acoustic signal Sa and the estimated noise signal Sn * may not be
synchronized particularly when noise is non-stationary, so the noise component predictor 10
corrects such a point. .
10-04-2019
9
That is, the noise component predictor 10 corrects fine features of the estimated noise frequency
spectrum FSn * by, for example, correlation processing between the main acoustic frequency
spectrum FSa and the estimated noise frequency spectrum F (Sn *). That is, in order to make the
noise frequency spectrum close to the noise frequency spectrum included in the main acoustic
frequency spectrum F (Sa), correction is also made using prediction processing. The noise
spectrum F (Sn #) obtained by this correction is applied to the subtraction input (-) of the
subtractor 11.
[0038]
Then, the subtractor 11 subtracts the noise spectrum F (Sn #) corrected from the main acoustic
frequency spectrum F (Sa), and the noise spectrum F (Sn #) corrected from the main acoustic
frequency spectrum F (Sa) Remove. Then, the denoised acoustic frequency spectrum F (S)
obtained by this removal is applied to the fast inverse Fourier transformer 12.
[0039]
Then, the fast inverse Fourier transformer 12 performs fast inverse Fourier transform on the
denoised acoustic frequency spectrum F (S) to output a denoised acoustic signal in digital form in
the time domain, and D / A (digital / analog) It gives to the converter 13. Then, the D / A (digital
/ analog) converter 13 converts the noise removing acoustic signal in digital form into an analog
noise removing acoustic signal S and outputs it.
[0040]
According to the noise removal apparatus of the embodiment described above, the estimated
noise signal Sn * can be obtained relatively easily by the adaptive noise predictor 5. Moreover,
since the estimated noise signal Sn * can not be sufficiently removed by using the noise signal as
it is for noise removal, it is further converted into a frequency spectrum in order to obtain an
accurate acoustic signal. Then, since the noise spectrum is finely corrected again from the main
acoustic frequency spectrum F (Sa) and the estimated noise frequency spectrum F (Sn *) by the
noise component predictor 10, the noise spectrum F (Sn #) with high accuracy can be obtained.
You can get it.
10-04-2019
10
[0041]
That is, in the frequency domain, the fact that discrimination (discrimination) between noise
components and acoustic components is easy is utilized. Then, the noise spectrum F (Sn #)
(spectrum of noise, noise, etc.) is removed from the main acoustic frequency spectrum F (Sa) to
obtain a denoised acoustic frequency spectrum F (S).
[0042]
From this, it is possible to remove only stationary noise and non-stationary noise from the input
acoustic signal with less processing amount and less delay.
[0043]
Furthermore, from the above, in the processing for the frequency domain in the conventional
noise removing device, the amount of processing increases and the problem of making the delay
large, and in the processing of the time domain, the voice disappears. Can solve the problem of
too high sensitivity.
[0044]
Other Embodiments In the above embodiment, the adaptive noise predictor 5 is realized by an
adaptive digital filter, but various estimation algorithms can be applied to noise estimation.
For example, not only the above-described learning identification method but also LFT (Least
Mean Square) method, RLS (Recursive Least Square) method, or FTF (Fast Transversal Filter)
method as a fast RLS method may be applied. it can.
[0045]
Moreover, in the above-mentioned one Example, although it comprised using two microphones as
an acoustic capture means, it does not limit to such a structure.
For example, three or more microphones may be used to estimate noise from audio signals
10-04-2019
11
captured by these microphones.
[0046]
Furthermore, in the above-described embodiment, the frequency spectrum of the main acoustic
signal Sa and the estimated noise Sn * is determined, and the noise component is further reestimated, focusing on the frequency spectrum region. The fast Fourier transformers 8 and 9 are
used for the conversion of the above, but it is not limited thereto. It may be converted to a signal
of a frequency spectrum by another configuration.
[0047]
Moreover, in the above-mentioned one Example, although it paid attention to the frequency
spectrum and demonstrated, you may be processing paying attention to an amplitude spectrum,
a phase spectrum, a power spectrum, etc. elsewhere.
[0048]
Furthermore, although the above-described voice detector 7 monitors the power of the main
acoustic signal Sa and the estimation error signal Se and performs voice detection from the
change in power, the present invention is not limited thereto.
For example, sound detection may be performed by monitoring the power of any one of the main
sound signal Sa, the reference sound signal Sn, and the estimation error signal Se.
[0049]
Further, although the speech detector 7 is intended for speech detection, it is not limited to this.
This can generally be applied as an acoustic detector to detect acoustic signals.
[0050]
10-04-2019
12
Furthermore, in the above-described embodiment, the A / D (analog / digital) converters 3 and 4
are not only linear conversion systems but also companding laws (for example, μ-Law law, ALaw law) and the like. May be Further, the number of conversion bits is not limited to 12 as
described above, and it is desirable to set a large number of conversion bits depending on the
purpose.
[0051]
Furthermore, the denoised acoustic frequency spectrum F (S), which is the output of the
subtractor 11 in FIG. 1, may be output to a voice recognition device connected externally.
[0052]
In addition to being applied to a hands free telephone in a car, the noise removal device
according to the above-described embodiment is effective when applied to a device that takes in
various acoustic signals.
For example, it is effective to use for an acoustic device used in a factory under a noise
environment or in an aircraft.
[0053]
As described above, according to the noise removal apparatus of the present invention, since the
noise spectrum estimation means, the acoustic spectrum conversion means, and the processing
means are provided, stationary noise, non-stationary noise and the like can be simplified. The
configuration can be precisely removed at a sufficiently practical response speed.
10-04-2019
13
Документ
Категория
Без категории
Просмотров
0
Размер файла
22 Кб
Теги
description, jph06303689
1/--страниц
Пожаловаться на содержимое документа