close

Вход

Забыли?

вход по аккаунту

?

DESCRIPTION JP2012129652

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JP2012129652
PROBLEM TO BE SOLVED: To provide a high quality voice by suppressing reverberation
generated by an acoustic resistor while reducing wind noise by using an acoustic resistor.
SOLUTION: The voice processing device has first and second microphones, and the second
microphone is provided with an acoustic resistor so as to cover the microphones. A high pass
filter obtains high frequency components of the output signal of the first microphone, and a low
pass filter obtains low frequency components of the output signal of the second microphone. The
output signal of the high pass filter and the output signal of the low pass filter are added and
output. Here, an adaptive filter is provided between the second microphone and the low pass
filter, and the filter coefficients are estimated and learned so that the difference between the
output signal of the first microphone and the output signal of the second microphone is
minimized. By doing this, the reverberation component generated in the closed space between
the acoustic resistor and the second microphone in the output signal of the second microphone is
suppressed. [Selected figure] Figure 1
Voice processing apparatus and method, and imaging apparatus
[0001]
The present invention relates to voice processing technology for reducing wind noise mixed in
recording.
[0002]
Voice processing devices are desired to faithfully record voice under various circumstances.
18-04-2019
1
In outdoor photography, noise caused by wind (hereinafter referred to as "wind noise"). The
occurrence of) is remarkable. Many mechanical devices / electrical processes have been
proposed to suppress wind noise. For example, in Patent Document 1, a wind noise reducer
(hereinafter referred to as “acoustic resistor”) is an adhesive tape in a sound collection unit of
a casing of an imaging device. There is disclosed a method of suppressing wind noise by pasting
the.
[0003]
Unexamined-Japanese-Patent No. 2006-211302
[0004]
However, in the prior art disclosed in the above-mentioned patent documents, depending on the
material of the acoustic resistor, it is conceivable that reverberation occurs inside the sound
collection unit and the quality of the voice is degraded.
Therefore, the present invention aims to provide a high quality voice by suppressing
reverberation generated by an acoustic resistor while reducing wind noise using the acoustic
resistor.
[0005]
According to one aspect of the present invention, the first and second microphones and an
acoustic wave provided to cover the second microphone in order to block the movement of air
from the outside of the apparatus to the second microphone. A resistor, a high pass filter that
passes only the high frequency component of the output signal of the first microphone, a low
pass filter that passes only the low frequency component of the output signal of the second
microphone, the high pass An adder for adding and outputting an output signal of the pass filter
and an output signal of the low pass filter, and an output of the first microphone provided
between the second microphone and the low pass filter By estimating and learning the filter
coefficient so that the difference between the signal and the output signal of the second
microphone is minimized, the acoustic resistance of the output signal of the second microphone
can be reduced. Audio processing apparatus is provided, characterized in that it comprises an
adaptive filter for suppressing reverberation occurring in the closed space between the the body
second microphone.
18-04-2019
2
[0006]
According to the present invention, it is possible to provide a recording device in which wind
noise is reduced by an acoustic resistor and reverberation is suppressed.
[0007]
The figure which shows the structure of the recording device in embodiment.
1A and 1B are a perspective view and a cross-sectional view of an imaging device.
The figure which shows the example of the frequency characteristic of a microphone. The figure
explaining the attachment structure of a microphone. The figure which shows the structure of a
reverberation suppressor. The figure which shows operation | movement of the wind detector
according to wind noise. FIG. 6 is a diagram showing the configuration and operation of a
synthesizer. The figure which shows the example to which the prior art is applied. The figure
which shows the operation | movement sequence of a switch, a variable filter, and a variable
gain. The figure explaining the wind noise processing in case there is no HPF. The figure
explaining the wind noise processing in case there is HPF. The figure which shows the example
of another audio processing apparatus. The perspective view of the imaging device in a 2nd
Example. The figure which shows the structure of the speech processing unit in a 2nd Example.
The figure which shows the structure of the speech processing unit in a 3rd Example. The figure
which shows the structure of the speech processing unit in a 4th Example. FIG. 18 is a view for
explaining the positional relationship between subject sound and a microphone in the fourth
embodiment.
[0008]
Hereinafter, embodiments of the present invention will be described in detail based on the
attached drawings. The same reference numerals are assigned to the same components
throughout the drawings.
[0009]
18-04-2019
3
Embodiment 1 Hereinafter, with reference to FIG. 1 to FIG. 11, a recording apparatus and an
imaging apparatus provided with the recording apparatus according to a first embodiment of the
present invention will be described.
[0010]
FIG. 1 is a block diagram showing the configuration of a recording apparatus in the present
embodiment.
FIGS. 2A and 2B are respectively a perspective view and a cross-sectional view of an imaging
device (camera) provided with the recording device of FIG. Reference numeral 1 denotes an
imaging device, 2 a lens mounted on the imaging device 1, 3 a housing of the imaging device 1, 4
an optical axis of the lens, 5 a photographing optical system, and 6 an imaging device. Further,
30 is a release button, and 31 is an operation button. The imaging device 1 is provided with a
first microphone 7 a and a second microphone 7 b. Reference numerals 32a and 32b denote
openings provided in the housing 3 for the microphones 7a and 7b, respectively. An acoustic
resistor 41 is attached to the opening 32 b. As will be described later, the acoustic resistor 41
can also be configured to have the housing 3 with an uneven thickness structure, or with
separate components. The imaging device 1 can record audio simultaneously with acquisition of
an image using the microphones 7a and 7b.
[0011]
The moving image shooting operation by the imaging device 1 will be described. By pressing a
live view button (not shown) prior to shooting a moving image, an image of the imaging device 6
is displayed on a display device provided in the imaging device 1 in real time. The imaging device
1 synchronizes with the operation of the moving image shooting button, obtains the information
of the subject from the imaging device 6 at the set frame rate, obtains the audio information from
the microphones 7a and 7b, synchronizes these and is not shown. Record to memory. The
shooting ends in synchronization with the operation of the movie shooting button.
[0012]
The configuration of the voice processing device 51 will be described with reference to FIG. 52 is
18-04-2019
4
a variable high pass filter (HPF). Reference numeral 53 denotes a reverberation suppressor,
which uses, for example, a dereverberation adaptive filter. Reference numerals 54a and 54b
denote a first A / D converter (ADC) for digitizing the output signal of the microphone, 55
denotes a first delay unit (DL), and 56a and 56b denote HPFs for cutting DC components.
Reference numeral 61 denotes an automatic level correction unit (ALC). In the ALC 61, 62a and
62b are variable gains for level adjustment, and 63 is a level controller. Reference numeral 71
denotes a synthesizer for synthesizing the signal of the first microphone 7a and the signal 7b of
the second microphone. In the synthesizer 71, 72 is a low pass filter (LPF), 73 is a variable HPF,
74 is a variable gain, and 75 is an adder. 81 is a wind-detector. In the wind detector 81, 82a and
82b are band pass filters (BPF), 83 is a differencer, 84 is a second A / D converter (ADC), 85 is a
second delayer, and 86 is a level detector. . A switch 87 controls the reverberator 53, a switch 88
controls the synthesizer 71, and a mode switching operation unit 89.
[0013]
1 and 2, the housing 3 is provided with openings 32a and 32b for microphones. Here, an
acoustic resistor 41 covering the second microphone 7 b is provided at the opening 32 b so as to
block the movement of air from the outside of the apparatus to the second microphone 7 b. On
the other hand, such an acoustic resistor is not provided in the opening 32a so that the first
microphone 7a can faithfully acquire the subject sound. The acoustic resistor 41 is provided in
close contact with the housing 3. The movement of air here is assumed to be movement of air by
wind. For example, it is also possible to use, as the acoustic resistor, a material such as porous
PTFE that allows air to move in a slower time than air movement by wind and does not allow
wind to pass.
[0014]
The audio processing device 51 processes the signal from the first microphone 7a by the HPF
52, and then performs analog / digital conversion (A / D conversion) by the ADC 54a. Further,
the output of the ADC 54a is delayed by the first delay 55 by an appropriate amount. On the
other hand, the audio processing device 51 performs A / D conversion of the signal from the
second microphone 7 b by the ADC 54 b, and then suppresses the reverberation by the
reverberation suppressor 53. The operation of the dereverberation unit 53 and how to give a
delay in the first delay unit 55 will be described later.
[0015]
18-04-2019
5
The outputs of the first delay 55 and the ADC 54b are respectively processed by the HPFs 56a
and 56b for DC component cutting. Since the HPFs 56a and 56b aim to remove the offset of the
analog part, it is preferable that components below the audible range from DC can be removed.
Therefore, the cutoff frequency of the HPFs 56a and 56b is set to, for example, about 10 Hz.
[0016]
The outputs of the HPFs 56a and 56b are input to the ALC 61, and gain-adjusted by the variable
gains 62a and 62b, respectively. At this time, the gains of the variable gains 62a and 62b are
controlled in conjunction so that the two signal levels become the same. The level adjuster 63
obtains the outputs of the variable gains 62a and 62b, and appropriately adjusts the level so that
saturation does not occur and the dynamic range can be effectively used. At this time, the level
adjuster 63 adjusts the level so that the larger one of the outputs of the variable gains 62a and
62b is not saturated.
[0017]
The outputs of the variable gains 62 a and 62 b are input to the combiner 71. The output of the
variable gain 62 a is sent to the adder 75 after passing through the HPF 73. On the other hand,
the output of the variable gain 62 b is sent to the adder 75 through the LPF 72 and the variable
gain 74. The output synthesized by the adder 75 is output as a wind noise-processed speech.
[0018]
The output of the first microphone 7 a and the output of the reverberation suppressor 53 are
respectively input to the BPFs 82 a and 82 b of the wind detector 81. The BPFs 82a and 82b are
intended to pass a range where the subject sound can be faithfully acquired by the second
microphone 7b. Therefore, the passband is set to, for example, about 30 Hz to 1 kHz. However,
for the upper limit frequency, the set value can be changed depending on the structure of the
acoustic resistor 41 or the like. Details will be described later together with the frequency
characteristics of the second microphone 7b.
18-04-2019
6
[0019]
The output of the BPF 82 a is A / D converted by the second ADC 84 and then sent to the second
delay unit 85. How to give a delay in the second delay unit 85 and the like will be described later
together with the operation of the reverberation suppressor 53.
[0020]
The difference between the output of the second delay unit 85 and the output of the BPF 82 b is
calculated by the difference unit 83, and the result is sent to the level detector 86. The operation
of the level detector 86 will be described later. The level of the wind is determined by the level
detector 86, and the switch 87 is controlled to switch the feedback to the reverberator 53. The
detection result of the level detector 86 is also used to control the switch 88 that controls the
synthesizer 71. When the mode switching operation unit 89 is set to OFF by the user, the switch
88 operates so as to always select processing in the case where there is no wind described later.
On the other hand, when the mode switching operation unit 89 is set to Auto by the user, the
switch 88 controls the cutoff frequency and variable gain of the HPF 52 and HPF 73 according to
the wind intensity determined by the level detector 86. Works to change 74. Details of this
process will be described later.
[0021]
The effects of the acoustic resistor 41, desirable characteristics, and reduction of wind noise will
be described with reference to FIGS. 1, 3 and 4. FIG. FIG. 3 is a diagram schematically showing
frequency characteristics of the microphone, in which the horizontal axis represents frequency
and the vertical axis represents gain. In FIG. 3, (a) shows the subject sound acquisition
characteristic of the first microphone 7a, and (b) shows the subject sound acquisition
characteristic of the second microphone 7b. (C) shows the wind noise acquisition characteristic
of the first microphone 7a, and (d) shows the wind noise acquisition characteristic of the second
microphone 7b. (E) shows the subject sound acquisition characteristic of the output of the
synthesizer 71, and (f) shows the wind noise acquisition characteristic of the output of the
synthesizer 71. Also, in order to clarify the difference in the characteristics of the first
microphone 7a and the second microphone 7b, the characteristics of the first microphone 7a are
shown by broken lines in (b) and (d). F0 in FIG. 3 indicates a structural cutoff frequency by the
acoustic resistor 41, and f1 indicates a cutoff frequency of the LPF 72 and the HPF 73 in the
synthesizer 71 shown in FIG.
18-04-2019
7
[0022]
As shown in FIG. 3A, it is desirable that the subject sound acquisition characteristic of the first
microphone 7a be flat in the audible range. This makes it possible to faithfully acquire the
subject sound. As shown in FIG. 3B, since the second microphone 7b is provided so as to block
the movement of air from the subject, the second microphone 7b has different characteristics. At
frequencies lower than the cutoff frequency by the acoustic resistor 41, the audio signal is
passed relatively faithfully. This is because the sound that is a compressional wave of air excites
the acoustic resistor 41, and the acoustic resistor 41 similarly excites the air inside the
apparatus. On the other hand, at a frequency higher than the cutoff frequency by the acoustic
resistor 41, the audio signal is cut off. This is a state in which the acoustic resistor 41 is vibrated
by the sound which is a compression wave of air, but can not move because the density is
reversed earlier than the acoustic resistor 41 vibrates. Thus, the acoustic resistor 41 acts as a
structural LPF. The frequency f0 at which structural cutting starts is called the cutoff frequency
of the acoustic resistor 41.
[0023]
It is known that the power of wind noise is concentrated in the low band. For example, as shown
in FIG. 3C, the power of wind noise in the first microphone 7a often has such a characteristic that
it is lifted from about 1 kHz toward low frequencies. Even when the shape is not as shown in FIG.
3C, the wind noise is dominated by low frequency (500 Hz or less) components. As shown in FIG.
3D, the second microphone 7b has less lifting of low frequency components due to wind noise. In
the vicinity of the first microphone 7a, a large atmospheric pressure difference is likely to be
generated due to the occurrence of turbulent flow and the like. On the other hand, since the
second microphone 7b is provided with the acoustic resistor 41 so as to block the movement of
air from the subject, a large atmospheric pressure difference due to the turbulent flow or the like
does not occur. This is the reason why the output of the second microphone 7b has less lifting of
low frequency components due to wind noise.
[0024]
It is considered that these signals are processed by the synthesizer 71. As described with
reference to FIG. 1, the signal of the first microphone 7 a is processed by the HPF 73. This
18-04-2019
8
corresponds to cutting out the portion indicated by 91 in FIG. 3 (a) and the portion indicated by
93 in FIG. 3 (c). The signal of the second microphone 7 b is processed by the LPF 72. This
corresponds to cutting out a portion indicated by 92 in FIG. 3 (b) and a portion indicated by 94
in FIG. 3 (d). After passing through the adder 75, the subject sound characteristic is as shown in
FIG. 3 (e), and the wind noise characteristic is as shown in FIG. 3 (f). The parts shown by 91a,
92a, 93a and 94a in FIG. 3E and FIG. 3F are parts where the parts shown by 91, 92, 93 and 94
are dominant. Note that “dominate” is mentioned because the characteristics of the LPF 72
and the HPF 73 do not necessarily make the other zero. As is clear from FIGS. 3E and 3F, the
subject sound characteristic of the output of the synthesizer 71 is flat in the audible range, and
the wind noise characteristic is the characteristic of the microphone provided with the acoustic
resistor 41. ing.
[0025]
FIG. 4 shows an example of the mounting structure of the microphone. In FIG. 4, reference
numerals 33a and 33b denote holding elastic bodies of the first microphone 7a and the second
microphone 7b, respectively. A sleeve 34 holds the second microphone 7 b and the acoustic
resistor 41.
[0026]
FIG. 4A shows an example in which the acoustic resistor 41 is attached to the outside of the
housing 3. In the example of FIG. 4A, since the acoustic resistor 41 may be attached after the
assembly of the device, the assemblability can be improved.
[0027]
FIG. 4B shows an example in which the acoustic resistor 41 is attached to the inside of the
housing 3. In the example of FIG. 4B, since the acoustic resistor 41 is not exposed to the outside
of the housing 3, it is excellent in an aesthetic point.
[0028]
18-04-2019
9
FIG. 4C shows an example in which a part of the housing 3 doubles as the function of the
acoustic resistor 41. In the example of FIG. 4C, a part of the housing 3 to be the acoustic resistor
41 is thinned so as to be vibrated by the sound wave. In the example of FIG. 4C, since it is not
necessary to attach the acoustic resistor 41 to the housing 3 while reducing the number of parts,
it is excellent in aesthetic point. However, in the example of FIG. 4C, since the housing 3 and the
acoustic resistor 41 are integrated, the degree of freedom in design generally decreases. (The
strength of the housing 3 may be limited by the thickness of the portion forming the acoustic
resistor 41, and it is difficult to achieve both of them. )
[0029]
FIG. 4D is an example in which the second microphone 7 b and the acoustic resistor 41 are held
by the sleeve 34 having a sufficiently high rigidity. It is desirable for the sleeve 34 to have a
primary resonance frequency at a frequency sufficiently higher than the frequency band desired
to be acquired by the second microphone 7b (meaning that the resonance frequency of the
sleeve 34 is higher than f0 in FIG. 3). In the example of FIG. 4 (d), the acoustic resistor 41 is
attached to the highly rigid sleeve 34, so in the pass band (at a frequency lower than f0 in FIG. 3)
without being affected by unwanted resonance of the attachment structure. Desired audio signals
can be obtained.
[0030]
Next, the reverberation suppressor 53 will be described using FIGS. 1 and 5. Since the second
microphone 7b has a structure covered by the acoustic resistor 41, reverberation may occur in
the closed space. In the present embodiment, a reverberation suppressor 53 is provided to
suppress such reverberation.
[0031]
The specific configuration of the reverberation suppressor 53 is shown in FIG. The reverberation
suppressor 53 is configured by an adaptive filter. This adaptive filter is, as will be specifically
described below, the output of the differentiator 83 representing the magnitude of wind noise, ie
the difference between the output signal of the first microphone 7a and the output signal of the
second microphone 7b. Estimate and learn the filter coefficients so as to minimize. Thereby, the
reverberation component generated in the closed space between the acoustic resistor 41 and the
18-04-2019
10
second microphone 7b in the output signal of the second microphone 7b is suppressed. By using
such an adaptive filter, it is possible to appropriately process even a change in the reverberation
generation state caused by a change in the holding state of the camera by the user or a
temperature change.
[0032]
The principle of dereverberation will be briefly described. The subject sound is s, the subject
sound acquisition characteristic of the first microphone 7a is g1, the subject sound acquisition
characteristic of the second microphone 7b is g2, and the influence of the reverberation is r. g1
and g2 are equal to the inverse Fourier transform of the characteristics in the frequency space
shown in FIG. The signal x1 of the first microphone 7a and the signal x2 of the second
microphone 7b, which are obtained in an environment where reverberation occurs in the second
microphone 7b, are given by equation (1).
[0033]
[0034]
However, in the equation (1), * is an operator indicating convolution.
As described with reference to FIG. 3, at frequencies lower than f0, similar subject sounds can be
acquired by the first microphone 7a and the second microphone 7b. Further, as shown in FIG. 1,
only the components of the appropriate band are taken out by the BPFs 82a and 82b. That is, the
band which the BPF passes is in the audible range, which is a frequency lower than f0 in FIG. Due
to human auditory characteristics, sensitivity is extremely reduced for the band of 50 Hz or less.
The details may be referred to the A characteristic curve or the like. Therefore, the BPFs 82a and
82b may be designed to pass, for example, 30 Hz to 1 kHz. Assuming that the BPFs 82a and 82b
are BPFs and the signals after passing through the BPFs are x1_BPF and x2_BPF, the following
equation holds.
[0035]
18-04-2019
11
[0036]
g1 ≠ g2 and g1 * BPF = g2 * BPF are equivalent to the fact that similar subject sound can be
acquired by the first microphone 7a and the second microphone 7b at frequencies lower than f0.
As is clear from the equation (2), the inputs of the subtractor 83 in FIG. 1 are equal if there is no
reverberation effect r. The effect of reverberation can be reduced by operating the adaptive filter
with x1_BPF = d as the desired response from equation (2) and x2_BPF = u as the input.
[0037]
When the filter of the reverberation suppressor 53 is represented by h, the adaptive filter output
y is given by the following equation.
[0038]
[0039]
However, in equation (3), it is shown that n is the signal of the nth sample, M is the filter order of
the reverberation suppressor 53, and the lower letter of h is the value of the filter h of the nth
sample. It shows.
The input u may be x2_BPF.
[0040]
Furthermore, since the desired response is that d should use x1_BPF, the error signal e is
expressed as follows.
[0041]
[0042]
Various adaptation algorithms have been proposed, but here, as an example, an update equation
18-04-2019
12
of h in the LMS algorithm is shown below.
[0043]
[0044]
However, in equation (5), μ is a step size parameter.
According to the above, u is made closer to d by updating h using equation (5) after giving an
appropriate initial h.
That is, the influence of r is reduced to be close to x1_BPF = x2_BPF.
In this case, | h * r | = 1 holds in the passband of the BPF.
However, under an environment where wind noise is dominant, the update of the equation (5) is
not correctly performed, so the switch 87 stops the estimation learning of the adaptive filter.
The control sequence of the switch 87 will be described later together with the operation of the
wind detector 81.
[0045]
As described above, the reverberator 53 suppresses the reverberation. On the other hand, as
apparent from FIG. 5, in the reverberation suppressor 53, the signal is delayed according to the
order of the adaptive filter. In order to compensate for these, in FIG. 1, a first delay 55 and a
second delay 85 are provided. Typically, a delay of half (= M / 2) of the filter order of the
reverberation suppressor 53 may be given (if M is an odd number, it may be a nearby value). At
this time, for example, h (M / 2) = 1 and all other hs are initialized to 0, so that the adaptive
algorithm can be operated with a state without reverberation as an initial value. When an
appropriate initial value for dereverberation is stored in the memory, h may be initialized to that
18-04-2019
13
value and then the operation may be started. For example, it is conceivable to set the initial value
as follows. The filter coefficient can be estimated to some extent based on design values such as
dimensions around the microphones 7a and 7b, materials of structural members, and the like.
Therefore, filter coefficients obtained from design values may be set as initial values. In addition,
the filter coefficient when the power of the recording device is turned off may be stored in the
memory, and may be set as an initial value at the next activation of the recording device. In
addition, in the production process of the recording device, the filter coefficient may be
calculated by storing a predetermined reference sound and stored in the memory, and may be set
as an initial value when the recording device is activated.
[0046]
Next, the operation of the ALC 61 will be described. ALC is provided to effectively utilize the
dynamic range while suppressing the saturation of the audio signal. It is necessary to adjust the
level of the audio signal appropriately because the power fluctuation with respect to the time axis
is large. A level adjuster 63 provided in the ALC 61 monitors the outputs from the variable gains
62a and 62b.
[0047]
First, the attack operation will be described. When it is determined that the higher level signal
exceeds a predetermined level, the gain is lowered by a predetermined step. This operation is
repeated at a predetermined cycle. This operation is called an attack operation. The attack
operation makes it possible to prevent saturation.
[0048]
Next, the recovery operation will be described. When the signal of the larger level does not
exceed the predetermined level for a predetermined time, the gain is increased by a
predetermined step. This operation is repeated at a predetermined cycle. This operation is called
recovery operation. The recovery operation makes it possible to obtain sounds in a quiet
environment.
[0049]
18-04-2019
14
The variable gains 62a and 62b in the ALC 61 operate in conjunction with each other. That is,
when the variable gain 62a is attacked and the gain is reduced, the gain of the variable gain 62b
is also reduced by the same amount. By performing such an operation, the level difference
between the signal channels is eliminated, and the sense of incongruity is reduced when the
signals between the channels are mixed in the synthesizer 71 later.
[0050]
Next, the wind detector 81 will be described. The wind noise collected by the first microphone 7a
is w1, and the wind noise collected by the second microphone 7b is w2. As described in FIG. 3,
since the power of the wind noise is concentrated in the low band, it is not blocked by the BPFs
82a and 82b. For this reason, w1-w2 is obtained as the output of the difference unit 83. In
addition, it is assumed that the influence of the reverberation mentioned above can be
disregarded. Even in a real environment, the influence of reverberation is sufficiently small and
negligible compared to wind noise.
[0051]
In the level detector 86, the output of the differentiator 83 is subjected to an LPF processing as
appropriate after absolute value calculation. The cutoff frequency of the LPF may be determined
by the stability of the wind detector and the detection speed, but may be about 0.5 Hz. The LPF
integrates the stop band signal and passes the pass band signal as it is. As a result, the same
effect as the integration operation + HPF is obtained. Therefore, when the absolute value
operation maintains a high level for a certain period of time (which changes with the abovementioned cutoff frequency), a large output is obtained. That is, it is equivalent to monitoring Σ |
w1-w2 | over a suitable time.
[0052]
FIG. 6 shows an example of the output signal of the wind detector 81 according to the difference
in wind strength. FIGS. 6A, 6B, and 6C are diagrams showing signals obtained by the first
microphone 7a and the second microphone 7b, where the horizontal axis represents time and the
vertical axis represents signal level. . In FIGS. 6A, 6B, and 6C, +1 of the signal level indicates a
18-04-2019
15
level at which the signal in the positive direction is saturated. 6 (a) shows a signal without wind,
FIG. 6 (b) shows a signal with weak wind, and FIG. 6 (c) shows a signal with strong wind. It can be
seen that the signal level of the first microphone 7a is increased according to the strength of the
wind, and that wind noise is generated. On the other hand, it can be seen that the signal level of
the second microphone 7b is not much higher than the signal level of the first microphone 7a. It
shows that wind noise is reduced by the effect of the acoustic resistor 41.
[0053]
The result of having performed the process of the wind detector 81 mentioned above at this time
is shown in FIG.6 (d). The horizontal axis of FIG. 6 (d) indicates the same time as that of FIGS. 6
(a), 6 (b) and 6 (c), and the vertical axis indicates the output of the wind detector. The BPFs 82a
and 82b have a pass band of 30 Hz to 1 kHz, and the cutoff frequency of the LPF in the level
detector 86 is 0.5 Hz. It can be seen that the output of the wind detector 81 changes near zero
when there is no wind, and the value increases according to the wind intensity. Further, in FIG.
6D, the reason that the signal near 0 seconds is small is that the rise is delayed due to the
influence of the LPF in the level detector 86. Before the wind is detected, a delay as shown at the
rising of the signal of FIG. 6 (d) occurs. Since there is a problem in that the delay becomes
smaller due to the fluctuation of the wind, the wind is detected with the delay shown in FIG. 6 in
this embodiment.
[0054]
The output of the wind detector 81 is used not only for the switch 87 of the reverberation
suppressor 53 described above, but also for switching of the HPF 52 described later and
switching of combining processing in the combining unit 71.
[0055]
Next, the operation of the synthesizer 71 will be described using FIGS. 1 and 7.
While changing the cutoff frequency of the HPF 73 and the variable gain 74 has been described
based on the output of the wind detector 81 in FIG. 1, a specific changing method will be
described using FIG. 7.
18-04-2019
16
[0056]
FIGS. 7A and 7C respectively show an example of the configuration of the synthesizer 71. FIG.
FIGS. 7 (b) and 7 (d) show how to change the variable part of FIGS. 7 (a) and 7 (c), respectively.
[0057]
First, the configuration of FIG. 7A will be described. The synthesizer 71 shown in FIG. 7A has the
same configuration as that shown in FIG. In FIG. 7A, the cutoff frequency of the LPF 72 is fixed,
for example, 1 kHz. In FIG. 7B, the upper part schematically shows the gain of the variable gain
74, and the lower part schematically shows the cutoff frequency of the HPF 73. The horizontal
axis in FIG. 7B is common to the two graphs, and Wn1, Wn2, and Wn3 are values indicating the
magnitude of wind noise, which indicates that wind noise is strong in this order.
[0058]
As shown in FIG. 7B, when the wind noise is smaller than the predetermined value Wn1, it is
assumed that the wind processing is not necessary, the gain of the variable gain 74 is set to 0,
and the cutoff frequency of the HPF 73 is set to 50 Hz. As a result, by passing the circuit shown
in FIG. 7A, the signal from the second microphone 7b is completely cut off, and the audible range
(here, a frequency higher than 50 Hz which is the cutoff frequency of the HPF 73 is dominant It
is in the audible range as it becomes a component. ) Can be obtained only from the first
microphone 7a. This is considered to be that the voice of the subject is obtained faithfully
because it is not necessary to use the signal of the second microphone 7b provided with the
acoustic resistor 41.
[0059]
Explain that wind noise exceeds the level of Wn1 and is between Wn1 and Wn2. At this time, the
value of the variable gain 74 gradually increases, and the cutoff frequency of the HPF 73
gradually rises. By performing the control described above, the ratio of the signal from the
second microphone 7 b provided with the acoustic resistor 41 is gradually increased in the low
frequency sound signal. Although wind noise largely acts on the signal from the first microphone
7 a, the wind noise is reduced by increasing the cutoff frequency of the HPF 73.
18-04-2019
17
[0060]
Explain when wind noise exceeds the level of Wn2 and is between Wn2 and Wn3. At this time,
the value of the variable gain 74 is fixed to 1 and the cutoff frequency of the HPF 73 is gradually
raised. By performing the above-described control, the sound existing between the cutoff
frequency of the LPF 72 and the cutoff frequency of the HPF 73 is lost, but the wind noise can be
further reduced. If the cutoff frequency of the HPF 73 is increased excessively, the deterioration
of the subject sound becomes too large, and therefore, the frequency is not increased beyond the
appropriate cutoff frequency. In the example of FIG. 6B, when the magnitude of the wind noise
exceeds Wn3, the cutoff frequency of the HPF 73 is fixed at 2 kHz and does not change more
than this.
[0061]
The configuration of FIG. 7C which is another example will be described. The synthesizer 71
shown in FIG. 7C is provided with a variable LPF 76 in place of the fixed LPF 72 and the variable
gain 74. In FIG. 7D, the upper part schematically shows the cutoff frequency of the variable LPF
76, and the lower part schematically shows the cutoff frequency of the HPF 73. The horizontal
axis in FIG. 7D is common to the two graphs, and Wn1, Wn2, and Wn3 are values indicating the
magnitude of wind noise, which indicates that wind noise is strong in this order.
[0062]
As shown in FIG. 7D, when the wind noise is smaller than the predetermined value Wn1, the
cutoff frequency of the variable LPF 76 and the HPF 73 is set to 50 Hz on the assumption that no
wind processing is necessary. As a result, by passing the circuit shown in FIG. 7C, the signal from
the second microphone 7b is almost completely cut off, and the audible range (in this case, the
frequency higher than 50 Hz, which is the cutoff frequency of the HPF 73, is dominated by the
sound) It is in the audible range as it becomes a vital component. ) Can be obtained only from the
first microphone 7a. This is considered to be that the voice of the subject is obtained faithfully
because it is not necessary to use the signal of the second microphone 7b provided with the
acoustic resistor 41.
18-04-2019
18
[0063]
Explain that wind noise exceeds the level of Wn1 and is between Wn1 and Wn2. At this time, the
cut-off frequencies of the variable LPF 76 and the HPF 73 are gradually raised while being
matched. By performing the above-described control, the low frequency sound signal will
gradually use the signal from the second microphone 7 b provided with the acoustic resistor 41.
Although wind noise largely acts on the signal from the first microphone 7 a, the wind noise is
reduced by increasing the cutoff frequency of the HPF 73.
[0064]
Explain when wind noise exceeds the level of Wn2 and is between Wn2 and Wn3. At this time,
the cutoff frequency of the variable LPF 76 is fixed to 1 kHz, and the cutoff frequency of the HPF
73 is further raised. By performing the above-described control, the sound existing between the
cutoff frequency of the LPF 72 and the cutoff frequency of the HPF 73 is lost, but the wind noise
can be further reduced. If the cutoff frequency of the HPF 73 is increased excessively, the
deterioration of the subject sound becomes too large, and therefore, the frequency is not
increased beyond the appropriate cutoff frequency. In the example of FIG. 7D, when the
magnitude of the wind noise exceeds Wn3, the cutoff frequency of the HPF 73 is fixed at 2 kHz
and does not change more than this.
[0065]
In the above description, the example in which the HPF 73 is moved wider than the operations of
the variable gain 74 and the variable LPF 76 has been described. Obviously, by setting Wn2 =
Wn3, the operation of the HPF 73 can be operated only in the same range as the variable gain 74
and the variable LPF 76. If the movement is restricted, the wind noise reduction effect will be
small, but the subject sound can be acquired faithfully. On the other hand, the magnitude of wind
noise generated in the first microphone 7a when the wind blows is largely different depending
on the mounting structure of the microphone and the like. The settings of Wn1, Wn2 and Wn3
may be adjusted by comparing the necessity of reducing wind noise and the need of faithfully
acquiring the subject sound.
[0066]
18-04-2019
19
In the above description, in the example of the synthesizer 71 shown in FIG. 7, the range of
change of the cutoff frequency of the variable HPF and the LPF is specifically shown. The
preferred variable range and the configuration of the filter will be briefly described.
[0067]
The synthesizer 71 shown in the present embodiment synthesizes the voices acquired by the
plurality of microphones 7a and 7b. As described above, in the process of separating into bands
and combining them, it is desirable that the phases of the respective paths coincide with each
other, particularly in the frequency band where the signals of a plurality of microphones overlap.
This is because, if the phases shift due to the processing in a plurality of paths, the waveforms
may not overlap correctly and cancel out. In order to sufficiently satisfy this, it is convenient for
the HPF 73 and the LPF 72 to be configured by FIR filters of the same order. By using an FIR
filter, so-called group delay characteristics can be obtained and signals can be synthesized
without contradiction even when processed for each band. In the case where the cutoff frequency
is very low in the FIR filter (when the ratio is very small when normalized to the ratio to the
sampling frequency), a very high order is required to obtain sufficient filter performance. A filter
is required. This is derived from the fact that a large number of samples is required to obtain a
wave of the frequency to be blocked / passed. Since the order of the filter can not be increased
infinitely, the lower limit of the variable range of the cutoff frequency is determined from this. In
the configuration of FIG. 7C, since the LPF and HPF are variable, if the cutoff frequency is very
low, the orders of the variable LPF 76 and the HPF 73 become very high. For this reason, in the
example of FIG. 7, 50 Hz is illustrated as a range that does not significantly affect the signal in
the audible range as the limitation for reducing the frequency. As described above, it is not
limited to 50 Hz and may be appropriately set by computer resources. In the example of FIG. 7A,
since only the HPF is variable, only one high-order filter is required. In terms of reduction of
computational complexity, this is superior to the configuration of FIG. 7 (c).
[0068]
On the other hand, the upper limit of the variable range is limited by the second microphone 7 b
provided with the acoustic resistor 41. As schematically shown in FIG. 3B, the band of the subject
that can be acquired by the second microphone 7b is limited to f0 due to the influence of the
acoustic resistor 41. Since the subject sound is not obtained in the part beyond this, the cutoff
frequency of the variable LPF 76 and the HPF 73 in the example of FIG. 7 should be set lower
than this. It is f1 in FIG. 3, and it is clear that f1 <f0.
18-04-2019
20
[0069]
The effects, variable operation, and the like of the HPF 52 will be described with reference to
FIGS. 1, 3, 6, and 8 to 11. As described with reference to FIGS. 3 and 6, wind noise is
concentrated at low frequencies, and the ways of being affected by the first microphone 7a and
the second microphone 7b are significantly different. That is, even if the wind is weak, a large
wind noise is generated in the first microphone 7a. As a problem associated with this, it is
conceivable that the saturation of the ADC 54a and the operation of the ALC 61 become
inappropriate. Since it is easy to understand the saturation of the ADC 54a, the explanation is
omitted, and the problems with the ALC 61 operation when wind noise is generated will be
described.
[0070]
In the absence of the HPF 52, large wind noise is generated in the first microphone 7a as shown
in FIG. Even when wind noise and subject sound are superimposed, it is assumed that wind noise
becomes dominant. Under such circumstances, the ALC 61 performs level adjustment with
reference to the wind noise level of the first microphone 7a. Thereafter, when the wind noise is
processed by the HPF 73 in the synthesizer 71, the level of the audio signal is greatly reduced. As
a result, there is a problem that the output from the adder 75 becomes very small. That is, the
signal level is in an inappropriate state.
[0071]
In order to solve the above-mentioned problem of saturation and signal level of the ADC, for
example, it is conceivable to apply the invention shown in Patent Document 1. An example of the
voice processing device 51 at this time is shown in FIG. The components in FIG. 8 having the
same functions as those in FIG. In FIG. 8, variable gains 62a and 62b are provided in front of the
ADCs 54a and 54b to avoid saturation of the ADC. Furthermore, another ALC 61b is provided
after the wind noise processing by the synthesizer 71, and the variable gain 62c and the level
adjuster 63b prevent the signal level after the wind processing from becoming inappropriate.
[0072]
18-04-2019
21
However, the circuit of FIG. 8 also has two problems. One is an increase in circuit scale by
performing the level ALC operation at two places. The other is an increase in the quantization
error due to the gain being raised by the ALC 61b disposed behind the synthesizer 71. That is,
the level adjuster 63a performs level adjustment on a signal including wind noise, and the level
adjuster 63b performs level adjustment on a signal including no wind noise. In the case where
the wind noise reduction effect is large, it is necessary to increase the gain largely by the level
adjuster 63b. At this time, since the signal is already digitized, the quantization error increases
with the level adjustment.
[0073]
The quantization error mentioned here will be briefly described. For example, when 12 dB gain is
to be raised by the level adjuster 63b, the digital signal may be shifted to the left by 2 bits, but at
this time there is no information corresponding to the lower 2 bits, so fill in with an appropriate
value (for example 0). There is a need. In this case, since the lower 2 bits are always 0, only 4
after 0 can be expressed in decimal. Thus, the signal can be expressed only discretely, and a
quantization error occurs with respect to the natural signal (continuous).
[0074]
Consider now the HPF 52 shown in FIG. By setting the cutoff frequency of the HPF 52
appropriately, the main component of wind noise can be removed. As a result, saturation of the
ADC 54a can be prevented, and appropriate gain adjustment can be performed in the ALC 61. (At
the time of ALC 61, the subject sound is not buried in the wind noise, so it is possible to perform
the ALC operation matched to the level of the subject sound. )
[0075]
An example of the control sequence of the cutoff frequency in the HPF 52 will be described with
reference to FIG. 9 (a) shows the operation sequence of the switch 87, FIG. 9 (b) shows the
operation sequence of the HPF 52, FIG. 9 (c) shows the operation sequence of the variable gain
74, and FIG. 9 (d) shows the operation sequence of the HPF 73. . Further, in FIGS. 9A to 9D, the
horizontal axis is common and indicates the magnitude of wind noise. Wn1, Wn2 and Wn3 are
18-04-2019
22
values indicating the magnitude of wind noise and indicate that wind noise is strong in this order.
The operations of FIGS. 9C and 9D are the same as those of FIG. 7B and thus the description
thereof is omitted.
[0076]
If the wind noise is smaller than the predetermined value Wn1, it is determined that the wind
processing is not necessary, and the switch 87 is turned on to perform the adaptive operation of
the reverberation suppressor 53 described above. Also, the cutoff frequency of the HPF 52 is set
to 0 Hz (= through without HPF operation). Since it is not necessary to use the signal of the
second microphone 7b provided with the acoustic resistor 41, it is considered that the voice of
the subject is obtained faithfully.
[0077]
If the wind noise exceeds the level of Wn1, it is determined that the wind noise is generated, and
the switch 87 is turned off to stop the adaptive operation of the adaptive filter in the
reverberation suppressor 53 described above. By performing such control, it is possible to
suppress inappropriate adaptive operation.
[0078]
The time between Wn1 and Wn2 will be described. At this time, the cutoff frequency of the HPF
52 is raised stepwise in a range not exceeding the cutoff frequency of the HPF 73. By performing
the control described above, it is possible to reduce the wind noise generated by the first
microphone 7a. Further, by controlling so as not to exceed the cutoff frequency of the HPF 73,
the cutoff frequency of the HPF 52 does not have a large influence on the HPF 73 output.
[0079]
The effects of this will be described. Since the HPF 52 is provided in the analog unit (preceding
the ADC) of the audio processing device 51, it is generally configured of an IIR filter (HPF by RC
circuit). At this time, the HPF 52 can not satisfy the group delay characteristic. On the other
18-04-2019
23
hand, also in the IIR filter, since the phase delay is small in the pass band, even if the group delay
characteristic is not satisfied, the phase delay does not affect. By controlling the cut-off
frequencies of the HPF 52 and the HPF 73 as described above, it is possible to reduce the
influence of the phase delay due to the IIR filter. As described above, in the process of separating
into bands and combining, it is preferable that the phases in the respective paths coincide with
each other, particularly in the frequency band in which the signals of a plurality of microphones
overlap. However, it shows that the effect can be reduced even in a situation where this can not
be observed. Further, as described above, the HPF 52 is provided in the analog unit of the audio
processing device 51. However, if the cutoff frequency is continuously changed in the analog
circuit, the circuit scale becomes large. The circuit can be realized with a simple configuration by
using a circuit suitable for the control sequence as described in FIG.
[0080]
10 and 11 show examples of signals processed by the circuit described above. FIG. 10 shows the
case where the HPF 52 is not provided, and FIG. 11 shows the case where the HPF 52 is
provided. The signal of FIG. 10 is a signal processed with the HPF 52 removed from the state of
FIG. Further, as shown in the drawing, the graph shows the output of gain 62a, output of gain
62b, output of HPF 73, output of LPF 72, and output of adder 75 in this order from the top. The
horizontal axis shows time, which is common to all graphs. In the example of FIG. 10 and FIG. 11,
a state in which the subject is speaking from around 2.5 seconds (a human voice is a sound to be
picked up) is shown. Also, the signals shown in FIGS. 10 and 11 were processed assuming that
the wind noise level was at the level of Wn2 in FIG.
[0081]
2.5
The part before the second is a state of wind noise only, as shown in FIG. Focusing only on this
portion, the output of the gain 62a in FIGS. 10 and 11 seems to be larger in FIG. Actually, the
gain is increased by the ALC 61. This is clear when looking after 2.5 seconds overlapping the
subject sound.
[0082]
2.5
18-04-2019
24
Focusing on the gain 62b output after the second, it can be seen that the signal of FIG. 10 is
clearly lower in signal level than the signal of FIG. This is because the ALC 61 performs level
adjustment on the wind noise generated by the first microphone 7a, the gain becomes small, and
as a result, the subject sound is acquired very small. On the other hand, since the signal of FIG.
11 reduces the wind noise generated by the first microphone 7a by the effect of the HPF 52, the
gain of the ALC 61 is kept high compared to the state of FIG.
[0083]
Focusing on the HPF 73 output in FIG. 10, it can be seen that the wind noise is considerably
reduced by appropriately processing the cutoff frequency of the HPF 73. However, since the
signal level of the HPF 73 is greatly reduced compared to the signal level of the gain 62a output,
it can be seen that the signal level of the final output of the adder 75 is very small.
[0084]
On the other hand, also in FIG. 11, it can be seen that the wind noise is considerably reduced by
appropriately processing the cutoff frequency of the HPF 73. Further, it can be seen that, since
the output of the LPF 72 is kept large, the signal level of the final output of the adder 75 is also
kept at a sufficient level.
[0085]
As such, by disposing the HPF 52 closer to the microphone than the ADC and ALC, it is possible
to obtain high quality voice.
[0086]
An example of another circuit configuration of this embodiment is shown in FIG.
FIG. 12 (a) is an example in which ALC is disposed in the analog unit, and FIG. 12 (b) is an
example in which ALC 61 is disposed behind the synthesizer 71. Even with such a configuration,
it is possible to obtain the effects shown in the present embodiment.
18-04-2019
25
[0087]
As described above, according to the present invention, it is possible to obtain high-quality voice
in which reverberation is suppressed while reducing wind noise by an acoustic resistor.
[0088]
Second Embodiment Hereinafter, with reference to FIG. 13 to FIG. 14, a recording apparatus and
an imaging apparatus provided with the recording apparatus according to a second embodiment
of the present invention will be described.
In the second embodiment, the same operations as in the first embodiment are denoted by the
same reference numerals.
[0089]
FIG. 13 is a perspective view of the imaging device. FIG. 13 is similar to FIG. 2 but with the
addition of an opening 32c for the microphone. A microphone 7 c (not shown) is provided at the
back of the opening 32 c.
[0090]
FIG. 14 is a view for explaining the main part of the speech processing device 51 corresponding
to the device shown in FIG. FIG. 14 illustrates expansion to stereo based on the circuit for
performing ALC in an analog manner, as shown in FIG. 12A, of the first embodiment. Also, the
reverberation suppressor 53 and the level detector 86 simplify / change the notation. In contrast
to the first embodiment, the first microphone 7a is expanded to two. Here, the microphones 7a
and 7c are microphones constituting left and right channels of stereo, and their characteristics
are designed to be equal. On the other hand, the second microphone 7b is provided with an
acoustic resistor 41, and has the same characteristics as those of the first embodiment.
[0091]
18-04-2019
26
The HPF 52b, the gain 62c, the ADC 54c, the HPF 56c for cutting the DC component, and the
HPF 73b expanded in FIG. 14 have the same movement as the HPF 52, the gain 62a, the ADC
54a, the HPF 56a for cutting the DC component, and the HPF 73 shown in the first embodiment,
respectively. Do. Here, the delay units 55a and 55b whose operation changes, the newly added
phase comparator 57, the adder 58, and the gain 59 will be described.
[0092]
In a stereo recording apparatus, a stereo sense is given to a signal by the phase difference of an
audio signal. On the other hand, in the arrangement as shown in FIG. 13, the second microphone
7b is arranged between the first microphones 7a and 7c. In such a configuration, when
considering the phase difference between the microphone 7a and the microphone 7c, the phase
of the signal of the second microphone 7b exists in the middle thereof. For example, when the
microphone 7a and the microphone 7b, and the microphone 7c and the microphone 7b are
equidistantly disposed, when the second microphone 7b is disposed just in the middle, the phase
is also exactly in the middle. Therefore, in the circuit of FIG. 14, the phase difference between the
microphones 7a and 7c is calculated, and the delay corresponding to that is given by the delay
units 55a and 55b.
[0093]
For example, consider the case where the signal of the microphone 7c is delayed relative to the
signal of the microphone 7a. At this time, as described later, the reverberation suppressor is
adjusted to meet the intermediate signal. When mixing with the signal of the microphone 7a, the
phase may be advanced, and when mixing with the signal of the microphone 7c, the phase may
be delayed and mixed. In the first embodiment, a delay of half (= M / 2) of the filter order of the
reverberation suppressor 53 may be given, but in 55a, a delay smaller than this is given, and in
55b, a delay is larger than this. Just give it a delay. Although the absolute value thereof varies
depending on the arrangement of the microphones, for example, as described above, when the
second microphone 7b is positioned between the first microphones 7a and 7c, the phase
difference calculated by the phase comparator 57 Each half of can be shifted. By performing the
above-described processing, an audio signal can be obtained without losing the sense of stereo.
[0094]
18-04-2019
27
The adder 58 and the gain 59 will be described. The adder 28 adds the signals of the microphone
7a and the microphone 7c. The gain 59 halves the output of the adder 58. As a result, the output
of the gain 59 is the average of the microphones 7a and 7c. As a result, the phase of the acquired
voice becomes an intermediate phase between the microphone 7a and the microphone 7c signal.
On the other hand, the BPF 82a passes only a band of about 30 Hz to 1 kHz as shown in the first
embodiment. Furthermore, the voice processing device 51 is configured to be able to obtain
voice of a frequency higher than the passband of the BPF. At this time, in the audio signal that
can be acquired, the phase is not reversed between the microphone 7 a and the microphone 7 c
signal. From the above, when observing only in the band to be passed by the BPF 82a, the
difference in phase between the microphone 7a and the microphone 7c signal is small. From this,
it can be considered that the levels of the signals in the 82a passband are substantially added.
Therefore, by halving the output with gain 59, it is possible to obtain a signal whose signal level
is almost the same as 7a and 7c and the phase is in the middle. In this embodiment, the
reverberation suppressor 53 is operated to match the output of the gain 59 described above.
[0095]
According to the above configuration, the present invention can be easily applied to an apparatus
for recording in stereo without impairing the stereo feeling.
[0096]
Although the case of stereo (in the case of two first microphones for obtaining high frequencies)
is described in this embodiment, it is possible to easily extend the recording apparatus having
more microphones.
[0097]
Embodiment 3 Hereinafter, with reference to FIG. 15, a recording apparatus and an imaging
apparatus provided with the recording apparatus according to a third embodiment of the present
invention will be described.
In the third embodiment, the same operations as in the first embodiment are denoted by the
same reference numerals.
[0098]
18-04-2019
28
A perspective view of an imaging apparatus provided with a recording apparatus according to
the third embodiment is omitted because it is the same as FIG. 2 of the first embodiment.
FIG. 15 is a view for explaining the main part of the speech processing unit 51 in the third
embodiment. In FIG. 15, an up sampler 96 that changes the sampling frequency of the audio
signal is disposed before the LPF 72. Also, unlike the first embodiment, different values are set
for sampling frequencies in the ADCs 54a and 54b. The sampling frequency of the ADC 54b is
set to a lower value than the sampling frequency of the ADC 54a. The sampling frequency of the
ADC 84 is set to the same value as that of the ADC 54 b.
[0099]
The ADC 54b, the ADC 84, the reverberation suppressor 53, and the newly formed up sampler
96 will be described.
[0100]
The output of the first microphone 7a is branched and sent to the wind detector 81, passes
through the BPF 82a, and is A / D converted at a sampling frequency lower than that of the ADC
54a by the ADC 84.
The sampling frequency is a value within a range in which the band passed by the BPF 82a can
be reproduced, and is preferably an integral fraction of the sampling frequency of the ADC 54a.
For example, when the pass band of the BPF 82a is 30 Hz to 1 kHz and the sampling frequency
of the ADC 54a is 48 kHz, the frequency is set to 3 kHz, which is 16 times smaller than 48 kHz.
The output of the ADC 84 is delayed by the delay unit 85 and sent to the difference unit 83.
[0101]
On the other hand, the signal of the second microphone 7b is A / D converted to the sampling
frequency similar to that of the ADC 84 in the ADC 54b. Then, after the reverberation is
suppressed by the reverberation suppressor 53, it is branched and sent to the wind detector 81,
passes through the BPF 82b, and is sent to the differencer 83. Since the filter order M of the
18-04-2019
29
reverberation suppressor 53 is suppressed to 1/16 of the sampling frequency by the ADC 54b,
the same effect as that of the conventional can be obtained with 1/16 of the conventional, and
the circuit size and the amount of calculation Leading to a decrease in As the filter order M of the
dereverberation device 53 decreases, the delay amount of the delay device 85 also decreases.
The operations of the difference unit 83 and the subsequent steps are the same as those of the
first embodiment, and therefore, are omitted.
[0102]
One of the outputs of the branched dereverberation device 53 passes through the HPF 56 b, is
gain-adjusted by the ALC 61, and is sent to the up sampler 96. The up sampler 96 converts the
output of the variable gain 62 b into the same sampling frequency as that of the ADC 54 a and
sends it to the LPF 72. Although upsampling may cause aliasing, the high frequency component
is reduced by the LPF 72 and aliasing is removed.
[0103]
The operations of the HPF 52 and the subsequent stages after the first microphone 7a and the
LPF 72 and the subsequent steps are the same as those in the first embodiment, and thus the
description thereof is omitted.
[0104]
With the above configuration, the circuit scale and the amount of calculation can be reduced by
downsampling the low frequency component and performing the dereverberation processing.
Furthermore, by performing up-sampling after the dereverberation processing, high-quality
speech can be obtained.
[0105]
Fourth Embodiment Hereinafter, with reference to FIG. 16 and FIG. 17, a recording apparatus
and an imaging apparatus provided with the recording apparatus according to a fourth
embodiment of the present invention will be described. In the fourth embodiment, the same
18-04-2019
30
operations as in the first embodiment are denoted by the same reference numerals.
[0106]
A perspective view of an imaging apparatus provided with a recording apparatus according to
the fourth embodiment is omitted because it is the same as FIG. 2 of the first embodiment. FIG.
16 is a view for explaining the main part of the speech processing unit 51 in the third
embodiment. Reference numeral 97 in FIG. 16 is a cross-correlation calculator that receives the
branched output of the BPF 82b and the delay unit 85, calculates the cross-correlation value of
the two signals, and determines whether there are multiple sound source arrival directions. The
operation of the cross correlation calculator 97 will be described later. FIG. 17 schematically
shows the positional relationship between the sound source of the subject sound, the
microphones 7a and 7b, and the propagation of sound, and FIG. 17 (a) shows the case where the
subject sound propagates from one direction. 17 (b) is a schematic view in the case where the
subject sound propagates from two directions.
[0107]
Problems when the subject sound propagates from two directions will be described with
reference to FIG. An object sound emitted from a certain object O1 is s1, and an object sound
emitted from a direction different from the object O1 is s2. The transfer function of the sound
propagating from the subject O1 to the microphone 7a is T1a, and the transfer function of the
sound propagating to the microphone 7b is T1b. Similarly, transfer functions of sound
propagating from the subject O2 to the microphones 7a and 7b are T2a and T2b, respectively.
When the sound source of the subject sound is unidirectional as shown in FIG. 17A, the audio
signals x1 and x2 obtained by the microphones 7a and 7b are expressed by the following
equations.
[0108]
[0109]
There is a delay between the signal x1 of the microphone 7a and the signal x2 of the microphone
7b due to the difference in distance from the microphone 7a and the microphone 7b from the
18-04-2019
31
subject sound, but the correlation of the two signals is very High.
On the other hand, when the subject sound propagates from two directions as shown in FIG. 17B,
the audio signals x1 and x2 obtained by the microphones 7a and 7b are expressed by the
following equations.
[0110]
[0111]
A delay occurs between the signal x1 of the microphone 7a and the signal x2 of the microphone
7b due to the distance between the two microphones 7a and 7b and the two objects O1 and O2.
As the positions of the two objects O1 and O2 move apart, the delay amounts at T1a and T1b and
at T2a and T2b shift, so that the correlation between the two signals decreases. As a result, there
arises a problem that updating of the dereverberation device 53 is not performed correctly.
[0112]
Therefore, in the imaging apparatus provided with the recording apparatus according to the
fourth embodiment, the cross correlation calculator 97 is provided, and when the cross
correlation value of the two signals is lower than the specified value, the above problem is solved
by stopping learning of the reverberation suppressor. Solve
[0113]
The operation of the cross correlation calculator 97 will be described.
The cross correlation calculator 97 receives the branched output of the BPF 82 b and the delay
unit 85. This is an audio signal of a frequency band of 30 Hz to 1 kHz which has passed through
the microphone 7a and the BPF 82a and the BPF 82a of the microphone 7b, respectively. Let the
signals be x1_BPF and x2_BPF, respectively, and the cross-correlation calculator 97 calculates
the cross-correlation value of the two signals as follows. The cross-correlation value R (n) of the
18-04-2019
32
two signals at the nth sample when the data length is N can be obtained by the following
equation.
[0114]
[0115]
Further, when this is normalized by x1_BPF, it is expressed as the following equation.
[0116]
Ideally, Rnorm (n) has a maximum value of 1 when the direction of the subject sound is
unidirectional.
However, when the sound source direction in which the subject sound is generated is in two or
more directions, the cross correlation between the two signals is low, so Rnorm (n) is lower than
1.
Therefore, when the calculated normalized cross-correlation value Rnorm (n) is lower than a
predetermined value Rn1, it is determined that the sound source direction of the subject sound is
two directions or more, and the switch 87 is turned off. The adaptation operation of the
reverberator 53 is stopped.
[0117]
Also in the imaging apparatus according to the third embodiment, the switch 87 is switched
based on the detection result of the level detector 86 as in the first embodiment. That is, when
the cross correlation calculator 97 detects that the cross correlation value is lower than Rn1 or
the level detector 86 detects that the wind noise exceeds the level of Wn1, the switch 87 is
turned OFF. Thus, the adaptive operation of the adaptive filter in the reverberator 53 is stopped.
[0118]
By performing such control, appropriate adaptation operation can be performed even when
18-04-2019
33
subject sound propagates from two or more directions, and high-quality sound can be obtained.
[0119]
Other Embodiments The present invention is also realized by executing the following processing.
That is, software (program) for realizing the functions of the above-described embodiment is
supplied to a system or apparatus via a network or various storage media, and a computer (or
CPU, MPU or the like) of the system or apparatus reads the program. It is a process to execute. In
this case, the program and the storage medium storing the program constitute the present
invention.
18-04-2019
34
Документ
Категория
Без категории
Просмотров
0
Размер файла
55 Кб
Теги
jp2012129652, description
1/--страниц
Пожаловаться на содержимое документа