close

Вход

Забыли?

вход по аккаунту

?

DESCRIPTION JP2014126854

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JP2014126854
Abstract: To realize a good automatic level control (ALC) in which distortion does not occur even
when an attack sound is continuously input in a short cycle. A first ALC function unit performs a
limiting operation to lower a gain when an amplitude level of an input audio signal exceeds an
upper limit value of a predetermined range, and when an amplitude level is smaller than a lower
limit value of the predetermined range Performs a recovery operation to increase the gain. The
second ALC function unit 12 performs the limit operation or the recovery operation on the audio
signal whose amplitude level has been adjusted by the first ALC function unit 11 again. Here, the
time constant of gain increase in the recovery operation in the first ALC function unit 11 is made
larger than the time constant of gain increase in the recovery operation in the second ALC
function unit 12. [Selected figure] Figure 1
Voice processing apparatus and control method thereof
[0001]
The present invention relates to an audio processing device and a control method thereof.
[0002]
BACKGROUND Conventionally, there has been known a voice processing apparatus having an
automatic level control (ALC) function that controls the magnitude of input voice to an
appropriate level (see, for example, Patent Document 1).
11-04-2019
1
ALC generally performs control to suppress the level if the input sound is excessive (limit
operation) and amplify the level if the input sound is too low (recovery operation). Here, there is
a problem in coping with the case where a sudden sound is input, that is, a sound that rises
sharply and then falls rapidly. Such sound is generally called "attack sound". Specifically, when
the rising portion of the attack sound is input, the level is suppressed by the limit operation.
Thereafter, when the falling edge of the attack sound is input, the level is increased by the
recovery operation. However, since the fall of the attack sound is rapid, the reaction of recovery
is slow, and there is a problem that the voice level immediately after the fall portion becomes
small and difficult to hear.
[0003]
Therefore, in the recovery operation at the time of the attack sound detection, it is conceivable to
accelerate the recovery reaction by raising the level amplification factor more than usual.
[0004]
JP, 2008-129107, A
[0005]
However, when the attack sound is continuously input in a short cycle, if the amplification factor
of the level is raised more than usual in the recovery operation with one attack sound, the sound
is clipped and distorted at the rise portion of the attack sound thereafter. There is a problem of
[0006]
The present invention has been made to solve such problems.
That is, the present invention realizes good automatic level control without distortion even when
attack sound is continuously inputted in a short cycle.
[0007]
According to one aspect of the present invention, the audio processing device adjusts the
amplitude level so that the amplitude level of the input audio signal falls within a predetermined
range, wherein the amplitude level of the input audio signal is within the predetermined range. If
the amplitude level gain is lower than the lower limit value of the predetermined range, the
11-04-2019
2
recovery operation is performed to increase the gain. And level control means for performing the
limit operation or recovery operation again on the audio signal whose amplitude level has been
adjusted by the first level control means, The time constant of the increase of the gain at the time
of recovery operation in the level control means is larger than the time constant of the increase
of the gain at the time of recovery operation in the second level control means. Audio processing
apparatus is provided which is characterized in that.
[0008]
According to the present invention, it is possible to realize good automatic level control without
distortion even when attack sound is continuously input in a short cycle.
[0009]
FIG. 2 is a diagram showing the configuration of an ALC unit of the voice processing device
according to the first embodiment.
6 is a flowchart showing the operation of the ALC unit according to the first embodiment.
6 is a flowchart showing the operation of the zero cross detection unit.
The flowchart which shows operation | movement of an amplitude level determination part.
6 is a flowchart showing the operation of the first and second amplitude adjustment units
according to the first embodiment. 6 is a flowchart showing an operation of a first amplitude gain
determination unit according to the first embodiment. FIG. 6 is a diagram showing an ALC
operation when a plurality of attack sounds are continuously input in the first embodiment. The
figure which shows the structure of the ALC section of the speech processing unit concerning a
2nd embodiment. 7 is a flowchart showing the operation of the ALC unit according to the second
embodiment. The figure which shows the structure of the ALC part of the speech processing unit
in 3rd Embodiment. 10 is a flowchart showing the operation of the ALC unit according to the
third embodiment. The flowchart which shows operation | movement of the amplitude gain
adjustment part which concerns on 3rd Embodiment. The flowchart which shows operation |
movement of the 2nd amplitude gain determination part which concerns on 3rd Embodiment.
The figure which shows the structural example of the automatic level control part of a speech
11-04-2019
3
processing unit. FIG. 15 is a flowchart showing the operation of the ALC unit of FIG. 14; 6 is a
flowchart showing the operation of an attack sound determination unit. The flowchart which
shows operation | movement of an amplitude gain determination part. (A) is a figure which
shows ALC operation | movement at the time of an attack sound, (b) shows ALC operation |
movement when not an attack sound. The figure which shows ALC operation | movement when
several attack sound is input continuously.
[0010]
Hereinafter, embodiments of the present invention will be described in detail with reference to
the attached drawings. In addition, the structure shown in the following embodiment is only an
example, and this invention is not limited to the illustrated structure.
[0011]
The following embodiments will be described for the speech processing apparatus, but any
apparatus capable of processing speech may be used. The voice processing device may be, for
example, an imaging device, a mobile phone, a smartphone, a personal computer, an IC recorder,
a car navigation system, a car having a voice recognition function, and the like. These audio
processing devices include a block that controls an audio signal collected by a sound collection
unit such as a microphone.
[0012]
<Description of Recovery Operation at Detection of Attack Sound> The present invention relates
to a voice processing apparatus having a function of automatic level control (ALC) for adjusting
the amplitude level so that the amplitude level of the input voice signal falls within a
predetermined range. . Before describing the embodiment of the present invention in detail, the
recovery operation upon detection of an attack sound in automatic level control of the voice
processing device will be described.
[0013]
11-04-2019
4
FIG. 14 shows an example of the configuration of the ALC unit of the speech processing
apparatus. In FIG. 14, an audio input unit 1501 inputs an audio signal from an audio input unit
such as a microphone or an audio reproduction device. An audio signal from which a DC
component has been removed is input to the audio input unit 1501. Therefore, voice signals of
positive and negative values are input around 0. In the present specification, the "voice signal"
includes various sounds as well as human voice. The amplitude adjustment unit 1502 adjusts the
amplitude of the input audio signal by the gain 1507 gain and outputs the result to the audio
output unit 1503. The amplitude level determination unit 1509 determines the amplitude level of
the audio signal of the audio output unit 1503. The zero cross detection unit 1504 detects the
zero cross of the value of the audio signal of the audio input unit 1501. The attack sound
determination unit 1510 measures a period from when the amplitude level 1508 output from the
amplitude level determination unit 1509 suddenly increases and then decreases, and determines
whether or not it is an attack sound. The amplitude gain determination unit 1506 performs
control so that the amplitude level 1508 falls between the lower limit value TH_MIN and the
upper limit value TH_MAX according to the zero cross detection result 1505, the amplitude level
1508, and the attack sound determination result 1511. By this control, the amplitude gain
determination unit 1506 determines the gain of the amplitude adjustment unit 1502, and
outputs the gain 1507.
[0014]
The operation of the ALC unit in FIG. 14 will be described below. Here, although the case where
speech is converted to a digital signal at the sampling frequency Fs will be described, the same
applies to an analog signal. FIG. 15 is a flowchart showing the operation of the ALC unit of FIG.
First, it is determined whether or not the current time is a sampling timing (S1601). If it is a
sampling timing, input of an audio signal from the audio input unit 1501 and output of an audio
signal to the audio output unit 1503 are performed. (S1602). Next, the amplitude level
determination unit 1509 determines the amplitude level of the audio signal (S1603), and the
zero cross detection unit 1504 performs zero cross detection of the audio signal (S1604). Next,
the attack sound determination unit 1510 determines an attack sound (S1605), and the
amplitude gain determination unit 1506 determines an amplitude gain (S1606). Thereafter, the
amplitude adjustment unit 1502 adjusts the amplitude of the audio signal using the gain 1507
which is the output of the amplitude gain determination unit 1506 (S1607), and waits until the
next sample timing comes.
[0015]
11-04-2019
5
FIG. 3 is a flowchart showing the operation of the zero cross detection unit 1504. First, the
sample value of the audio signal input via the audio input unit 1501 is set to DIN (S301). If the
sign is different from DIN_D which is the input of the previous sampling timing, that is, if DIN> 0
and DIN_D <0 (S302 YES), a value 1 indicating zero cross detection is output as a zero cross
detection result (S305) ). Also, if DIN <0 and DIN_D> 0 (S303 YES), the value 1 indicating zero
cross detection is output (S305). Also when DIN is 0, similarly, a value 1 indicating zero cross
detection is output as a zero cross detection result (S305). In other cases, a value 0 indicating
zero cross non-detection is output as the zero cross detection result (S305). The zero cross
detection result 1505 thus obtained is transmitted to the amplitude gain determination unit
1506. Then, for the next processing, the current DIN is substituted into DIN_D (S307), and the
processing is ended.
[0016]
FIG. 4 is a flowchart showing the operation of the amplitude level determination unit 1509. First,
the absolute value of the audio sample output from the amplitude adjustment unit 1502 to the
audio output unit 1503 is set to DIN (S401). It is determined whether or not DIN is equal to or
higher than the amplitude level DLEVEL which is the previous determination result (S402). If DIN
is greater than DLEVEL, substitute DIN into DLEVEL (S404). If DIN is not equal to or more than
DLEVEL in S402, it is determined whether or not DIN is equal to or less than a value obtained by
subtracting K1 from the amplitude level DLEVEL which is the previous determination result
(S403). If the determination is NO, DIN is substituted for DLEVEL (S404). If the determination is
YES, a value obtained by subtracting K1 from DLEVEL is substituted for DLEVEL (S405). At this
time, DLEVEL is limited not to be smaller than DIN. Then, the DLEVEL thus obtained is output as
the current amplitude level 1508 (S406). Since the envelope value of the audio signal can be
obtained by doing as described above, this is used as the amplitude level. In S401, the input voice
sample may be processed as it is, but if the absolute value is taken, a large level can be reflected
even if the positive and negative are asymmetrical, so the performance of ALC improves.
[0017]
FIG. 16 is a flowchart showing the operation of the attack sound determination unit 1510. When
adjusting the amplitude by the amplitude adjustment unit 1502, a short and large voice (attack
sound) is determined in order to change the amount of change in gain when a large voice
changes to a small voice according to the period of the large voice. First, the amplitude level
1508 determined by the amplitude level determination unit 1509 is set as DLEVEL (S1701). If
DLEVEL is larger than the threshold TH_MAX (S1702 YES), the fixed value K2 is added to
11-04-2019
6
ATT_CNT (S1703). If DLEVEL is equal to or smaller than TH_MAX (NO in S1702), the fixed value
K3 is subtracted from ATT_CNT (S1705) until ATT_CNT becomes 0 (YES in S1704). When the
voice suddenly increases, ATT_CNT continues to increase by K2 until the ALC decreases the
amplitude level to below TH_MAX (S1703). Thereafter, if the amplitude level is lower than
TH_MAX, ATT_CNT decreases by K3 (S1705). Therefore, it is judged whether ATT_CNT is 0 or
not (S1706), and if ATT_CNT is 0, 0 is output to indicate no attack sound detection (S1707), and
1 if ATT_CNT is not 0 to output attack sound. It outputs (S1708).
[0018]
FIG. 17 is a flowchart showing the operation of the amplitude gain determination unit 1506. The
amplitude gain determination unit 1506 performs control so that the amplitude level 1508 is
between TH_MIN and TH_MAX (where TH_MIN <TH_MAX). The operation of increasing the gain
1507 when the amplitude level 1508 is smaller than TH_MIN is called recovery operation, and
the operation of decreasing the gain 1507 when the amplitude level 1508 is larger than
TH_MAX is called limit operation. In the flowchart of FIG. 17, the variable GAIN represents the
gain 1507 to be output. Further, S_CNT is a variable representing a counter of sample frequency
timing. It is zero at the start of the M_LIMIT mode or M_RECOV mode, and counts up every
sample frequency timing.
[0019]
The amplitude gain determination unit 1506 first inputs the zero cross detection result 1505, the
amplitude level 1508, and the attack sound determination result 1511 to the variables Z_DET,
DLEVEL, and ATT_DET, respectively (S1801). Thereafter, the mode (MODE) is determined and
processing is performed according to the determination. MODEは、M_IDLE、
M_LIMIT、M_RECOVの3つのモードをもつ。 In recovery operation, MODE =
M_RECOV, and in limit operation, MODE = M_LIMIT. Further, when the amplitude level is in the
range of TH_MIN to TH_MAX, the gain is maintained as MODE = M_IDLE. M_LIMIT and
M_RECOV perform processing over a period of one sample to multiple samples.
[0020]
In the case of MODE = M_IDLE, if the current voice amplitude level DLEVEL> TH_MAX (S1803
YES), MODE = M_LIMIT is changed (S1804), and the process returns to S1802 again. On the
11-04-2019
7
other hand, if DLEVEL <TH_MIN (S1805 YES), MODE is changed to M_RECOV, and the process
returns to S1802. When DLEVEL is in the range of TH_MIN to TH_MAX (S1803 NO and S1805
NO), the value of GAIN is output as gain 1507 as it is (S1807), and the processing is ended.
[0021]
When MODE = M_RECOV, recovery operation is performed, but when DLEVEL exceeds TH_MAX
(S1808 YES), MODE is changed to M_LIMIT and limit operation is performed (S1809). This is
because the audio signal may become too large and distorted if the limit operation is not
performed at all until the recovery operation is completed. When MODE is changed from
M_RECOV to M_LIMIT, S_CNT is reset to 0 (S1809).
[0022]
When MODE = M_LIMIT, C_MIN, C_MAX, and ADD_GAIN are set to L_C_MAIN, L_C_MAX, and
L_ADD_GAIN, respectively (S1810).
[0023]
In the case of MODE = M_RECOV, if DLEVEL does not exceed TH_MAX (S1808 NO), the detection
result of the attack sound is determined (S1811).
Here, if the attack sound is not detected (ATT_DET = 0) (S1811 NO), C_MIN, C_MAX and
ADD_GAIN are respectively set to R_C_MAIN, R_C_MAX and R_ADD_GAIN (S1812). On the other
hand, in the case of attack sound detection (ATT_DET = 1) (S1811 YES), C_MIN, C_MAX and
ADD_GAIN are set to ATT_C_MAIN, ATT_C_MAX and ATT_ADD_GAIN, respectively (S1813). The
recovery operation at the time of the attack sound detection is called "first recovery operation".
C_MIN is a parameter for setting the minimum sample period for changing the gain, and is
usually set under the condition of C_MIN <C_MAX, but C_MIN may have any value when the zero
cross detection result is not used.
[0024]
When S_CNT> C_MAX (S1814 YES), the value of GAIN is updated to a value obtained by adding
11-04-2019
8
ADD_GAIN to the current value of GAIN (S1815). Similarly, when S_CNT> C_MIN and Z_DET = 1
(zero cross detection) (S1816 YES), the value of GAIN is updated to a value obtained by adding
ADD_GAIN to the current value of GAIN (S1815). Thereafter, S_CNT is reset to 0, MODE is set to
M_IDLE (S1817), GAIN is output (S1807), and the processing is ended. Otherwise, S_CNT is
incremented by 1 (S1818), GAIN is output while maintaining the MODE (S1807), and the
processing is ended.
[0025]
In the above processing, C_MIN corresponds to the time constant of GAIN change. As C_MIN
increases, it takes time for DLEVEL to fall within the range of TH_MIN to TH_MAX. This
corresponds to an increase in time constant of GAIN change. C_MAX functions as a limiter to
prevent the time constant of GAIN change from becoming too large in the case of low frequency
sound. The change of GAIN is performed by adding ADD_GAIN to GAIN (S1815). Accordingly,
when M_LIMIT (limit operation), ADD_GAIN is a negative value, and when M_RECOV (recovery
operation), ADD_GAIN is a positive value.
[0026]
The smaller the change in gain, the less the influence on the sound quality. So, here we do it as
follows.
[0027]
R_ADD_GAIN = −L_ADD_GAIN1 = ATT_ADD_GAIN2 Here, R_ADD_GAIN is a positive value. At
the time of the limit operation, the level of the audio signal becomes large to distort, so it is
better to reduce the gain as fast as possible. On the other hand, at the time of recovery operation,
it is better to increase the gain as slowly as possible so as to make level fluctuation
inconspicuous. そこで、 R_C_MIN > L_C_MIN とする。 Furthermore, in the case of
an attack sound, it is desirable to reduce the time constant of the recovery operation to make the
sound level immediately after the attack sound as appropriate as possible. そこで、
R_C_MIN > ATT_C_MIN とする。
[0028]
11-04-2019
9
In FIG. 18, (a) shows an ALC operation at the time of an attack sound, and (b) shows an ALC
operation at the time of not an attack sound. “Input sound envelope” is an envelope waveform
of the audio signal input to the audio input unit 1501. (a) shows an attack sound in which the
amplitude level suddenly increases and the amplitude level decreases. On the other hand, (b)
shows a sound that is not an attack sound, in which the amplitude level suddenly increases but
decreases after a while. The “output sound envelope” is an envelope waveform of the audio
signal of the audio output unit 1503 and is an output after ALC execution. “Gain” indicates the
change of the gain 1507 determined by the amplitude gain determination unit 1506.
“ATT_CNT” indicates a change in ATT_CNT calculated by the attack sound determination unit
1510 according to the flow of FIG. As described above, the recovery operation when the attack
sound is detected is referred to as "first recovery operation". Since attack sound is detected when
ATT_CNT> 0 (S1708), in FIG. 18A, the fast recovery operation is performed in the period of T3a
to T4a, and the normal recovery operation is performed in the period of T4a to T5a. In (b), since
ATT_CNT = 0 in the period from T3b to T4b, the normal recovery operation is performed and the
fast recovery operation is not performed.
[0029]
FIG. 19 is a diagram showing an ALC operation when a plurality of attack sounds are
continuously input. Since the fast recovery operation is performed in the fast recovery period
shown in FIG. 19, the time constant of the gain change when the input becomes small is small
and the recovery is quick. Therefore, since the recovery operation is completed before the next
attack sound comes, when the attack sound continues, the responses of the plurality of attack
sounds become the same as shown in the figure.
[0030]
However, when the attack sound is continuously input in a short cycle, if the amplification factor
of the level is raised more than usual in the recovery operation with one attack sound, the sound
is clipped and distorted at the rise portion of the attack sound thereafter. There is a problem of
[0031]
So, below, the embodiment for solving such a problem is described.
[0032]
11-04-2019
10
First Embodiment FIG. 1 is a diagram showing a configuration of an ALC unit of a voice
processing apparatus according to the present embodiment.
In FIG. 1, an audio input unit 101 inputs an audio signal from a microphone or an audio
reproduction device or the like.
The audio input unit 101 receives an audio signal from which a DC component has been
removed. Therefore, voice signals of positive and negative values are input around 0. As
illustrated, the ALC unit in the present embodiment includes a first ALC function unit 11 and a
second ALC function unit 12 provided in the latter stage. The audio output unit 105 outputs an
audio signal whose amplitude level is adjusted to be between TH_MIN and TH_MAX (where
TH_MIN <TH_MAX).
[0033]
The first ALC function unit 11 as a first level control unit includes a first amplitude adjustment
unit 102, a first amplitude gain determination unit 108, and a first amplitude level determination
unit 110. The second ALC function unit 12 as a second level control unit includes a second
amplitude adjustment unit 104, a second amplitude gain determination unit 112, and a second
amplitude level determination unit 114. The ALC unit in the present embodiment further
includes a zero cross detection unit 106. The provision of the zero cross detection unit 106 is
advantageous for improving the sound quality, but is not essential.
[0034]
First, the first ALC function unit 11 will be described. The first amplitude adjustment unit 102
amplifies or attenuates the audio signal from the audio input unit 101 according to the gain 111
determined by the first amplitude gain determination unit 108. The first amplitude level
determination unit 110 determines the amplitude level of the output signal 103 of the first
amplitude adjustment unit 102. The first amplitude gain determination unit 108 provides the
gain 111 to be provided to the first amplitude adjustment unit 102 according to the amplitude
level 109 determined by the first amplitude level determination unit 110 and the zero cross
detection result 107 from the zero cross detection unit 106. decide.
11-04-2019
11
[0035]
When the first amplitude gain determination unit 108 changes the gain 111, if it is changed at
timing when the absolute value of the amplitude level of the audio signal from the audio input
unit 101 is large, a step is formed in the audio waveform, and the sound quality is degraded.
Therefore, in the present embodiment, the point at which the amplitude level of the audio signal
from the audio input unit 101 crosses the 0 level in the zero cross detection unit 106
(hereinafter referred to as “zero cross”. Is detected, and the first amplitude gain determination
unit 108 changes the gain 111 at that timing. This can reduce the deterioration of the sound
quality. This utilizes the fact that the absolute value of the amplitude level of the audio signal
tends to be smaller at the timing of the zero crossing. The zero cross detection result 107 is
provided to the first amplitude gain determination unit 108. The first amplitude gain
determination unit 108 changes the gain 111 based on the zero cross detection result 107. In
addition, the first amplitude gain determination unit 108 controls the gain 111 such that the
amplitude level 109 received from the first amplitude level determination unit 110 falls between
TH_MIN and TH_MAX (where TH_MIN <TH_MAX).
[0036]
Next, the second ALC function unit 12 will be described. The second amplitude adjustment unit
104 amplifies or attenuates the output signal 103 of the first amplitude adjustment unit 102
according to the gain 115 determined by the second amplitude gain determination unit 112. The
second amplitude level determination unit 114 determines the amplitude level of the output
signal of the second amplitude adjustment unit 104. The second amplitude gain determination
unit 112 determines the gain 115 to be provided to the second amplitude adjustment unit 104
based on the amplitude level 113 determined by the second amplitude level determination unit
114 and the zero cross detection result 107.
[0037]
Similar to the first ALC function unit 11, the second amplitude gain determination unit 112, for
example, changes the gain 115 at the timing of the zero crossing detected by the zero crossing
detection unit 106. The output signal 103 of the first amplitude adjustment unit 102 only adjusts
the amplitude of the audio signal input to the audio input unit 101, and the timing of the zero
11-04-2019
12
cross of the output signal 103 and the audio signal input to the audio input unit 101 The timing
of the zero crossing of is the same. Therefore, the zero cross detection result 107 used in the first
ALC function unit 11 is utilized. That is, the zero cross detection result 107 is also transmitted to
the second amplitude gain determination unit 112. The second amplitude gain determination
unit 112 changes the gain 115 based on the zero cross detection result 107. In addition, the
second amplitude gain determination unit 112 controls the gain 115 such that the amplitude
level 113 received from the second amplitude level determination unit 114 is between TH_MIN
and TH_MAX (where TH_MIN <TH_MAX).
[0038]
The operation of each part will be described below using a flowchart. Although the processing of
the ALC unit in this embodiment can be realized by either digital signal processing or analog
signal processing, here, a case where an analog audio signal is converted into a digital signal at a
sampling frequency Fs will be described. Therefore, what is input to the audio input unit 101 is a
digitized audio signal, and the audio output unit 105 outputs a digital audio signal.
[0039]
FIG. 2 is a flowchart showing the operation of the ALC unit of FIG. First, it is determined whether
the current time is a sampling timing (S201). If it is a sampling timing, the input of the audio
signal from the audio input unit 101 and the output of the audio signal to the audio output unit
105 are performed. Perform (S202). Next, the amplitude level is determined by the first
amplitude level determination unit 110 and the second amplitude level determination unit 114
(S203), and the zero cross detection is performed by the zero cross detection unit 106 (S204).
Next, the amplitude gain is determined by the first amplitude gain determination unit 108 and
the second amplitude gain determination unit 112 (S205). Thereafter, the first amplitude
adjustment unit 102 performs amplitude adjustment using the determined gain 111, and the
second amplitude adjustment unit 104 performs amplitude adjustment using the determined
gain 115 (S206). Wait until the sample timing comes.
[0040]
The operation of the zero cross detection unit 106 is the same as the operation of the zero cross
detection unit 1504, and operates according to the flowchart of FIG. The operations of the first
11-04-2019
13
and second amplitude level determination units 110 and 114 are the same as the operation of
the amplitude level determination unit 1509, and operate according to the flowchart of FIG.
[0041]
FIG. 5 is a flowchart showing the operation of the first and second amplitude adjustment units
102 and 104. Here, the operation of the first amplitude adjustment unit 102 will be described.
The operation of the second amplitude adjustment unit 104 is the same. First, the sample value
of the input audio signal is set as a variable DIN, and the gain 111 is input to the variable GAIN
(S501). Next, DIN * GAIN is calculated, and the result is output (S502). If GAIN is a decibel of LOG
scale, convert it to a ratio (10 ^ (GAIN / 20)). There are various conversion methods such as a
combination of table and shift operation, but any method may be used.
[0042]
FIG. 6 is a flowchart showing the operation of the first amplitude gain determination unit 108.
The first amplitude gain determination unit 108 performs control so that the amplitude level 109
is between TH_MIN and TH_MAX (where TH_MIN <TH_MAX). As described above, the operation
of increasing the gain 111 when the amplitude level 109 is smaller than TH_MIN is called
recovery operation, and the operation of decreasing the gain 111 when the amplitude level 109
is larger than TH_MAX is called limit operation. In the flowchart of FIG. 6, the variable GAIN
represents the gain 111 to be output. Further, S_CNT is a variable representing a counter of
sample frequency timing. It is zero at the start of the M_LIMIT mode or M_RECOV mode, and
counts up every sample frequency timing.
[0043]
First, the first amplitude gain determination unit 108 inputs the zero cross detection result 107
and the amplitude level 109 into variables Z_DET and DLEVEL, respectively (S601). Thereafter,
the mode (MODE) is determined and processing is performed according to the determination.
MODEは、M_IDLE、M_LIMIT、M_RECOVの3つのモードをもつ。 In
recovery operation, MODE = M_RECOV, and in limit operation, MODE = M_LIMIT. Further, when
the amplitude level is in the range of TH_MIN to TH_MAX, the gain is maintained as MODE =
M_IDLE. M_LIMIT and M_RECOV perform processing over a period of one sample to multiple
samples.
11-04-2019
14
[0044]
In the case of MODE = M_IDLE, if the current amplitude level DLEVEL> TH_MAX (S603 YES),
MODE = M_LIMIT is changed (S604), and the process returns to S602 again. On the other hand,
if DLEVEL <TH_MIN (S605 YES), MODE is changed to M_RECOV, and the process returns to
S602 again. If DLEVEL is in the range of TH_MIN to TH_MAX (S603 NO and S605 NO), the value
of GAIN is output as it is as the gain 111 (S607), and the processing is ended.
[0045]
When MODE = M_RECOV, recovery operation is performed, but when DLEVEL exceeds TH_MAX
(S608 YES), MODE is changed to M_LIMIT and limit operation is performed (S609). This is
because the audio signal may become too large and distorted if the limit operation is not
performed at all until the recovery operation is completed. When MODE is changed from
M_RECOV to M_LIMIT, S_CNT is reset to 0 (S609).
[0046]
When MODE = M_LIMIT, C_MIN, C_MAX, and ADD_GAIN are set to L_C_MAIN, L_C_MAX, and
L_ADD_GAIN, respectively (S610).
[0047]
If DLEVEL does not exceed TH_MAX in the case of MODE = M_RECOV (S608 NO), C_MIN,
C_MAX, and ADD_GAIN are set to R_C_MAIN, R_C_MAX, and R_ADD_GAIN, respectively (S613).
[0048]
C_MIN is a parameter for setting the minimum sample period for changing the gain, and is
usually set under the condition of C_MIN <C_MAX, but C_MIN may have any value when the zero
cross detection result is not used.
[0049]
11-04-2019
15
When S_CNT> C_MAX (S614 YES), the value of GAIN is updated to a value obtained by adding
ADD_GAIN to the current value of GAIN (S615).
Similarly, when S_CNT> C_MIN and Z_DET = 1 (zero cross detection) (YES in S616), the value of
GAIN is updated to a value obtained by adding ADD_GAIN to the current value of GAIN (S615).
Thereafter, S_CNT is reset to 0, and MODE is set to M_IDLE (S617), and the process is ended.
Otherwise, S_CNT is incremented by 1 (S618), GAIN is output while maintaining the MODE
(S607), and the process is ended.
[0050]
In the above processing, C_MIN corresponds to the time constant of GAIN change. As C_MIN
increases, it takes time for DLEVEL to fall within the range of TH_MIN to TH_MAX. This
corresponds to an increase in time constant of GAIN change. C_MAX functions as a limiter to
prevent the time constant of GAIN change from becoming too large in the case of low frequency
sound. The change of GAIN is performed by adding ADD_GAIN to GAIN (S615). Accordingly,
when M_LIMIT (limit operation), ADD_GAIN is a negative value, and when M_RECOV (recovery
operation), ADD_GAIN is a positive value.
[0051]
The above is the operation flow of the first amplitude gain determination unit 108. The operation
flow of the second amplitude gain determination unit 112 is also similar to that of the first
amplitude gain determination unit 108.
[0052]
However, R_C_MIN, R_C_MAX, R_ADD_GAIN, L_C_MIN, L_C_MAX, and L_ADD_GAIN are set to
different values between the first ALC function unit 11 and the second ALC function unit 12.
Here, values on the first ALC function unit 11 side are R_C_MIN 1, R_C_MAX 1, R_ADD_GAIN 1,
11-04-2019
16
L_C_MIN 1, L_C_MAX 1, and L_ADD_GAIN 1. On the other hand, values on the second ALC
function unit 12 side are R_C_MIN2, R_C_MAX2, R_ADD_GAIN2, L_C_MIN2, L_C_MAX2, and
L_ADD_GAIN2. In this case, by setting, for example, the following settings, a good ALC operation
can be realized even when the attack sound is continuous.
[0053]
The smaller the change in gain, the smaller the influence on the sound quality. Therefore, in the
present embodiment, R_ADD_GAIN1 = −L_ADD_GAIN1 = R_ADD_GAIN2 = −L_ADD_GAIN2.
R_ADD_GAIN1 is a positive value.
[0054]
At the time of limit operation, it is preferable to reduce the gain as fast as possible since the level
of the audio signal becomes large and distorted. On the other hand, during recovery operation, it
is better to increase the gain as slowly as possible so that level fluctuations are not noticeable.
Therefore, R_C_MIN1> L_C_MIN1 and R_C_MIN2> L_C_MIN2.
[0055]
In addition, by making the time constant of gain increase during recovery operation of the first
ALC function unit 11 larger than the time constant of gain increase of the second ALC function
unit 12, good characteristics can be obtained when attack sound continues. You can get
Therefore, the following relationship is set.
[0056]
R_C_MIN1> R_C_MIN2 The change in gain at the time of limit operation varies depending on the
set value, but there is no problem if it has the following relationship. (L_C_MIN1 <
L_C_MIN2でもかまわない。 L_C_MIN1 L L_C_MIN2 If zero cross detection is not
performed, set R_C_MIN1 = R_C_MAX1, L_C_MIN1 = L_C_MAX1, R_C_MIN2 = R_C_MAX2,
L_C_MIN2 = L_C_MAX2. By this, it operates regardless of Z_DET.
11-04-2019
17
[0057]
In the present embodiment, when L_C_MIN1> L_C_MIN2, the operation in the case where a
plurality of attack sounds are continuously input is shown in FIG. In FIG. 7, “input sound
envelope” is an envelope waveform of the sound signal input to the sound input unit 101, and a
plurality of attack sounds are continuously input. The “output sound envelope” is an envelope
waveform of the audio signal output to the audio output unit 105, and is an envelope waveform
of the audio signal after ALC execution of this embodiment. “Gain 1” indicates the change of
the gain 111 determined by the first amplitude gain determination unit 108, and “Gain 2”
indicates the change of the gain 115 determined by the second amplitude gain determination
unit 112. The “total gain” is the sum of gain 1 and gain 2 and corresponds to the gain of the
entire ALC unit.
[0058]
Since R_C_MIN1> R_C_MIN2, gain 1 decreases when attack sound continues, and the value of
gain 2 on the second ALC function unit 12 with quick response rapidly increases the amplitude
of the input sound envelope The rate of change in parts is decreasing. For this reason, when the
attack sound continues, distortion of the output sound at the inrush portion of the attack sound
can be suppressed. In addition, since recovery is performed by the fast time constant of the 2nd
ALC function part 12 side when an attack sound is only 1 time, the characteristic equivalent to
the former can be acquired.
[0059]
Second Embodiment FIG. 8 is a diagram showing a configuration of an ALC unit of a voice
processing apparatus according to a second embodiment. In FIG. 8, an audio input unit 801
inputs an audio signal from a microphone, an audio reproduction device, or the like. The audio
input unit 801 receives an audio signal from which a DC component has been removed.
Therefore, voice signals of positive and negative values are input around 0. The audio output unit
803 outputs an audio signal whose amplitude level is adjusted to be between TH_MIN and
TH_MAX (where TH_MIN <TH_MAX).
[0060]
11-04-2019
18
The present embodiment provides an operation equivalent to that of the first embodiment. The
ALC unit in the present embodiment includes an amplitude level determination unit 804, an
amplitude adjustment unit 802, a zero cross detection unit 810, and first and second amplitude
gain determination units 812 and 814. The ALC unit in the present embodiment further includes
a first amplitude level prediction unit 806 as a first prediction unit, a second amplitude level
prediction unit 808 as a second prediction unit, and an amplitude gain calculation unit 816.
[0061]
FIG. 9 is a flowchart showing the operation of the ALC unit of FIG. First, it is determined whether
the current time is a sampling timing (S901). If it is a sampling timing, the input of the audio
signal from the audio input unit 801 and the output of the audio signal to the audio output unit
803 are performed. Perform (S902). Next, the amplitude level determination unit 804 determines
the amplitude level (S903), and the zero cross detection unit 810 performs zero crossing
detection (S904). Next, after the amplitude level prediction is performed by the first amplitude
level prediction unit 806, the amplitude level prediction is further performed by the second
amplitude level prediction unit 808 (S905). Next, the amplitude gain is determined by the first
amplitude gain determination unit 812 as the first gain control unit and the second amplitude
gain determination unit 814 as the second gain control unit (S906). Thereafter, the amplitude
gain computing unit 816 adds the first gain 813 determined by the first amplitude gain
determining unit 812 and the second gain 815 determined by the second amplitude gain
determining unit 814 (S 907). Then, the amplitude adjustment unit 802 performs the amplitude
adjustment using the total gain 817 which is the addition result (S 908), and waits until the next
sample timing comes.
[0062]
The operation of the zero cross detection unit 810 is the same as the operation of the zero cross
detection unit 1504, and operates according to the flowchart of FIG. Further, the operation of the
amplitude level determination unit 804 is the same as the operation of the amplitude level
determination unit 1509, and operates according to the flowchart of FIG. The operation of the
amplitude adjustment unit 802 is the same as the operation of the first and second amplitude
adjustment units 102 and 104 in the first embodiment, and operates according to the flowchart
of FIG. 5. The operations of the first and second amplitude gain determination units 812 and 814
are similar to the operations of the first and second amplitude gain determination units 108 and
11-04-2019
19
112, and operate according to the flowchart of FIG.
[0063]
R_C_MIN, R_C_MAX, R_ADD_GAIN, L_C_MIN, L_C_MAX, and L_ADD_GAIN are set to different
values between the first amplitude gain determination unit 812 and the second amplitude gain
determination unit 814. Here, values on the first amplitude gain determination unit 812 side are
R_C_MIN 1, R_C_MAX 1, R_ADD_GAIN 1, L_C_MIN 1, L_C_MAX 1, and L_ADD_GAIN 1. On the
other hand, values on the second amplitude gain determination unit 814 side are R_C_MIN2,
R_C_MAX2, R_ADD_GAIN2, L_C_MIN2, L_C_MAX2, and L_ADD_GAIN2. By making these values
the same as in the first embodiment, a good ALC operation can be realized even in the case
where the attack sound continues as in the first embodiment.
[0064]
Third Embodiment In the second embodiment described above, it is desirable that the total gain
817 be constant in the periods 1 to 4 of FIG. 7. However, since gain 1 and gain 2 operate
independently, the total gain may fluctuate in the minimum unit of the gain variable width. The
present embodiment takes measures against it.
[0065]
FIG. 10 is a diagram showing the configuration of the ALC unit of the speech processing
apparatus in the third embodiment. The configuration of FIG. 10 is configured such that an
amplitude gain adjustment unit 850 that adjusts the gain determined by the first amplitude gain
determination unit 812 is added to the configuration of FIG. 8. The same components as those in
FIG. 8 are denoted by the same reference numerals, and the description thereof will be omitted.
However, the second amplitude gain determination unit 814 is configured to determine the gain
based on the gain adjusted by the amplitude gain adjustment unit 850.
[0066]
FIG. 11 is a flowchart showing the operation of the ALC unit of FIG. The same steps as the steps
11-04-2019
20
in the flowchart of FIG. 9 have the same reference characters, and the description thereof will be
omitted. In FIG. 11, the difference from FIG. 10 is that S1101 is executed instead of S906. In
S1101, first, the first amplitude gain determination unit 812 determines an amplitude gain. Next,
the amplitude gain adjusting unit 850 adjusts the determined amplitude gain by the operation
according to the flow of FIG. 12 described later. Thereafter, the second amplitude gain
determination unit 814 determines the amplitude gain using the adjustment result.
[0067]
FIG. 12 is a flowchart showing the operation of the amplitude gain adjustment unit 850. First, the
amplitude level predicted by the first amplitude level prediction unit 806 is input to the variable
DIN, and the gain 813 determined by the first amplitude gain determination unit 812 is input to
the variable GAIN (S1201). If the value of DIN is between TH_MIN and TH_MAX (where TH_MIN
<TH_MAX) (S1202 and S1203 are both NO), GAIN_D-GAIN is substituted for the variable
ADJ_GAIN (S1204). If not, 0 is substituted into the variable ADJ_GAIN (S1205). GAIN_D is a gain
813 input at the previous sample timing. GAIN is substituted into GAIN_D for processing of the
next sampling timing (S1206). Then, ADJ_GAIN is output to the second amplitude gain
determination unit 814 (S1207).
[0068]
FIG. 13 is a flowchart showing the operation of the second amplitude gain determination unit
814. The same steps as the steps in the flowchart of FIG. 6 have the same reference characters,
and the description thereof will be omitted. 13 differs from FIG. 6 in that S1301 is executed
instead of S601 and S1302 is executed instead of S607. In S1301, the zero cross detection result
in the zero cross detection unit 810 is input to the variable Z_DET, and the amplitude level
determined by the amplitude level determination unit 804 is input to the variable DLEVEL.
Further, the amplitude gain (adjustment gain) adjusted by the amplitude gain adjustment unit
850 is input to the variable ADV_GAIN. In S1302, the value obtained by adding ADJ_GAIN to the
current GAIN is updated to a new GAIN, and the updated GAIN is output.
[0069]
According to the third embodiment, when the amplitude level predicted by the second amplitude
level prediction unit 808 is in the range of TH_MIN to TH_MAX, when the gain determined by
11-04-2019
21
the first amplitude gain determination unit 812 changes, the change amount is Is adjusted by the
amplitude gain adjustment unit 850. By this adjustment, gain 1 + gain 2 do not change. Thus, it is
possible to suppress the fluctuation in the minimum unit of the gain variable width in the periods
1 to 4 of FIG. 7.
[0070]
Other Embodiments The present invention is also realized by executing the following processing.
That is, software (program) for realizing the functions of the above-described embodiments is
supplied to a system or apparatus via a network or various storage media, and a computer (or
CPU, MPU or the like) of the system or apparatus reads the program. It is a process to execute. In
this case, the program and the storage medium storing the program constitute the present
invention.
11-04-2019
22
Документ
Категория
Без категории
Просмотров
0
Размер файла
37 Кб
Теги
description, jp2014126854
1/--страниц
Пожаловаться на содержимое документа