close

Вход

Забыли?

вход по аккаунту

?

JP2004343590

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JP2004343590
A 2-channel stereo signal in which a plurality of acoustic signals generated based on a plurality
of sounds coming from a plurality of sound sources such as voices, musical tones, and
environmental sounds are frequency band divided and frequency band divided A stereo audio
signal processing method, apparatus, program and storage medium for emphasizing a sound
source signal localized in the center using the power ratio of left and right channel signals. In a
two-channel audio signal processing method processed for stereo sound collection or stereo
reproduction, a process of dividing a stereo signal into a plurality of frequency components for
each of left and right channels, and power of the left and right channels for each of frequency
components Calculating the ratio, adding the left and right components to each frequency
component corresponding to the power ratio and multiplying the addition result by the level
adjustment coefficient, recombining each frequency component signal for each left and right
channel, And d. Outputting the recombined signal. [Selected figure] None
Stereo sound signal processing method, device, program and storage medium
BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a
stereo audio signal processing method, apparatus, program, and storage medium, and in
particular, a plurality of sounds coming from a plurality of sound sources such as voices, tones
and environmental sounds. Stereo sound that frequency-divides a two-channel stereo signal in
which a plurality of acoustic signals generated on the basis are mixed, and emphasizes a sound
source signal localized in the center using the power ratio of the left and right channel signals
divided into frequency bands The present invention relates to a signal processing method,
apparatus, program and storage medium. [0002] Various prior art examples have been
developed in which two microphones are used to pick up sound and to emphasize the acoustic
signal of the sound source located at the center between the microphones. The first prior art
10-05-2019
1
example is an example in which acoustic signals of left and right channels are added. The sound
of the sound source present at the center is input to the left and right microphones in
substantially the same phase, and when the collected sound signals of the left and right channels
are added, the amplitude of the sound signal is doubled. Other than that, in particular, the
reflected sound coming from the wall and the background noise are randomly input to the left
and right microphones, so the acoustic signals generated by the left and right microphones are
nearly uncorrelated, and the power addition is about √2 times become. As a result, the sound
source signal resulting from the sound source present at the center is enhanced by about √2
times, ie, 3 dB. This first conventional example is for linear processing of an acoustic signal, and
thus has the advantage that no distortion occurs in the acoustic signal, but two microphones can
in principle emphasize only up to 3 dB. . In addition, there is also a problem that the original
signal that was stereo becomes monaural. A second prior art example (see Patent Documents 1
and 2) is an example in which a stereophonic sound signal can be emphasized as compared with
the first prior art example. Hereinafter, to describe this, the left and right acoustic signals are
divided into a plurality of frequency bands, and the level difference and the phase difference of
the left and right acoustic signals are determined for each frequency band. Then, attenuation
coefficients are multiplied only for components having large level differences and phase
differences, and re-synthesis is performed. As a result, the source signal other than the source
signal that can be localized to the center where the level difference and the phase difference are
small as the stereo signal is suppressed according to the magnitude of the attenuation coefficient,
and the source signal heard to be localized at the center is relative Can be emphasized. Compared
with the first conventional example, the second conventional example is advantageous in that it
remains in stereo and that the amount of suppression of the sound source signal other than the
sound source signal that can be localized at the center can be increased. However, since the
acoustic signal processing is non-linear processing, there is a problem that the phase is
disturbed, the sense of localization is impaired, or distortion occurs.
Patent Document 1: Japanese Patent Application Laid-Open No. 2002-78100 Patent Document 2:
Japanese Patent Application Laid-Open No. 2002-247699 SUMMARY OF THE INVENTION
According to the present invention, a plurality of acoustic signals are mixed. A stereo sound
signal that solves the above-mentioned problem of dividing the two-channel stereo signal into
frequency bands and emphasizing the sound source signal localized at the center as a stereo
sound signal using the power ratio of the left and right channel signals divided into frequency
bands A processing method, an apparatus, a program, and a storage medium are provided.
Means for Solving the Problems Claim 1: In a two-channel audio signal processing method
processed for stereo sound collection or stereo reproduction, a stereo signal is divided into a
plurality of frequency components for each of left and right channels. The process of calculating
the power ratio of the left and right channels for each frequency component, the process of
adding the left and right components for each frequency component corresponding to the power
ratio, and the multiplication result of the addition result with the level adjustment coefficient A
10-05-2019
2
stereo audio signal processing method was constructed, which comprises the steps of
recombining each frequency component signal and outputting the recomposed signal. And in the
process of adding the left and right signals for each frequency component in the stereo sound
signal processing method according to the second aspect of the present invention, the frequency
ratio is within the range of a predetermined ratio in the vicinity of 1 A stereo audio signal
processing method is configured in which only addition is performed and addition is not
performed on other frequency components. According to a third aspect of the present invention,
in the stereo audio signal processing method according to any one of the first and second
aspects, the step of calculating the phase difference between the left and right channels for each
frequency component is further included; A stereo audio signal processing method is configured,
which includes a process of obtaining, for each frequency component, an attenuation coefficient
for suppressing a frequency component having a large phase difference, and a process of
multiplying each frequency component by the attenuation coefficient. Here, the fourth aspect
comprises the stereo signal input unit 102 and divides the left channel stereo signal into a
plurality of frequency components, and divides the left channel frequency band division unit 103
and the right channel stereo signal into a plurality of frequency components. A power ratio
calculating unit 105 is provided which comprises a right channel frequency band dividing unit
104 and calculates a power ratio q (k) for each of the same frequency bands for the signal
components of the left and right channels divided into bands; The mixing determination unit 106
calculates for each frequency band whether or not the value is a value near 1 and the power ratio
q (k) is a value near 1 based on the determination result of the mixing determination unit 106.
The left and right signal components fL (k) and fR (k) determined to be present are subjected to
addition processing for each frequency band, and an LR mixing unit 107 that multiplies the
addition result by the level adjustment coefficient is provided. And forming a stereo audio signal
processor comprising a left channel signal combining unit 108 and right channel signal
combining unit 109 for recombining the left and right channel signals by inputting the output
signal after processing in the R mixer 107.
According to a fifth aspect of the present invention, there is provided a stereo acoustic signal
processing apparatus according to the fourth aspect of the present invention, including a power
ratio calculation unit 205 for calculating the phase difference between left and right channels for
each frequency component in addition to the power ratio. A left channel attenuation coefficient
multiplier including an attenuation coefficient calculation unit 206 which is provided instead of
105 and calculates an attenuation coefficient for each frequency component for suppressing
frequency components with large phase difference for each frequency component, and
multiplying each frequency component by the attenuation coefficient A stereo audio signal
processor comprising the digital signal processor 207 and the right channel attenuation
coefficient multiplier 208 is configured. Claim 6: The stereo signal is divided into a plurality of
frequency components for each of the left and right channels, the power ratio of the left and
right channels is calculated for each of the frequency components, and the left and right
10-05-2019
3
components are added for each frequency component according to the power ratio. At the same
time, the addition result is multiplied by the level adjustment coefficient to recombine each
frequency component signal for each of the left and right channels, and a stereo sound signal
processing program is configured to give the computer a command to output the recombined
signal. In the seventh aspect, the stereo signal is divided into a plurality of frequency components
for each of the left and right channels, the power ratio of the left and right channels is calculated
for each of the frequency components, and the phase difference between the left and right
channels is calculated for each of the frequency components. Then, the left and right components
are added for each frequency component corresponding to the power ratio, and the addition
result is multiplied by the level adjustment coefficient to obtain the attenuation coefficient for
suppressing the frequency component with a large phase difference for each frequency
component Was multiplied to each frequency component, each frequency component signal for
each of the left and right channels was recombined, and a stereo sound signal processing
program for making a computer a command to output the recombined signal was constructed.
Claim 8: A storage medium storing the stereo sound signal processing program according to
claim 6 is configured. In a ninth aspect, there is provided a storage medium storing the stereo
acoustic signal processing program according to the seventh aspect. BEST MODE FOR CARRYING
OUT THE INVENTION A stereo sound signal processing apparatus according to the present
invention comprises a stereo sound signal input unit, a frequency band dividing unit for dividing
a signal of two channels into a plurality of frequency components for each channel, and A left /
right signal whose power ratio is determined to be close to 1 as a result of comparison between a
power ratio calculation unit that calculates the power ratio between channels for each frequency
component and a mixing determination unit that compares the power ratio between
predetermined channels. Only LR components are added for each frequency, and when the added
frequency component signals are allocated to left and right signals, the LR mixing unit multiplies
the left and right signal components with a weighting factor that makes the power added value of
the left and right signals constant. It has left and right channel signal combining sections for
recombining each frequency component signal, and a stereo signal output section for outputting
the recombined signal.
First, the input stereo signal is divided into a plurality of frequency components for each channel.
Then, the power ratio of signal components between channels is calculated for each frequency
component. In order to detect close components of left and right power, a threshold with a
constant ratio width is set around power ratio 1, and it is determined whether the calculated
power ratio falls within the width. The left and right signal components are added only when the
power ratio falls within the threshold. After the addition, when allocating to the left and right
channels, the weighting factor √2 is multiplied to each of the left and right signal components.
Each frequency component is re-synthesized for each channel, returned to a time waveform, and
output. A first embodiment of the present invention will be described with reference to FIG. The
sound signal input to the stereo signal input unit 102 is a stereo signal collected so that the
10-05-2019
4
target sound source signal to be emphasized is localized and perceived as the center, that is, the
target sound source signal has substantially the same phase on the left and right. The present
invention is effective if the sound is picked up at the level. The stereo signal input to the stereo
signal input unit 102 is processed for each of the left and right channels. The processing method
will be described below. The left channel signal sL is frequency band divided by the left channel
frequency band dividing unit 103 and converted into a frequency domain signal. Similarly, the
right channel signal sR is frequency band divided by the right channel frequency band dividing
unit 104 and converted into a frequency domain signal. Here, the number of band divisions is N.
In the left channel, the frequency-divided signal components are arranged in ascending order of
frequency: fL (O), fL (1), fL (2). . . . fL (k). . . . It is set as fL (N-1). In the right channel, the
frequency-divided signal components are arranged in ascending order of frequency from fR (O),
fR (1), fR (2). . . . fR (k). . . . It is set as fR (N-1). These left channel signal components fL (O), fL (1),
fL (2). . . . fL (k). . . . fL (N-1) and right channel signal components fR (O), fR (1), fR (2). . . . fR (k). . .
. The fR (N-1) is supplied to the LR mixing unit 107 described later. In the power ratio calculation
unit 105, fL (k) and fR (k) are power ratios q (O), q (1), q (2). . . . q (k). . . . q (N-1) is calculated as
follows. When | fR (k) | ≦ | fL (k) |, α (k) = | fR (k) | ∧ 2 / | fL (k) | ∧ 2 | fR (k) |> | fL (k) When
|, α (k) = | fL (k) | ∧2 / | fR (k) | ∧2 (1) where A∧2 represents the square of A.
Also, | A | indicates the size of A (complex number). For convenience, the power ratio is squared
in order to define it, but the effect of the present invention is the same even if it is defined as a
ratio of absolute value without squaring. Next, operations of the mixing determination unit 106
and the LR mixing unit 107 will be described. The sound source signal perceived to be localized
near the center has a small power ratio between the left and right. Therefore, the mixing
determination unit 106 determines that the threshold width determination variable for
determining the width of the threshold is α (a real number of 1 or more), and if q (k) is in the
range of 1 / α to 1, it is determined as the central localization sound source component . If q (k)
is smaller than 1 / α, it is determined that the component is not the central localization sound
source component. Information on the mixing determination result indicating that these are the
central localization sound source components or not the central localization sound source
components is supplied to the LR mixing unit 107. The LR mixing unit 107 adds the left and
right components fL (k) and fR (k) of the sound source signal determined to be the central
localization sound source component by the mixing determination unit 106 as shown in equation
(2). Is multiplied by β and output. The left and right components fL (k) and fR (k) of the sound
source signal determined to be components other than the central localization sound source
component are output as they are in the LR mixing unit 107 without being added according to
equation (3). Here, fL '(k) and fR' (k) indicate output signals after addition processing of fL (k) and
fR (k) in the LR mixing unit 107, respectively. That is, when 1 / α ≦ q (k) ≦ 1, fL ′ (k) = fR ′
(k) = β <*> (fL (k) + fR (k)) (2) 1 / α> q ( When k), fL '(k) = fL (k) fR' (k) = fR (k) (3) fL '(k) and fR'
(k) are the left channel signal combining unit 108 and the right channel The signal is recombined
by the signal combining unit 109 and output from the stereo signal output unit 110. Β is a level
10-05-2019
5
adjustment coefficient that adjusts the size before and after addition. Generally, even if the left
and right level differences are equal, if the phases are largely different, they are not perceived as
being localized at the center. When the condition of the above equation (2) is satisfied and
addition is performed, the phase difference between the left and right is small, that is, the
component of the sound source signal perceived by being localized at the center has about twice
the amplitude by the synchronous addition effect. When the phase difference is large, power
addition is performed and the power is approximately doubled. As a result, only the sound source
signal component perceived to be localized at the center is emphasized by √2 times (3 dB). By
setting the level adjustment coefficient β to 1 / √2, it is possible to substantially match the
power of the component to which equation (2) is applied and the power of the component of low
correlation to which equation (3) is applied, As a result, it is possible to solve the problem that
only the sound source signal component that is perceived as being localized at the center while
being localized in stereo is perceived to be enhanced by about 3 dB to be monaural, which is the
problem of the first conventional example. .
A second embodiment will be described with reference to FIG. Similarity calculation unit 205 also
calculates left and right phase differences in addition to the power ratio in FIG. In attenuation
coefficient calculation section 206, attenuation coefficients g (O), g (1), g (2). . . . Find g (N-1) and
multiply each frequency component. This is a conventional example in which the left channel
sound source signal synthesis unit 108 and the right channel sound source signal synthesis unit
109 recombine as it is, but in the second embodiment, the LR mixing unit 107 and the mixing
determination unit 106 are illustrated in the previous stage. Is inserted at the position where
Here, the left and right phase differences calculated by the similarity calculation unit 205 are
supplied to the attenuation coefficient calculation unit 206, and attenuation coefficients g (O), g
(1), g (2). . . . Find g (N-1). The attenuation coefficient calculated in attenuation coefficient
calculation section 206 is multiplied by each frequency component in left channel attenuation
coefficient multiplier 207 and right channel attenuation coefficient multiplier 208. The
integration result in left channel attenuation coefficient multiplier 207 and right channel
attenuation coefficient multiplier 208 is supplied to LR mixing section 107. Similar to the basic
configuration of FIG. 1, the similarity calculation unit 205 extracts the power ratio q (k)
information, and the mixing determination unit 106 determines the conditions of the equations
(2) and (3). Run. Information on the mixing determination result of the mixing determination unit
106 indicating that it is the central localization sound source component or not the central
localization sound source component is supplied to the LR mixing unit 107. Left and right
components fL (k) and fR (k) of the left and right channels of the integration result in left channel
attenuation coefficient multiplier 207 and right channel attenuation coefficient multiplier 208 k)
and fR (k) are added and output in the LR mixing unit 107 as in equation (2). The left and right
components fL (k) and fR (k) of the sound source signal determined to be components other than
the central localization sound source component are output as they are in the LR mixing unit 107
without being added according to equation (3). The outputs fL '(k) and fR' (k) after the mixing
10-05-2019
6
processing in the LR mixing unit 107 are recombined by the left channel signal combining unit
108 and the right channel signal combining unit 109, and output from the stereo signal output
unit 110. . According to the conventional example only, although the suppression effect other
than the central localization sound source is large, the phase is disturbed due to the non-linear
processing, and the localization of the central localization sound source is perceived as being
blurred. .
However, in the second embodiment, since the left-right difference of the sound source near the
center becomes smaller including the level difference and the phase difference 迄, there is an
effect of being perceived near the center, and the effect of synchronous addition is also exhibited.
When the value of the threshold value width determining variable α in the equations (2) and (3)
is increased, the localization is narrowed to the center and at the same time the synchronous
addition effect is closer to the theoretical maximum amount of 3 dB. Even the sound source
localized in any one is shifted to the center, and finally, when α is made infinite, the same effect
as in the first conventional example is obtained, and monauralization is achieved. In order to
emphasize the central localization sound source while maintaining the stereo effect, the threshold
value width determination variable α is set to an appropriate value according to the sound
collection situation. As described above, according to the present invention, the reproduction
according to the preference of the listener of the stereo music source is performed or the target
voice is emphasized and heard under environmental noise. When applied to enhance the sound
source signal localized from the stereo sound signal to the center, the problems of the
conventional example are improved, and the effect of maintaining better sound quality and
localization feeling is achieved. By the way, the component of the sound source signal localized at
the center and perceived is approximately doubled in amplitude by the synchronous addition
effect. When the phase difference is large, power addition is performed, and the power is
approximately doubled. As a result, only the sound source signal component that is perceived to
be localized at the center is enhanced by √2 times (3 dB). Here, if the level adjustment
coefficient β is 1 / √2, the power of the component to which the equation (2) is applied and the
power of the component of low correlation to which the equation (3) is applied can be made to
almost coincide. By this, only the sound source signal component that is localized and perceived
to be centered while maintaining the stereo localization is emphasized by about 3 dB to solve the
problem of being monaural, which is the problem of the first conventional example. it can. In the
second embodiment shown in FIG. 2, an LR mixing unit 107 and a mixing determination unit 106
are additionally inserted in front of the left channel sound source signal synthesizing unit 108
and the right channel sound source signal combining unit 109. In other words, the part excluding
the mixing determination unit 106 and the LR mixing unit 107 corresponds to the second
conventional example. In this second conventional example, although the suppression effect
other than the central localization sound source is large, there is a problem that the phase is
disturbed from the place of the non-linear processing, and the localization of the central
localization sound source is perceived as blurred. However, in the second embodiment in which
10-05-2019
7
the mixing determination unit 106 and the LR mixing unit 107 are added, the left / right
difference of the sound source near the center becomes smaller including the level difference and
the phase difference の, so there is an effect perceived near the center Furthermore, the effect of
synchronous addition is also exhibited.
Further, as the threshold value width determining variable α in the equations (2) and (3) is
increased, the localization is narrowed to the center and at the same time the synchronous
addition effect approaches 3 dB which is the theoretical maximum amount. However, even the
sound source localized to either the left or right will be close to the center, and finally, if α is
made infinite, the same effect as in the first conventional example is achieved, and
monauralization is achieved. However, by setting the threshold value width determination
variable α to an appropriate value according to the sound collection situation, it is possible to
configure a stereo acoustic signal processing device that emphasizes the central localization
sound source while maintaining the stereo effect. BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1
is a block diagram showing a first embodiment of the present invention. FIG. 2 is a block diagram
showing a second embodiment of the present invention. [Description of code] 102 stereo signal
input unit 103 left channel frequency band division unit 104 right channel frequency band
division unit 105 power ratio calculation unit 106 mixing determination unit 107 LR mixing unit
108 left channel sound source signal synthesis unit 109 right channel sound source signal
synthesis Unit 110 Stereo signal combining unit 111 Stereo loudspeaker 112 Stereo headphone
205 Similarity calculation unit 206 Attenuation coefficient calculation unit 207 Left channel
attenuation coefficient multiplier 208 Right channel attenuation coefficient multiplier
10-05-2019
8
Документ
Категория
Без категории
Просмотров
0
Размер файла
20 Кб
Теги
jp2004343590
1/--страниц
Пожаловаться на содержимое документа