close

Вход

Забыли?

вход по аккаунту

?

JP2007318274

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JP2007318274
An object of the present invention is to suppress wraparound speech more effectively. A first
filter (203) corrects the frequency characteristic of an input sound collection beam signal so that
a wraparound speech has a uniform signal level in all frequency bands, and applies the echo
cancellation circuit (200) to the echo cancellation circuit. The echo cancellation circuit 200
generates a pseudo-regression sound signal from the input sound signal which is a sound
emission signal, and subtracts it from the sound collection beam signal corrected by the first
filter 203. Since the frequency characteristics of the wraparound speech in the first filter 203 are
substantially uniform over the entire band, the wraparound speech signal and the pseudoregression tone signal included in the corrected sound collection beam signal have substantially
the same frequency spectrum. Therefore, the echo cancellation circuit 200 effectively removes
the wraparound speech. The second filter 204 re-adjusts the frequency characteristic of the
utterance sound from the speaker included in the collected beam signal whose frequency
characteristic has changed together with the wraparound speech to the original state, and
outputs it as an output sound signal. [Selected figure] Figure 3
Sound emission device
[0001]
The present invention relates to a sound emission and collection device used for an audio
conference or the like held between a plurality of points via a network or the like, and more
particularly to a sound emission and collection device having an echo cancellation function.
[0002]
04-05-2019
1
2. Description of the Related Art Conventionally, as a method of conducting an audio conference
between remote places, a method of installing a sound emitting and collecting device at each
point where an audio conference is to be performed and connecting these devices by a network
to communicate audio signals is widely used.
In many cases, in the sound emission and collection device, a speaker for emitting the sound of
the other party device and a microphone for collecting the sound of the own device are
simultaneously installed in one case.
[0003]
For example, the audio conference apparatus (sound emitting and collecting apparatus) disclosed
in Patent Document 1 emits an audio signal input through a network from a speaker disposed on
the top surface, and sets different directions disposed on the side to different directions. The
voices are picked up by the respective microphones in the front direction, and the picked up
signal is transmitted to the outside through the network. JP-A-8-298696
[0004]
However, in the device of Patent Document 1, when the microphones and the speakers are in
proximity to each other, the collected sound signals of the respective microphones often include
wraparound sound from the speakers. For this reason, the S / N ratio of the collected signal to
the utterance sound of the speaker of the own device is reduced, and the utterance sound of the
speaker can not be picked up clearly and can not be output.
[0005]
Conventionally, echo cancellation processing exists as a method of removing such wraparound
speech. In echo cancellation processing, an adaptive filter and a post processor which is a
subtractor are used to set an input voice signal, that is, a pseudo-regression sound signal for a
wraparound voice of a sound emission signal with an adaptive filter, By subtracting the pseudoregression sound signal, the wraparound speech is removed.
04-05-2019
2
[0006]
However, the wraparound speech signal usually has frequency characteristics as shown in FIG.
FIG. 6 is a diagram showing the frequency characteristics of a general looped speech signal.
Although this frequency characteristic differs depending on the specification of the apparatus,
etc., the sound emitting and collecting apparatus having the configuration as in this embodiment
described later has almost the frequency characteristic shown in FIG. As shown in FIG. 6, the
wraparound speech signal has a low signal level in the high band exceeding 1000 Hz, as opposed
to a low band signal level of about 300 Hz. When echo cancellation processing is performed by
setting operation bits based on the low frequency component to the wraparound speech signal
having such characteristics, the high frequency component represents the signal level at the time
of actual operation because the signal level is low. As the number of bits decreases, the
calculation accuracy becomes significantly lower than the low frequency. For this reason, even if
the low-frequency component echo cancellation can be performed with high accuracy, the highfrequency component echo cancellation can not be performed with high accuracy.
[0007]
On the other hand, when the number of operation bits is increased in order to improve the
operation accuracy with respect to the high frequency component, the operation processing load
further increases with respect to the echo cancellation processing in which the operation amount
is originally larger than other processing. For this reason, a higher performance arithmetic
processor (DSP) is required, and the cost of the sound emitting and collecting apparatus itself
increases.
[0008]
Therefore, the object of the present invention is to eliminate the influence of the self-contained
speech not only on the low frequency band but also on the high frequency band, without
requiring a high-performance arithmetic processing unit or a complicated configuration, It is an
object of the present invention to provide a sound emission and collection device capable of
collecting and outputting sound with a high S / N ratio.
[0009]
A sound emission and collection device according to the present invention comprises a sound
emission means for emitting an input sound signal, a sound collection means for collecting sound
04-05-2019
3
and generating a sound collection signal, and a sound collection means to sound collection
means for the sound collection signal. A wraparound characteristic correction filter that performs
a first filtering process to make the frequency characteristic of the looping voice substantially
uniform, an echo cancellation unit that performs an echo cancellation process on the first filtered
sound collection signal, and an echo cancellation unit And a characteristic readjustment filter for
performing a second filtering process that is an inverse characteristic of the first filtering process
on the sound collection signal subjected to the cancellation process.
[0010]
In this configuration, a part of the sound emitted from the sound emitting means wraps around
and is collected by the sound collecting means.
As described above, the wraparound sound collected by the sound collection means has a high
signal level in the low range and a low signal level in the high range.
[0011]
The wraparound characteristic correction filter performs a first filtering process to raise the
signal level of the high-pass component of the wraparound voice to the signal level of the lowpass component.
That is, the signal level of the high frequency component is relatively raised with reference to the
low frequency component to make the frequency characteristic of the signal level substantially
uniform. At this time, the frequency characteristic of the collected signal including the
wraparound speech changes in accordance with the first filtering process.
[0012]
The echo cancellation means performs echo cancellation processing of the collected sound signal
corrected with uniform frequency characteristics to the wraparound speech. At this time, the
high frequency component is corrected by the high frequency component as well as the low
frequency component, so that the high frequency component is expressed by the number of
04-05-2019
4
effective bits larger than that before the correction (first filtering process). Here, the number of
effective bits represents at least the number of bits necessary to represent the signal level of the
corresponding frequency, and the lower the signal level, the smaller the effective number of bits,
and the higher the signal level, the larger the effective number of bits. . Thus, as the number of
effective bits increases, the resolution of the signal level of not only the low band component but
also the high band component is increased. Therefore, the echo cancellation processing is
performed with high accuracy in both the low band and the high band.
[0013]
The characteristic readjustment filter performs a second filtering process that is an inverse
characteristic of the first filtering process. That is, the signal level of the low band is high and the
signal level of the high band is low for the collected sound signal corrected to substantially
uniform frequency characteristics by the first filtering process, as in the frequency characteristics
of the original loop-around voice. Correct to reverse. As a result, the sound collection signal from
which the wraparound speech has been removed after the frequency characteristics have been
corrected is restored to the original or sound collection frequency characteristics.
[0014]
As a result, the wraparound sound is removed from the collected signal at the time of collection,
and a collected signal having no change in frequency characteristics is generated.
[0015]
Further, the sound collection means of the sound emission and collection device according to the
present invention has strong directivity in different directions using a microphone array formed
by installing a plurality of microphones in a predetermined array and an audio signal collected by
the plurality of microphones. And a sound collection beam signal generation unit that generates a
plurality of sound collection beam signals and outputs the sound collection beam signals as a
sound collection signal.
Furthermore, the wraparound characteristic correction filter and the characteristic readjustment
filter of the sound emission and collection device according to the present invention have a
filtering characteristic according to the sound collection directivity selected from a plurality of
filtering characteristics set in advance for each sound collection beam signal. It is characterized
04-05-2019
5
in that it performs selection processing and filtering processing.
[0016]
In this configuration, a plurality of collected sound beam signals are generated using the sound
collected by each microphone of the microphone array of the sound collection means. Each
sound collection beam signal (sound collection signal) has different sound collection directivity,
so that the frequency characteristics of looped speech are also different. The wraparound
characteristic correction filter and the characteristic readjustment filter select the filtering
characteristic for each sound collection directivity and perform the filtering process to perform
the filtering process adapted to the selected sound collection directivity. As a result, in
accordance with the sound collection directivity, the wraparound speech is removed with higher
accuracy.
[0017]
In the sound emitting means of the sound emitting and collecting apparatus of the present
invention, a plurality of different sound emission characteristics are realized by the speaker array
formed by installing a plurality of speakers in a predetermined array and the sound emitted by
the plurality of speakers. And sound emission control means for generating sound emission
signals for a plurality of speakers. Further, the wraparound characteristic correction filter and the
characteristic readjustment filter of the sound emission and collection device of the present
invention select the filtering characteristic corresponding to the selected sound emission
characteristic from a plurality of filtering characteristics set in advance for each sound emission
characteristic. And filtering processing.
[0018]
In this configuration, a plurality of sound emission characteristics are realized using the sounds
emitted from the speakers of the speaker array of the sound emission means. In each sound
emission characteristic, the frequency characteristic of the wraparound speech is also different.
The wraparound characteristic correction filter and the characteristic readjustment filter select
the filtering characteristic for each sound emission characteristic and perform the filtering
process, thereby performing the filtering process adapted to the selected sound emission
characteristic. As a result, in accordance with the sound emission characteristic, the wraparound
04-05-2019
6
speech is removed with higher accuracy.
[0019]
In the sound emission and collection device of the present invention, a plurality of microphones
formed by arranging a plurality of microphones in a predetermined array and a plurality of
microphones having strong directivity in different directions using voice signals collected by the
plurality of microphones are used. Sound collecting means comprising sound collecting beam
signal generating means for generating a sound collecting beam signal and outputting the sound
collecting beam signal as a sound collecting signal, a speaker array formed by installing a
plurality of speakers in a predetermined array, and And sound output means including sound
output control means for generating sound output signals for the plurality of speakers such that
a plurality of different sound emission characteristics are realized by the sound emitted by the
speakers. Furthermore, the wraparound characteristic correction filter and the characteristic
readjustment filter of the sound emitting and collecting apparatus are filtered according to a
combination selected from a plurality of filtering characteristics previously set for each
combination of the sound collection beam signal and the sound emission characteristic. It is
characterized in that the characteristic is selected and the filtering process is performed.
[0020]
In this configuration, the filtering process is performed by selecting the filtering characteristic for
each combination of the sound collection directivity and the sound emission characteristic and
performing the filtering process. As a result, in accordance with the combination of the sound
collection directivity and the sound emission characteristic, the wraparound sound is removed
with higher accuracy.
[0021]
Further, the wraparound characteristic correction filter and the characteristic readjustment filter
of the sound emission and collection device of the present invention are characterized in that
filtering processing is performed by double precision bit operation on the echo cancellation
means.
[0022]
04-05-2019
7
In this configuration, by performing bit operation with double precision at the time of filtering
processing, the signal level of the high frequency band is corrected and restored with higher
precision.
Then, while increasing the bit operation amount of the filtering process, the wraparound speech
is removed with higher accuracy without increasing the bit operation amount of the echo
cancellation process. At this time, since the echo cancellation processing has a higher load than
the filtering processing, highly accurate removal of the wraparound speech is performed without
significantly increasing the processing load.
[0023]
According to the present invention, regardless of the frequency characteristics of the wraparound
voice, the wraparound voice is reliably and effectively removed, and the voice from the sound
source direction of the speaker etc. is collected and output with a high S / N ratio. Can.
[0024]
A sound emission and collection device according to an embodiment of the present invention will
be described with reference to the drawings.
FIG. 1A is a plan view showing the arrangement of the microphones and speakers of the sound
emission and collection device 1 according to this embodiment, and FIG. 1B is formed by the
sound emission and collection device 1 shown in FIG. 1A. It is a figure which shows a sound
collection beam area.
[0025]
FIG. 2 is a functional block diagram of the sound emission and collection device 1 of the present
embodiment.
[0026]
The sound emission and collection device 1 of this embodiment is provided with a plurality of
speakers SP1 to SP3, a plurality of microphones MIC11 to MIC17, and MIC 21 to MIC 27 in the
housing 101, and the functional units shown in FIG.
04-05-2019
8
[0027]
The housing 101 has a substantially rectangular shape elongated in one direction, and at both
ends of the long sides (faces) of the housing 101, the lower surface of the housing 101 is
separated from the installation surface by a predetermined distance. Legs (not shown) are
installed.
In the following description, among the four side surfaces of the housing 101, the long surface is
referred to as a long surface, and the short surface is referred to as a short surface.
[0028]
On the lower surface of the casing 101, single directional non-directional speakers SP1 to SP3
having the same shape are installed.
The single speakers SP1 to SP3 are linearly installed at regular intervals along the longitudinal
direction, and the straight line connecting the centers of the single speakers SP1 to SP3 is along
the long surface of the housing 101, The horizontal position is set to coincide with a central axis
100 connecting the centers of the short surfaces. That is, a straight line connecting the centers of
the speakers SP1 to SP3 is disposed on a vertical reference plane including the central axis 100.
As described above, the speaker array SPA 10 is configured by arranging the single speakers SP1
to SP3. In such a state, when sound is emitted from each single speaker SP1 to SP3 of the
speaker array SPA10, the emitted sound is equally transmitted to the two long surfaces. At this
time, the emitted sound propagating to the two opposing long planes travels in mutually
symmetrical directions orthogonal to the reference plane.
[0029]
Microphones MIC11 to MIC17 having the same specifications are installed on one long surface
of the housing 101. The microphones MIC11 to MIC17 are linearly arranged at regular intervals
along the longitudinal direction, and thereby the microphone array MA10 is configured. Also, on
the other long surface of the housing 101, microphones MIC21 to MIC27 of the same
04-05-2019
9
specifications are installed. The microphones MIC21 to MIC27 are also linearly arranged at
regular intervals along the longitudinal direction, and the microphone array MA20 is thus
configured. The microphone array MA10 and the microphone array MA20 are arranged such
that the vertical positions of their arrangement axes coincide with each other, and furthermore,
the microphones MIC11 to MIC17 of the microphone array MA10 and the microphones MIC21
to MIC27 of the microphone array MA20 They are respectively disposed at symmetrical
positions with respect to the reference plane. Specifically, for example, the microphone MIC11
and the microphone MIC21 are in a symmetrical relationship with respect to the reference plane,
and similarly, the microphone MIC17 and the microphone MIC27 are in a symmetrical
relationship.
[0030]
In the present embodiment, the number of speakers in the speaker array SPA10 is three and the
number of microphones in each of the microphone arrays MA10 and MA20 is seven. However,
the number of speakers and the number of microphones are not limited thereto. May be set as
appropriate. In addition, each speaker interval of the speaker array and each microphone interval
of the microphone array may not be constant. For example, they are densely arranged at the
central portion along the longitudinal direction and are sparsely arranged toward both ends It
may be an aspect as well.
[0031]
Next, as shown in FIG. 2, the sound emission and collection device 1 of this embodiment
functionally includes the control unit 10, the input / output connector 11, the input / output I / F
12, the sound emission directivity control unit 13, the D / A converter 14, sound emitting
amplifier 15, the above-mentioned speaker array SPA10 (speakers SP1 to SP3), the above
microphone arrays MA10 and MA20 (microphones MIC11 to MIC17, MIC21 to MIC27), sound
pickup amplifier 16, A / D converter 17, a sound collection beam generation unit 181, 182, a
sound collection beam selection unit 19, and an echo cancellation unit 20.
[0032]
The control unit 10 performs operation / process control of the entire apparatus including power
supply control and the like, and gives control instructions such as arithmetic processing to each
part of the apparatus according to an operation input command from an operation unit (not
shown).
04-05-2019
10
[0033]
The input / output I / F 12 converts an input audio signal from another sound emitting and
collecting device input through the input / output connector 11 from a data format (protocol)
corresponding to the network, and transmits the converted signal through the echo cancellation
unit 20. To the sound emission directivity control unit 13.
Further, the input / output I / F 12 converts the output voice signal generated by the echo
cancellation unit 20 into a data format (protocol) corresponding to the network, and transmits
the data to the network through the input / output connector 11.
[0034]
The sound emission directivity control unit 13 simultaneously supplies a sound emission signal
based on the input sound signal to the speakers SP1 to SP3 of the speaker array SPA 10 if the
sound emission directivity is not set.
Further, when the sound emission directivity such as setting of the virtual point sound source is
designated from the control unit 10, the sound emission directivity control unit 13 sets each
speaker SP1 of the speaker array SPA 10 based on the designated sound emission directivity. The
individual sound emission signals are generated by performing delay processing and amplitude
processing and the like specific to ~ SP3 on the input voice signal. The sound emission directivity
control unit 13 outputs these individual sound emission signals to the D / A converter 14
installed for each of the speakers SP1 to SP3.
[0035]
Each D / A converter 14 converts an individual sound emission signal into an analog format and
outputs it to each sound emission amplifier 15, and each sound emission amplifier 15 amplifies
the individual sound emission signal and gives it to the speakers SP1 to SP3.
[0036]
The speakers SP1 to SP3 convert the given sound emission signal and the individual sound
04-05-2019
11
emission signal into voice and emit the sound to the outside.
Since the speakers SP1 to SP3 are installed on the lower surface of the housing 101, the emitted
sound is reflected on the installation surface of the desk on which the sound emission and
collection device 1 is installed, from the side of the device where the conference person is It is
propagated obliquely upward. In addition, part of the emitted sound flows from the bottom of the
sound emission and collection device 1 to the side where the microphone arrays MA10 and
MA20 are installed.
[0037]
The microphones MIC11 to MIC17 and MIC21 to MIC27 of the microphone arrays MA10 and
MA20 may be omnidirectional or directional, but are preferably directional and are external to
the sound emission and collection device 1. The voices from the above are picked up and
converted electrically, and the picked-up voice signal is output to each pickup amplifier 16.
[0038]
Also, the microphones MIC11 to MIC17 and MIC21 to MIC27 of the microphone arrays MA10
and MA20 pick up the wraparound sound of the emitted sound of the speakers SP1 to SP3.
[0039]
Each sound pickup amplifier 16 amplifies the picked up sound signal and supplies it to the A / D
converter 17, and the A / D converter 17 converts the picked up sound signal into a digital signal
and outputs it to the collected sound beam generating units 181 and 182. Output.
A collected sound signal from each of the microphones MIC11 to MIC17 of the microphone array
MA10 installed on one of the long surfaces is input to the collected sound beam generation unit
181, and the other long length is collected to the collected sound beam generation unit 182. A
collected sound signal is input to the microphones MIC21 to MIC27 of the microphone array
MA20 installed on the surface.
[0040]
04-05-2019
12
The sound collection beam generation unit 181 performs predetermined delay processing or the
like on the sound collection sound signals of the microphones MIC11 to MIC17 to generate
sound collection beam signals MB11 to MB14.
As shown in FIG. 1 (B), the collected beam signals MB11 to MB14 have different areas with
predetermined widths different from each other along the long surface on the long surface side
on which the microphones MIC11 to MIC17 are installed. It is set in the area.
[0041]
The sound collection beam generation unit 182 performs predetermined delay processing and
the like on the sound collection sound signals of the microphones MIC21 to MIC 27 to generate
sound collection beam signals MB21 to MB24. As shown in FIG. 1 (B), the collected beam signals
MB21 to MB24 have different areas with predetermined widths different from each other along
the elongated surface on the elongated surface side where the microphones MIC21 to MIC27 are
installed. It is set in the area.
[0042]
The sound collection beam selection unit 19 performs full-wave rectification, BPF of a speaker's
voice frequency band, and peak detection for the collected sound beam signals MB11 to MB14
and MB21 to MB24, and mainly performs speaker voice. The collected sound beam signal is
selected and corresponds to the collected sound beam signal MB (the “sound collected signal”
of the present invention. ) Is output to the echo cancellation unit 20.
[0043]
Also, the sound collection beam selection unit 19 provides the sound collection directivity
information corresponding to the selected sound collection beam signal MB to the control unit
10.
[0044]
The control unit 10 combines the sound emission directivity information and the sound
collection directivity information so as to be synchronized according to the delay of the
wraparound sound, and supplies the echo cancellation unit 20 with the information.
04-05-2019
13
[0045]
FIG. 3 is a block diagram showing the main configuration of the echo cancellation unit 20 of the
present embodiment.
The echo cancellation unit 20 includes, in order from the input side of the sound collection beam
signal, the first filter 203, the echo cancellation circuit 200, the second filter 204, and the filter
characteristic storage unit 205.
The echo cancellation circuit 200 includes an adaptive filter 201 and a post processor 202. The
adaptive filter 201, the first filter 203, and the second filter 204 are configured by digital filters
such as FIR filters, and each filter characteristic is set by each filter coefficient of the digital filter.
[0046]
The filter characteristic storage unit 205 stores, for the first filter 203 and the second filter 204,
filter characteristics individually set for each combination of sound emission directivity
information and sound collection directivity information. . This filter characteristic depends on
the structure of the speaker array SPA10 and the microphone arrays MA10 and MA20, and the
positional relationship between the speakers SP and the microphones MIC and the installation
condition. You may set it. The setting can be obtained, for example, by collecting the sound
emitted with one sound emission directivity and performing a process of analyzing each sound
collection directivity for all the sound emission directivity.
[0047]
FIG. 4 is a graph showing the difference in frequency characteristics of looped speech due to the
combination of the sound emission directivity and the sound collection directivity, and the graphs
of (A), (B), and (C) are individual sound emission directivity. The characteristics Dv1 to Dv3 are
shown, and in each graph, the characteristic curves respectively indicate different sound
collecting directivity Ds11 to Ds14 (sound collecting beam signals MB11 to MB14).
04-05-2019
14
[0048]
As shown in FIG. 4, in any combination of the sound emission directivity and the sound collection
directivity, a strong / weak difference occurs between the signal levels of the respective
frequency components.
The filter characteristics of the first filter 203 suppress the difference in signal level for each
frequency component, and are set so that the signal levels become substantially the same in all
frequency bands. That is, since the low band signal level is high and the high band signal level is
low, the high band signal level is raised to the low band signal level. More specifically, the entire
frequency band is divided into partial frequency regions each of which has a predetermined
frequency width, and is based on the highest signal level in the low band. Then, the level shift
amount of the signal level of each partial frequency domain is set so that the signal level
substantially matches the reference signal level. At this time, the number of bits higher than the
number of operation bits used in the echo cancellation circuit 200 is used to set and store the
level shift amount.
[0049]
Then, as shown in each of FIGS. 4A to 4C, since the frequency characteristic is different for each
combination of the sound emission directivity and the sound collection directivity, the setting of
the filter characteristic by such a level shift amount is required. , Sound emission directivity and
sound collection directivity.
[0050]
On the other hand, the filter characteristic of the second filter 204 is set to cancel the filter
characteristic of the first filter 203.
That is, the filter characteristic of the second filter 204 is set so that the frequency characteristic
of the signal level corrected by the first filter 203 is returned to the original frequency
characteristic.
[0051]
04-05-2019
15
FIG. 5 is a conceptual diagram showing the storage content of the filter characteristic storage
unit 205. As shown in FIG. In addition, although the case where there are three types of sound
emission directivity and eight types of sound collection directivity is shown in this description,
the number of sound emission directivity and the number of sound collection directivity are not
limited to this, and it is appropriate It should be set. The filter characteristic storage unit 205
stores filter characteristics for each combination of sound emission directivity and sound
collection directivity.
[0052]
As shown in FIG. 5, in the filter characteristic storage unit 205, filter characteristics are stored
for each combination of sound emission directivity Dv1, Dv2, Dv3 and sound collection
directivity Ds11 to Ds14, Ds21 to Ds24. For example, if it is a combination of sound emission
directivity Dv1 and sound collection directivity Ds11, a first filter characteristic Fc111 and a
second filter characteristic Fr111 are stored. Similarly, in the case of the combination of the
sound emission directivity Dv3 and the sound collection directivity Ds24, the first filter
characteristic Fc324 and the second filter characteristic Fr324 are stored. That is, in the example
shown in FIG. 5, 3 × 8 = 24 first filter characteristics Fc and second filter characteristics Fr are
stored.
[0053]
When the echo cancellation unit 20 receives the combination information of the sound emission
directivity information and the sound collection directivity information from the control unit 10,
the corresponding first filter characteristic Fc and the second filter characteristic Fr are output
from the filter characteristic storage unit 205. read out. The echo cancellation unit 20 sets each
filter coefficient of the first filter 203 based on the read first filter characteristic Fc, and sets each
filter coefficient of the second filter 204 based on the read second filter characteristic Fr.
[0054]
The first filter 203 performs filter processing based on the first filter characteristic Fc on the
input sound collection beam signal MB, and outputs the result to the post processor 202 of the
echo cancellation circuit 200. That is, the signal level of the high region of the sound collection
04-05-2019
16
beam signal MB is raised so that the frequency characteristic due to the wraparound becomes
substantially uniformed from the low region to the high region. As a result, the change of the
frequency spectrum due to the wraparound is canceled, and although the signal level is different,
it becomes equivalent to the capture of the wraparound speech of the same frequency spectrum
as the input speech signal input to the sound emission directivity control unit 13. In addition,
since the low and high frequencies are equalized with high signal levels, the effective number of
bits representing the signal level at each frequency is high, and the resolution is high over the
entire frequency band.
[0055]
In this level shift operation, a bit number higher than the operation bit number used in the echo
cancellation circuit 200 is used. For example, if the operation bit number of the echo cancellation
circuit 200 is 16 bits, the bit number of the level shift operation is set to 32 bits or the like. Also,
floating point arithmetic may be used for level shift. As a result, many bit amounts can be
allocated to the high frequency band originally having a low signal level, so that the signal level
of the corrected sound collection beam signal MB can be calculated with higher accuracy. That is,
the quantization error at the time of signal correction can be suppressed.
[0056]
The adaptive filter 201 of the echo cancellation unit 20 generates a pseudo-regression sound
signal based on the sound collection directivity of the selected sound collection beam signal MB
for the input sound signal. At this time, as the initial condition, the adaptive filter 201 is assumed
to be a voice of the pseudo-regression sound signal as the voice wraps equally in the entire
frequency band from the low band to the high band regardless of the radiation directivity and the
collection directivity. Start generation. As a result, the generation start condition of the pseudoregression sound signal is simplified (unified), and an increase in calculation load can be
prevented. The post processor 202 functions as a subtractor, subtracts the pseudo-regression
sound signal from the sound collection beam signal MB corrected by the first filter 203, and
outputs the result to the second filter 204. Here, since the collected sound beam signal input to
the post processor 202 includes the wraparound voice whose signal level is substantially
uniformly corrected in all frequency bands as described above, the input voice signal and the
wraparound voice corrected The frequency spectrum with the signal substantially matches.
Therefore, the frequency spectra of the pseudo-regression sound signal based on the input sound
signal and the looped sound signal included in the sound collection beam signal MB also
substantially match. In addition, the number of effective bits in the entire frequency band is high,
04-05-2019
17
and the resolution is high. As a result, it is possible to reliably remove the sneak noise from the
collected sound beam signal MB with high accuracy.
[0057]
The second filter 204 performs a filtering process based on the second filter characteristic Fr on
the collected sound beam signal from which the sneak voices have been removed, and outputs
the result to the input / output I / F 12 as an output sound signal. The second filter characteristic
Fr lowers the high frequency signal level raised by the first filter characteristic Fc so as to
conform to the frequency characteristic of the original looped speech. As a result, the frequency
characteristics of the voice signal other than the wraparound speech signal, ie, the voice of the
speaker, whose high frequencies are raised together with the wraparound speech by the first
filter 203, are readjusted to the state before the correction by the first filter 203. can do. That is,
an output sound signal conforming to the collected sound beam signal according to the collected
raw frequency spectrum can be obtained.
[0058]
Also in this readjustment operation, a bit number higher than the operation bit number used in
the echo cancellation circuit 200 is used. For example, if the operation bit number of the echo
cancellation circuit 200 is 16 bits, the bit number of the readjustment operation is set to 32 bits
or the like. Also, floating point operations may be used for readjustment operations. As a result,
the signal level of the readjusted sound collection beam signal MB can be calculated with high
accuracy. That is, the quantization error at the time of signal readjustment can be suppressed.
[0059]
Also, using double precision for the first and second filter operations with relatively low
computational processing load as opposed to the computational accuracy of the high echo
processing with high computational processing load, the originally complicated echo canceling
process with complicated processing It is possible to obtain a collected sound beam signal
(output sound signal) from which loop-around sound has been removed with high accuracy,
without further increasing the calculation load of the above.
[0060]
As described above, by using the configuration and processing of the present embodiment, it is
04-05-2019
18
possible to remove the wraparound voice with high accuracy, and obtain and output only the
necessary sound such as the utterance of the utterer at a high S / N ratio. A sound collecting
device can be realized.
[0061]
In the above description, an example in which both the first filter characteristic Fc and the
second filter characteristic Fr are stored in the filter characteristic storage unit 205 has been
described.
However, since the second filter characteristic Fr is an inverse correction characteristic with
respect to the first filter characteristic Fc, only the first filter characteristic Fc is stored, and the
second filter characteristic Fr is calculated from the selected first filter characteristic Fc. And the
second filter 204 may be set.
[0062]
In the above description, the sound emitting and collecting apparatus having both the
microphone array having the sound collecting directivity and the speaker array having the sound
emitting directivity has been described. However, the microphone and the microphone array
having no sound collecting directivity Even in the case of using a combination with a speaker
array having sound emission directivity, or a combination of a microphone array having sound
collection directivity and a speaker and speaker array without sound emission directivity, the
above configuration can be applied. it can.
[0063]
They are a microphone of the sound emitting and collecting apparatus which concerns on this
embodiment, a top view which shows speaker arrangement, and a figure which shows the sound
collection beam area | region formed of the sound emitting and collecting apparatus.
It is a functional block diagram of the sound emission and collection device of this embodiment.
It is a block diagram which shows the main structures of the echo cancellation part 20 of this
04-05-2019
19
embodiment. It is a graph which shows the difference in the frequency characteristic of the
wraparound speech by the combination of sound emission directivity and sound collection
directivity. It is a conceptual diagram which shows the memory content of the filter characteristic
storage part 205. FIG. It is a figure which shows the general frequency characteristic of a
wraparound speech.
Explanation of sign
[0064]
1-sound emission and collection device, 101-case, 11-input / output connector, 12-input / output
I / F, 13-sound emission directivity control unit, 14-D / A converter, 15-sound emission amplifier,
16 -Sound pickup amplifier, 17-A / D converter, 181, 182-Sound collection beam generation unit,
19-Sound collection beam selection unit, 20-Echo cancellation unit, 201-Adaptive filter, 202-Post
processor, 2203, 2203 204-filter, SP1-SP3-speaker, SPA10-speaker array, MIC11-MIC17,
MIC21-MIC27-microphone, MA10, MA20-microphone array
04-05-2019
20
Документ
Категория
Без категории
Просмотров
0
Размер файла
32 Кб
Теги
jp2007318274
1/--страниц
Пожаловаться на содержимое документа