close

Вход

Забыли?

вход по аккаунту

?

JP2007027939

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JP2007027939
PROBLEM TO BE SOLVED: To provide an acoustic signal processing apparatus capable of
widening an effective frequency band at the time of beamforming a received sound wave with a
relatively small number of microphones and improving the estimation accuracy of a target
direction. With the goal. SOLUTION: A plurality of microphones are disposed along at least one of
the upper end or the lower end of the portable information terminal, and a plurality of
microphones are disposed along at least one of the right side or the left side of the portable
information terminal. The function of beam forming the sound waves received by the plurality of
microphones is provided. [Selected figure] Figure 1
Acoustic signal processor
[0001]
The present invention relates to an acoustic signal processing apparatus provided with a plurality
of microphones.
[0002]
In recent years, Automatic Speech Recognition (ASR) has been applied to anthropomorphic
agents and car navigation systems.
In the real environment, since the recognition rate is greatly reduced due to the effects of noise
and reverberation, research has been conducted to aim at an ASR system that is robust against
04-05-2019
1
noise and reverberation (see reference [1]). By using the microphone array, it is possible to
improve the recognition performance of the remote speech by using the spatial phase difference
between the target sound source and the noise source and suppressing the noise and the
reverberation.
[0003]
Reference [1]: NAKAMURA Tetsu, "Aiming for Robust Speech Recognition in Real Acoustic
Environments," Technical Report of SP 2002-12, pp. 31-36, 2002.
[0004]
As a form to obtain the target signal from the signal received by each microphone of the
microphone array, a delay sum is added to each signal received by each microphone after adding
the delay to make the target signal in phase, and adding these An array (delay-and-sum array), an
adaptive microphone array that learns the noise environment of the sound field to form an
optimum directivity characteristic, and the like are known.
[0005]
Also, as the arrangement of the microphones in the microphone array, a plurality of microphones
arranged in a straight line, a plurality of microphones arranged in a lattice, and a single
microphone arranged at the center of a circle It is known that a plurality of microphones are
arranged.
[0006]
The larger the difference between the minimum value and the maximum value of the
arrangement spacing of the microphones, the wider the effective frequency band when
beamforming the received sound wave.
Also, the more types of microphone arrangement intervals, the easier it is to emphasize the target
sound.
Further, estimation accuracy of the target direction is higher when the microphones are two-
04-05-2019
2
dimensionally arranged than when the microphones are one-dimensionally arranged.
[0007]
JP, 2004-127701, A NAKAMURA Tetsu, "Aiming for robust speech recognition in a real acoustic
environment," Technical Report of SP 2002-12, pp. 31-36, 2002.
[0008]
The object of the present invention is to provide an acoustic signal processing apparatus capable
of widening the effective frequency band when beamforming received sound waves with a
relatively small number of microphones and improving the estimation accuracy of the target
direction. Do.
[0009]
The invention according to claim 1 is characterized in that a plurality of microphones are
arranged along at least one of the upper end or the lower end of the portable information
terminal, and a plurality of microphones along at least one of the right side or the left side of the
portable information terminal , And has a function of beamforming the sound waves received by
the plurality of microphones.
[0010]
According to a second aspect of the present invention, in the first aspect, an arrangement interval
of a plurality of microphones disposed along at least one of the upper end and the lower end of
the portable information terminal, and the right or left side of the portable information terminal
And at least one of the plurality of microphones disposed along at least one of the plurality of
microphones.
[0011]
The invention according to claim 3 is provided with a microphone array unit to which a portable
information terminal is attached, and in a state where the portable information terminal is
attached to the microphone array unit, a plurality of them along at least one of the upper end or
the lower end of the portable information terminal And a plurality of microphones are disposed
along at least one of the right side or the left side of the portable information terminal, and has a
function of beamforming a sound wave received by the plurality of microphones. It is
characterized by
04-05-2019
3
[0012]
According to a fourth aspect of the present invention, in the third aspect, an arrangement interval
of a plurality of microphones disposed along at least one of the upper end and the lower end of
the portable information terminal, and the right or left side of the portable information terminal
And at least one of the plurality of microphones disposed along at least one of the plurality of
microphones.
[0013]
According to the present invention, it is possible to widen the effective frequency band when
beamforming received sound waves with a relatively small number of microphones, and to
improve the estimation accuracy of the target direction.
[0014]
Hereinafter, embodiments of the present invention will be described with reference to the
drawings.
[0015]
FIG. 1 shows a portable information terminal in which a microphone array unit is attached.
[0016]
In FIG. 1, 1 is a microphone array unit, and 2 is a portable information terminal (small PC).
Reference numeral 21 denotes a display unit provided on the front of the portable information
terminal 2.
In this specification, the direction from the back side to the front side of the portable information
terminal 2 is referred to as the front.
[0017]
The microphone array unit 1 is provided with a base 100 on a rectangular plate, a horizontally-
04-05-2019
4
long first microphone holding portion 101 provided to project forward at the upper end of the
base 100, and a front protrusion at the right end of the base 100. And a vertically long second
microphone holding unit 102.
[0018]
In a state where the portable information terminal 2 is mounted on the base 100 of the
microphone array unit 1, the microphone array unit 1 is attached to the portable terminal 2.
In a state where the microphone array unit 1 is attached to the portable information terminal 2,
the first microphone holding unit 101 is disposed along the upper end surface of the portable
information terminal 2, and the second microphone is held along the right side surface of the
portable information terminal 2 Part 102 is arranged.
The microphone array unit 1 and the portable information terminal 2 are connected by USB.
[0019]
FIG. 2 shows an arrangement of microphones in the microphone array unit 1.
[0020]
In this example, the microphone array unit 1 is provided with eight microphones M1 to M8.
The microphone M5 is provided at a connection portion of the first microphone holding unit 101
and the second microphone holding unit 102.
The microphones M1 to M4 are provided in the first microphone holding unit 101, and the
microphones M6 to M8 are provided in the second microphone holding unit 102.
[0021]
04-05-2019
5
The microphones M1 to M5 are disposed forward and side by side in the horizontal direction at
equal intervals D1.
The microphones M <b> 5 to M <b> 8 are arranged forward in the longitudinal direction at equal
intervals D <b> 2.
The spacing D1 is set to 2 cm in this example, and the spacing D2 is set to 4 cm in this example.
Nondirectional condenser microphones are used as the microphones M1 to M8.
[0022]
FIG. 3 shows the configuration of the speech recognition system.
[0023]
The speech recognition system comprises a portable information terminal 2 on which the
microphone array unit 1 is mounted, and a speech recognition apparatus 3 connected to the
portable information terminal 2 by a wireless LAN.
[0024]
The microphone array unit 1 includes a multi-channel A / D converter 11 for converting an audio
signal received by each of the microphones M1 to M8 into a digital signal.
The multi-channel audio signal obtained by the multi-channel A / D converter 11 in the
microphone array unit 1 is transmitted to the speech recognition apparatus 3 by the wireless
LAN via the portable information terminal 2.
[0025]
The speech recognition apparatus 3 includes a direction estimation / beam forming unit 31, a
noise suppression unit 32, and a speech recognition unit 33.
04-05-2019
6
[0026]
The direction estimation / beamforming unit 31 estimates the target direction based on the
multi-channel audio signal sent from the portable information terminal 2 and obtains the target
audio signal using this estimation result.
As this direction estimation process, for example, the method disclosed in Japanese Patent
Application Laid-Open No. 2004-112701 can be used.
In JP-A-2004-112701, direction estimation is performed based on signals of three microphones
arranged at the apex of an equilateral triangle.
For example, M3, M5, and M6 signals can be used as the signals of these three microphones.
[0027]
For example, a delay-and-sum array method is used as a method of obtaining a target speech
signal using the estimated target direction. That is, a delay is added to each signal received by
each microphone to make the target signal in phase, and then these are added. The amount of
delay to be added to each microphone is determined based on the estimated target direction.
[0028]
The target speech signal obtained by the direction estimation / beamforming unit 31 is sent to
the speech recognition unit 33 via the noise suppression unit 32, and speech recognition is
performed.
[0029]
In the microphone array unit 1, the five microphones M1 to M5 are arranged in the lateral
direction at equal intervals D1 (2 cm in this example), and the four microphones M5 to M8 are
equally spaced D2 (this example In the vertical direction with 4 cm).
04-05-2019
7
[0030]
The reception frequency band of a single microphone is determined by the performance of the
microphone itself, for example, 50 Hz to 12 kHz, but the effective frequency band that can be
formed when beam forming is performed using a plurality of microphones is the microphone It
depends on the interval.
For example, when a microphone with the above-mentioned performance is used, the effective
frequency band for beamforming when the spacing is 2 cm is about 200 Hz to 8 KHz, and the
effective frequency band when the spacing between microphones is 4 cm is 100 Hz to It will be
about 4 KHz.
That is, the effective frequency range of beamforming differs depending on the distance between
the microphones, and human voice is from 100 Hz to 6 kHz or so, so it is meaningful to set the
distance between the microphones to 2 cm and 4 cm.
[0031]
In the above embodiment, since the plurality of microphones are two-dimensionally arranged, the
target direction can be estimated with high accuracy. In addition, since the plurality of
microphones are arranged in an L shape and the arrangement interval of the microphones in the
horizontal direction is different from the arrangement interval of the microphones in the vertical
direction, the microphones can be Since the difference between the minimum value (2 cm) of the
arrangement interval and the maximum value (the distance between M1 and M8) can be made
large, the effective frequency band when beam forming the received sound wave becomes wide.
In addition, since the arrangement interval of the microphones in the lateral direction is different
from the arrangement interval of the microphones in the vertical direction, the types of
arrangement intervals of the microphones increase and the target sound is emphasized even with
a relatively small number of microphones. It becomes easy to do.
[0032]
In the above embodiment, the plurality of microphones are arranged in the lateral direction along
the upper end surface of the portable information terminal 2, but the plurality of microphones
are arranged in the lateral direction along the lower end surface of the portable information
terminal 2 A plurality of microphones may be arranged side by side along the upper end surface
04-05-2019
8
and the lower end surface of the portable information terminal 2, respectively.
[0033]
Also, although a plurality of microphones are arranged in the longitudinal direction along the
right side surface of the portable information terminal 2, a plurality of microphones may be
arranged in the longitudinal direction along the left side surface of the portable information
terminal 2 Alternatively, a plurality of microphones may be arranged in the longitudinal direction
along the right side surface and the right side surface of the portable information terminal 2,
respectively.
[0034]
That is, a plurality of microphones are arranged at a first predetermined interval in the lateral
direction along at least one of the upper end surface or the lower end surface of the portable
information terminal 2 and at least the right side surface or the left side surface of the portable
information terminal 2 A plurality of microphones may be disposed at a second predetermined
interval different from the first predetermined interval in the vertical direction along one side.
[0035]
In the above embodiment, the microphone array unit 1 is attached to the portable information
terminal 2, but as shown in FIG. 4 or 5, a plurality of microphones constituting the microphone
array are embedded in the portable information terminal 2 itself. May be.
[0036]
In the example of FIG. 4, three microphones M1 to M3 are arranged side by side along the upper
end of the portable information terminal 2 at equal intervals D1 (for example, 2 cm) along the
upper end of the portable information terminal 2 And four microphones M3 to M6 are arranged
along the right side of the portable information terminal 2 at equal intervals D2 (for example, 4
cm) in the vertical direction on the right side of the front face of the portable information
terminal 2 .
[0037]
In the example of FIG. 5, four microphones M1 to M4 are arranged along the left side of the
portable information terminal 2 in the longitudinal direction at equal intervals D1 (for example, 2
cm) on the left side of the front of the portable information terminal Similarly, on the right side of
04-05-2019
9
the front surface of the portable information terminal 2, four microphones M5 to M8 are
arranged along the right side of the portable information terminal 2 at equal intervals D1 (for
example, 2 cm) in the longitudinal direction. The microphones M4 and M5, M3 and M6, M2 and
M7, and M1 and M8, which are disposed correspondingly to both sides, form a lateral
microphone pair having a distance of D2 (for example, 4 cm).
The microphones M4 and M5 are arranged in the lateral direction at the upper end of the front
surface of the portable information terminal 2 at equal intervals D2 (for example, 4 cm) along the
upper end of the portable information terminal 2.
[0038]
In the above embodiment, signals received by a plurality of microphones are transmitted to the
speech recognition apparatus 3 by the wireless LAN via the portable information terminal 2, and
beam forming and noise suppression processing are performed by the speech recognition
apparatus 3. Voice recognition processing is performed, but beamforming is performed on the
portable information terminal 2 side, beamforming and noise suppression processing are
performed on the portable information terminal 2 side, and beamforming and noise are
performed on the portable information terminal 2 side. Suppression processing and speech
recognition processing may be performed.
[0039]
In the above embodiment, the signals received by the plurality of microphones are beamformed
and then used as input signals for speech recognition. It is also possible to use as.
Further, the present invention can be applied to the case of absorbing and transmitting not only
human voice but also bird's cry or machine operation sound.
[0040]
It is a perspective view which shows the state in which the microphone array unit was attached
to the portable terminal.
04-05-2019
10
FIG. 3 is a plan view showing an arrangement of microphones in the microphone array unit 1;
It is a block diagram showing composition of a speech recognition system.
It is a perspective view which shows the example in which the some microphone which
comprises a microphone array is embedded in the portable information terminal 2 itself.
It is a perspective view which shows the other example which the some microphone which
comprises a microphone array is embedded in the portable information terminal 2 itself.
Explanation of sign
[0041]
1 Microphone array unit 2 Mobile information terminals M1 to M8 Microphones
04-05-2019
11
Документ
Категория
Без категории
Просмотров
0
Размер файла
19 Кб
Теги
jp2007027939
1/--страниц
Пожаловаться на содержимое документа