close

Вход

Забыли?

вход по аккаунту

?

JP2005333270

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JP2005333270
PROBLEM TO BE SOLVED: To generate a synthetic microphone signal showing directivity less
affected by noise in a speech apparatus using a plurality of microphones. SOLUTION: A first DSP
25 uses a selected microphone as a main microphone and a microphone in a predetermined
angular relationship with the main microphone as a sub microphone, multiplies the voice
detection signals of both by a coefficient, and adds or subtracts them. Thus, a synthetic
microphone signal is generated that exhibits directivity different from that of the microphone
alone. [Selected figure] Figure 24
Microphone signal generation method and communication device
[0001]
The present invention relates to a speech apparatus and a microphone signal generation method
suitable for use when, for example, a plurality of conference participants in two conference
rooms use a microphone to conduct an audio conference. In particular, the present invention
relates to a microphone signal generation method and a speech apparatus that combine audio
signals of a plurality of microphones to generate an audio signal that exhibits directivity different
from that of a single microphone.
[0002]
A teleconferencing system is used in order for conference participants in two conference rooms
04-05-2019
1
located at remote locations to conduct a conference. The teleconferencing system captures the
image of the conference participants in each conference room with the imaging means, picks up
the voice with the microphone, and communicates the image picked up with the imaging means
and the voice picked up with the microphone over the communication path. The image is
transmitted, and the picked up image is displayed on the display unit of the television receiver of
the conference room on the other party side, and the sound picked up from the speaker is
output.
[0003]
In such a video conference system, in each conference room, we have encountered the problem
that it is difficult to pick up the voice of the speaker at a distance from the imaging means and
the microphone. There may be a microphone provided for each. In addition, there is also a
problem that it is difficult for a participant in a conference located at a position away from the
speaker to hear the sound output from the speaker of the television receiver.
[0004]
Japanese Patent Application Laid-Open Nos. 2003-87887 (Patent Document 1) and 2003-87890
(Patent Document 2) provide video and audio when carrying out a video conference in
conference rooms mutually separated from each other. In addition to the teleconferencing
system, the voices of the meeting attendees in the other's conference room can be clearly heard
from the speaker, and the microphone and the speaker are less susceptible to noise in the
conference room on the other side or have less burden on the echo canceller. And an audio input
/ output device integrally configured.
[0005]
For example, the voice input / output device disclosed in Japanese Patent Application Laid-Open
No. 2003-87887 is described with reference to FIGS. 5 to 8, 9 and 23 of Japanese Patent
Application Laid-Open No. 2003-87887. A speaker box 5 with a built-in speaker 6 from the
bottom to the top, a conical reflector 4 for diffusing sound radially open upward, a sound
shielding plate 3 and a support 8 A plurality of unidirectional microphones (four in FIG. 6 and
seven in FIG. 23 and six in FIG. 23) are arranged radially at equal angles in the horizontal plane.
The sound shielding plate 3 is for shielding so that the sound from the lower speaker 5 does not
04-05-2019
2
enter the plurality of microphones. JP 2003-87887 JP JP 2003-87890 JP
[0006]
The audio input / output devices disclosed in Japanese Patent Application Laid-Open Nos. 200387887 and 2003-87890 are utilized as means for complementing a video conference system
that provides video and audio. However, as the teleconferencing system, it is often sufficient to
use only the voice instead of using a complex device such as a video conference system. For
example, in the case where a plurality of conference participants hold meetings between the head
office of the same company and a remote sales office, they are also familiar with each other and
understand each other's voice, so there is no video by the video conference system But I can hold
a meeting enough. Although the video conference system is convenient, there is a disadvantage
that the amount of investment for introducing the video conference system itself, the complexity
of operation, and the large communication load for transmitting the captured image.
[0007]
Assuming that the present invention is applied to such an audio-only conference, in the audio
input / output devices disclosed in Japanese Patent Application Laid-Open Nos. 2003-87887 and
2003-87890, performance, price, and dimensional aspects are considered. And, in terms of
compatibility with the use environment, usability, etc., it is often improved.
[0008]
An object of the present invention is to provide a telephone apparatus further improved in terms
of performance, price, size, adaptability to a use environment, usability, etc. as means used only
for two-way communication. is there.
In particular, the present invention provides a microphone signal generation method and a
speech apparatus that combine a plurality of microphones to generate an audio signal having
directivity different from that of a single microphone, preferably an audio signal with noise
eliminated or reduced. It is to provide.
[0009]
According to the first aspect of the present invention, a plurality of microphones having a
04-05-2019
3
predetermined directivity are radially disposed at predetermined angular intervals, and an audio
signal of one of the audio signals detected by the microphones. A microphone signal generation
method in a speech apparatus for selecting a sub-microphone according to the present invention,
wherein a sub-microphone is set in advance for the selected microphone, and when the
microphone is selected, it is based on the angular position of the corresponding sub-microphone
A microphone signal generation method is provided, wherein sub-coefficients are multiplied by
the sub-microphone audio signal and subtracted from the selected microphone audio signal to
calculate a synthetic microphone audio signal.
[0010]
Preferably, the setting of the sub microphone and the determination of the sub coefficient are
performed by calculating the audio signal of the selected microphone and the result of
multiplying the audio signal of at least one sub microphone by the sub coefficient. It carries out
so that desired directivity different from that of a single microphone can be obtained.
[0011]
Further preferably, the audio signal of the synthetic microphone is output from one speaker.
[0012]
According to a second aspect of the present invention, a microphone selection means for
selecting a plurality of microphones having a predetermined directivity and being arranged
radially at predetermined angular intervals, and one of the plurality of microphones. And sub
microphone setting means in which a sub microphone and a sub coefficient are set in advance
for each microphone to be selected, and when one microphone is selected by the microphone
selection means, the sub microphone setting means is set. A voice signal of the sub microphone is
multiplied by a sub coefficient based on the angular position of the corresponding sub
microphone, and subtracted from the voice signal of the selected microphone to calculate a voice
signal of the synthetic microphone, a synthetic microphone signal A calling device having
generation means There is provided.
[0013]
Preferably, it further comprises a speaker for outputting the audio signal generated by the
synthetic microphone signal generation means.
[0014]
Preferably, the setting of the sub microphone and the determination of the sub coefficient set in
04-05-2019
4
the sub microphone selection means include the sub coefficient in the audio signal of the
selected microphone and the audio signal of at least one sub microphone. By calculating the
multiplication result, a desired directivity different from the directivity of a single microphone
can be obtained.
[0015]
According to the microphone signal generation method and the communication device of the
present invention, a microphone / speaker integrated configuration / communication device is
provided in which the influence of noise is reduced.
Further, in the microphone signal generation method and the communication device according to
the present invention, for example, even if there is a noise source such as a projector in the
vicinity of the communication device as a microphone unit for collecting speech in a conference
room, the communication device is less susceptible to the noise And available.
[0016]
Furthermore, the microphone signal generation method and the speech apparatus of the present
invention are used, for example, as a microphone unit of a television conference system, and the
sensitivity to the speaker direction of the television receiver is reduced to thereby reduce echo
sensitivity of the speech apparatus. Thus, the effect of improving the stability can be achieved.
[0017]
First Embodiment First, an application example of the communication device of the first
embodiment of the present invention will be described.
FIGS. 1A to 1C are configuration diagrams showing an example to which a call apparatus
according to a first embodiment of the present invention is applied.
As illustrated in FIG. 1 (A), the telephones 1A and 1B are respectively installed in two remotely
located conference rooms 901 and 902, and the telephones 1A and 1B are connected by a
telephone line 920. There is.
04-05-2019
5
As illustrated in FIG. 1B, in the two conference rooms 901 and 902, the communication devices
1A and 1B are placed on the tables 911 and 912, respectively.
However, in FIG. 1B, only the communication device 1A in the conference room 901 is illustrated
for simplification of the illustration.
The same applies to the communication device 1B in the meeting room 902.
An external perspective view of the communication device 1 (hereinafter, 1 is used as a code
representing 1A and 1B) is shown in FIG. As illustrated in FIG. 1C, a plurality of (six people in the
present embodiment) conference participants A1 to A6 are located around the communication
devices 1A and 1B. However, in FIG. 1 (C), only the conference participants around the
communication device 1A in the conference room 901 are illustrated in order to simplify the
illustration. The arrangement of the conference participants located around the communication
device 1B in the other conference room 902 is similar.
[0018]
The communication device of the first embodiment of the present invention can respond by voice
between the two conference rooms 901 and 902 via the telephone line 920, for example.
Usually, the conversation via the telephone line 920 is performed by talking between one
speaker and one speaker, ie, one-to-one, but the communication device of the first embodiment
of the present invention has one telephone line. A plurality of conference participants A1 to A6
can talk with each other using 920. However, although details will be described later, in order to
avoid voice congestion, speakers at the same time (in the same time zone) are mutually limited to
one person. Since the communication device of the present invention is intended for voice (call),
it only transmits voice via the telephone line 920. In other words, it does not transmit a large
amount of image data as in a video conference system. Furthermore, since the communication
device of the present invention compresses and transmits the call of the conference participant,
the transmission load on the telephone line 920 is light.
[0019]
04-05-2019
6
Configuration of Communication Device The configuration of the communication device
according to the first embodiment of the present invention will be described with reference to
FIGS. FIG. 2 is a perspective view of the communication device according to the first embodiment
of the present invention. FIG. 3 is a cross-sectional view of the communication device illustrated
in FIG. FIG. 4 is a plan view of the microphone / electronic circuit housing portion of the
communication device illustrated in FIG. 1, and is a plan view taken along the line XX-Y in FIG.
[0020]
As illustrated in FIG. 2, the communication device 1 has an upper cover 11, a sound reflection
plate (sound direction plate or sound guide plate) 12, a connecting member 13, a speaker
housing portion 14, and an operation portion 15. . As illustrated in FIG. 3, the speaker housing
portion 14 has a sound reflecting surface (a sound directing surface or a sound guiding surface)
14 a, a bottom surface 14 b, and an upper sound output opening 14 c. The receiving and
reproducing speaker 16 is accommodated in a lumen 14 d which is a space surrounded by the
sound reflecting surface 14 a and the bottom surface 14 b. The sound reflection plate 12 is
located above the speaker housing portion 14, and the speaker housing portion 14 and the
sound reflection plate 12 are connected by the connection member 13.
[0021]
A restraining member 17 penetrates into the connecting member 13, and the restraining
member 17 is between the restraining member / lower fixing portion 14 e of the bottom surface
14 b of the speaker housing portion 14 and the restraining member fixing portion 12 b of the
sound reflecting plate 12. Are tied up. However, the restraint member 17 only penetrates the
restraint member / penetration portion 14 f of the speaker housing portion 14. Although the
restraining member 17 penetrates the restraining member / penetration portion 14f and is not
restrained here, the speaker accommodating portion 14 vibrates by the operation of the speaker
16, but the vibration is restrained around the upper sound output opening portion 14c. It is to
prevent it.
[0022]
Speaker The voice of the speaker in the conference room of the other party leaves the upper
04-05-2019
7
sound output opening 14 c through the reception / playback speaker 16, and the sound
reflection surface 12 a of the sound reflection plate 12 and the sound reflection surface 14 a of
the speaker housing 14 It spreads in all directions 360 degrees around axis C-C along the defined
space. The cross section of the sound reflection surface 12a of the sound reflection plate 12
draws a gentle trumpet-shaped arc as illustrated. That is, in the sound reflection plate 12, a
conical projection whose center is continuously inclined and a flat surface which is gentle
inclining to the periphery are continuous. The cross section of the sound reflection surface 12a
has an illustrated cross-sectional shape over 360 degrees (all directions) around the axis C-C.
Similarly, the cross section of the sound reflecting surface 14a of the speaker housing portion 14
also draws a gentle convex surface as illustrated. The cross section of the sound reflection
surface 14a also has the illustrated cross-sectional shape over 360 degrees (entire orientation)
about the axis C-C.
[0023]
The sound S emitted from the reception / playback speaker 16 passes through the upper sound
output opening 14c and is defined by the sound reflection surface (sound directional surface or
sound guide surface) 12a and the sound reflection surface (sound directional surface or sound
guide surface) 14a Cross-sections through a trumpet-shaped sound output space and diffuse in
all directions 360 degrees around the axis C-C along the surface of the table 911 on which the
communication apparatus 1 is mounted, all conferences It is heard at a volume equal to
participants A1 to A6. In the present embodiment, the surface of the table 911 is also used as
part of the sound propagation means. The spread state of the sound S output from the reception
and reproduction speaker 16 is illustrated by an arrow.
[0024]
The sound reflection plate 12 supports the printed circuit board 21. The microphones MC1 to
MC6 of the microphone / electronic circuit housing portion 2, the light emitting diodes LED1 to
LED6, the microprocessor 23, the codec (CODEC) 24, and the first digital signal on the printed
circuit board 21 as illustrated in the plan view of FIG. Various electronic circuits such as a
processor (DSP1) DSP25, a second digital signal processor (DSP2) DSP26, an A / D converter
block 27, a D / A converter block 28, an amplifier block 29, etc. are mounted. Reference numeral
12 also functions as a member for supporting the microphone / electronic circuit housing
portion 2.
04-05-2019
8
[0025]
The printed circuit board 21 has a damper 18 for absorbing the vibration from the reception /
playback speaker 16 so that the vibration from the reception / playback speaker 16 is
transmitted to the sound reflection plate 12 and enters the microphones MC1 to MC6 and the
like to avoid noise. It is attached. The damper 18 is composed of a screw and a buffer material
such as anti-vibration rubber inserted between the screw and the printed circuit board 21. The
buffer material is screwed to the printed circuit board 21 with the screw. That is, the vibration
transmitted from the reception / playback speaker 16 to the printed circuit board 21 is absorbed
by the buffer material. Thus, the microphones MC1 to MC6 are not affected by the sound from
the speaker 16.
[0026]
Arrangement of Microphones As illustrated in FIG. 4, six microphones MC1 to MC6 are located
radially from the central axis C of the printed circuit board 21 at equal intervals (60 degrees in
this embodiment). Each microphone is a microphone with unidirectionality. The characteristics
will be described later. Each of the microphones MC1 to MC6 is swingably supported by the first
microphone support member 22a and the second microphone support member 22b both having
flexibility or elasticity (in order to simplify the illustration, the microphones) Only the first
microphone support member 22a and the second microphone support member 22b in the MC1
portion are illustrated), not affected by the vibration from the reception / playback speaker 16 by
the damper 18 using the above-described shock absorbing material In addition to the measures,
the first microphone support member 22a and the second microphone support member 22b
having flexibility or elasticity absorb the vibration of the printed circuit board 21 that is vibrated
by the vibration from the reception and reproduction speaker 16, and the reception and
reproduction is performed. The noise of the reception / playback speaker 16 is set in such a
manner as not to be affected by the vibration of the speaker 16. It has been avoided.
[0027]
As illustrated in FIG. 3, the receiving and reproducing speaker 16 is directed perpendicularly to
the central axis C-C of the plane on which the microphones MC1 to MC6 are located (in the
present embodiment, it is directed upward ( (Oriented), the arrangement of the reception and
reproduction speaker 16 and the six microphones MC1 to MC6 makes the distances between the
reception and reproduction speaker 16 and the microphones MC1 to MC6 equal. The sound
04-05-2019
9
reaches the microphones MC1 to MC6 with almost the same volume and the same phase.
However, due to the configuration of the sound reflection surface 12a of the sound reflection
plate 12 and the sound reflection surface 14a of the speaker housing 14, the sound of the
reception / playback speaker 16 is not directly input to the microphones MC1 to MC6. In
addition, as described above, by using the damper 18 using a shock absorbing material, and the
first microphone support member 22a and the second microphone support member 22b having
flexibility or elasticity, the reception reproduction speaker 16 can be obtained. The influence of
vibration is reduced. As illustrated in FIG. 1C, the conference participants A1 to A6 generally
have substantially equal intervals in the vicinity of the microphones MC1 to MC6 disposed at an
interval of 60 degrees in the direction of 360 degrees around the communication device 1
Located at.
[0028]
Light Emitting Diodes Light emitting diodes LED1 to LED6 are disposed in the vicinity of the
microphones MC1 to MC6 as a microphone selection result display device 30 for notifying that
the speaker to be described later is determined. The light emitting diodes LED1 to LED6 are
provided so as to be visible from all the conference participants A1 to A6 even when the upper
cover 11 is attached. Therefore, the upper cover 11 is provided with a transparent window so
that the light emission state of the light emitting diodes LED1 to 6 can be visually recognized. Of
course, an opening may be provided in the upper cover 11 in the light emitting diodes LED1 to
LED6, but a light transmitting window is preferable from the viewpoint of dust protection to the
microphone / electronic circuit housing portion 2.
[0029]
A first digital signal processor (DSP1) 25, a second digital signal processor (DSP2) 26, and
various electronic circuits 27 to 29 perform various kinds of signal processing to be described
later on the printed circuit board 21; It is arrange | positioned in space other than the part in
which MC6 is located. In the present embodiment, the DSP 25 is used as signal processing means
for performing processing such as filter processing and microphone selection processing
together with various electronic circuits 27 to 29, and the DSP 26 is used as an echo canceler.
[0030]
04-05-2019
10
FIG. 5 is a schematic configuration diagram of the microprocessor 23, the codec 24, the DSP 25,
the DSP 26, the A / D converter block 27, the D / A converter block 28, the amplifier block 29,
and various other electronic circuits. The microprocessor 23 performs overall control processing
of the microphone / electronic circuit housing unit 2. The codec 24 compresses and encodes
voice to be transmitted to the conference room of the other party. The DSP 25 performs various
types of signal processing described below, such as filtering and microphone selection. The DSP
26 functions as an echo canceller. In FIG. 5, four A / D converters 271 to 274 are illustrated as
an example of the A / D converter block 27, and two D / A as an example of the D / A converter
block 28. The converters 281-282 are illustrated, and two amplifiers 291-292 are illustrated as
an example of the amplifier block 29. In addition, various circuits such as a power supply circuit
are mounted on the printed circuit board 21 as the microphone / electronic circuit housing unit
2.
[0031]
In FIG. 4, a pair of microphones MC1-MC4: MC2-MC5: MC3-M6 disposed on a straight line at
respective symmetrical (or opposed) positions with respect to the central axis C of the printed
circuit board 21 respectively have two channels. It is input to A / D converters 271 to 273 which
convert analog signals to digital signals. In the present embodiment, one A / D converter converts
2-channel analog input signals into digital signals. Therefore, the detection signals of two (one
pair) microphones, for example, microphones MC1 and MC4 located on a straight line
sandwiching central axis C, are input to one A / D converter and converted into digital signals.
There is. Further, in the present embodiment, in order to identify the speaker of the voice to be
sent to the conference room of the other party, the difference between the voices of the two
microphones positioned on a straight line, the magnitude of the voice, etc. are referred to. When
the signals of two microphones located on the line are input to the same A / D converter, the
conversion timing is almost the same, and when the difference between the audio outputs of the
two microphones is taken, the timing error is small. There are advantages such as ease. The A / D
converters 271 to 274 can also be configured as A / D converters 271 to 274 with a variable
gain amplification function. The collected sound signals of the microphones MC1 to MC6
converted by the A / D converters 271 to 274 are input to the DSP 25, and various signal
processing described later is performed. The result of selecting one of the microphones MC1 to
MC6 as one of the processing results of the DSP 25 is output to the light emitting diodes LED1 to
LED6 which are an example of the microphone selection result display means 30.
[0032]
04-05-2019
11
The processing result of the DSP 25 is output to the DSP 26 and echo cancellation processing is
performed. The DSP 26 has, for example, an echo cancellation transmission processing unit and
an echo cancellation reception unit. The processing result of the DSP 26 is converted to an
analog signal by the D / A converters 281 to 282. The output from the D / A converter 281 is
optionally encoded by the codec 24 and output to the line out of the telephone line 920 (FIG. 1A)
through the amplifier 291 to the other party's conference room It is outputted as a sound
through the reception reproduction speaker 16 of the installed telephone apparatus 1. The voice
from the communication apparatus 1 installed in the conference room of the other party is input
through the line-in of the telephone line 920 (FIG. 1A), converted into a digital signal by the A / D
converter 274, and converted to the DSP 26. It is input and used for echo cancellation
processing. Further, the sound from the telephone apparatus 1 installed in the conference room
of the other party is applied to the speaker 16 through a path not shown and is output as a
sound. The output from the D / A converter 282 is output as a sound from the reception and
reproduction speaker 16 of the communication device 1 through the amplifier 292. That is, in
addition to the voice of the speaker selected in the other party's conference room from the
above-described receiving and replaying speaker 16, the conference participants A1 to A6 also
receive the voice of the speaker uttered in the conference room. You can listen through.
[0033]
Microphones MC1 to MC6 FIG. 6 is a graph showing the characteristics of the microphones MC1
to MC6. The frequency characteristics and level characteristics of each single directional
microphone change as shown in FIG. 6 according to the arrival angle of voice from the speaker to
the microphone. A plurality of curves show directivity when the frequency of the collected signal
is 100 Hz, 150 Hz, 200 Hz, 300 Hz, 400 Hz, 500 Hz, 700 Hz, 1000 Hz, 1500 Hz, 2000 Hz, 3000
Hz, 4000 Hz, 5000 Hz, 7000 Hz. However, to simplify the illustration, FIG. 6 typically illustrates
the directivity for 150 Hz, 500 Hz, 1500 Hz, 3000 Hz, and 7000 Hz.
[0034]
FIGS. 7A to 7D are graphs showing the analysis results of the position of the sound source and
the sound collection level of the microphone, and each microphone is placed at a predetermined
distance, for example, 1.5 meters, from the communication device 1 and each microphone Shows
the result of fast Fourier transform (FFT) of the picked up voice at fixed time intervals. The X axis
represents frequency, the Y axis represents signal level, and the Z axis represents time. When the
directional microphone shown in FIG. 6 is used, strong directivity is shown in front of the
microphone. In the present embodiment, the DSP 25 performs microphone selection processing
04-05-2019
12
utilizing such characteristics.
[0035]
When a nondirectional microphone is used instead of a directional microphone as in the present
invention, all sound around the microphone is picked up, so the S / N (SNR) of the speaker's voice
and the peripheral noise is confused. I can not collect so good sound. In order to avoid this, in the
first embodiment of the present invention, the S / N with the surrounding noise is improved by
collecting sound with one directional microphone. Furthermore, although a microphone array
using multiple omnidirectional microphones can be used as a method of obtaining the directivity
of the microphone, such a method is complicated due to the coincidence of time axes (phases) of
multiple signals. Since processing is required, it takes time, is not responsive, and the apparatus
configuration is complicated. That is, the signal processing system of the DSP also requires
complicated signal processing. The first embodiment of the present invention solves such a
problem by using the directional microphone illustrated in FIG. In addition, in order to combine
microphone array signals and use it as a directional sound collecting microphone, there is a
disadvantage that the outer shape is restricted by the pass frequency characteristic and the outer
shape becomes large. The present invention also solves this problem.
[0036]
Effects of Device Configuration of Communication Device The communication device having the
above-described configuration exhibits the following advantages. (1) The positional relationship
between the even number microphones MC1 to MC6 arranged at equal angles radially and at
equal intervals and the reception / reproduction speaker 16 is constant, and further the distance
is very short. The level of the sound coming back directly from the level coming back to the
microphones MC1 to MC6 through the conference room (room) environment is overwhelmingly
dominant. Therefore, the characteristics (signal level (intensity), frequency characteristics (f
characteristic), and phase) of the sound reaching the microphones MC1 to MC6 from the speaker
16 are always the same. That is, in the communication device 1 in the embodiment of the present
invention, there is an advantage that the transfer function is always the same. (2) Therefore,
there is no change in transfer function when the output of the microphone to be sent to the other
party's conference room is switched when the speakers are different, and it is not necessary to
adjust the gain of the microphone system each time the microphone is switched. Have an
advantage. In other words, there is an advantage that once adjustment is made at the time of
manufacture of the telephone set according to the first embodiment of the present invention, it is
not necessary to perform adjustment again. (3) Even if the microphones are switched when the
04-05-2019
13
speakers are different for the same reason as described above, only one echo canceller (DSP 26)
may be used. The DSP is expensive, there is no need to arrange a plurality of DSPs on the printed
circuit board 21 with various members mounted thereon and a small amount of space, and the
space for arranging the DSPs on the printed circuit board 21 may be small. As a result, the
printed circuit board 21 and hence the communication device of the embodiment of the present
invention can be miniaturized. (4) As described above, since the transfer function between the
reception and reproduction speaker 16 and the microphones MC1 to MC6 is constant, for
example, there is an advantage that the sensitivity difference adjustment of the microphone itself
having ▒ 3 dB can be performed by the microphone unit of the communication device alone
There is. Details of sensitivity difference adjustment will be described later. (5) The table on
which the communication device 1 is mounted usually uses a round table (round table) or a
polygonal table, but one reception / playback speaker 16 in the communication device 1 centers
voice of uniform quality around the axis C. A speaker system that enables even distribution in
360 degrees omnidirectional has become possible. (6) The sound from the reception / playback
speaker 16 is transmitted to the table surface of the round table (boundary effect), and the
quality sound effectively and evenly reaches the conference participants, and the ceiling direction
of the conference room is opposed The side sound and phase are canceled to make a small
sound, and there is an advantage that the reflected sound from the ceiling direction to the
conference participants is small, and as a result, a clear sound is distributed to the participants.
(7) The sound coming from the reception / playback speaker 16 reaches the microphones MC1
to MC6 arranged at equal angles radially and at equal intervals at the same volume at the same
time, so it is judged whether it is the speaker's voice or the reception voice. It will be easier. As a
result, erroneous determination of the microphone selection process is reduced. The details will
be described later. (8) By arranging an even number of microphones, for example, six
microphones at equal angles and at equal intervals, and arranging a pair of opposing
microphones on a straight line, level comparison for direction detection can be facilitated. (9) The
damper 18, the microphone support members 22A and 22B, etc. can reduce the influence of the
vibration due to the sound of the reception / playback speaker 16 on the sound collection of the
microphones MC1 to MC6. (10) As illustrated in FIG. 3, structurally, the sound of the reception /
playback speaker 16 does not directly propagate to the microphones MC1 to MC6. Therefore, in
the communication device 1, the influence of noise from the reception and reproduction speaker
16 is small.
[0037]
Modifications Although the telephone set 1 described with reference to FIGS. 2 to 3 arranges the
receiving and reproducing speaker 16 in the lower part, and arranges the microphones MC1 to
04-05-2019
14
MC6 (and related electronic circuits) in the upper part, the receiving and reproducing speaker
The positions of the microphone 16 and the microphones MC1 to MC6 (and the associated
electronics) can also be upside down, as illustrated in FIG. Even in such a case, the abovedescribed effect is obtained.
[0038]
The number of microphones is not limited to six, and an arbitrary even number of microphones
such as four or eight microphones may be equally spaced radially at equal angles with a plurality
of pairs of axes C in a straight line (in the same direction). Arrange in a straight line like MC1 and
MC4. The reason for arranging the two microphones MC1 and MC4 to face each other in a
straight line is to select a microphone and specify a speaker.
[0039]
Content of Signal Processing The content of processing performed by the first digital signal
processor (DSP) 25 will be mainly described below. FIG. 9 is a diagram illustrating an outline of
processing performed by the DSP 25. As shown in FIG. The outline is described below.
[0040]
(1) Measurement of ambient noise As an initial operation, preferably, the ambient noise in which
the communication device 1 is installed is measured. The communication device 1 can be used in
various environments (conference rooms). In order to ensure the accuracy of the selection of the
microphone and to improve the performance of the communication device 1, in the present
invention, the noise of the surrounding environment in which the communication device 1 is
installed is measured at an early stage, and the effect of the noise is measured by the microphone
It is possible to exclude it from the collected signal. Of course, when the calling device 1 is used
repeatedly in the same conference room, noise measurement is performed in advance, and this
processing can be omitted when the noise condition does not change. Note that noise
measurement can also be performed in a normal state.
[0041]
04-05-2019
15
(2) Selection of Chairperson For example, when using the telephone apparatus 1 for a two-way
conference, it is useful that there is a chairperson who organizes the management of the
proceedings in each conference room. Therefore, as an aspect of the present invention, a
chairperson is set from the operation unit 15 of the speech device 1 at the initial stage of using
the speech device 1. As a setting method of the chairperson, for example, the first microphone
MC1 located in the vicinity of the operation unit 15 is used as the chairperson microphone. Of
course, the chairman microphone can also be optional. In addition, this process can be omitted,
when the chairperson who uses the telephone apparatus 1 repeatedly is the same. Alternatively,
the microphone in which the chairperson sits may be determined in advance. In that case, the
selection operation of the chairman is not necessary each time. Of course, the selection of the
chairman is not limited to the initial state, but can be performed at any timing. Details of the
chairperson selection will be described later.
[0042]
(3) Adjustment of sensitivity difference of microphones As an initial operation, preferably, a gain
or attenuation unit of an amplification unit that amplifies signals of microphones MC1 to MC6 so
that acoustic coupling between reception / reproduction speaker 16 and microphones MC1 to
MC6 becomes equal. Automatically adjust the attenuation value of. The sensitivity difference
adjustment will be described later.
[0043]
As the normal processing, various types of processing exemplified below are performed. (4)
Microphone Selection and Switching Processing When a plurality of conference participants talk
at the same time in one conference room, the voice is mixed and it is difficult for the conference
participants A1 to A6 in the other conference room to hear. Therefore, in the first embodiment of
the present invention, in principle, one person talks at a certain time zone. Therefore, the DSP 25
performs microphone selection / switching processing. As a result, only the call from the selected
microphone is transmitted to the communication apparatus 1 of the other party's conference
room via the telephone line 920 and output from the speaker. Of course, as described with
reference to FIG. 5, the LED in the vicinity of the microphone of the selected speaker is lit, and
furthermore, the voice of the selected speaker is also heard from the speaker of the
communication device 1 in the room. Can identify who is the authorized speaker. By this
processing, it is intended to select the signal of the unidirectional microphone facing the speaker
and send a good S / N signal as the transmission signal to the other party. (5) Display of Selected
04-05-2019
16
Microphone A microphone is selected so that all the conference participants A1 to A6 can easily
recognize which microphone of the conference participant who is permitted to speak and who is
the speaker microphone is selected. The selection result display means 30, for example,
corresponding ones of the light emitting diodes LED1 to 6 are lit. (6) As background art of the
above-mentioned microphone selection processing, or in order to execute the microphone
selection processing correctly, various kinds of signal processing exemplified below are
performed. (A) Band separation of sound collection signal of microphone and level conversion
process (b) Judgment process of start and end of speech In order to use as a selection judgment
start trigger of a microphone signal facing in the direction of the speaker. (C) Detection
processing of microphones in the direction of the speaker In order to analyze the collected sound
signals of each microphone and determine the microphone used by the speaker. (D) Speakerdirection microphone switching timing determination processing, and selection switching
processing of microphone signal facing the detected speaker The switching instruction is given to
the microphone selected from the processing result described above. (E) Measurement of floor
noise at the time of normal operation ~ Measure the volume level average value of the speaker
selected for MC 6 and the noise level after detecting the end of speech, and measure the speech
start and end judgment threshold levels in fixed time units. Reset.
[0044]
Generation of Various Frequency Component Signals by Filter Processing FIG. 10 is a
configuration diagram showing a filtering processing performed by the DSP 25 as a preprocessing of a sound signal collected by a microphone. FIG. 10 shows processing for one
microphone (channel (one sound pickup signal)). The collected sound signal of each microphone
is processed by, for example, an analog low-cut filter 101 having a cutoff frequency of 100 Hz,
and a filtered audio signal from which a frequency of 100 Hz or less has been removed is output
to the A / D converter 102. , Digital high-cut filters 103a to 103e (collectively referred to as
having a cut-off frequency of 7.5 KHz, 4 KHz, 1.5 KHz, 600 Hz, 250 Hz, respectively) of the
collected sound signals converted into digital signals by the A / D converter 102 The high
frequency component is removed at step 103) (high cut process). The results of the digital high
cut filters 103a to 103e are further subjected to subtraction for each filter signal of the adjacent
digital high cut filters 103a to 103e in subtractors 104a to 104d (collectively, 104). In the
embodiment of the present invention, the digital high cut filters 103a to 103e and the
subtractors 104a to 104d are actually processed in the DSP 25. The A / D converter 102 can be
implemented as one of the A / D converter blocks 27.
[0045]
04-05-2019
17
FIG. 11 is a frequency characteristic diagram showing the filter processing result described with
reference to FIG. A plurality of signals having various frequency components are generated from
the signals collected by the microphone having one directivity in this manner.
[0046]
Band-pass filter processing and microphone signal level conversion processing One of the
triggers for start of microphone selection processing is determination of the start and end of
speech. A signal used for that purpose is obtained by the band pass filter processing and level
conversion processing illustrated in FIG. FIG. 12 shows only one CH during input signal
processing of six channels (CH) collected by the microphones MC1 to MC6. The band pass filter
processing and level conversion processing unit in the DSP 25 respectively collect, for example,
100 to 600 Hz, 200 to 250 Hz, 250 to 600 Hz, 600 to 1500 Hz, 1500 to 4000 Hz, and 4000 of
collected sound signals of microphones of respective channels. Band pass filter 201a-201a
(collectively, band pass filter block 201) having a band pass characteristic of ~ 7500 Hz, and level
converter 202a for level converting the original microphone sound pickup signal and the bandpass sound pickup signal ~202 g (collectively, level conversion block 202).
[0047]
Each of the level conversion units 202 a to 202 g has a signal absolute value processing unit 203
and a peak hold processing unit 204. Therefore, as illustrated in the waveform diagram in FIG.
12, the signal absolute value processing unit 203 inverts the sign and converts it into a positive
signal when the negative signal indicated by the broken line is input. The peak hold processing
unit 204 holds the maximum value of the output signal of the signal absolute value processing
unit 203. However, in the present embodiment, the maximum value held decreases somewhat
with the passage of time. Of course, the peak hold processing unit 204 can be improved to
reduce the amount of decrease and hold the maximum value for a long time.
[0048]
Describe band pass filter. The band pass filter used for the communication device 1 is, for
example, a band pass filter composed of only a second-order IIR high cut filter and a low cut filter
of the microphone signal input stage. In the present embodiment, it is used that when the signal
04-05-2019
18
having a high frequency pass filter is subtracted from the signal having a flat frequency
characteristic, the rest is almost equal to the signal having a low frequency cut filter. In order to
match frequency-level characteristics, one band extra band pass filter is required, but the number
of band pass filters required + number of filter stages + the number of filter stages and the
required band Pass is obtained. The band frequency of the hand pass filter required at this time
is the following six band band pass filters per channel (CH) of the microphone signal.
[0049]
BP characteristic band pass filter BPF1 = [100Hz-250Hz] и и и 201b BPF2 = [250Hz-600Hz] и и
201c BPF3 = [600Hz-1.5KHz] и и 201d BPF4 = [1.5KHz-4KHz] и и и 201e BPF5 = [ 4 KHz-7.5 KHz] и
и 201 f BPF 6 = [100 Hz-600 Hz] и и 201 a
[0050]
In this method, the calculation program of the above-mentioned IIR filter in the DSP 25 is only
6CH (channel) О 5 (IIR filter) = 30.
Contrast with the conventional band pass filter configuration. Assuming that the configuration of
the band-pass filter uses a second-order IIR filter, if six-band band-pass filters are prepared for
each of six microphone signals as in the present invention, in the conventional method, 6 О 6 О
2 = 72. The circuit needs IIR filtering. This process requires considerable program processing
even with the latest excellent DSPs and affects other processes. In the embodiment of the present
invention, the 100 Hz low cut filter is processed by the analog filter of the input stage. There are
five cutoff frequencies of 250 Hz, 600 Hz, 1.5 KHz, 4 KHz, and 7.5 KHz for the second-order IIR
high-cut filter to be prepared. Of these, the high cut filter with a cut-off frequency of 7.5 KHz is
not necessary because the sampling frequency is actually 16 KHz, but in the process of
subtraction processing, the output level of the band pass filter decreases due to the influence of
the phase around the IIR filter. In order to reduce the phenomenon, the phase of the reduced
number is intentionally turned (rotated to change).
[0051]
The filter processing in the DSP 25 performs high-pass filter processing as processing of the first
stage and subtraction processing from the result of high-pass filter processing of the first stage
as processing of the second stage. FIG. 11 is an image frequency characteristic diagram of the
04-05-2019
19
signal processing result. The following [x] shows each processing case in FIG.
[0052]
First stage [1] For the whole band pass filter, the input signal is passed through a 7.5 KHz high
cut filter. This filter output signal becomes a band pass filter output of [100 Hz-7.5 KHz] by
analog low cut matching of the input.
[0053]
[2] Pass the input signal through a 4 KHz high cut filter. This filter output signal becomes a [100
Hz-4 KHz] de-pass filter output in combination with the input analog low cut filter.
[0054]
[3] The input signal is passed through a 1.5 KHz high cut filter. This filter output signal is
combined with the analog low-cut filter at the input [100Hz-1.5KHz] by the combination with the
analog low-cut filter at the input [100Hz-1.5KHz] by combining it with the analog low-cut filter at
the input [100Hz -1.5 KHz] bandpass filter output.
[0055]
[4] The input signal is passed through a 600 KHz high cut filter. This filter output signal becomes
a band pass filter output of [100 Hz-600 Hz] in combination with the analog low cut filter of the
input.
[0056]
[5] The input signal is passed through a 250 KHz high cut filter. This filter output signal becomes
a [100 Hz-250 Hz] band pass filter output in combination with the input analog low cut filter.
04-05-2019
20
[0057]
Second stage [1] When the bandpass filter (BPF5 = [4 KHz to 7.5 KHz]) executes the processing
of the filter outputs [1]-[2] ([100 Hz to 7.5 KHz]-[100 Hz to 4 KHz]) The signal output [4 KHz to
7.5 KHz] is obtained. [2] The band pass filter (BPF4 = [1.5 KHz to 4 KHz]) executes the processing
of the filter output [2]-[3] ([100 Hz to 4 KHz]-[100 Hz to 1.5 KHz]), the above signal The output is
[1.5 KHz to 4 KHz]. [3] The band pass filter (BPF3 = [600 Hz to 1.5 KHz]) executes the processing
of the filter output [3]-[4] ([100 Hz to 1.5 KHz]-[100 Hz to 600 Hz]) The output is [600 Hz to 1.5
KHz]. [4] The band pass filter (BPF2 = [250 Hz to 600 Hz]) executes the processing of the filter
output [4]-[5] ([100 Hz to 600 Hz]-[100 Hz to 250 Hz]) and the above signal output [250 Hz ~
600 Hz]. [5] The band pass filter (BPF1 = [100 Hz to 250 Hz]) uses the signal of [5] as it is as the
output signal [5]. [6] The band pass filter (BPF 6 = [100 Hz to 600 Hz]) uses the signal of [4] as it
is as the output signal of (4). The band pass filter output required by the above processing in the
DSP 25 is obtained.
[0058]
The input sound pickup signals MIC1 to MIC6 of the microphones are constantly updated as
shown in Table 1 as the sound pressure level of the entire band and the sound pressure level of
the six bands passed through the band pass filter in the DSP 25.
[0059]
[0060]
In Table 1, for example, L1-1 indicates the peak level when the sound pickup signal of the
microphone MC1 passes the first band pass filter 201a.
The speech start / end determination uses the microphone sound collection signal whose sound
pressure level has been converted by the level conversion unit 202b after passing through the
100 Hz to 600 Hz band pass filter 201a illustrated in FIG.
[0061]
04-05-2019
21
Since the conventional band pass filter configuration is performed by combining a high pass filter
and a low pass filter per band pass filter stage, the 36 band pass filters of the specification used
in this embodiment are used. If built, filtering of 72 circuits is required.
On the other hand, the filter configuration of the embodiment of the present invention is
simplified as described above.
[0062]
Speech start / end determination processing The first digital signal processor (DSP1) 25 uses the
floor noise as the microphone sound collection signal level as illustrated in FIG. 13 based on the
value output from the sound pressure level detection unit. Rising, when it exceeds threshold of
speech start level It decides that it is speech start, when the level which is higher than threshold
of start level continues, during speech, when floor level becomes lower than threshold of speech
end, it decides that it is floor noise When the speech end determination time is continued, for
example, for 0.5 seconds, it is determined that the speech is ended. The sound pressure level data
(microphone signal level (1)) that has passed through the 100 Hz to 600 Hz band pass filter
whose sound pressure level has been converted by the microphone signal conversion processing
unit 202b illustrated in FIG. It is determined that the speech is started when the threshold level
illustrated in FIG. 13 or more is reached. The DSP 25 is configured not to detect the next speech
start for a speech end determination time, for example, 0.5 seconds, after detecting the speech
start, in order to avoid operation failure caused by frequent microphone switching.
[0063]
Microphone Selection The DSP 25 performs the speaker direction detection in the mutual
communication system and the automatic selection of the microphone signal facing the speaker
based on a so-called ?star catalog method? in which selection is made in order from the
highest speech level. The details of the "star catalog method" will be described later. FIG. 14 is a
graph illustrating the operation of the communication device 1. FIG. 15 is a flowchart showing
the normal processing of the communication device 1.
[0064]
04-05-2019
22
The communication device 1 performs audio signal monitoring processing according to the
collected sound signals from the microphones MC1 to MC6, performs speech start / end
determination, performs speech direction determination, performs microphone selection, and
displays the result as microphone selection result display means 30, display on the light emitting
diodes LED1 to LED6, for example. Hereinafter, the operation will be described mainly with the
DSP 25 in the communication device 1 with reference to the flowchart of FIG. The overall control
of the microphone / electronic circuit housing unit 2 is performed by the microprocessor 23, but
the processing of the DSP 25 will be mainly described below.
[0065]
Step 1: Monitoring of Level Conversion Signal The signals collected by the microphones MC1 to
MC6 are respectively seven types of levels in the band pass filter block 201 and the level
conversion block 202 described with reference to FIGS. Since converted as data, the DSP 25
constantly monitors seven types of signals for each microphone pickup signal. Based on the
monitoring result, the DSP 25 shifts to any one of the speaker direction detection process 1, the
speaker direction detection process 2, and the speech start / end determination process.
[0066]
Step 2: Speech start / end judgment processing The DSP 25 judges the start and end of the
speech according to the method described in detail below with reference to FIG. When the DSP
25 detects the speech start, it notifies the judgment process of the direction of the speaker of
step 4 that the speech start is detected. In addition, when the speech level is smaller than the
speech end level, the process of judging the start and end of the speech in step 2 starts a timer
for the speech end judgment time (for example, 0.5 seconds) and the speech end judgment time
and the speech level When it is lower than the end level, it is determined that the speech is
ended. If it becomes higher than the speech end level within the speech end judgment time, it
waits until it becomes smaller than the speech end level again.
[0067]
Step 3: Speaker Direction Detection Process The speaker direction detection process in the DSP
25 is performed by continuously searching the speaker direction. Thereafter, the data is supplied
04-05-2019
23
to the determination process of the speaker direction in step 4.
[0068]
Step 4: Speaker Direction Microphone Switching Processing The timing determination processing
in the processing for switching the speaker direction microphone to the DSP 25 has been
selected from the speaker detection direction at that time based on the processing in Step 2 and
the processing in Step 3. When the speaker direction is different, the microphone selection of the
new speaker direction is instructed to the microphone signal switching process of step 4.
However, if the chairman's microphone is set from the operation unit 15 and the chairman's
microphone and other conference participants speak simultaneously, priority is given to the
chairman's speech. At this time, the selected microphone information is displayed on the
microphone selection result display means 30, for example, the light emitting diodes LED1 to
LED6.
[0069]
Step 5: Transmission of Microphone Pickup Signal In the microphone signal switching process,
only the microphone signal selected by the step 4 process out of the six microphone signals is
used as a transmission signal, and the communication apparatus 1 uses the telephone line 920 to
communicate with the other party. To the line unit of the telephone line 920 illustrated in FIG.
[0070]
Setting of speech start level threshold and speech end threshold Process 1: Measure floor noise
for a predetermined time, for example, one second, of each microphone immediately after power
on.
The DSP 25 reads out the peak-held level value of the sound pressure level detection unit at
constant time intervals, for example, 10 mSec intervals in this embodiment, and calculates an
average value of values for a predetermined time, for example, 1 minute, and floor noise I
assume. The DSP 25 determines the speech start detection level (floor noise +9 dB) and the
speech end detection level threshold (floor noise +6 dB) based on the measured floor noise level.
The DSP 25 subsequently reads out the peak-held level value of the sound pressure level detector
at constant time intervals. When it is determined that the speech is ended, the DSP 25 works as a
measurement of floor noise, detects the speech start, and updates the threshold of the detection
04-05-2019
24
level of the speech end.
[0071]
According to this method, since the threshold setting is different for each floor noise level at the
position where the microphone is placed, the threshold can be set for each microphone, and
erroneous determination in selection of the microphone by the noise source can be prevented.
[0072]
Process 2: Correspondence to room with surrounding noise (floor noise is large) Process 2 is the
process 1 when the floor noise is large and the threshold level is updated automatically
automatically, the following measures are taken as a measure when it is difficult to detect speech
start and end .
The DSP 25 determines the threshold of the speech start detection level and the speech end
detection level based on the predicted floor noise level. The DSP 25 sets the speech start
threshold level to be larger than the speech end threshold level (for example, a difference of 3 dB
or more). The DSP 25 reads out the level value peak-held by the sound pressure level detector at
constant time intervals.
[0073]
According to this method, since this threshold setting has the same value for all the microphones,
it is possible to recognize the speech start with the same level of voice between the person with
the noise source on the back and the person without the noise source.
[0074]
Speech start determination processing 1, comparing the output level of the sound pressure level
detector corresponding to the six microphones with the threshold of the speech start level, and
determining that the speech start level is exceeded when the threshold of the speech start level is
exceeded.
When the output levels of the sound pressure level detectors corresponding to all the
04-05-2019
25
microphones exceed the threshold of the speech start level, the DSP 25 determines that the
signal is from the reception and reproduction speaker 16 and does not determine that the speech
is started. This is because the distance between the reception and reproduction speaker 16 and
all the microphones MC1 to MC6 is the same, so the sound from the reception and reproduction
speaker 16 reaches all the microphones MC1 to MC6 almost equally.
[0075]
Process 2, two unidirectional microphones with directivity axes shifted by 180 degrees in
opposite directions at equal angles and radial and equally spaced arrangement of 60 degrees for
the six microphones illustrated in FIG. Three pairs of MC4, microphones MC2 and MC5, and
microphones MC3 and MC6) are used to utilize the level difference of the microphone signals.
That is, the following operation is performed.
[0076]
Absolute value of (signal level of microphone 1-signal level of microphone 4) ... [1] Absolute value
of (signal level of microphone 2-signal level of microphone 5) ... [2] (signal level of microphone 3
-Absolute value of the signal level of the microphone 6 ... [3]
[0077]
The DSP 25 compares the absolute values [1], [2], and [3] with the threshold of the speech start
level, and determines that speech is started when the threshold of the speech start level is
exceeded.
In the case of this processing, since all absolute values do not become larger than the threshold
of the speech start level as in the processing 1 (since the sound from the reception reproduction
speaker 16 reaches all the microphones equally), the reception reproduction speaker 16 It is not
necessary to determine whether the sound is from the speaker or from the speaker.
[0078]
Speaker Direction Detection Process The characteristic of the unidirectional microphone
04-05-2019
26
illustrated in FIG. 6 is used to detect the speaker direction. In the unidirectional microphone, the
frequency characteristic and the level characteristic change as illustrated in FIG. 6 according to
the arrival angle of the voice from the speaker to the microphone. The results are illustrated in
FIGS. 7A to 7C. 7 (A) to 7 (C) show a speaker placed at a predetermined distance from the
communication device 1, for example, 1.5 meters, and fast Fourier transform (FFT) of voice
collected by each microphone at constant time intervals Show the results. The X axis represents
frequency, the Y axis represents signal level, and the Z axis represents time. The horizontal line
represents the cutoff frequency of the band pass filter, and the level of the frequency band
sandwiched by this line is the band pass of the five bands from the microphone signal level
conversion process described with reference to FIGS. -Data converted to sound pressure level
through a filter.
[0079]
A determination method applied as an actual process for detecting the direction of the speaker in
the speech apparatus 1 as the first embodiment of the present invention will be described.
Appropriate weighting processing is performed on the output level of each bandpass filter (0 at 0
dBFs for 1 dB full span (1 dBFs) steps, 3 for -3 dBFs, or vice versa). The weighting step
determines the resolution of the process. The above weighting process is performed every one
sample clock, the weighted score of each microphone is added, and the average value is
calculated with a certain number of samples, and the small (large) microphone signal of the total
point is determined as the microphone facing the speaker Do. An image of the result is shown in
Table 2 below.
[0080]
[0081]
In this example illustrated in Table 2, since the first microphone MC1 has the smallest total point,
the DSP 25 determines that there is a sound source in the direction of the first microphone MC1
(speaker is present).
The DSP 25 holds the result in the form of a sound source direction microphone number. As
described above, the DSP 25 performs weighting on the output level of the band pass filter of the
frequency band for each microphone, and ranks the microphone signals in the order of small (or
04-05-2019
27
large) score of the output of each band pass filter And determine that the microphone signal
having the first rank in three or more bands is the microphone facing the speaker. Then,
assuming that there is a sound source (speaker is present) in the direction of the first
microphone MC1, the DSP 25 creates a score sheet as shown in Table 3 below. This is called a
star list.
[0082]
[0083]
Actually, the performance of the first microphone MC1 is not necessarily the best at all band pass
filter outputs due to sound reflections and standing waves due to the characteristics of the room,
but a majority of the five bands are If it is the first place, it can be determined that there is a
sound source (a speaker is present) in the direction of the first microphone MC1.
The DSP 25 holds the result in the form of a sound source direction microphone number.
[0084]
The DSP 25 sums up the output level data of each band pass filter of each microphone in the
form shown in Table 9 below, determines the high level microphone signal as the microphone
facing the speaker, and the result is the sound source direction microphone number In the form
of
[0085]
MIC1 Level = L1-1 + L1-2 + L1-3 + L1-4 + L1-5 MIC2 Level = L2-1 + L2-2 + L2-3 + L2-4 + L2-5
MIC3 Level = L3-1 + L3-2 + L3-3 + L3-4 + L3-5 MIC4 Level = L4-1 + L4-2 + L4-3 + L4-4 + L4-5
MIC5 Level = L5-1 + L5-2 + L5 -3 + L5-4 + L5-5 MIC6 Level = L6-1 + L6-2 + L6-3 + L6-4 + L6-5
[0086]
Speaker Direction Microphone Switching Timing Judgment Process Activated by the speech start
judgment result in step 2 of FIG. 15 and when the microphone of the new speaker is detected
from the detection process result of the speaker direction in step 3 and the past selection
information, The DSP 25 issues a switching command of the microphone signal to the selection
switching process of the microphone signal in step 5, and notifies the microphone selection
04-05-2019
28
result display means 30 (light emitting diodes LED1 to 6) that the microphone of the speaker has
switched, Inform the caller that the calling device 1 has responded to his / her speech.
[0087]
In a room with a large echo, in order to remove the influence of reflected sound and standing
waves, the DSP 25 prohibits the effect of a new microphone selection command unless the
speech end determination time (for example, 0.5 seconds) has elapsed after switching the
microphone.
From the microphone signal level conversion processing result of step 1 of FIG. 15 and the
detection processing result of the speaker direction of step 3, two microphone selection
switching timings are prepared in this embodiment.
[0088]
First method: When the start of speech can be clearly determined If the speech from the selected
microphone direction has ended and there is a new speech from another direction.
In this case, the DSP 25 starts speech after the speech termination judgment time (for example,
0.5 seconds) has elapsed after all the microphone signal levels (1) and microphone signal levels
(2) have fallen below the speech termination threshold level. It is judged that the speech is
started when any microphone signal level (1) becomes equal to or higher than the speech start
threshold level, and the microphone facing the speaker direction is properly collected based on
the information of the sound source direction microphone number The microphone is
determined, and the microphone signal selection switching process of step 5 is started.
[0089]
Second method: When a louder voice is newly heard from another direction while the speech
continues In this case, the DSP 25 ends the speech from the speech start (when the microphone
signal level (1) becomes higher than the threshold level) After the judgment time (for example,
0.5 seconds) or more has elapsed, the judgment process is started.
04-05-2019
29
If it is determined that the sound source direction microphone number from the processing of 3
is changed and it is determined to be stable before the speech end detection, the DSP 25 is more
than the speaker currently selected as the microphone corresponding to the sound source
direction microphone number. It is determined that there is a loud speaker and the sound source
direction microphone is determined to be a valid sound pickup microphone, and the microphone
signal selection switching process of step 5 is activated.
[0090]
Selection switching processing of microphone signal facing the detected speaker The DSP 25 is
activated by a command selected and determined by the command from the switching timing
determination processing of the speaker direction microphone in step 4 of FIG. The selection
switching process of the microphone signal of the DSP 25 is composed of a 6-circuit multiplier
and a 6-input adder as illustrated in FIG. In order to select a microphone signal, the DSP 25 sets
the channel gain (channel gain: CH Gain) of the multiplier to which the microphone signal to be
selected is connected to "1" and the CH Gain of the other multipliers to "0". By doing so, the
processing result of (the microphone signal О 1) and the selected (microphone signal О 0) are
added to the adder to obtain the desired microphone selection signal.
[0091]
As described above, when the channel gain is rapidly switched to "1" or "0", a click sound may be
generated due to the level difference of the microphone signal switched. Therefore, in the
communication device 1, as illustrated in FIG. 17, when changing the change of the channel gain
from "1" to "0" and from "0" to "1", the switching transition time, for example, 10 ms. By
continuously changing over time and making it cross, generation of click noise due to a rapid
level change of the microphone signal is avoided.
[0092]
Also, adjust the echo cancellation processing operation in the DSP 25 in the latter stage by
setting the maximum of the channel gain to an arbitrary value between 0 and 1, for example,
"0.5" other than "1". You can also.
04-05-2019
30
[0093]
As described above, the communication device of the first embodiment of the present invention
is not affected by noise, and can be effectively applied to a communication device such as a
conference.
Of course, the communication device of the present invention is not limited to the conference,
and can be applied to various other applications. That is, the communication device of the first
embodiment of the present invention is also suitable for measuring the voltage level in the pass
band when it is not necessary to emphasize the group delay characteristics of each pass band.
Therefore, for example, a simple spectrum analyzer, a level meter (FFT-like) that performs fast
Fourier transform (FFT) processing, a level detection processing apparatus for checking equalizer
processing results such as graphic equalizer, a car stereo, a level meter such as radio cassette
player, etc. It can be applied to
[0094]
The communication device of the first embodiment of the present invention has the following
advantages in terms of structure. (1) The positional relationship between a plurality of
unidirectional microphones and the reception and reproduction speaker is constant, and the
distance from the reception reproduction speaker is very short, so that the sound from the
reception and reproduction speaker passes through the conference room (room) environment.
The level coming back directly from the level coming back to multiple microphones is
overwhelmingly dominant. Therefore, the characteristics (signal level (intensity), frequency
characteristics (f characteristic), and phase) in which the sound reaches the microphones from
the reception and reproduction speaker are always the same. That is, in the communication
device, there is an advantage that the transfer function is always the same.
[0095]
(2) Therefore, there is no change in the transfer function when switching the microphone, and
there is an advantage that it is not necessary to adjust the gain of the microphone system each
time the microphone is switched. In other words, there is an advantage that it is not necessary to
redo once adjustment at the time of manufacturing the communication device.
04-05-2019
31
[0096]
(3) Even if the microphones are switched for the same reason as described above, only one echo
canceller may be configured with a digital signal processor (DSP). The DSP is expensive, and the
space for arranging the DSP on a printed circuit board with various members mounted and a
small space may be small.
[0097]
(4) Since the transfer function between the receiving and reproduction speaker and the plurality
of microphones is constant, there is an advantage that the sensitivity difference adjustment of the
microphone itself, which is ▒ 3 dB, can be performed by the unit alone.
[0098]
(4) The table on which the talking device is mounted usually uses a round table, but a speaker
system is possible that evenly distributes (separates) voice of equal quality in all directions with
one reception and reproduction speaker in the talking device became.
[0099]
(5) The sound from the receiver / reproduction speaker is transmitted to the table surface
(boundary effect) to effectively and evenly reach the conference participants effectively and
evenly, and the sound from the opposite side in the ceiling direction of the conference room
There is an advantage that the phase cancellation is performed to make a small sound, and the
reflected sound from the ceiling direction to the conference participants is small, and as a result,
a clear sound is distributed to the participants.
[0100]
(6) Since the sound emitted from the receiving and reproducing speaker simultaneously reaches
all the plural microphones at the same volume, it is easy to determine whether it is the voice of
the speaker or the receiving voice.
As a result, erroneous determination of the microphone selection process is reduced.
04-05-2019
32
[0101]
(7) By arranging the even number of microphones at equal intervals, it is possible to easily
perform level comparison for detecting the direction of the speaker.
[0102]
(8) Vibration due to the sound of the receiver reproduction speaker that can be transmitted
through the printed circuit board on which the microphone is mounted by the damper using the
buffer material, the flexible or resilient microphone support member, etc. The impact can be
reduced.
[0103]
(9) The sound of the reception and reproduction speaker does not directly enter the microphone.
Therefore, in this two-way communication device, the influence of noise from the reception and
reproduction speaker is small.
[0104]
The speech apparatus according to the first embodiment of the present invention has the
following advantages in terms of signal processing.
(A) A plurality of unidirectional microphones are arranged radially at equal intervals to enable
detection of the sound source direction, and microphone signals are switched to collect (collect)
sounds having good S / N and clear sounds. , Can be sent to the other party.
(B) Voices from surrounding speakers can be picked up with high S / N, and detection signals of
microphones facing the speakers can be automatically selected.
(C) In the present invention, signal analysis is simplified by dividing the passing audio frequency
band as a method of microphone selection processing and comparing the levels of the divided
04-05-2019
33
frequency bands. (D) The microphone signal switching process of the present invention is
realized as signal processing of a DSP, and clicks at the time of switching by cross-fading a
plurality of signals so that sudden changes do not occur when switching microphone signals. I try
not to make a sound. (E) The microphone selection result can be notified to the microphone
selection result display means such as a light emitting diode or the like or to the outside.
Therefore, for example, it can also be used as speaker position information to a television camera.
[0105]
Second Embodiment A technique for automatically adjusting the sensitivity difference of
microphones will be described as a second embodiment of the speech apparatus according to the
present invention.
[0106]
As a method of adjusting the gain of the microphone amplifier, a method of adjusting the gain of
the microphone analog amplifier to absorb the sensitivity difference between the microphones is
generally assumed, but in such a method, reflection or absorption of sound is considered. There
is a tendency for the influence of the coordinator and so on.
That is, the level of adjustment tends to be different between when the adjuster is close to the
microphone during adjustment and when the adjuster is away from the microphone. In addition,
such a method requires troublesome work such as connection and disconnection between the
output signal of the microphone amplifier and the measuring device. In the second embodiment
of the present invention, in order to overcome the problems described above, the sensitivity
difference of the microphone is automatically adjusted by the method described below.
[0107]
Adjustment of the sensitivity difference of the microphone of the second embodiment of the
present invention is based on the following concept. ?? The call device 1 according to the
embodiment of the present invention, for example, as shown in FIG. Therefore, if the reference
signal is line-in, it can be input to the DSP 26 and the DSP 25 via the A / D converter 274, so that
the sensitivity difference of the microphone can be adjusted without providing a special
measuring device. ?? The error range of the sensitivity difference can be freely set by the
04-05-2019
34
program of the DSP 25. ?? By performing automatic adjustment, a nonstandard microphone
can be identified and a connection failure can be detected. Similarly, a defect or the like of an
amplification unit that amplifies a microphone signal is also detected.
[0108]
As a precondition, in the second embodiment, as illustrated in FIG. 4, even number of
microphones, for example, six microphones, are equally spaced at equal angles radially and
equally spaced from the reception and reproduction speaker 16 in the second embodiment. It is
done. The positional relationship between the microphones MC1 to MC6 and the reception and
reproduction speaker 16 is as illustrated in FIG. 3 as to whether the reception and reproduction
speaker 16 is disposed below the microphones MC1 to MC6 or as illustrated in FIG. The
reception / playback speaker 16 may be disposed above the MC1 to MC6.
[0109]
Apparatus Configuration An apparatus configuration according to the second embodiment is
basically illustrated in FIG. 5, and the details are the configurations illustrated in FIGS. In FIG. 18,
variable gain type amplifiers 301 to 306 for performing gain adjustment are actually disposed
between the microphones MC1 to MC6 and the A / D converters 271 to 273 in FIG. Alternatively,
the A / D converters 271 to 274 in FIG. 5 may be A / D converters 271 to 274 with variable gain
type amplifiers 301 to 306. The DSP 25 performs the various processes described above, but the
first to sixth variable attenuators (ATTs) 2511 to 2516 and the first to sixth level detectors 2521
to 2526 serve as parts for adjusting the sensitivity difference of the amplifiers 301 to 306. , A
level determination / gain control unit 253, and a test signal generation unit 254. The DSP 26
has an echo cancellation transmission processing unit 261 and an echo cancellation reception
unit 262.
[0110]
The variable gain type amplifiers 301 to 306 are amplifiers capable of changing the gain, and the
gain adjustment is performed by the level determination / gain control unit 253. However, when
the variable gain type amplifiers 301 to 306 are built in the A / D converters 271 to 273, gain
adjustment can not be freely performed. That is, there are cases where gain adjustment can be
freely performed, and there are also restrictions on the control width of variable gain amplifiers
04-05-2019
35
301 to 306, etc. In this embodiment, the situation of variable gain amplifiers 301 to 306 Process
according to
[0111]
The variable attenuators 2511 to 2516 are also attenuators capable of changing the amount of
attenuation, and the level determination / gain control unit 253 performs control of the amount
of attenuation by outputting attenuation coefficients 0.0 to 1.0. Note that, since the variable
attenuators 2511 to 2516 are processed in the DSP 25, in reality, the level determination / gain
control unit 253 in the same DSP 25 controls (adjusts) the attenuation value of the portions of
the variable attenuators 2511 to 2516. become.
[0112]
Each of the level detection units 2521 to 2526 includes a band pass filter 252a, an absolute
value calculation unit 252b, and a peak level detection and holding unit 252c, and basically the
same as the configuration illustrated in FIG. It is.
[0113]
FIG. 19 is a diagram in which the apparatus configuration illustrated in FIG. 18 is changed from
the illustration according to the operation mode of the present embodiment, and illustrates the
amount of signal attenuation.
When a test sound is emitted from the sound level meter or the reception / playback speaker 16
in a room of a certain size (conference room), it is arranged at an equal distance d from the
soundness meter or the reception / playback speaker 16 unless there is a reflector or sound
absorber. An approximately equal signal arrives at each of the microphones MC1 to MC6
provided. Test audio from the noise level meter or the reception / playback speaker 16 picked up
by the microphones MC1 to MC6 is amplified by the variable gain type amplifiers 301 to 306,
converted to digital signals by the A / D converters 271 to 273, and It attenuates in the variable
attenuators 2511 to 2516. The band pass filter 252a in the level detection units 2521 to 2526
passes frequency components in a predetermined band, the absolute value calculation unit 252b
performs the calculation shown in Table 6, and the peak level detection and holding unit 252c
detects the maximum value. Being held. The level determination / gain control unit 253 adjusts
the attenuation amount (attenuation coefficient) of the variable attenuation units 2511 to 2516
04-05-2019
36
to adjust the sensitivity difference of each of the microphones MC1 to MC6.
[0114]
Design Value of Sensitivity Difference Adjustment Error In the second embodiment, for example,
a microphone of ▒ 3 dB is assumed as a nominal error of the microphone sensitivity. In the
second embodiment, for example, 0.5 dB or less is targeted as a design value of the sensitivity
difference adjustment error. The actual sensitivity difference adjustment error is, for example,
about 0.5 to 1.0 dB, because it changes depending on the environment in which the
communication device is installed.
[0115]
The test signal generation unit 254 inputs to the line input terminal the pink noise of the
reference input level (a sound pressure that is sufficiently large with respect to ambient noise is
generated), for example, 20 dB pink noise. put out. Alternatively, as indicated by a broken line in
FIG. 18, the test signal output from the test signal generation unit 254 can be re-input to the DSP
25 via the echo cancellation transmission processing unit 261.
[0116]
According to the second embodiment, a plurality of microphones disposed at equal distances
from the reception and reproduction speaker 16 automatically adjust the sensitivity difference
between the pair of opposing microphones fixedly connected to the microphone amplifier. The
gain difference of the transmitting microphone can be automatically adjusted so that the acoustic
coupling between the receiving and reproducing speaker 16 and each of the sound collecting
microphones MC1 to MC6 becomes equal.
[0117]
In the implementation of the second embodiment, it is only necessary to use the communication
device itself without requiring a special device.
Therefore, the sensitivity adjustment can be performed in a state where the communication
04-05-2019
37
device is provided.
[0118]
Third Embodiment Referring to FIGS. 20 to 22, as a third embodiment of a communication device
according to the present invention, a plurality of unidirectional microphones are used as a pair
(one set). Describe in more detail how to identify the speaker when used. The basic idea of how
to identify the speaker has been described in the first embodiment. The third embodiment will be
described in further detail and in a preferable manner of a speaker identification method in
association with the first embodiment.
[0119]
Device Configuration As illustrated in FIG. 4, the microphones are radially arranged at equal
angles and equally spaced from the speaker 16, and in particular, for example, like the first
microphone MC1 and the fourth microphone MC4, A pair of microphones facing each other
across the central axis C are located on a straight line. Since there are six microphones MC1 to
MC6 illustrated in FIG. 4, the microphones are radially disposed equiangularly at an angle of 60
degrees, and the conferee is located in front of them.
[0120]
In the present embodiment, each of the microphones MC1 to MC6 has the directivity illustrated
in FIGS. 6 and 7A to 7D. Assuming that the frequency of the signal sound from the sound source
is, for example, 500 Hz, if there is a sound source (speaker's voice) in the direction of the first
microphone MC1, for example, the unidirectional microphones MC1 to MC6 are radially emitted.
When the sound pressure levels collected by the microphones MC1 to MC6 are arranged at
intervals of 60 degrees, the values shown in Table 4 below are obtained when the front direction
level in FIG. 7A is normalized to 0 dB.
[0121]
[0122]
04-05-2019
38
Table 4 shows the results of normalization of the sound source device direction and the sound
pressure levels collected by the six microphones.
On the other hand, it is assumed that the sound pressure levels collected by the microphones
MC1 to MC6 when there is a sound source in the direction of the first microphone MC1 are, for
example, as follows.
[0123]
Microphone detection sound pressure number The level of microphone 1 is the highest [0 dB] [1]
The level of microphones 2 and 6 is the second [-4 dB] [2] the level of microphones 3 and 5 is
the third [-14.7 dB ] [3] Microphone 4 level is lowest [-15.3 dB] [4]
[0124]
The difference between the sound pressures detected by the respective pairs of microphones
opposed to each other across the central axis C and provided on a straight line is, for example, as
follows.
[0125]
Microphone A-Microphone B sound pressure difference number (1) MC 1-MC 4 0-(-15.3) = 15.3
dB [5] (2) MC 2-MC 5-4-(-14.7) = 10.7 dB [6] (3) MC 3-3 MC6-14.7-(-4) = -10.7 dB [7]
[0126]
Assuming that such level states indicate that there is a sound source (if there is a speaker) in the
direction of the pair of microphones, for example, Table 5 is obtained.
[0127]
[0128]
In the present embodiment, the direction in which the condition of the signal level pattern and
the condition of the microphone signal level shown in the example of Table 4 match is
04-05-2019
39
determined as the sound source direction.
This determination process is performed by the first digital signal processor (DSP1) DSP 25. The
process content is shown in the flowchart of FIG.
[0129]
In this processing, for example, in the sound pressure level detection unit illustrated in FIG. 12,
for the low frequency component signal that has passed through the band pass filter 201 a of
100 Hz to 600 Hz, the level conversion processing unit 202 b Then, the difference between the
level detection values of the pair of opposed microphones (on a straight line) is calculated, the
absolute value of the difference is determined, and the peak hold processing result is used.
Note that detection signals of a pair of microphones disposed in a straight line facing each other
are input to the A / D converters 271 to 273, and the sound pressure level detection unit detects
the detection signals of such a pair of microphones. The above processing such as level
difference calculation and absolute value calculation is performed.
The reason why the signal passing through the pass band of 100 Hz to 600 Hz is used in the
band pass filter 201a is to share it with other sound source direction determination processing,
and it is not a special condition for specifying the sound source direction.
Therefore, the above processing can be performed using the output of the band pass filter having
an arbitrary band pass characteristic.
[0130]
Preferably, prior to the detection of the sound pressure level, in order to increase the reliability
as to whether the sound pressure detected by a certain microphone is effective or not, as
illustrated in FIG. It is desirable for the DSP 25 to confirm that it is done.
[0131]
04-05-2019
40
FIG. 21 summarizes the device configuration described above.
Of course, the configuration illustrated in FIG. 21 is based on the configuration illustrated in FIG.
5, and the portion of the DSP 25 is illustrated by extracting the portion related to the third
embodiment, and the sound source performing the processing illustrated in FIG. The apparatus
direction identification processing means 255 is clearly shown.
The determination result of the sound source device direction identification processing means
255 is displayed on the LED as the microphone selection result display means 30.
[0132]
The relationship between the microphones MC1 to MC6 and the A / D converters 271 to 273 is
the same as that in the second embodiment described above. As described in the second
embodiment, the A / D converters 271 to 273 incorporate the variable gain amplifiers 301 to
306, or the microphones MC1 to MC6 and the A / D converters 271 to 273 are independent.
Variable gain amplifiers 301 to 306 may be provided. Therefore, in the third embodiment, the
optimum condition in which the sensitivity difference described in the second embodiment is
automatically adjusted, and the acoustic coupling between the microphones MC1 to MC6 and the
reception and reproduction speaker 16 is equally adjusted. Is applicable.
[0133]
The sound source device direction identification processing means 255 performs the following
processing.
[0134]
Step S311: The sound source direction identification processing means 255 detects the
microphone (first microphone) which has detected the sound pressure of the maximum level
according to Table 3 and Table 4 and stores the detected first microphone number of the
maximum level in the DSP 25. Remember in the "MAX" part of.
[0135]
Step S312: The sound source direction identification processing means 255 then detects the
microphone (second microphone) that has detected the sound pressure of the second highest
04-05-2019
41
level according to Table 3 and Table 4, and detects the microphone number of the detected
second microphone as DSP 25 Store in the "second" part of the internal memory.
[0136]
Step S313: The sound source device direction identification processing unit 255 or the absolute
value conversion processing unit 203 obtains the difference in sound pressure level detected by
each pair of microphones.
That is, the sound source device direction identification processing unit 255 or the absolute
value conversion processing unit 203 obtains (MC1-MC4), (MC2-MC5), and (MC3-MC6), holds
the respective peak values, and Store in "sub1", "sub2" and "sub3".
[0137]
Steps S314 to S320: The sound source device direction identification processing means 255
performs any of the steps S315 to S320 according to the contents of "MAX" in the memory, that
is, the first microphone detecting the sound pressure at the maximum level. .
[0138]
Step S315: Processing when the microphone 1 is at the maximum level: The details of this
processing are illustrated in FIG.
Step S331: Confirmation of Microphone Adjacent to Microphone Detecting Maximum Level The
sound source direction identification processing means 255 confirms that the content of ?2nd?
in the memory is the second microphone MC2 or the sixth microphone MC6.
The reason is that (a) when the microphone (second microphone MC) that detects the second
highest sound pressure is the second microphone MC2, it is adjacent to the first microphone
MC1 and the first microphone MC1 If it is determined that there is a sound source between the
second microphone MC2 or (b) the microphone that has detected the second highest sound
pressure is the sixth microphone MC6, the first microphone MC1 This is because it is appropriate
to determine that there is a sound source between the first microphone MC1 and the adjacent
04-05-2019
42
sixth microphone MC6.
That is, in the present embodiment, the specific reliability of the microphone positioned in the
sound source direction is enhanced by also referring to the level detection state of the
microphone MC present at the position adjacent to the microphone MC having detected the
maximum level. In the present embodiment, only one of the microphones that detected the
second highest level is detected because the resolution of the sound source direction is limited to
the microphone front direction (60 degrees) and between adjacent microphones. Direction is
ignored.
[0139]
Step S332: Undecidable Process In the case other than the above, the sound source device
direction identification processing means 255 stores the undecidable state in the "RESLT" portion
of the memory as undecidable.
[0140]
Step S333: Pattern Confirmation of Level Difference Between Pairs of Microphones Next, the
sound source device direction identification processing means 255 causes the contents of
?sub1?, ?sub2? and ?sub3? of the memory to be ?+? as shown in Table 13. Confirm
with "," "+", "-".
Step S334: Confirm the match between the sound source direction and the microphone MC If this
state matches, the sound source device direction identification processing means 255 determines
that there is a sound source in the first microphone MC1 direction, and the "RESLT" portion of
the memory The number of the first microphone MC1 is stored. If the states in Table 4 do not
match, the sound source direction identification processing means 255 jumps to step S332, and
stores information indicating that the determination is impossible in the "RESLT" portion of the
memory as being undeterminable.
[0141]
Step S321: Display of specified result If the selected sound source device direction specified
processing means 255 determines that the sound source is present in the direction of the first
microphone MC1 in the process described above, as shown in FIG. The LED as the microphone
selection result display means 30 adjacent to the first microphone MC1 of the display means 30
04-05-2019
43
is turned on to clearly indicate that the first microphone MC1 has been identified (selected).
[0142]
Step S316: Processing when the microphone 2 is at the maximum level: The sound source device
direction identification processing means 255 performs the processing in the same manner as
the processing of the first microphone MC1.
[0143]
Confirmation of Adjacent Microphone MC The sound source direction identification processing
means 255 checks whether the third microphone MC3 or the first microphone MC1 adjacent to
the memory ?second? portion second microphone MC2 is present.
[0144]
Undecidable Process In any case other than the above, the sound source device direction
identification processing means 255 stores the undecidable state in the "RESLT" portion of the
memory as undecidable.
[0145]
Pattern confirmation and confirmation processing of the level difference of the pair of
microphones The sound source device direction identification processing means 255 has the
contents of "sub1", "sub2" and "sub3" in the memory shown in Table 4 as "+", "+" and " When it is
confirmed that the sound source is "+", it is determined that there is a sound source in the
direction of the second microphone MC2, and the number of the second microphone MC2 is
stored in "RESLT" of the memory.
[0146]
Undecidable Process If the states in Table 4 do not match, information indicating undecidable is
stored in the "RESLT" portion of the memory as undecidable.
[0147]
Specific result display When the selected sound source device direction identification processing
means 255 determines that the sound source is present in the direction of the second
microphone MC2 in the above process, the LED adjacent to the microphone MC2 is turned on It
clearly indicates that the microphone MC2 of has been identified (selected).
04-05-2019
44
[0148]
Step S317: Processing when the microphone 3 is at the maximum level: The sound source device
direction identification processing means 255 performs the same processing as the processing of
the first and second microphones.
That is, the sound source device direction identification processing means 255 determines the
contents of the "second" part of the memory and the contents of the "sub1", "sub2" and "sub3" of
the memory as "-", "+" and "+" in Table 13. "It is confirmed that there is a sound source in the
direction of the third microphone MC3.
In any other case, it is determined that the determination is unsuccessful.
When the microphone corresponding to the sound source direction can be determined
(specified), the sound source direction identification processing means 255 turns on the LED
corresponding to the determined microphone.
[0149]
Step S318: Processing when the microphone 4 is at the maximum level: The sound source
direction identification processing means 255 performs the processing in the same manner as
the processing of the first, second and third microphones MC.
[0150]
Step S319: Processing when the microphone 5 is at the maximum level: The sound source device
direction identification processing means 255 performs the processing in the same manner as
the processing of the first to fourth microphones MC.
[0151]
Step S320: Processing when the microphone 6 is at the maximum level: The sound source device
direction identification processing means 255 performs the processing in the same manner as
the processing of the first to fifth microphones MC.
04-05-2019
45
[0152]
As described above, in the third embodiment of the present invention, the sound source direction
is detected by the above method by focusing on the sound pressure level difference from the
sound source from the directivity characteristic of the unidirectional microphone.
That is, the sound source direction is detected using the order determination of the size of the
level collected by the microphone, the determination of the order and reference of the detected
level of the microphone adjacent to the microphone, and the difference between the detection
levels of one pair of microphones Do.
As a result, according to the third embodiment, the sound source direction can be specified with
high reliability in the communication device.
[0153]
Modification of Third Embodiment In the above embodiment, the peak level of the signal passing
through the pass band of 100 Hz to 600 Hz by the band pass filter 201a is used as the detection
level of the signal for shared use with other sound source direction determination processing.
Although the determination process is realized by using, as illustrated in FIG. 12, it is also
possible to use the result of the level conversion process of the passband signals of the plurality
of pand-pass filters 201a to 201f by the level conversion processing units 202b to 202g.
Of course, as described above, the DSP 25 performs such signal processing.
In that case, the processes illustrated in FIGS. 20 and 22 are respectively performed on the
results of level conversion performed by the level conversion processing units 202b to 202g, and
a first determination (provisional determination) is performed, and a plurality of second
determinations are performed. The first determination result can be determined as the final
determination result when the number is the largest in the majority, and the result can be
selectively output in step S321.
According to such a method, the reliability (accuracy) of the determination result of the sound
04-05-2019
46
source direction is further improved.
[0154]
Fourth Embodiment As a fourth embodiment of the present invention, a microphone signal
generation method and a telephone set to which the microphone signal generation method is
applied will be described.
As described in the first to third embodiments, a speaker capable of diffusing sound in all
directions and a plurality of microphones with high directivity are equally spaced horizontally,
for example, at an interval of 60 degrees, with the speaker at the center. It has been described
that in the microphone / speaker integrated configuration / talking device in which six
microphones / speakers are arranged, arbitrary directivity can be provided by combining a single
microphone or a plurality of microphone signals.
Ideally, it is sufficient if there is a microphone that picks up only the sound in front of the
microphone, but, in fact, (1) a directional microphone in which the sensitivity in the back
direction of the microphone has dropped but the sensitivity in the left and right directions has
not dropped so much Alternatively, (2) If a directional microphone with reduced sensitivity in the
left-right direction but not much in the back direction is used in a telephone apparatus, sounds
other than the front of the microphone will also be picked up. When such a microphone is simply
arranged horizontally, for example, when the directional microphone of (1) is used, the directivity
of (2) when there is a noise source next to the speaker (microphone to pick up sound) In the
microphone, when there is a noise source on the opposite side of the speaker, this noise is picked
up considerably. The communication device of the fourth embodiment solves such a problem.
[0155]
FIG. 23 is a view illustrating the arrangement of microphones in the communication device of the
fourth embodiment. Six microphones are radially arranged at equal angles. The microphones in
the fourth embodiment are arranged clockwise as microphones 1 to 6 and are in the reverse
order that microphones 1 to 6 illustrated in FIG. 4 are arranged counterclockwise. . FIG. 24 is a
graph showing the directivity of the microphone in the fourth embodiment. FIG. 24 shows an
example of the directivity of the microphone at a frequency of 1 kHz. The six microphones
illustrated in FIG. 23 have the same directivity. The directivity of the microphone illustrated in
04-05-2019
47
FIG. 24 is similar to the directivity of the microphones illustrated in FIGS. 6 and 7 used in the
above-described embodiment, but the directivity is somewhat lower than the directivity
illustrated in FIG. It is.
[0156]
The inventors of the present application variously synthesized detection signals of a plurality of
microphones for the six microphones having the directivity shown in FIG. As an example, a
detection signal MIC1 of the first microphone and a detection signal of a microphone placed at a
position rotated ▒ 120 degrees from the front of the first microphone, for example, a fifth
microphone at a position of -120 degrees (240 degrees) The following operation was performed
for MIC5.
[0157]
1.0 О MIC1 / 2/3 О MIC5 (1) The above calculation in the first DSP 25 is, as illustrated in FIG.
16, channel gain = 1.1 by multiplying the detection signal MIC1 of the first microphone by a
coefficient. It can be easily obtained by setting 0 as the coefficient by which the detection signal
MIC5 of the fifth microphone is multiplied by channel gain = ?0.667 (2/3) and setting the
channel gain of other microphone detection signals to 0.0. . The signal MIC1 of the first
microphone is referred to as the main microphone signal, and the signal MIC5 of the fifth
microphone is referred to as the sub microphone signal. The coefficient by which the main
microphone signal is multiplied is called a main coefficient, and the coefficient by which the sub
microphone signal is multiplied is called a sub coefficient.
[0158]
The directivity at 1 kHz based on this calculation result is shown in FIG. The directivity at 1 kHz
of the single microphone illustrated in FIG. 24 and the directivity at 1 kHz of the two
microphones illustrated in FIG. 25 are as follows.
[0159]
Directivity factor Front / rear ratio Distance factor Figure 24 0.30539 0.11212 1.80955 Figure
25 0.31005 0.12929 1.7959
04-05-2019
48
[0160]
The directivity of the exemplary microphone illustrated in FIG. 24 is such that the sensitivity is
reduced at 120 degrees, 180 degrees, and 240 degrees with respect to the front, but is hardly
reduced at 60 degrees and 300 degrees.
On the other hand, the directivity of the illustrated two synthetic microphones illustrated in FIG.
25 is such that the sensitivity is reduced at 120 degrees and 300 degrees with respect to the
front, but is hardly reduced at 180 degrees. That is, in the communication devices of the first to
third embodiments in which the user selects an arbitrary microphone or automatically switches
to a microphone with a large input volume, detection of a plurality of microphones, for example,
two microphones is further performed. By combining the signals and giving the directivity of the
microphone arbitrarily, the sensitivity in the direction in which the noise source such as the
projector device where the rotational sound of the cooling fan turns into noise of the conference
even if the microphone is switched Can be lowered. As a result, a clear voice can be collected
without picking up a noise sound in a certain direction. As described above, when the user (user)
of the communication device according to the fourth embodiment sets the microphone in the
noise direction to non-selection, the sensitivity in the non-selection direction is constantly
lowered even if any selectable microphone is selected. It is possible to create such directivity and
not pick up noise sounds.
[0161]
The telephone set of the fourth embodiment has the same configuration as that illustrated in
FIGS. 2 to 4 except for those illustrated in FIGS. 23 and 24, and has the signal processing system
illustrated in FIG. Therefore, except for the microphone selection method in the first DSP 25
described below, the communication device of the fourth embodiment has the same function as
the communication device of the first to third embodiments described above.
[0162]
The outline of the microphone selection process of the fourth embodiment performed in the first
DSP 25 will be described below. Prerequisites 1. For example, microphones having directivity
04-05-2019
49
illustrated in FIG. 24 are disposed so that the voice detection portion faces outward at six equal
angles, as illustrated in, for example, FIG. 23 or FIG. A telephone set having 16 and the like is
used. The communication devices of the first to third embodiments described above satisfy such
conditions. ?? In such a speech device, the user can select a microphone, or by setting the
channel gain in the configuration illustrated in FIG. 16, the microphone can be substantially
selected. These conditions also satisfy the communication devices of the first to third
embodiments. ?? A speaker capable of outputting the sound selected in the fourth embodiment
is provided. This condition is also satisfied in the first to third embodiments, and, for example, the
speaker 16 illustrated in FIG. 3 corresponds. ?? As the fourth embodiment, for example, since a
noise source is close, a function capable of designating a microphone in any one direction as nonselected is added.
[0163]
Under the above conditions, when the user selects and changes the microphone while using the
calling device, if the non-selected microphone is set, the first DSP 25 performs the following
determination and processing procedure.
[0164]
(1) If the microphone to be selected is a non-selected microphone, it is not changed.
(2) The microphone angle to be selected and the microphone angle set to be non-selected. In the
example illustrated in FIG. 23, it is calculated what number non-selected microphone hits
clockwise, and (3) the angle is, for example, , 60 degrees, that is, for the microphone to be
selected, when the non-selection setting microphone is the first in the clockwise direction, the
sub microphone (= selection) is selected from the signal of the main microphone (= microphone
to be selected) The detection signal of the microphone 120 ░ away from the microphone is
multiplied by a subcoefficient, eg 2/3, subtracted from the main microphone signal (or multiplied
by -2/3 and added to the main microphone signal), and the combined microphone output It is a
signal. (4) Similarly, when the angle is, for example, 300 degrees, that is, when the unselected
setting microphone is the fifth to the microphone to be selected, from the signal of the main
microphone (= the microphone to be selected), The detection signal of the sub microphone (=
microphone 240 degrees away from the microphone to be selected) is multiplied by a sub
coefficient, for example -2/3, and subtracted from the main microphone signal to obtain an
output signal of the synthetic microphone. (5) When the angle is other than the above, the signal
of the single microphone to be selected is output. (6) After that, by performing the processing of
the steps 4 and 5 above, regardless of which microphone is selected, the influence of the noise
04-05-2019
50
source can be suppressed to a small amount by generating the synthetic microphone signal, and
clear speech can be obtained. It can be output from the speaker 16 and can be sent to the other
party's communication device.
[0165]
FIG. 26 is a flowchart showing the processing of the first DSP 25.
[0166]
Step 101: Initial Condition Setting The first DSP 25 initializes the selected microphone number =
1, the non-selected microphone number = 0, and the sub microphone number = 0.
[0167]
Step 102, status display The first DSP 25 displays the selected microphone.
Specifically, the LED illustrated in FIG. 4 corresponding to the selection microphone is turned on.
In addition, preferably, the first DSP 25 displays a non-selected microphone. Since the
communication device illustrated in FIG. 4 does not have an LED indicating a non-selected
microphone, it is necessary to add a corresponding LED or substitute that the non-selected
microphone does not display the LED illustrated in FIG. .
[0168]
Step 103: Calculation processing The first DSP 25 performs the following calculation, for
example, by the method illustrated in FIG. Voice output = Mc x Voice signal level of selected
microphone + SC x Voice signal of sub microphone Note that Mc is a coefficient corresponding to
the channel gain of the selected (main) microphone, and in the above example or in general, Mc =
1.0 is there. SC is a coefficient corresponding to the channel gain of the submicrophone, and in
the above example, SC = -2/3.
[0169]
04-05-2019
51
Step 104: Determination of presence / absence of instruction of non-selected microphone If there
is an instruction of non-selected microphone, the process proceeds to step 105. If there is no
indication of the non-selected microphone, the process proceeds to step 106.
[0170]
Step 105: Update Non-Selected Microphone The first DSP 25 advances the number of the nonselected microphone by one. As a result, when the number of existing microphones, for example,
six or more, the initial microphone number is set to 0.
[0171]
Step 106: Determination of Change of Selection Microphone The first DSP 25 checks whether or
not there is an instruction to change the selection microphone, and if not, returns to the
processing of Step 102. Step 107: Update of Selection Microphone When instructed to change
the selection microphone, the first DSP 25 advances the selection microphone number by one. As
a result, when the number of existing microphones, for example, six or more, the initial
microphone number is set to 0.
[0172]
Step 108: The first DSP 25 increments the number of the selected microphone by one if the nonselected microphone number matches the selected microphone number. As a result, when the
number of existing microphones, for example, six or more, the initial microphone number is set
to 0.
[0173]
Step 109: Check non-selected microphone number If the number of non-selected microphones is
within 1 to 6, the first DSP 25 shifts to the processing of step 110. If not, the process proceeds to
step 114.
04-05-2019
52
[0174]
Step 110: Calculation of Microphone Spacing The first DSP 25 calculates the spacing of the
microphones by the following calculation. Microphone spacing = selected microphone number
(arrangement angle) -unselected microphone number (arrangement angle) If the calculated
microphone spacing is negative, the arrangement number of the microphone number increased
by six is used.
[0175]
Step 111, Check of Microphone Interval The first DSP 25 performs the following processing
according to the microphone interval.
[0176]
Step 112: The first DSP 25 calculates (sub-microphone number = selected microphone
number?1) if the microphone spacing is ?1?.
As a result, if the sub microphone number is less than 1 (0), 6 is added to the sub microphone
number. Step 113: The first DSP 25 calculates (sub-microphone number = selected microphone
number-5) if the microphone spacing is ?5?. As a result, if the sub microphone number is less
than 1 (0), 6 is added to the sub microphone number. Step 114: The first DSP 25 sets (submicrophone number = 0) when the microphone spacing is ?2?, ?3?, ?4?. That is, it
indicates that there is no sub microphone.
[0177]
The above illustrates the case where the number of microphones is six and arranged at equal
intervals (equal angles) of 60 degrees, but in the case where the number of microphones is four,
eight, etc., the processing described above is applied. Process.
[0178]
04-05-2019
53
According to the communication device of the fourth embodiment, in addition to the use as the
communication device of the first to third embodiments, for example, a projector may be
provided in the vicinity of the communication device as a microphone unit for collecting speech
of a conference room. Even if there are noise sources, etc., it is possible to realize a
communication device which is not easily affected by the noise.
In addition, by using the speech apparatus as a microphone unit of the television conference
system and reducing the sensitivity of the television receiver in the direction of the speaker, the
margin of howling and the echo canceller processing portion of the speech apparatus becomes
large, resulting in stability. The effect of improving the quality can be achieved.
[0179]
The main factor (channel gain), the sub-factor can be defined by the directivity of the
microphone, the position of the noise source, the volume, etc. or according to the mutual
relationship of the microphones to be synthesized.
[0180]
In the fourth embodiment described above, an example in which one sub-microphone is selected
and performed by using the selected one microphone as the main microphone according to the
communication device of the first embodiment or the like is considered as a noise source.
Depending on the situation, the directivity of the single microphone, etc., the second submicrophone can be selected to perform, for example, the following calculation.
[0181]
Audio output = Mc О voice signal level of selected microphone + SC1 О voice signal of first sub
microphone + SC2 О voice signal of second sub microphone Note that Mc is a coefficient
corresponding to the channel gain of the main microphone, and SC1 is the first sub SC2 is a
coefficient corresponding to the channel gain of the microphone, and SC2 is a coefficient
corresponding to the channel gain of the second sub microphone.
[0182]
The first to fourth embodiments can be appropriately combined in the implementation of the
communication device of the present invention.
[0183]
04-05-2019
54
FIG. 1A is a view showing an outline of a conference system as one example to which a telephone
apparatus according to the present invention is applied, and FIG. 1B is a state in which the
telephone apparatus in FIG. 1A is placed. FIG. 1 (C) is a diagram showing the arrangement of the
telephone set and the conference participants placed on the table.
FIG. 2 is a perspective view of the communication device according to the embodiment of the
present invention.
FIG. 3 is an internal sectional view of the communication device illustrated in FIG.
FIG. 4 is a plan view of the microphone / electronic circuit housing with the top cover of the
communication device illustrated in FIG. 1 removed.
FIG. 5 is a diagram showing the connection of the main circuits of the microphone / electronic
circuit housing, and shows the connection of the first digital signal processor (DSP1) and the
second digital signal processor (DSP2). FIG. 6 is a characteristic diagram of the microphone
illustrated in FIG. FIGS. 7A to 7D are graphs showing the results of analyzing the directivity of the
microphone having the characteristics illustrated in FIG. FIG. 8 is a partial configuration diagram
of a modification of the speech apparatus of the present invention. FIG. 9 is a graph showing an
outline of the overall processing content in the first digital signal processor (DSP1). FIG. 10 is a
view showing a filtering process in the speech apparatus of the present invention. FIG. 11 is a
frequency characteristic diagram showing the processing result of FIG. FIG. 12 is a block diagram
showing the band pass filtering process and the level conversion process of the present
invention. FIG. 13 is a graph showing a process of determining the speech start and end in the
speech apparatus of the present invention. FIG. 14 is a graph showing a flow of normal
processing in the speech apparatus of the present invention. FIG. 15 is a flow chart showing the
flow of the normal processing in the speech apparatus of the present invention. FIG. 16 is a block
diagram illustrating microphone switching processing in the speech apparatus of the present
invention. FIG. 17 is a block diagram illustrating a method of microphone switching processing in
the communication device of the present invention. FIG. 18 is a block diagram illustrating a
partial configuration of the speech apparatus according to the second embodiment of the present
invention. FIG. 19 is a block diagram illustrating a partial configuration of the speech apparatus
according to the second embodiment of the present invention. FIG. 20 is a flowchart showing the
processing method of the third embodiment of the present invention. FIG. 21 is a block diagram
of an apparatus according to a third embodiment of the present invention. FIG. 22 is a flowchart
04-05-2019
55
showing details of a part of FIG. FIG. 23 is a view showing the arrangement of microphones in
the speech apparatus according to the fourth embodiment of the present invention. FIG. 24 is a
diagram showing an example of the directivity of the microphone of FIG. FIG. 25 is a diagram
showing the overall directivity when the voice signals of the directional microphone shown in
FIG. 24 are combined. FIG. 26 is a flow chart showing processing of the first DSP in the speech
apparatus of the fourth embodiment of the present invention.
Explanation of sign
[0184]
1 и и Microphone и speaker integrated configuration и speech device (telephone device) 11 и и и
upper cover 12 и и sound reflection plate 12a и и и sound reflection surface 12b и и restraint member
fixing portion 13 и и connection member 14 и и Housing portion 14a иии Sound reflecting surface,
14b и и Bottom surface 14c и и Upper surface 14b и 14d и и Internal cavity 14e и и Restraint member
lower fixing portion 14f и и Restraint member penetration portion 15 и и Operating portion 16 и и и и
и и и 17 и и restraining member 18 и и damper 2 и и microphone и electronic circuit housing portion
21 и и и printed circuit board MC1 ~ MC и microphone 22 и и microphone support member 22a, 22b
и и microphone support member 23 и и и microprocessor, 24 и и Codec 25 и и First digital signal
processor (DSP 1) 301 to 306 и и Variable gain type Width detector 251 иии Variable attenuation
unit 252 и и Level detection unit 253 и и и Level judgment и gain control unit 254 и и Test signal
generation unit 255 и и Sound source device direction specification processing means 26 и и
Second digital signal processor (DSP2 ) 261 ... echo canceling transmission processing unit 262 ...
echo cancellation receiver portion 27 ... A / D converter block 301-306 и variable gain amplifier
28 .. D / A converter block 29 ... amplifier block 30 и и Microphone selection result display means
LED 1 to 6 и и Light emitting diode
04-05-2019
56
supplied
04-05-2019
23
to the determination process of the speaker direction in step 4.
[0068]
Step 4: Speaker Direction Microphone Switching Processing The timing determination processing
in the processing for switching the speaker direction microphone to the DSP 25 has been
selected from the speaker detection direction at that time based on the processing in Step 2 and
the processing in Step 3. When the speaker direction is different, the microphone selection of the
new speaker direction is instructed to the microphone signal switching process of step 4.
However, if the chairman's microphone is set from the operation unit 15 and the chairman's
microphone and other conference participants speak simultaneously, priority is given to the
chairman's speech. At this time, the selected microphone information is displayed on the
microphone selection result display means 30, for example, the light emitting diodes LED1 to
LED6.
[0069]
Step 5: Transmission of Microphone Pickup Signal In the microphone signal switching process,
only the microphone signal selected by the step 4 process out of the six microphone signals is
used as a transmission signal, and the communication apparatus 1 uses the telephone line 920 to
communicate with the other party. To the line unit of the telephone line 920 illustrated in FIG.
[0070]
Setting of speech start level threshold and speech end threshold Process 1: Measure floor noise
for a predetermined time, for example, one second, of each microphone immediately after power
on.
The DSP 25 reads out the peak-held level value of the sound pressure level detection unit at
constant time intervals, for example, 10 mSec intervals in this embodiment, and calculates an
average value of values for a predetermined time, for example, 1 minute, and floor noise I
assume. The DSP 25 determines the speech start detection level (floor noise +9 dB) and the
speech end detection level threshold (floor noise +6 dB) based on the measured floor noise level.
The DSP 25 subsequently reads out the peak-held level value of the sound pressure level detector
at constant time intervals. When it is determined that the speech is ended, the DSP 25 works as a
measurement of floor noise, detects the speech start, and updates the threshold of the detection
04-05-2019
24
level of the speech end.
[0071]
According to this method, since the threshold setting is different for each floor noise level at the
position where the microphone is placed, the threshold can be set for each microphone, and
erroneous determination in selection of the microphone by the noise source can be prevented.
[0072]
Process 2: Correspondence to room with surrounding noise (floor noise is large) Process 2 is the
process 1 when the floor noise is large and the threshold level is updated automatically
automatically, the following measures are taken as a measure when it is difficult to detect speech
start and end .
The DSP 25 determines the threshold of the speech start detection level and the speech end
detection level based on the predicted floor noise level. The DSP 25 sets the speech start
threshold level to be larger than the speech end threshold level (for example, a difference of 3 dB
or more). The DSP 25 reads out the level value peak-held by the sound pressure level detector at
constant time intervals.
[0073]
According to this method, since this threshold setting has the same value for all the microphones,
it is possible to recognize the speech start with the same level of voice between the person with
the noise source on the back and the person without the noise source.
[0074]
Speech start determination processing 1, comparing the output level of the sound pressure level
detector corresponding to the six microphones with the threshold of the speech start level, and
determining that the speech start level is exceeded when the threshold of the speech start level is
exceeded.
When the output levels of the sound pressure level detectors corresponding to all the
04-05-2019
25
microphones exceed the threshold of the speech start level, the DSP 25 determines that the
signal is from the reception and reproduction speaker 16 and does not determine that the speech
is started. This is because the distance between the reception and reproduction speaker 16 and
all the microphones MC1 to MC6 is the same, so the sound from the reception and reproduction
speaker 16 reaches all the microphones MC1 to MC6 almost equally.
[0075]
Process 2, two unidirectional microphones with directivity axes shifted by 180 degrees in
opposite directions at equal angles and radial and equally spaced arrangement of 60 degrees for
the six microphones illustrated in FIG. Three pairs of MC4, microphones MC2 and MC5, and
microphones MC3 and MC6) are used to utilize the level difference of the microphone signals.
That is, the following operation is performed.
[0076]
Absolute value of (signal level of microphone 1-signal level of microphone 4) ... [1] Absolute value
of (signal level of microphone 2-signal level of microphone 5) ... [2] (signal level of microphone 3
-Absolute value of the signal level of the microphone 6 ... [3]
[0077]
The DSP 25 compares the absolute values [1], [2], and [3] with the threshold of the speech start
level, and determines that speech is started when the threshold of the speech start level is
exceeded.
In the case of this processing, since all absolute values do not become larger than the threshold
of the speech start level as in the processing 1 (since the sound from the reception reproduction
speaker 16 reaches all the microphones equally), the reception reproduction speaker 16 It is not
necessary to determine whether the sound is from the speaker or from the speaker.
[0078]
Speaker Direction Detection Process The characteristic of the unidirectional microphone
04-05-2019
26
illustrated in FIG. 6 is used to detect the speaker direction. In the unidirectional microphone, the
frequency characteristic and the level characteristic change as illustrated in FIG. 6 according to
the arrival angle of the voice from the speaker to the microphone. The results are illustrated in
FIGS. 7A to 7C. 7 (A) to 7 (C) show a speaker placed at a predetermined distance from the
communication device 1, for example, 1.5 meters, and fast Fourier transform (FFT) of voice
collected by each microphone at constant time intervals Show the results. The X axis represents
frequency, the Y axis represents signal level, and the Z axis represents time. The horizontal line
represents the cutoff frequency of the band pass filter, and the level of the frequency band
sandwiched by this line is the band pass of the five bands from the microphone signal level
conversion process described with reference to FIGS. -Data converted to sound pressure level
through a filter.
[0079]
A determination method applied as an actual process for detecting the direction of the speaker in
the speech apparatus 1 as the first embodiment of the present invention will be described.
Appropriate weighting processing is performed on the output level of each bandpass filter (0 at 0
dBFs for 1 dB full span (1 dBFs) steps, 3 for -3 dBFs, or vice versa). The weighting step
determines the resolution of the process. The above weighting process is performed every one
sample clock, the weighted score of each microphone is added, and the average value is
calculated with a certain number of samples, and the small (large) microphone signal of the total
point is determined as the microphone facing the speaker Do. An image of the result is shown in
Table 2 below.
[0080]
[0081]
In this example illustrated in Table 2, since the first microphone MC1 has the smallest total point,
the DSP 25 determines that there is a sound source in the direction of the first microphone MC1
(speaker is present).
The DSP 25 holds the result in the form of a sound source direction microphone number. As
described above, the DSP 25 performs weighting on the output level of the band pass filter of the
frequency band for each microphone, and ranks the microphone signals in the order of small (or
04-05-2019
27
large) score of the output of each band pass filter And determine that the microphone signal
having the first rank in three or more bands is the microphone facing the speaker. Then,
assuming that there is a sound source (speaker is present) in the direction of the first
microphone MC1, the DSP 25 creates a score sheet as shown in Table 3 below. This is called a
star list.
[0082]
[0083]
Actually, the performance of the first microphone MC1 is not necessarily the best at all band pass
filter outputs due to sound reflections and standing waves due to the characteristics of the room,
but a majority of the five bands are If it is the first place, it can be determined that there is a
sound source (a speaker is present) in the direction of the first microphone MC1.
The DSP 25 holds the result in the form of a sound source direction microphone number.
[0084]
The DSP 25 sums up the output level data of each band pass filter of each microphone in the
form shown in Table 9 below, determines the high level microphone signal as the microphone
facing the speaker, and the result is the sound source direction microphone number In the form
of
[0085]
MIC1 Level = L1-1 + L1-2 + L1-3 + L1-4 + L1-5 MIC2 Level = L2-1 + L2-2 + L2-3 + L2-4 + L2-5
MIC3 Level = L3-1 + L3-2 + L3-3 + L3-4 + L3-5 MIC4 Level = L4-1 + L4-2 + L4-3 + L4-4 + L4-5
MIC5 Level = L5-1 + L5-2 + L5 -3 + L5-4 + L5-5 MIC6 Level = L6-1 + L6-2 + L6-3 + L6-4 + L6-5
[0086]
Speaker Direction Microphone Switching Timing Judgment Process Activated by the speech start
judgment result in step 2 of FIG. 15 and when the microphone of the new speaker is detected
from the detection process result of the speaker direction in step 3 and the past selection
information, The DSP 25 issues a switching command of the microphone signal to the selection
switching process of the microphone signal in step 5, and notifies the microphone selection
04-05-2019
28
result display means 30 (light emitting diodes LED1 to 6) that the microphone of the speaker has
switched, Inform the caller that the calling device 1 has responded to his / her speech.
[0087]
In a room with a large echo, in order to remove the influence of reflected sound and standing
waves, the DSP 25 prohibits the effect of a new microphone selection command unless the
speech end determination time (for example, 0.5 seconds) has elapsed after switching the
microphone.
From the microphone signal level conversion processing result of step 1 of FIG. 15 and the
detection processing result of the speaker direction of step 3, two microphone selection
switching timings are prepared in this embodiment.
[0088]
First method: When the start of speech can be clearly determined If the speech from the selected
microphone direction has ended and there is a new speech from another direction.
In this case, the DSP 25 starts speech after the speech termination judgment time (for example,
0.5 seconds) has elapsed after all the microphone signal levels (1) and microphone signal levels
(2) have fallen below the speech termination threshold level. It is judged that the speech is
started when any microphone signal level (1) becomes equal to or higher than the speech start
threshold level, and the microphone facing the speaker direction is properly collected based on
the information of the sound source direction microphone number The microphone is
determined, and the microphone signal selection switching process of step 5 is started.
[0089]
Second method: When a lou
Документ
Категория
Без категории
Просмотров
0
Размер файла
87 Кб
Теги
jp2005333270
1/--страниц
Пожаловаться на содержимое документа