close

Вход

Забыли?

вход по аккаунту

?

DESCRIPTION JP2006339974

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JP2006339974
PROBLEM TO BE SOLVED: To use one microphone as both a contact microphone and a noncontact microphone. SOLUTION: A contact / non-contact microphone for collecting an air
conduction sound of a person's speech voice while using it away from a human body and making
contact with a human body and collecting a body conduction sound of a person's speech speech
Using the dual purpose microphone 1, the dual purpose microphone 1 is held in contact with the
human body by the contact holding member 21, and the dual purpose microphone 1 is used as a
contact type microphone, and the contact holding member 21 is separated from the human body
by the locking member 24. Lock in place and use the dual purpose microphone 1 as a noncontact
microphone. [Selected figure] Figure 2
Voice input / output device
[0001]
The present invention relates to an apparatus for inputting human speech and outputting speech.
[0002]
There is known a contact type microphone (hereinafter referred to as a microphone) which is
brought into contact with the skin of a human body to collect in-vivo conduction sound of human
speech (for example, refer to Non-Patent Document 1).
10-04-2019
1
This contact-type microphone is superior to the conventional non-contact-type microphone in S /
N characteristics under noise environment, and can collect human's "tweet" and "lone message".
[0003]
Prior art documents related to the invention of this application are as follows. Yuki Nakajima et
al., "Improvement of sensing method in speech recognition (NAM recognition)" Proceedings of
the Acoustical Society of Japan, 3-Q-1, pp 145-146 March 2004
[0004]
However, in the case of the contact type microphone described above, even in the case of
inputting and outputting non-confidential voice, it has to be kept in contact with the human body
at all times, and the user may feel bothersome.
[0005]
A contact / non-contact microphone (dual use microphone) that separates from the human body
and collects air conducted sound of human speech and uses it in contact with the human body to
collect internal conductive sound of human speech. The dual purpose microphone is held in
contact with the human body by the contact holding member, and the dual purpose microphone
is used as a contact type microphone, and the contact holding member is locked by the locking
member at a location away from the human body. Use as a contact microphone.
[0006]
According to the present invention, the functions of a stand microphone (noncontact
microphone) and a mounting microphone (contact microphone) can be realized by one
microphone.
[0007]
An embodiment in which the present invention is applied to a voice input / output device for
voice operation used in a car will be described.
The present invention is not limited to vehicles.
10-04-2019
2
[0008]
FIG. 1 is a diagram showing the configuration of an embodiment.
The contact / non-contact dual use microphone (hereinafter simply referred to as dual use
microphone) 1 is a contact type microphone which makes contact with the human body directly
or through clothes and collects internal conduction sound from human speech and converts it
into an electric signal. Besides being usable, it can also be used as a non-contact type microphone
which is placed apart from the human body and collects air conducted sound of human speech
and converts it into an electric signal.
As a contact type microphone, a bone conduction type microphone and a NAM (Non-Audible
Murmur) microphone are known, and as a non-contact type microphone, an electret condenser
microphone (ECM) and the like are known.
[0009]
The bone conduction microphone is a microphone that collects the vibration of the skull due to
the speech and gives the vibration to the skull to directly transmit the voice to the auditory
nerve. In recent years, they have been put to practical use as portable telephones and the like for
transmission and reception.
[0010]
In addition, the NAM microphone is a microphone that directly contacts the skin of the human
body or contacts the skin through relatively thin clothes and collects the internal conduction
sound of human speech. The NAM microphone is formed by coating the electret film with a
material such as silicone whose acoustic impedance is similar to human skin, and in particular,
the S / N characteristic in the frequency range of 1000 Hz or less is superior to the general
noncontact microphone , It has the feature that it can pick up human inaudible "tweet" and
"song".
10-04-2019
3
[0011]
In this embodiment, as shown in FIG. 2, the duplex microphone 1 is attached to a headset 21 and
used. By mounting the headset 21 on the head, the dual microphone 1 is kept in contact with the
occupant's earlobe (see FIG. 2B), and the dual microphone 1 can be used as a contact
microphone.
[0012]
A headset speaker 9 described later is also attached to the headset 21. The duplex microphone 1
and the headset speaker 9 are connected to a microphone amplifier 2 and a speaker amplifier 7
described later by a cable 22.
[0013]
On the other hand, as shown in FIG. 2 (a), a hook 24 for locking the headset 21 is installed on an
instrument panel (hereinafter referred to as an instrument panel) 23 in the passenger
compartment. And can be used as a non-contact microphone.
[0014]
The shape and structure of the headset 21 are not limited to the one embodiment, and may be
any shape and structure capable of holding the microphone 1 in contact with the human hand,
arm, head, shoulder, chest, upper back, etc. Just do it.
[0015]
In FIG. 1, a microphone amplifier 2 is an amplifier for amplifying an audio signal collected by the
dual microphone 1, and an amplification gain can be adjusted.
A PTT (Push To Talk) switch 3 is a switch operated by an occupant when voice input is started.
When the PTT switch 3 is turned on, the voice input / output device picks up the speech voice
and starts voice input / output processing.
10-04-2019
4
Further, the input cancel switch 4 is a switch operated by the occupant to cancel the input voice.
[0016]
The voice recognition result by the voice input / output device is broadcasted by an instrument
panel speaker 8 or a headset speaker 9 described later, but if the broadcasted recognition result
is different from the content uttered by the occupant, this input cancel switch 4 is operated.
Input voice can be canceled. The PTT switch 3 and the input cancel switch 4 are installed, for
example, in the spoke portion of the steering wheel.
[0017]
The on-hook sensor 5 is a sensor that detects that the headset 21 is locked to the hook 24 of the
instrument panel 23, as shown in FIG. 2A.
[0018]
The controller 6 includes a CPU 6a, a ROM 6b, a RAM 6c, an A / D converter 6d, a D / A
converter 6e, etc., executes a voice input / output program described later to input the voice of
the occupant and performs voice recognition processing to perform on-vehicle equipment Output
to
[0019]
The speaker amplifier 7 amplifies the speech word signal of the speech recognition result
inputted from the controller 6 and broadcasts it by the instrument panel speaker 8 or the
headset speaker 9.
The amplification gain of the speaker amplifier 7 is adjustable.
[0020]
The instrument panel speaker 8 is used for broadcast of speech when the headset 21 is locked to
10-04-2019
5
the hook 24 of the instrument panel 23.
The broadcast by the instrument panel speaker 8 can be heard by all occupants in the vehicle
compartment. On the other hand, the headset speaker 9 is used for broadcast of speech when the
headset 21 is set on the head of the occupant. The broadcast by the headset speaker 9 can be
heard only by the occupant wearing the headset 21.
[0021]
The on-hook acoustic dictionary 10 is used when the headset 21 is engaged with the hook 24 of
the instrument panel 23 and the dual microphone 1 is used as a non-contact microphone as
shown in FIG. 2A (hereinafter referred to as on-hook usage) Is an acoustic dictionary which is
referred to in order to recognize the voice of the occupant. On the other hand, in the off-hook
acoustic dictionary 11, as shown in FIG. 2B, when the occupant wears the headset 21 on the head
and uses the dual microphone 1 as a contact type microphone (hereinafter referred to as offhook usage), This is an acoustic dictionary to be referred to in order to recognize the voice of the
occupant.
[0022]
In this embodiment, the amplification gain of the microphone amplifier 2, the filter applied to the
audio output signal of the dual microphone 1, the acoustic dictionary used for speech
recognition, and the amplification gain of the speaker amplifier 7 when on-hook and off-hook are
used. Switch to the most appropriate one.
[0023]
Even when the utterer speaks with the same voice amount in the on-hook use and the off-hook
use, the level of the audio output signal of the dual microphone 1 is different.
Since the space transmission distance of the voice from the speaker to the duplex microphone 1
is longer when using the on-hook than when using the off-hook, when the speaker speaks with
the same voice amount when using the on-hook and when using the off-hook, The output signal
level is lower when using on-hook than when using off-hook. Therefore, the amplification gain of
the microphone amplifier 2 at the time of on-hook use is made higher than the amplification gain
10-04-2019
6
at the time of off-hook use so that the occupant can utter with the same voice amount in the onhook use and the off-hook use. As a result, the passenger does not have to adjust the
amplification gain of the microphone amplifier 2 according to the type of use, and can always
speak with the same voice regardless of the type of use.
[0024]
Next, the amount of mixed noise inside and outside the vehicle to the duplex microphone 1
differs between when using on-hook and when using off-hook. When on-hook is used, noise from
inside and outside of the car is mixed in the speech of the occupant of the dual purpose
microphone 1. Therefore, it is necessary to filter the audio output signal of the dual purpose
microphone 1 to remove noise. For example, a high pass filter (HPF) having a cutoff frequency of
about 300 Hz is used to remove low frequency noises inside and outside the vehicle such as
engine noise and road noise from the audio output signal of the dual microphone 1. Alternatively,
spectral subtraction processing with a large prediction noise subtraction amount is performed on
the audio output signal of the duplex microphone 1.
[0025]
On the other hand, when off-hook is used, there is little noise from inside and outside of the
vehicle mixed in the speech sound of the passenger of dual-use microphone 1, so band pass filter
(BPF) processing of about 100 Hz to 2 kHz is performed on the audio output signal of dual-use
microphone 1. Or spectral subtraction processing with a small prediction noise subtraction
amount is performed. This makes it possible to reduce the difference between the S / N ratio of
the audio output signal of the dual microphone 1 when using on-hook and when using off-hook,
and to correctly recognize and correctly recognize the voice of the occupant regardless of the
type of use. be able to.
[0026]
Furthermore, the transfer characteristics of the speech are different between on-hook use and
off-hook use. When using on-hook, this is an acoustic model that reflects the space transfer
characteristics of voice between the passenger and the two-way microphone 1, that is, an onhook that combines an acoustic model that models a phoneme pattern uttered in a car interior
noise environment and a language dictionary. An acoustic dictionary 10 is used. On the other
10-04-2019
7
hand, when off-hook is used, an off-hook combining an acoustic model that models a phoneme
pattern less affected by vehicle interior noise and a language dictionary, that is, an acoustic
model that reflects the in-vivo transfer characteristics of voice between the occupant and the
two-way microphone 1 The acoustic dictionary 11 is used.
[0027]
Further, in this embodiment, the voice recognition result is broadcasted using the instrument
panel speaker 8 when the dual microphone 1 is on-hook used, and the voice recognition result is
broadcasted using the headset speaker 9 when off-hook used. Therefore, the amplification gain
of the speaker amplifier 7 for the headset speaker 9 is made lower than the amplification gain for
the instrument panel speaker 8 in order to equalize the sound pressure of both the speakers 8
and 9 at the position of the occupant's ear. As a result, the passenger does not have to adjust the
amplification gain of the speaker amplifier 7 in accordance with the type of use, and can always
hear broadcasts of the same sound pressure regardless of the type of use.
[0028]
FIG. 3 is a flowchart showing a voice input / output program executed by the controller 6. The
operation of the embodiment will be described with reference to this flowchart. The controller 6
executes this voice input / output program when the PTT switch 3 is turned on.
[0029]
In step 1, it is confirmed by the on-hook sensor 5 whether or not the dual microphone 1 is in the
on-hook use state. When the dual use microphone 1 is in the on-hook use state, the on-hook
amplification gain is selected from the amplification gains of the microphone amplifier 2 stored
in the ROM 6 b in step 2, and in the filters stored in the ROM 6 b in the subsequent step 3 Select
the on-hook filter from. In step 4, the on-hook acoustic dictionary 10 is selected, and in step 5,
the on-hook amplification gain is selected from the amplification gains of the speaker amplifier 7
stored in the ROM 6b.
[0030]
10-04-2019
8
On the other hand, when the dual use microphone 1 is in the off-hook use state, the off-hook
amplification gain is selected from the amplification gains of the microphone amplifier 2 stored
in the ROM 6b in step 6, and the filter stored in the ROM 6b in the following step 7. Select the
off-hook filter from among the above. Also, in step 8, the off-hook acoustic dictionary 11 is
selected, and in step 9, the off-hook amplification gain is selected from the amplification gains of
the speaker amplifier 7 stored in the ROM 6b.
[0031]
After selecting the amplification gain of the microphone amplifier 2, the filter applied to the
output voice signal of the duplex microphone 1, the acoustic dictionary used for speech
recognition, and the amplification gain of the speaker amplifier 7 according to the usage form of
the dual microphone 1, The speech sound signal of the occupant collected by the microphone 1
is amplified by the microphone amplifier 2 with an amplification gain according to the type of
use, and is input. In the following step 11, the audio output signal of the dual microphone 1 is
converted into a digital signal by the A / D converter 6 and temporarily stored in the RAM 6c.
[0032]
In step 12, an audio output signal is read out from the RAM 6c, and filter processing is
performed according to the usage form. Next, in step 13, voice recognition processing is
performed using the on-hook acoustic dictionary 10 or the off-hook acoustic dictionary 11
according to the usage pattern. In step 14, the word of the speech recognition result is converted
to an analog signal by the D / A converter 6e, amplified by the amplification gain according to
the use form by the speaker amplifier 7, and by the instrument panel speaker 8 or the headset
speaker 9 according to the use form To broadcast.
[0033]
In step 15, it is checked whether or not the input cancel switch 4 is on. If the input cancel switch
4 is on, the process returns to step 1 to repeat the above-described processing. If the input cancel
switch 4 is not turned on, the process proceeds to a step 16, where the recognition result of the
uttered voice of the occupant is output to the on-vehicle device to end the voice operation.
10-04-2019
9
[0034]
Thus, according to one embodiment, air conduction sound of human speech is collected and used
while being separated from the human body, and body conduction sound of human speech is
collected using contact with the human body. The dual use microphone 1 is held in contact with
the human body by the headset 21 using the contact / non-contact dual use microphone 1, and
the dual use microphone 1 is used as a contact type microphone and the headset 21 is human
body by the hook 24 of the instrument panel 23. Since it was locked in the place away from and
it was made to use both microphones 1 as a non-contact type microphone, the function of a
stand microphone (non-contact type microphone) and a mounting microphone (contact type
microphone) is realized by one microphone 1 can do.
[0035]
Further, according to one embodiment, the on-hook sensor 5 detects a state in which the headset
21 is locked to the hook 24 of the instrument panel 23, and according to the detection result of
the on-hook sensor 5, the audio output signal of the dual microphone 1 It is not necessary for the
user to adjust the amplification gain of the microphone amplifier 2 in accordance with the usage
mode, because the amplification gain of the microphone amplifier 2 for amplifying the signal is
switched. it can.
[0036]
Furthermore, according to one embodiment, the on-hook sensor 5 detects the state in which the
headset 21 is locked to the hook 24 of the instrument panel 23, and the audio output signal of
the dual microphone 1 according to the detection result of the on-hook sensor 5. Since the
content of the filter processing applied to the switch is switched, the difference between the S / N
ratio of the audio output signal of the dual purpose microphone 1 at the time of on-hook use and
at the time of off-hook use can be reduced. The voice can be correctly input and correctly
recognized.
[0037]
According to one embodiment, when the dual-use microphone 1 is used as a non-contact
microphone, the on-hook acoustic dictionary 10 to be referred to recognize human speech and
the dual-use microphone 1 are used as a contact-like microphone When the on-hook sensor 5 is
provided with an off-hook acoustic dictionary 11 to refer to in order to recognize a person's
speech voice, the on-hook sensor 5 detects a state in which the headset 21 is locked to the hook
24 of the instrument panel 23. Since the on-hook acoustic dictionary 10 and the off-hook
acoustic dictionary 11 are switched according to the detection result of the above, it is possible
to use the optimal acoustic dictionary for on-hook use and off-hook use respectively, and speech
10-04-2019
10
speech is correctly recognized The probability of doing is higher.
[0038]
According to one embodiment, an instrument panel speaker 8 provided apart from the headset
21 and outputting sound to all persons in the vicinity including the person wearing the headset
21, provided in the headset 21, the headset 21 is provided with a headset speaker 9 for
outputting voice only to the person wearing the H.21, and the on-hook sensor 5 detects a state
where the headset 21 is locked to the hook 24 of the instrument panel 23, amplifies the voice
signal and Since the amplification gain of the speaker amplifier 7 output to the speaker 8 and the
headset speaker 9 is switched according to the detection result of the on-hook sensor 5, the user
needs to adjust the amplification gain of the speaker amplifier 7 according to the usage form You
can always listen to the same sound pressure broadcast regardless of the type of use.
[0039]
The correspondence between the components of the claims and the components of the
embodiment is as follows.
That is, the headset 21 contacts the contact holding member, the hook 24 of the instrument
panel 23 the locking member, the microphone amplifier 2 the microphone amplification means,
the on hook sensor 5 the locking state detection means and the off hook acoustic dictionary 11
The on-hook acoustic dictionary 10 is a non-contact microphone acoustic dictionary, the headset
speaker 9 is a user-only speaker, the instrument panel speaker 8 is a normal speaker, and the
speaker amplifier 7 is a speaker amplification means. The controller 6 constitutes microphone
gain switching means, filter processing means, filter processing switching means, acoustic
dictionary switching means, and speaker gain switching means.
The above description is merely an example, and when interpreting the invention, the
correspondence between the items described in the above embodiment and the items described
in the claims is not limited or restricted at all.
[0040]
It is a figure which shows the structure of one Embodiment.
10-04-2019
11
It is a figure which shows the use form of a microphone.
It is a flowchart which shows the voice input-output process of one Embodiment.
Explanation of sign
[0041]
1 contact / non-contact microphone 2 microphone amplifier 3 PTT switch 4 input cancel switch
5 on-hook sensor 6 controller 6 a CPU 6 b ROM 6 c RAM 6 d A / D converter 6 e D / A converter
7 speaker amplifier 8 in-panel speaker 9 headset speaker 10 on hook Acoustic Dictionary 11 Off
Hook Acoustic Dictionary 21 Headset 22 Cable 24 Hook
10-04-2019
12
Документ
Категория
Без категории
Просмотров
0
Размер файла
21 Кб
Теги
description, jp2006339974
1/--страниц
Пожаловаться на содержимое документа