close

Вход

Забыли?

вход по аккаунту

?

DESCRIPTION JP2008211526

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JP2008211526
A voice input / output device and a voice input / output method with improved voice quality are
provided. In an audio input / output apparatus 1 having a microphone 3 and a speaker 7 outside,
an input from the microphone 3 and an output to the speaker 7 are input to an acoustic
environment determination unit 4. The acoustic environment determination unit estimates the
acoustic environment between the microphone 3 and the speaker 7, selects the necessary
acoustic processing based on the estimation, and determines the acoustic processing to be
performed by the acoustic processing unit 5. The voice input from the microphone 3 executes
the selected function among the sound processing functions of the sound processing unit 5.
Then, it transmits to a partner terminal through necessary processing such as transmission /
reception processing. [Selected figure] Figure 1
Voice input / output device and voice input / output method
[0001]
The present invention relates to a voice input / output device and a voice input / output method
used for a real-time voice transmission / reception communication system such as a web
conference system or an IP telephone system, and a voice recording / reproducing system.
[0002]
In a system called a video conference system or a web conference system, calls may be made
between relatively large places such as a conference room.
10-04-2019
1
In such a case, sound processing is often performed in order to make a call smoothly. For
example, sound processing such as enabling an acoustic echo canceler when a call is made
hands-free, enabling a noise canceller in a noisy place, or increasing the volume may be
mentioned.
[0003]
A terminal that implements such acoustic processing may communicate with a user if the
specifications are not very high or coexist with other applications, and it is desirable that the
processing capability be as small as possible.
[0004]
Furthermore, in the existing terminal, in order to solve the above-mentioned problem, in some
cases, it was possible to manually set the sound by the user. However, since the user with little
expert knowledge of sound performs the setting, the setting is inappropriate. In some cases,
unnecessary processing was enabled and used.
[0005]
For example, in a multi-point video conference system, the transmission side terminal adjusts the
volume transmission level between the models, and the transmission side adjusts the volume
according to the capability of the reception terminal. Voice input terminal and voice input
terminal and voice synthesis terminal "(see, for example, Patent Document 1), in a mobile phone
terminal, adjust with software (equalizer) so that the acoustic performance falls within the
standard after manufacturing the telephone “Characteristic adjustment device, acoustic
characteristic adjustment method and program” (for example, refer to Patent Document 2), A
loudspeaker performs adaptive control of an echo canceller while a left sound is sounding first,
and after a call starts, from the talkhead A "voice communication apparatus" that removes echo
(see, for example, Patent Document 3).
Can be mentioned.
[0006]
10-04-2019
2
In addition to the above, Patent Documents 4 to 6 can be mentioned as voice input / output
devices.
JP-A-09-149133 JP-A-2006-157574 JP-A-2005-217547 JP-A-2002-330500 JP-A-2005151403 JP-A-H07-154305
[0007]
However, the above-described prior art has the following problems. The first problem is that
inappropriate or unnecessary sound processing may be performed, and the voice quality of the
call may deteriorate for the user, or the processing load may be unnecessarily increased.
[0008]
The reason is that manual sound setting by the user is premised, and depending on the user's
expert knowledge, setting may be improper and the call voice quality may deteriorate. In
addition, since the unnecessary processing is enabled, the processing load is increased, and the
processing perceived speed of the terminal may be slowed or the call voice quality may be
deteriorated.
[0009]
The second problem is that the user may have to manually make sound settings at the start of a
call, which may take time and effort and may not be able to concentrate on the original purpose
such as a call.
[0010]
The reason is that ordinary users who do not have much knowledge about sound processing
repeat the adjustment of setting the sound for a long time to try out to set the presence or
absence of sound processing and the strength of the effect by its own judgment It may be
necessary to enter the call without proper settings.
[0011]
Since the setting of sound processing is based on manual sound setting by the user, there are
10-04-2019
3
cases where inappropriate or unnecessary sound processing may be performed, and there may
be cases where the call voice quality is degraded or the processing load is unnecessarily
increased. there were.
[0012]
In addition, general users who do not know much about sound processing may repeat the
adjustment of setting the sound over time and trying it over and over again in order to set the
presence or absence of sound processing and the strength of the effect by its own judgment.
Sometimes it is necessary, or sometimes you enter a call without making appropriate settings,
and you may not be able to concentrate on the original purpose such as a call.
[0013]
Therefore, an object of the present invention is to provide a voice input / output device and a
voice input / output method with improved voice quality.
[0014]
In order to solve the above-mentioned problems, the invention according to claim 1 is an audio
input / output device for outputting voice from a speaker and inputting voice with a microphone,
combining one or more sound processing functions and inputting with the microphone The
acoustic transfer characteristic of the space from the speaker to the microphone is estimated
using acoustic processing means for performing acoustic processing on the processed audio, the
audio input from the microphone and the audio output from the speaker, and According to the
transfer characteristic, it is characterized in that it comprises: an acoustic environment judging
means that exerts an effect on the sound processing means.
[0015]
According to the first aspect of the present invention, when one or more sound processing
functions are combined and sound processing is performed on the sound input by the
microphone, the sound input by the microphone and the sound output from the speaker are By
estimating the acoustic transfer characteristics of the space from the speaker to the microphone,
and performing the above-mentioned acoustic processing in accordance with the acoustic
transfer characteristics, the variation of the acoustic setting due to the human system is
eliminated, and the processing load by the acoustic processing is necessary minimum The voice
quality can be improved because it is limited.
[0016]
10-04-2019
4
The invention according to claim 2 is characterized in that, in the invention according to claim 1,
the sound processing means can enable or disable each or all of the combined sound processing
functions.
[0017]
According to the second aspect of the present invention, the sound processing means can further
improve the voice quality by enabling or disabling each or all of the combined sound processing
functions.
[0018]
The invention according to claim 3 is characterized in that, in the invention according to claim 1
or 2, sound processing means for performing sound processing on the sound outputted from the
speaker is provided.
[0019]
According to the third aspect of the present invention, by providing the sound processing means
for performing sound processing on the sound output from the speaker, it is possible to further
improve the sound quality.
[0020]
The invention according to a fourth aspect is the invention according to any one of the first to
third aspects, wherein the voice from the speaker is used by using the voice inputted by the
microphone and the voice outputted from the speaker. According to another aspect of the
present invention, there is provided an acoustic environment determination means for estimating
the acoustic transfer characteristic of the space to which the sound is to be output, and acting on
the acoustic processing means for performing acoustic processing on the sound output from the
speaker.
[0021]
According to the invention as set forth in claim 4, the sound transfer characteristic of the space
from the speaker to the microphone is estimated using the sound inputted by the microphone
and the sound outputted from the speaker, and the sound transfer characteristic is adjusted
according to the sound transfer characteristic, By performing sound processing on the sound
output from the speaker, the sound quality can be further improved.
10-04-2019
5
[0022]
The invention according to claim 5 is characterized in that, in the invention according to any one
of claims 1 to 4, it further comprises transmitting / receiving means for transmitting / receiving
voice to / from another voice input / output device via a network. Do.
[0023]
According to the fifth aspect of the present invention, the voice quality can be further improved
by providing the transmitting / receiving means for transmitting / receiving the voice to / from
another voice input / output device via the network.
[0024]
The invention according to a sixth aspect is the invention according to any one of the first to fifth
aspects, wherein the calling means indicates to the user the start of transmission and reception of
voice with another voice input and output device, and the speaker The sound transfer
characteristic of the space from the speaker to the microphone is estimated using the ringing
tone to be output and the sound input by the microphone, and the function is applied to the
sound processing means in accordance with the sound transfer characteristic. And sound
environment judging means.
[0025]
According to the invention as set forth in claim 6, the user is shown the start of transmission and
reception of voice with another voice input / output device, and the ringing tone outputted from
the speaker and the voice inputted by the microphone are used. Sound quality can be further
improved by estimating the acoustic transfer characteristic of space from the speaker to the
microphone and acting on the acoustic processing means in accordance with the acoustic
transfer characteristic.
[0026]
In the invention according to claim 7, in the invention according to claim 6, the user is allowed to
select a ringing tone which indicates to the user the start of transmission / reception of voice
with another voice input / output device. It is characterized by
[0027]
According to the seventh aspect of the present invention, the user can select a ringing tone that
indicates to the user the start of transmission / reception of voice with another voice input /
output device, thereby further improving voice quality. It can be improved.
10-04-2019
6
[0028]
The invention according to claim 8 is the invention according to claim 7, for estimating the
acoustic transfer characteristic of space in the ringing tone which shows the user the start of
transmission and reception of voice with another voice input / output device. It is characterized
in that test sound can be added.
[0029]
According to the invention as set forth in claim 8, it is possible to add a test sound for estimating
the acoustic transfer characteristic of the space to the ringing tone indicating to the user the start
of transmission and reception of voice with another voice input / output device. As a result, voice
quality can be further improved.
[0030]
The invention according to claim 9 is the invention according to any one of claims 1 to 5,
wherein a ringing tone outputted from the speaker at a timing designated by the user and a voice
inputted by the microphone are used. Then, it is characterized in that the acoustic transfer
characteristic of the space from the speaker to the microphone is estimated, and an acoustic
environment judging unit which exerts an action on the acoustic processing unit according to the
acoustic transfer characteristic.
[0031]
According to the invention as set forth in claim 9, the acoustic transfer characteristic of the space
from the speaker to the microphone is estimated using the ringing tone outputted from the
speaker at the timing designated by the user and the voice inputted by the microphone. The
voice quality can be further improved by acting on the sound processing means in accordance
with the sound transmission characteristics.
[0032]
The invention according to claim 10, in the invention according to claim 9, has a ringing tone
indicating the user to start transmission and reception of voice with another voice input / output
device, and at a timing designated by the user, The sound transmission characteristic of the space
from the speaker to the microphone is estimated using the ringing sound output from the
speaker using the above-mentioned ringing sound and the sound input from the microphone, and
the sound transmission characteristic is adjusted according to the sound transmission
characteristic. It is characterized in that it comprises acoustic environment judging means that
10-04-2019
7
exerts an action on the sound processing means.
[0033]
According to the invention as set forth in claim 10, the speaker has a ringing tone indicating the
user to start transmission and reception of voice with another voice input / output device, and
the speaker uses the ringing tone at the timing designated by the user. By estimating the sound
transfer characteristic of the space from the speaker to the microphone using the ringing sound
output from the microphone and the voice input by the microphone, and acting on the sound
processing means according to the sound transfer characteristic, Voice quality can be further
improved.
[0034]
The invention according to claim 11 is such that, in the invention according to claim 10, the
voice specified by the user is used as a ringing tone which indicates to the user the start of
transmission and reception of voice with another voice input / output device. It is characterized
by having done.
[0035]
According to the invention as set forth in claim 11, voice quality specified by the user is used as a
ringing tone which indicates to the user the start of transmission and reception of voice with
another voice input / output device. Can be improved.
[0036]
The invention according to claim 12 is the invention according to claim 11, for estimating the
acoustic transfer characteristic of space in the ringing tone indicating to the user the start of
transmission and reception of voice with another voice input / output device. It is characterized
in that test sound can be added.
[0037]
According to the invention as set forth in claim 12, it is possible to add a test sound for
estimating the acoustic transfer characteristic of space to the ringing sound indicating to the user
the start of transmission and reception of voice with another voice input / output device. As a
result, voice quality can be further improved.
[0038]
10-04-2019
8
The invention according to claim 13 is the invention according to claim 12, for estimating the
acoustic transfer characteristic of space in the ringing tone indicating to the user the start of
transmission / reception of voice with another voice input / output device. It is characterized in
that test sound can be added.
[0039]
According to the invention as set forth in claim 13, it is possible to add a test sound for
estimating the acoustic transfer characteristic of space to the ringing sound indicating to the user
the start of transmission and reception of voice with another voice input / output device. As a
result, voice quality can be further improved.
[0040]
The invention according to a fourteenth aspect is characterized in that, in the invention according
to any one of the first to thirteenth aspects, two or more microphones for inputting voice are
provided.
[0041]
According to the invention of claim 14, by providing two or more microphones for inputting
voice, voice quality can be further improved.
[0042]
The invention according to a fifteenth aspect is characterized in that, in the invention according
to any one of the first to fourteenth aspects, two or more speakers for outputting sound are
provided.
[0043]
According to the invention of claim 15, by providing two or more speakers for outputting sound,
it is possible to further improve the sound quality.
[0044]
The invention according to claim 16 is the invention according to any one of claims 1 to 15,
wherein an external device other than a microphone estimates an acoustic transfer characteristic
of space, and the acoustic processing is performed according to the acoustic transfer
characteristic. It is characterized in that it comprises sound processing means for acting on the
means.
10-04-2019
9
[0045]
According to the invention as set forth in claim 16, the sound transfer characteristic of the space
is estimated by an external device other than the microphone, and the sound quality is further
improved by acting on the sound processing means according to the sound transfer
characteristic. Can.
[0046]
The invention according to claim 17 is an audio input / output device which internally has a
music file and reproduces a music file from a speaker, audio input means for inputting audio with
a microphone, and sound processing for audio output from the speaker The acoustic transfer
characteristic of the space from the speaker to the microphone is estimated using sound
processing means to be performed, the sound input from the microphone and the sound output
from the speaker, and the sound transfer characteristic is adjusted according to the sound
transfer characteristic. And sound environment judging means for acting on the sound
processing means.
[0047]
According to the invention described in claim 17, when sound is input by the microphone and
sound processing is performed on the sound output from the speaker, the speaker using the
sound input by the microphone and the sound output from the speaker Sound quality can be
further improved by estimating the acoustic transfer characteristic of the space from the
microphone to the microphone and acting on the acoustic processing means in accordance with
the acoustic transfer characteristic.
[0048]
The invention according to claim 18 is characterized in that, in the invention according to claim
17, the sound processing means is a combination of one or more sound processing functions.
[0049]
According to the eighteenth aspect of the present invention, the sound processing means can
further improve the sound quality by combining one or more sound processing functions.
[0050]
The invention according to claim 19 is an audio input / output method for outputting audio from
10-04-2019
10
a speaker and inputting audio with a microphone, combining one or more audio processing
functions and performing audio processing on audio input by the microphone In this case, the
sound transfer characteristic of the space from the speaker to the microphone is estimated using
the sound input from the microphone and the sound output from the speaker, and the sound
processing is performed according to the sound transfer characteristic. It is characterized by
[0051]
According to the invention described in claim 19, when one or more sound processing functions
are combined and sound processing is performed on the sound input by the microphone, the
sound input by the microphone and the sound output from the speaker are The sound quality
can be further improved by estimating the acoustic transfer characteristic of the space from the
speaker to the microphone and performing acoustic processing in accordance with the acoustic
transfer characteristic.
[0052]
The invention according to claim 20 is an audio input / output method having a music file inside
and reproducing a music file from a speaker, wherein sound is input with a microphone and
sound processing is performed on the sound output from the speaker. The acoustic transfer
characteristic of the space from the speaker to the microphone is estimated using the sound
input from the microphone and the sound output from the speaker, and the acoustic processing
is performed according to the acoustic transfer characteristic. It features.
[0053]
According to the invention as set forth in claim 20, when sound is input by the microphone and
sound processing is performed on the sound output from the speaker, the speaker using the
sound input by the microphone and the sound output from the speaker The sound quality can be
further improved by estimating the acoustic transfer characteristic of the space from the point to
the microphone and performing the acoustic processing according to the acoustic transfer
characteristic.
[0054]
That is, according to the present invention, call voice quality can be improved first.
A general user who does not know much about sound processing does not set the presence or
10-04-2019
11
absence of sound processing or the strength of the effect by his own judgment, but uses an
algorithm to set the optimum sound processing, so the sound effect by the user Can be expected
to improve the overall speech quality.
In addition, by performing necessary sound processing by adapting to the passage of time, it is
possible to always communicate with a call quality suitable for the sound system at that time.
[0055]
Further, according to the present invention, secondly, the processing load can be reduced.
Unnecessary processing because general users who do not know much about sound processing
do not set the presence or absence of sound processing and the strength of the effects by their
own judgment but use optimal algorithms to set the optimum sound processing. Processing load
can be reduced.
[0056]
Furthermore, according to the present invention, thirdly, the usability can be improved.
During the call, the user can concentrate on the original purpose of the call by saving the trouble
of setting up the sound at the start of the call.
[0057]
According to the present invention, when one or more sound processing functions are combined
and sound processing is performed on the sound input by the microphone, the sound input by
the microphone and the sound output from the speaker are used from the speaker By estimating
the acoustic transfer characteristic of the space to the microphone and performing the acoustic
processing according to the acoustic transfer characteristic, the variation in acoustic setting due
to the human system is eliminated, and the processing load due to the acoustic processing is
minimized. Voice quality can be improved.
10-04-2019
12
[0058]
First Embodiment A first embodiment relates to a voice input / output device that adjusts the
acoustic environment during a call.
FIG. 1 is a block diagram showing an embodiment of a voice input / output device according to
the present invention.
In the present embodiment, the case where the voice input / output device 1 and the voice input
/ output device A perform voice transmission and reception in real time via the network will be
described.
Further, as to the internal configuration of the voice input / output device 1 of FIG. 1, only the
part related to the present invention is functionally shown to simply explain the principle of the
present invention.
[0059]
First, the configuration of the voice input / output device 1 will be described.
The voice input / output device 1 has a microphone 3 and a speaker 7 outside, and performs
voice input / output.
The microphone 3 captures the voice to be transmitted.
The captured voice is also used to simultaneously determine the acoustic environment in which
the voice input / output device 1 is placed.
The sound captured by the microphone 3 is input to the sound processing unit 5 as a sound to be
transmitted, and is input to the sound environment judging unit 4 as a sound for determining the
sound environment.
10-04-2019
13
The sound output from the speaker 7 is also input to the sound environment determination unit
4.
[0060]
The acoustic environment determination unit 4 determines the environment in which the audio
input / output device 1 is installed from the input voice, and determines ON / OFF and strong /
weak control parameters of each acoustic function included in the acoustic processing unit 5.
[0061]
The sound processing unit 5 performs sound processing on the input voice.
Thereafter, through codec processing / transmission / reception processing 2, voice is
transmitted to another voice input / output device, for example, voice input / output device A via
the network.
[0062]
The voice from the voice input / output device A is subjected to reception processing in the
codec / transmission / reception processing 2 via the network and decoded.
After necessary sound processing is performed in the sound processing unit 6 on the receiving
side, the sound is output from the speaker 7.
[0063]
FIG. 2 shows a detailed example of the sound processing determination unit 4 and the sound
processing unit 5 of the voice input / output device shown in FIG.
10-04-2019
14
In FIG. 2, the same numerical values are used for the same parts as in FIG.
[0064]
In FIG. 2, the acoustic processing unit 5 has an acoustic echo canceller (AEC) 12, a noise
suppression function (NS) 14, an automatic volume control function (AGC) 16 and other acoustic
functions 18.
[0065]
First, an example of control of the switch 11 and the AEC 12 will be described.
The switch 11 is a switch for controlling ON / OFF of the AEC.
[0066]
The acoustic environment determination unit 4 measures the reverberation time in the acoustic
environment where the audio input / output device 1 is installed from the microphone 3 and the
speaker 7.
After that, the judgment result is notified from the sound environment judgment unit 4 to the
AEC 12 to control the performance of the AEC.
[0067]
For example, when the echo is very small, it is possible to reduce the processing load of the
terminal by controlling the switch 11 from the sound environment determination unit 4 and
disabling the echo canceller function itself.
Also, if the corresponding reverberation time of the echo canceller is initially set to 100 ms, if the
terminal is placed in a narrow space and the measured reverberation time is 50 ms, then the
corresponding reverberation time of the echo canceller Can be set to 50 ms and not to perform
10-04-2019
15
extra processing.
[0068]
Conversely, even if the terminal is placed in a large hole or the like and the reverberation time is
200 ms, the echo canceller's corresponding reverberation time is adjusted to 200 ms, so that the
necessary and sufficient echo canceller Performance can be demonstrated.
[0069]
Next, control examples of the switch 13 and the NS 14 will be described.
The switch 13 is a switch for controlling ON / OFF of the NS.
[0070]
By obtaining the signal-to-noise ratio (S / N ratio) from the microphone 3 and the speaker 7 by
the sound environment determination unit 4, it is possible to measure the magnitude of noise in
an environment in which the voice input / output device 1 is installed.
For example, when the S / N ratio is large, the environment is small, so by controlling the switch
13 to disable the function of the NS, the processing load on the terminal can be reduced.
When the S / N ratio is small but larger than a certain level, control is performed to weaken the
operation of the NS 14.
Furthermore, when the S / N ratio is small, since there is a large noise environment, control is
performed so that the NS 14 works strongly.
[0071]
10-04-2019
16
Next, control examples of the switch 15 and the AGC 16 will be described.
The switch 14 is a switch for controlling ON / OFF of the AGC.
[0072]
The sound environment determination unit 4 measures the magnitude of the signal from the
microphone 3.
For example, if the magnitude of the signal is appropriate and the S / N ratio is also large, it is not
necessary to adjust the volume by the AGC 16, so the switch 15 can disable the AGC function and
reduce the processing load on the terminal.
Also, if the above S / N ratio is also used for judgment, most of the input signal is likely to be
noise if the S / N ratio is small, so the AGC function is controlled to lower the volume as a whole.
be able to.
Further, even when the S / N ratio is high, noise is suppressed when the NS 14 is effective, so
that the AGC function can be controlled to be amplified as a whole without noise.
[0073]
In addition, if there is another acoustic function, the acoustic environment determination unit 4
may control the acoustic function in the same manner as described above, as shown in the
portion of the other acoustic function 18 in FIG.
Furthermore, the AEC, NS, AGC, and other acoustic processing may not be in the positional
relationship shown in the figure, and may be in any order.
[0074]
10-04-2019
17
3 and 4 show an example of the flow of the voice input / output method according to the present
invention.
When a call is started, the processing of the flow shown in FIG. 3 is performed.
The process is started (step 101).
The sound environment adjustment mode is turned on (step 102).
Here, it is assumed that the acoustic environment adjustment mode is a mode in which the
acoustic environment determination unit 4 determines the current acoustic environment.
The process ends (step 103).
Even if step 103 is performed, the call continues.
[0075]
When a call is started, the processing of the flow of FIG. 4 is also started (step 201).
First, it is determined whether the sound environment adjustment mode is set (step 202).
If the acoustic environment adjustment mode is ON (step 202 / ON), the acoustic environment
determination unit 4 selects the necessary acoustic processing (step 203).
The sound environment determination unit 4 uses the sound input from the microphone 3 and
the sound output from the speaker 7 in the configuration and method as shown in FIG. Estimate
10-04-2019
18
At the time of estimation of the acoustic environment, the voice from the voice input / output
device A is once stopped and the above test sound is output from the speaker 7 or the received
voice from the voice input / output device A is used to judge the acoustic environment. You may
Further, it is possible to estimate the acoustic environment by using the received sound from the
voice input / output device A.
[0076]
Then, based on the determination of the sound environment determination unit 4, the sound
processing unit 5 changes the setting of the sound processing (step 204).
For example, sound processing is turned on / off for each function, and the strength of the
function is determined.
Thereafter, the process ends (step 206).
[0077]
On the other hand, if the sound environment adjustment mode is OFF at step 202 (step 202 /
OFF), the process proceeds to step 205, and the process proceeds to end processing 206 without
changing the setting of the sound processing unit 5.
[0078]
When the above-described processing (steps 201 to 206) is performed, it is possible to measure
the acoustic environment and select necessary acoustic processing when the terminal is set to
the acoustic environment adjustment mode.
[0079]
By operating according to the flow as described above, an appropriate sound processing is
automatically selected without the user individually determining ON / OFF of the sound
processing, so it depends on the knowledge and skills of the operator. It is possible to always
make calls with appropriate voice quality.
10-04-2019
19
[0080]
In addition, since it is possible to prevent the user from having insufficient knowledge or to
enable the sound processing function that is unnecessarily sufficient, it is possible to prevent the
processing load from being increased more than necessary.
For example, it can be used to turn off the noise suppressor when the terminal is placed in a
quiet office, or turn off the echo canceler when there is little echo.
[0081]
Furthermore, the user can perform adjustment in the sound environment adjustment mode at the
start of communication, so that the subsequent call can concentrate on the original purpose of
the call without worrying about the sound quality.
[0082]
Further, the determination in the sound environment determination unit 4 is not limited to one
time, and may be repeated a plurality of times.
For example, after the process of step 204, the process may return to step 202 and repeat steps
203 and 204 as many times as long as the sound environment adjustment mode is ON.
[0083]
In this way, even if the acoustic environment in which the voice input / output device 1 is placed
changes with time, acoustic control can be performed adaptively.
For example, if the number of people in the meeting room suddenly increases and noise
increases, it is possible to enable NS and AGC.
10-04-2019
20
[0084]
Second Embodiment While the sound environment adjustment mode is made effective after the
start of the call to start the determination of the sound environment, the present embodiment
uses the calling state before the start of the call. It is to adjust the acoustic environment.
[0085]
In the present embodiment, the process of enabling the sound environment adjustment mode
shown in FIG. 3 is performed at the start of call calling.
Also, by performing the processing shown in FIG. 4 from the start of the call, the sound
environment of the terminal is measured using the ringing tone as a test tone.
[0086]
FIG. 5 shows an example of the configuration of a voice input / output device according to the
present invention.
The configuration shown in FIG. 5 is the configuration shown in FIG. 1 to which a user-specified
ringing tone 18 and a test tone 19 are added.
Although the ringing tone may be a fixed ringing tone, as shown in FIG. 5, a sound source file
may be provided externally and a voice / sound file specified by the user may be used as the
ringing tone.
In that case, since the characteristics of the file can be grasped in advance by the sound
environment determination unit 14, the sound environment in which the sound input / output
device 11 is placed at the time of comparison with the sound input from the microphone 13 It
can be expected that the voice from the output device A can be grasped more accurately than
when using it.
[0087]
10-04-2019
21
In addition, the sound environment judgment can be made more easily by mixing or inserting a
test sound (test sound 19) into the fixed ringing tone or the user-specified ringing tone.
If a voice having a frequency characteristic outside the human audible range is used as the test
sound, the test sound can be input so that the user does not understand.
[0088]
When the processing as in the present embodiment is performed, since the call can be started
after the appropriate sound processing content is set, comfortable communication can be
performed from the start of the call.
[0089]
Further, if the method described in the first embodiment is also implemented, the acoustic
processing can be implemented by following the variation of the acoustic system during a call.
[0090]
Third Embodiment In the first and second embodiments, the sound environment adjustment
mode is automatically enabled at the time of a call, but in the present embodiment, the sound
environment adjustment mode is enabled only while explicitly designated by the user. May be
For example, the user may enable or disable the sound environment adjustment mode through an
external input device or a GUI.
[0091]
If such processing is performed, the sound processing content can be determined even when the
terminal is not in a call state.
It can also be used when the user wants to activate the sound environment adjustment mode only
10-04-2019
22
for a certain period of time.
[0092]
Also in the present embodiment, the ringing tone used in the second embodiment may be used to
determine the acoustic environment.
Further, a ringing tone specified by the user may be used, or a test tone may be used.
[0093]
[Fourth Embodiment] In the first to third embodiments, the sound environment adjustment
function is added to the transmission side based on the microphone input voice. However, the
sound environment adjustment function is also provided to the reception side based on the
received sound. The setting of the sound processing on the receiving side may be controlled.
FIG. 6 shows another configuration example of the voice input / output device according to the
present invention.
FIG. 6 is a configuration example in which control from the acoustic environment determination
unit 14 is added to the acoustic processing unit 16 of the reception unit in addition to the
configuration of FIG. 5.
In addition to the voice input from the microphone 23 and the voice output to the speaker 27,
the voice received from the voice input / output device A is also input to the sound environment
determination unit.
The sound processing unit (reception side) 26 controls the sound output from the speaker 27 by
determining the sound environment based on the input sound.
[0094]
10-04-2019
23
FIG. 7 shows a detailed example of the sound processing determination unit 24 and the sound
processing unit 25.
In FIG. 7, the control to the sound processing unit (transmission side) 25 is similar to the
processing shown in FIG. Here, examples of the sound processing unit (reception side) 26 and the
sound environment determination unit 24 have been described in detail.
[0095]
A control example of the switch 31, the AGC 32, the switch 33, and the equalizer (EQ) 34 will be
described. The switch 31 controls the AGC, and the switch 33 controls the EQ on / off.
[0096]
For example, the acoustic environment determination unit 24 determines the acoustic
environment as in the first embodiment, and when the volume is appropriate, the processing load
can be reduced by operating the switch 31 to invalidate the AGC. When the environment is noisy,
the AGC 32 can be controlled to increase the volume output from the speaker 27. In addition, in
an environment where there are a lot of echoes, control such as control of the AGC 32 to reduce
the volume output from the speaker 27 to suppress the occurrence of echo can be performed.
[0097]
Further, the processing load can be reduced by judging the echo characteristic of the sound and
operating the switch 33 to invalidate the EQ. In addition, if the frequency band that tends to echo
is known in the determination of the acoustic characteristics, an acoustic effect such as reducing
the volume of the frequency band can be provided by controlling the EQ 34. Also, as a normal
equalizing function, it is also possible to give the received sound a sound effect preferred by the
user. For example, when the reception sound from the communication partner is not a voice but
a music file, it is possible to give an acoustic effect such as making the deep bass sound or make
the vocal stand out.
10-04-2019
24
[0098]
Although the sound environment judgment unit and the sound environment judgment unit 24 in
FIG. 6 perform both the control to the sound processing unit 25 and the control to the sound
processing unit 26, a configuration may be adopted in which one judgment unit is provided
separately .
[0099]
The operation flow of this configuration is the same as that shown in the first to third
embodiments, so the description of the operation flow is omitted.
[0100]
Further, when the processing as in this embodiment is performed, even if the voice input / output
device A of the communication partner does not have high-performance acoustic processing, the
sound processing unit 16 on the receiving side of the voice input / output device 11 Appropriate
sound processing can be performed by the processing of
[0101]
[Fifth Embodiment] In the first to fourth embodiments, a transmitting / receiving terminal for
voice and a transmitting / receiving terminal for audio file are assumed, but the present invention
is not limited to this, and is applied to an independent terminal not performing communication. it
can.
[0102]
FIG. 8 shows another configuration example of the voice input / output device according to the
present invention.
There is a music file reproduction apparatus (terminal) 41, which has a microphone 45 and a
speaker 44 outside.
An audio environment determination unit 46 is provided inside the terminal, and the audio input
by the microphone 45 and the audio output from the speaker 44 are input here.
10-04-2019
25
The acoustic environment determination unit 46 determines the external acoustic environment
of the terminal, and the acoustic processing unit 43 performs control according to the external
acoustic environment.
[0103]
For example, as described in the fourth embodiment, control is performed to increase the volume
from the speaker 44 in an environment with a lot of noise, or to reduce the volume in an
environment with a lot of echo.
In addition, it is also possible to analyze the frequency component in the echo and to control to
reduce the volume only in the frequency part that is likely to cause echo by the equalizer
function. It can be applied when listening to music in a bathroom. In this way, the sound of the
reproduced sound can be controlled in accordance with the external sound environment.
[0104]
[Others] The microphone and the speaker in the example described in the first to fourth
embodiments are configured one by one for the audio input / output device 1, but a plurality of
microphones and speakers are required to cope with multi-channel audio such as stereo. It may
be configured to connect. Also, not only voice but also Hi-Fi audio may be used to determine the
acoustic environment.
[0105]
Furthermore, external devices may be individually connected to the sound environment
determination unit 4, and the sound environment determination unit 5 may make the
determination using information other than the sound from the microphone which is the sound
to be transmitted. For example, using a camera or an infrared sensor, it is possible to output
sound only when there is a person by the side, to grasp the size of a room using a sonar, and to
derive acoustic characteristics.
10-04-2019
26
[0106]
The communication partner is one site of only the voice input / output device A in FIG. 1 and FIG.
4, but communication may be performed with a plurality of sites via a network.
[0107]
In the above, the present invention has the sound environment judging means which did not exist
in the prior art, and controls the sound processing means by it, thereby the sound subjected to
the sound processing according to the sound environment in which the own terminal is placed. It
can be sent to the other terminal.
[0108]
[Effect of the Invention] The first effect is to be able to eliminate the variation in sound setting
due to the human system.
The reason is that a general user who does not know much about sound processing does not set
the presence or absence of sound processing or the strength of the effect by its own judgment,
but uses an algorithm to set the optimum sound processing. is there.
[0109]
The second effect is that the processing load due to sound processing can be minimized.
The reason is that only necessary acoustic processing can be selected and implemented
according to the acoustic environment in which the own terminal is placed, and thus extra
processing can be eliminated.
[0110]
The third effect is that the convenience of the user can be improved. The reason is that since the
terminal automatically determines the implementation details of the sound processing at the start
of a call, etc., the user does not have to make an explicit selection, and the user can concentrate
10-04-2019
27
on the purpose of the original device.
[0111]
The fourth effect is that sound processing can be selected adaptively over time. The reason is
that since necessary sound processing can be performed adaptively to the time lapse of the
sound environment in which the own terminal is placed, communication can always be
performed with a speech quality suitable for the sound system at that time.
[0112]
The embodiment described above shows an example of a preferred embodiment of the present
invention, and the present invention is not limited thereto, and various modifications can be
made without departing from the scope of the invention. is there.
[0113]
The present invention can be used for an apparatus, system, program, etc. that performs realtime transmission / reception communication such as telephone, IP telephone, video conference,
web conference etc. It is used for music reproduction terminal, voice output device, system,
program etc. Can.
[0114]
FIG. 1 is a block diagram showing an embodiment of a voice input / output device according to
the present invention.
It is a figure which shows an example of the sound processing judgment part 4 and the sound
processing part 5 of the speech input / output device shown in FIG.
It is an example of the flow of the voice input / output method concerning the present invention.
It is another example of the flow of the voice input / output method concerning the present
invention. It is a figure showing the example of composition of the voice input / output device
concerning the present invention. It is a figure which shows the other structural example of the
speech input-output apparatus based on this invention. FIG. 6 is a diagram showing a detailed
10-04-2019
28
example of the sound processing determination unit 24 and the sound processing unit 25. It is a
figure which shows the other structural example of the speech input-output apparatus based on
this invention.
Explanation of sign
[0115]
1 audio transmission / reception terminal 2 codec transmission / reception processing etc 3
microphone 4 acoustic environment judgment unit 5, 6 acoustic processing unit 7 speaker A
audio transmission / reception terminal
10-04-2019
29
Документ
Категория
Без категории
Просмотров
0
Размер файла
40 Кб
Теги
jp2008211526, description
1/--страниц
Пожаловаться на содержимое документа