close

Вход

Забыли?

вход по аккаунту

?

DESCRIPTION JP2014204318

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JP2014204318
Abstract: In a video conference communication with a remote terminal connected via a network,
audio data can be smoothly transmitted and received even when the microphone is blocked. A
portable terminal device outputs voice data from a partner terminal connected via a network to a
speaker 12 and further collects voice data output from the speaker 12 to a microphone 11. The
correlation operation unit 34 performs correlation operation between the audio data output from
the speaker 12 and the audio data collected by the microphone 11 with respect to the audio data
output from the speaker 12. The microphone closing determination unit 35 uses the correlation
calculation result of the correlation calculation unit 34 to determine whether the microphone 11
is in a closed state. The warning unit 22 notifies that the microphone 11 is closed. [Selected
figure] Figure 4
Mobile terminal device
[0001]
The present invention relates to, for example, a portable terminal device that is used for, for
example, a two-way type video conference, and transmits and receives video data and audio data
to and from a partner terminal used by the partner of the video conference.
[0002]
In recent years, the use of portable terminal devices (for example, a smartphone or a tablet
terminal) including a camera and a microphone has been rapidly spreading.
11-04-2019
1
Such portable terminal devices are beginning to be used, for example, in teleconferencing in a
company. In the conventional video conference, the participants had to stay in the dedicated
conference room equipped with the equipment for video conferencing, but for example, the
participants can use the portable terminal devices connected to the network individually. Even if
you are not in a dedicated conference room, you can attend a video conference at a distant place.
[0003]
As a prior art using a portable terminal device for a video conference, the video audio
information communication system shown, for example to patent document 1 is known. In the
video and audio information communication system shown in Patent Document 1, when the
video conference terminal device is absent in the video conference participant and can not
respond to, for example, a video call or a voice call, the voice of the video conference is
transmitted to the portable terminal device. Forward.
[0004]
Thus, in the video and audio information communication system shown in Patent Document 1,
voices are transferred to the portable terminal device even when the participants of the video
conference are away from the installation place (for example, a conference room) of the video
conference terminal Can participate in video conferences.
[0005]
Unexamined-Japanese-Patent No. 2002-335502
[0006]
When a portable terminal device is used for a video conference, a participant may unintentionally
block the microphone of the portable terminal device.
[0007]
In the conventional portable terminal device including the patent document 1, when the
participant blocks the microphone of the conventional portable terminal device without intention,
the content of the participant's speech is not correctly collected by the microphone, and the
video conference is performed. The voice data of the content of the utterance is not correctly
transferred to the portable terminal device (the opposite terminal) used by the other participant
11-04-2019
2
in, and, for example, the content of the utterance is recognized as noise.
[0008]
Therefore, when the conventional portable terminal device is used for a video conference,
communication with the partner terminal in the video conference can be smoothly performed
when the participant of the conventional portable terminal device blocks the microphone of the
conventional portable terminal device by hand without intention. There was a problem that it
became difficult.
[0009]
In order to solve the above-mentioned conventional problems, the present invention is a portable
telephone which smoothly transmits and receives audio data even when a microphone is blocked
in a video conference communication with a partner terminal connected via a network. It aims at
providing a terminal unit.
[0010]
The present invention is a portable terminal device that communicates with a partner terminal
connected via a network, and includes a voice output unit that outputs voice data from the
partner terminal, and collecting voice data output from the voice output unit. Correlation
operation unit that performs correlation operation between a sound collection unit, sound data
output by the sound output unit, and sound data collected by the sound collection unit with
respect to sound data output by the sound output unit And a closing determination unit that
determines whether the voice collecting unit is in a closed state using the correlation calculation
result of the correlation calculation unit, and the determination result of the closing
determination unit. And a notification unit that notifies of a state in which the sound collection
unit is blocked.
[0011]
Further, the present invention is a portable terminal device for communicating with a partner
terminal connected via a network, wherein the voice output unit for outputting voice data from
the partner terminal, and voice data around the mobile terminal device The voice pickup unit
uses the voice pickup unit for picking up voice, a voice frequency characteristic calculation unit
for computing frequency characteristics of voice data picked up by the voice collection unit, and
the calculation result of the voice frequency characteristic calculation unit. A closing
determination unit that determines whether or not the unit is closed, and a notification unit that
notifies of a state in which the sound collection unit is closed using the determination result of
the closing determination unit. It is a portable terminal device provided.
11-04-2019
3
[0012]
According to the present invention, it is possible to smoothly transmit and receive audio data
even in the case where the microphone is blocked in the video conference communication with
the other party terminal connected via the network.
[0013]
(A) External view of the portable terminal device of the first embodiment, (B) Explanatory view
showing a state of a video conference using the portable terminal device of the first embodiment
Operation of the portable terminal device of the first embodiment An explanatory view showing
an outline, (A) a view showing a state in which a finger does not close a microphone, (B) a view
showing a state in which a finger close a microphone, a television conference system using the
portable terminal device of the first embodiment The block diagram showing the system
configuration The block diagram showing the internal configuration of the voice processing unit
of the mobile terminal device of the first embodiment The case where the microphone is not
closed The graph microphone showing an example of the correlation calculation result is closed
A graph showing an example of a correlation calculation result of the flowchart A flowchart
explaining an operation procedure of the portable terminal device of the first embodiment A
block diagram showing an internal configuration of a voice processing unit of the portable
terminal device of the second embodiment Around the volume level Flowchart illustrating the
operation procedure of the portable terminal device of the second embodiment graph showing
one example of a number of properties
[0014]
Hereinafter, each embodiment of a portable terminal device according to the present invention
will be described with reference to the drawings.
The mobile terminal device of each embodiment is, for example, a mobile phone, a smartphone, a
tablet terminal, or a PDA (Personal Digital Assistant).
[0015]
In the following embodiments, as an example of usage of the mobile terminal device according to
11-04-2019
4
the present invention, for example, a situation in which the mobile terminal device is used in a
video conference in a company will be described.
That is, in the video conference, for example, participants of the video conference individually
use the portable terminal devices of the respective embodiments.
However, the usage method of the portable terminal device of each embodiment is not limited to
the usage form in each of the following embodiments.
[0016]
The present invention can also be expressed as an audio processing method having each
operation (step) performed by the mobile terminal device.
Furthermore, according to the present invention, each operation (step) performed by the mobile
terminal device as a computer is implemented by a processor (for example, a central processing
unit (CPU), a micro processing unit (MPU), a digital signal processor DSP) incorporated in the
mobile terminal device. It may be expressed as a program for making it execute.
[0017]
First Embodiment First, before describing the detailed internal configuration of the mobile
terminal device 10 according to the first embodiment, an outline of the operation of the mobile
terminal device 10 will be described with reference to FIGS. 1A and 1B. A brief explanation is
given using.
[0018]
FIG. 1A is an external view of the mobile terminal device 10 according to the first embodiment.
FIG. 1 (B) is an explanatory view showing a state of a video conference using the mobile terminal
device 10 of the first embodiment.
11-04-2019
5
The mobile terminal device 10 and another mobile terminal device (hereinafter, referred to as
“television conference partner terminal”) used by each participant of the video conference are
mutually connected via the network NW.
[0019]
The portable terminal device 10 outputs a voice spoken by another participant who uses the
video conference callee terminal 50 and the microphone 11 that picks up the voice spoken by
the user of the mobile terminal device 10 (hereinafter, also simply referred to as “user”). The
system includes at least a speaker 12, a camera 13 for imaging a user, and a display 14 for
displaying video data (including image data, hereinafter the same) transmitted from the
teleconference counterpart terminal 50 (see FIG. 1A).
In the mobile terminal device 10, the arrangement positions of the microphone 11, the speaker
12, the camera 13, and the display 14 are not limited to the arrangement positions shown in FIG.
[0020]
The display 14 displays video data of other participants participating in the video conference (see
FIG. 1B).
The microphone 11 picks up the voice of the speech content (for example, "Thank you") spoken
by the user UA.
The speaker 12 outputs the sound of the contents of the speech of the other participants
participating in the video conference (for example, "I'm welcome").
Thereby, the user participating in the video conference and the other participants participating in
the same video conference can communicate smoothly using the portable terminal device 10 and
the video conference partner terminal 50. it can.
11-04-2019
6
[0021]
FIGS. 2A and 2B are explanatory views showing an operation outline of the mobile terminal
device 10 of the first embodiment.
FIG. 2A is a diagram showing a state in which the finger FG does not block the microphone 11.
FIG. 2B is a diagram showing a state in which the finger FG blocks the microphone 11.
[0022]
For example, it is assumed that the user of the mobile terminal device 10 grips the rear surface
of the housing of the mobile terminal device 10 with the hand TH, that is, the housing opposite to
the front surface of the housing where the display 14 is disposed (FIG. 2) (A)). In this case, the
microphone 11 of the portable terminal device 10 is not blocked by, for example, the finger FG of
the user, and can accurately collect voice data of the content of the speech spoken by the user.
Therefore, the user can communicate smoothly with other participants in the video conference.
[0023]
On the other hand, it is assumed that the user of the mobile terminal device 10 closes the
microphone 11 of the mobile terminal device 10 with, for example, a finger FG (see FIG. 2B). In
this case, it is difficult for the microphone 11 of the mobile terminal device 10 to accurately pick
up voice data of the content of the utterance spoken by the user. Therefore, it is difficult for the
user to communicate smoothly with other participants in the video conference.
[0024]
Therefore, when, for example, the finger FG of the user blocks the microphone 11 of the mobile
terminal 10 (see FIG. 2B), the mobile terminal device 10 indicates that the microphone 11 is
blocked. Inform alert message ALT of For example, the mobile terminal device 10 displays a
warning message ALT on the display 14.
11-04-2019
7
[0025]
Next, the detailed internal configuration of the mobile terminal device 10 of the present
embodiment will be described with reference to FIGS. 3 and 4. FIG. 3 is a block diagram showing
the system configuration of the video conference system 100 using the mobile terminal device
10 of the first embodiment. FIG. 4 is a block diagram showing an internal configuration of the
audio processing unit 15 of the mobile terminal device 10 of the first embodiment.
[0026]
The mobile terminal device 10 shown in FIG. 3 includes a microphone 11, a speaker 12, a camera
13, a display 14, an audio processing unit 15, an audio encoder 16, an audio decoder 17, a
communication unit 18, and a video encoder 19. , A video decoder 20, a video processing unit
21, and a warning unit 22. The portable terminal device 10 and the teleconference counterpart
terminal 50 are connected via the network NW, and mutually transmit and receive video data
and audio data in a teleconference.
[0027]
While the user of the mobile terminal device 10 is speaking, the microphone 11 as an example of
the voice collection unit picks up the voice spoken by the user and the surrounding voice (for
example, environmental sound, noise), and further, the mobile terminal device While the 10 users
are not speaking, it picks up the surrounding voice. The collected voice is converted into an
electric signal (voice signal) by the microphone 11 and input to the voice processing unit 15.
[0028]
The speaker 12 as an example of the voice output unit outputs voice spoken by other
participants who use the video conference call partner terminal 50.
[0029]
The camera 13 captures an image of the user of the mobile terminal device 10.
11-04-2019
8
The captured video is converted by the camera 13 into video data which can be subjected to
predetermined signal processing, and is input to the video processing unit 21.
[0030]
The display 14 displays video data transmitted from the video conference callee terminal 50, that
is, video data representing the status of the video conference (for example, the status in which
another participant is talking). Further, the display 14 displays a warning message ALT (see FIG.
2B) output from the video processing unit 21.
[0031]
The audio processing unit 15 performs predetermined audio processing on the audio signal
output from the microphone 11, and outputs an audio signal after predetermined audio
processing (hereinafter referred to as “microphone signal”) to the audio encoder 16. Further,
the audio processing unit 15 outputs a speaker signal (see later) output from the audio decoder
17 to the speaker 12.
[0032]
Furthermore, based on the microphone signal output from the microphone 11, the audio
processing unit 15 determines whether the microphone 11 is in a state of being blocked by, for
example, a finger FG. The voice processing unit 15 outputs the determination result as to
whether or not the microphone 11 is closed to the warning unit 22. The detailed internal
configuration of the audio processing unit 15 will be described later with reference to FIG.
[0033]
The audio encoder 16 encodes the microphone signal output from the audio processing unit 15,
and outputs the encoded microphone signal to the communication unit 18. Note that the method
of encoding processing in the audio encoder 16 and the contents thereof are known in the
11-04-2019
9
portable terminal device 10 and the teleconference counterpart terminal 50.
[0034]
The audio decoder 17 decodes the audio signal output from the communication unit 18, and
outputs the audio signal after the decoding process (hereinafter, referred to as “speaker
signal”) to the audio processing unit 15. Note that the method of decoding processing in the
audio decoder 17 and the contents thereof are known in the portable terminal device 10 and the
teleconference counterpart terminal 50.
[0035]
The communication unit 18 multiplexes the encoded microphone signal output from the audio
encoder 16 and the encoded video signal output from the video encoder 19, and further
multiplexes the multiplexed signal, for example. The signal is converted into a signal of a
predetermined frequency band for communication and transmitted to the teleconference
counterpart terminal 50.
[0036]
The communication unit 18 receives, for example, a signal of a predetermined frequency band
transmitted from the teleconference counterpart terminal 50, and further separates the received
signal into an audio signal and a video signal, and outputs the audio signal to the audio decoder
17. , And outputs the video signal to the video decoder 20.
[0037]
The video encoder 19 encodes the video signal output from the video processing unit 21 and
outputs the encoded video signal to the communication unit 18.
Note that the method of encoding processing in the video encoder 19 and the contents thereof
are known in the portable terminal device 10 and the teleconference counterpart terminal 50.
[0038]
11-04-2019
10
The video decoder 20 decodes the video signal output from the communication unit 18 and
outputs the video signal after the decoding process to the video processing unit 21.
Note that the method of decoding processing in the video decoder 20 and the contents thereof
are known in the portable terminal device 10 and the teleconference counterpart terminal 50.
[0039]
The video processing unit 21 performs predetermined video processing on the video signal
output from the camera 13, and outputs the video signal after the predetermined video
processing to the video encoder 19. Also, the video processing unit 21 causes the display 14 to
display the video signal output from the video decoder 20.
[0040]
Furthermore, according to the display instruction data output from the warning unit 22, the
video processing unit 21 causes the user to recognize that the microphone 11 is in a blocked
state (see FIG. 2B). ) Is displayed on the display 14. Although the image data of the warning
message ALT is held in advance by the video processing unit 21 or the warning unit 22, a storage
device not shown in FIG. 3 otherwise is the video processing unit 21 or the warning unit 22. It
may be data read from (for example, memory, hard disk drive).
[0041]
The warning unit 22 as an example of the notification unit displays a warning message ALT on
the display 14 according to the determination result output from the voice processing unit 15,
that is, the determination result whether the microphone 11 is in a blocked state. Display
instruction data to be displayed is generated and output to the video processing unit 21. That is,
when it is determined that the microphone 11 is blocked by, for example, the finger FG, the
warning unit 22 generates display instruction data for displaying the warning message ALT on
the display 14 and the video processing unit Output to 21
11-04-2019
11
[0042]
In each embodiment including the present embodiment, the warning unit 22 displays a warning
message ALT on the display 14 in order to notify that the microphone 11 is in a state of being
blocked by, for example, the finger FG of the user. Also, it may be notified as follows.
[0043]
For example, the warning unit 22 vibrates a vibration operation unit (not shown) provided in the
portable terminal device 10 by a vibration pattern corresponding to the warning message ALT, or
causes the portable terminal device 10 to vibrate by a lighting pattern corresponding to the
warning message ALT. A corresponding LED (Light Emitting Diode) (not shown) may be lighted
or blinked, and further, the corresponding sound may be output to the speaker 12 by a sound
pattern corresponding to the warning message ALT.
[0044]
The network NW may be a wireless network or a wired network.
[0045]
The teleconference counterpart terminal 50 has the same configuration as that of the mobile
terminal device 10, so the description of the configuration and the operation will be omitted.
[0046]
Here, the internal configuration of the audio processing unit 15 will be described in detail with
reference to FIG.
The voice processing unit 15 shown in FIG. 4 includes an ADC (Analog Digital Converter) 31, a
DAC (Digital Analog Converter) 32, a voice detection unit 33, a correlation operation unit 34, and
a microphone closing determination unit 35.
[0047]
The ADC 31 AD converts an analog audio signal output from the microphone 11 into a digital
audio signal at a predetermined sampling frequency (for example, 8 kHz).
11-04-2019
12
The digital audio signal (microphone signal) AD-converted by the ADC 31 is input to the audio
encoder 16 and the correlation operation unit 34.
[0048]
The DAC 32 DA-converts the digital speaker signal output from the audio decoder 17 into an
analog speaker signal.
The analog speaker signal DA converted by the DAC 32 is output as sound at the speaker 12.
The digital speaker signal output from the audio decoder 17 is input to the DAC 32, the audio
detection unit 33, and the correlation operation unit 34.
[0049]
The sound detection unit 33 detects, of the speaker signals output from the sound decoder 17, a
speaker signal whose volume level exceeds a predetermined first threshold TH1. The first
threshold TH1 is, for example, an average of the volume levels of the microphone signals from
which the voice output from the speaker 12 is collected as a result of the voice spoken by the
other participant using the video conference partner terminal 50 being output from the The
value is smaller than the value by a predetermined amount. Therefore, when the volume level of
the speaker signal exceeds the first threshold TH1, the voice spoken by the other participant who
uses the teleconference counterpart terminal 50 is output from the speaker 12.
[0050]
When the sound detection unit 33 detects a speaker signal whose sound volume level exceeds
the first threshold TH1, the sound detection unit 33 outputs, to the correlation operation unit 34,
a detection signal indicating that a speaker signal whose sound volume level exceeds the first
threshold TH1 is detected.
[0051]
11-04-2019
13
The correlation operation unit 34 receives the digital microphone signal output from the ADC 31
and the digital speaker signal output from the audio decoder 17.
When the correlation operation unit 34 receives from the audio detection unit 33 a detection
signal indicating that a speaker signal whose volume level exceeds the first threshold TH1 is
detected, the correlation operation unit 34 outputs from the digital microphone signal output
from the ADC 31 and the audio decoder 17 When the input sample value with the obtained
digital speaker signal reaches a predetermined number (for example, 256), the correlation
operation between the microphone signal and the speaker signal is performed (see equation (1)).
The correlation operation unit 34 outputs the result of the correlation operation between the
microphone signal and the speaker signal to the microphone closing determination unit 35.
[0052]
[0053]
In Equation (1), c (τ) is the correlation calculation result (correlation value) between the
microphone signal and the speaker signal, and N is a predetermined sample required for
performing the correlation calculation between the microphone signal and the speaker signal
once. A number (for example, 256), t is time, τ is a delay time with respect to time t, s (τ−t) is
a speaker signal delayed by τ from time t, and m (t) is a microphone signal at time t.
Note that mod N indicates that correlation calculation is periodically performed for each of N
sample values of the microphone signal and the speaker signal.
[0054]
For example, if the sampling frequency of the ADC 31 is 8 kHz and the correlation operation unit
34 performs correlation operation for each of 256 input sample values of the microphone signal
and the speaker signal, the correlation operation unit 34 calculates 32 msec (= 256/8). Perform
correlation calculation every). The correlation operation cycle in the correlation operation unit
34 is not limited to 32 msec, and may be changed as appropriate according to the sampling
11-04-2019
14
frequency of the ADC 31 and input sample values of the microphone signal and the speaker
signal.
[0055]
The microphone closing determination unit 35 determines whether the microphone 11 is in a
closed state based on the correlation calculation result (correlation value) in the correlation
calculation unit 34 for each correlation calculation cycle, and the microphone 11 is closed. If it is
determined that the microphone 11 is in the closed state, a determination result indicating that
the microphone 11 is in the closed state is output to the warning unit 22.
[0056]
Specifically, when the correlation calculation result (correlation value) in the correlation
calculation unit 34 for each correlation calculation cycle exceeds the predetermined second
threshold TH2, the microphone blocking determination unit 35 does not close the microphone
11 When it does not exceed the predetermined second threshold TH2, it is determined that the
microphone 11 is blocked.
The second threshold TH2 is a threshold for determining whether or not the microphone 11 is
closed, and is a predetermined value.
[0057]
FIG. 5 is a graph showing an example of the correlation calculation result when the microphone
11 is not blocked. FIG. 6 is a graph showing an example of the correlation calculation result
when the microphone 11 is closed. The horizontal axes of FIGS. 5 and 6 represent delay time τ,
and the vertical axes of FIGS. 5 and 6 represent correlation values.
[0058]
In a state where the microphone 11 is not blocked, it is considered that part of the audio signal
output from the speaker 12 turns around and is easily picked up by the microphone 11.
Therefore, the correlation value between the microphone signal and the speaker signal has a
11-04-2019
15
peak value (peak correlation value) exceeding the second threshold TH2 in the correlation
operation cycle (see FIG. 5).
[0059]
On the other hand, when the microphone 11 is closed by, for example, the finger FG of the user,
it is considered that a part of the audio signal output from the speaker 12 is hardly picked up by
the microphone 11. Therefore, the correlation value between the microphone signal and the
speaker signal does not have a peak value (peak correlation value) exceeding the second
threshold TH2 in the correlation operation cycle (see FIG. 6).
[0060]
Next, the operation procedure of the mobile terminal device 10 of the present embodiment will
be described with reference to FIG. FIG. 7 is a flowchart for explaining the operation procedure of
the mobile terminal device 10 according to the first embodiment.
[0061]
In FIG. 7, the sound detection unit 33 detects a speaker signal whose volume level exceeds the
first threshold TH1 among the speaker signals output from the sound decoder 17 (S11). When
the sound detection unit 33 detects a speaker signal whose sound volume level exceeds the first
threshold TH1 (S11, YES), a detection signal indicating that the speaker signal whose sound
volume level exceeds the first threshold TH1 has been detected is a correlation operation unit
Output to 34. When the speaker signal whose sound volume level exceeds the first threshold
TH1 is not detected (S11, NO), the operation of the mobile terminal device 10 shown in FIG. 7
ends.
[0062]
When the input sample values of the digital microphone signal output from the ADC 31 and the
digital speaker signal output from the audio decoder 17 reach a predetermined number (for
example, 256), the correlation operation unit 34 detects the microphone signal and the speaker
11-04-2019
16
signal. A correlation operation is performed (S12). The correlation operation unit 34 outputs the
result of the correlation operation between the microphone signal and the speaker signal to the
microphone closing determination unit 35.
[0063]
The microphone closing determination unit 35 detects a peak value (peak correlation value)
among the correlation calculation results in step S12 (S13), and determines whether the detected
peak value exceeds the second threshold TH2 (S14) . If the peak value detected in step S13
exceeds the second threshold TH2 (S14, YES), the microphone 11 is not blocked, and the
operation of the portable terminal device 10 shown in FIG. 7 ends.
[0064]
On the other hand, when the peak value detected in step S13 does not exceed the second
threshold TH2 (S14, NO), the microphone blocking determination unit 35 determines that the
microphone 11 is blocked by, for example, the finger FG of the user. And outputs the
determination result to the warning unit 22 that the microphone 11 is in a closed state.
[0065]
The warning unit 22 generates display instruction data for displaying a warning message ALT on
the display 14 according to the determination result that the microphone 11 is in a closed state,
and outputs the display instruction data to the video processing unit 21.
The video processing unit 21 generates a warning message ALT (see FIG. 2B) for causing the user
to recognize that the microphone 11 is in a blocked state according to the display instruction
data output from the warning unit 22. It is displayed on the display 14 (S15). Thereby, the
operation of the mobile terminal device 10 shown in FIG. 7 ends.
[0066]
As described above, in the mobile terminal device 10 of the present embodiment, the audio signal
(speaker signal) output from the speaker 12 and the audio signal (microphone) collected by the
11-04-2019
17
microphone 11 with respect to the audio signal (speaker signal) output from the speaker 12 The
correlation operation with the signal is performed every correlation operation cycle. The portable
terminal device 10 can easily determine, for example, the state in which the microphone 11 is
blocked by the user's finger FG using the correlation calculation result for each correlation
calculation cycle, and further, the state in which the microphone 11 is blocked is Can be notified.
[0067]
Therefore, for example, in the video conference communication with the video conference
partner terminal 50 connected via the network, the portable terminal device 10 blocks the
microphone 11 when the microphone 11 is blocked by the user's finger FG, for example. Since
the user can be prompted to release the finger FG, after the user releases the finger FG,
transmission and reception of audio data with the video conference partner terminal 50 can be
smoothly performed.
[0068]
In the first embodiment, the microphone blocking determination unit 35 detects a peak value
from the correlation calculation result in the correlation calculation unit 34 for each correlation
calculation cycle, and determines whether or not the second threshold TH2 is exceeded.
However, it is also possible to detect a peak value from the correlation calculation result in which
correlation calculation is performed over a plurality of correlation calculation cycles, and to
determine whether or not the second threshold TH2 is exceeded.
As a result, the microphone blocking determination unit 35 can suppress the influence of
ambient environmental noise (noise) and easily determine whether the microphone 11 is
blocked.
[0069]
Further, the microphone blocking determination unit 35 adds the correlation calculation result
obtained by the correlation calculation unit 34 to the correlation calculation for each correlation
calculation cycle for a predetermined Np times (Np is an integer of 2 or more), and calculates the
correlation after Np additions. The peak value may be detected from the correlation calculation
result in the cycle to determine whether the second threshold TH2 is exceeded. Thus, the
microphone blocking determination unit 35 can determine with high accuracy whether or not
11-04-2019
18
the microphone 11 is blocked by suppressing the influence of ambient sound (noise) around it.
[0070]
Second Embodiment In the first embodiment, depending on the arrangement relationship of the
microphone 11 and the speaker 12 in the housing of the portable terminal device 10, the
speaker may be used even if, for example, the user's finger FG blocks the microphone 11. It is
conceivable that the audio signal (speaker signal) output from 12 may leak into the microphone
11. In this case, the peak value of the correlation calculation result of the correlation calculation
unit 34 exceeds the second threshold TH2 and the microphone 11 is blocked even though the
microphone 11 is blocked. Whether or not there is a possibility of being erroneously determined.
[0071]
Therefore, in the second embodiment, the portable terminal device 10A does not use the
correlation value between the microphone signal and the speaker signal, and the contents of the
noise (the environmental sound) picked up by the microphone 11 and the remarks spoken by the
user of the portable terminal device 10A. The respective frequency characteristics with the voice
signal (microphone signal) are calculated, and it is determined based on each frequency
characteristic whether or not the microphone 11 is in a blocked state.
[0072]
In the mobile terminal device 10A of the present embodiment, the configuration and operation of
each component other than the voice processing unit 15A are the same as those of the mobile
terminal device 10 of the first embodiment, and thus each component other than the voice
processing unit 15A. The description of the configuration and operation of is omitted or
simplified, and different contents are described.
Further, in the description of the audio processing unit 15A shown in FIG. 8, contents different
from the configuration and operation of each unit of the audio processing unit 15 shown in FIG.
4 are described, and the same reference numerals are given to components having the same
configuration and operation. And omit or simplify the explanation.
11-04-2019
19
[0073]
FIG. 8 is a block diagram showing an internal configuration of the audio processing unit 15A of
the mobile terminal device 10A of the second embodiment. The audio processing unit 15A shown
in FIG. 8 includes an ADC 31, a DAC 32, an audio detection unit 33A, a noise frequency
characteristic calculation unit 41, an audio frequency characteristic calculation unit 42, and a
microphone closing determination unit 35A.
[0074]
The voice detection unit 33A detects a microphone signal having a volume level exceeding a
predetermined third threshold TH3 among the microphone signals output from the microphone
11 every predetermined voice detection period. The third threshold TH3 is, for example, a value
smaller by a predetermined amount than the average value of the volume levels of the
microphone signals collected by the microphone 11 as the utterance content spoken by the user
of the mobile terminal device 10A. Therefore, when the volume level of the microphone signal
exceeds the third threshold TH3, the voice spoken by the user of the mobile terminal device 10A
is collected by the microphone 11.
[0075]
The voice detection unit 33A outputs, to the voice frequency characteristic calculation unit 42, a
detection signal indicating that a microphone signal having a volume level exceeding the third
threshold TH3 has been detected when a microphone signal having a volume level exceeding the
third threshold TH3 is detected. Do. Further, when the sound detection unit 33A does not detect
a microphone signal whose sound volume level exceeds the third threshold TH3, the sound
frequency characteristic calculation unit does not detect a microphone signal whose sound
volume level does not detect the microphone signal exceeding the third threshold TH3. Output to
41.
[0076]
When the noise frequency characteristic calculation unit 41 receives from the voice detection
unit 33A a non-detection signal indicating that the microphone signal whose volume level
11-04-2019
20
exceeds the third threshold TH3 is not detected in the voice detection period, the mobile terminal
device in the voice detection period Calculate the frequency characteristics of the volume level of
ambient sound (noise) around 10 A. The noise frequency characteristic calculation unit 41
calculates the volume characteristic of ambient environmental noise (noise) by computing the
frequency characteristic of the volume level of environmental ambient noise (noise) around the
portable terminal device 10A in a plurality of speech detection periods. The average value of the
frequency characteristics of the level may be calculated.
[0077]
The noise frequency characteristic calculation unit 41 determines whether or not the microphone
11 is blocked based on the frequency characteristic of the volume level of ambient sound (noise)
around the portable terminal device 10A in the sound detection period or the average value
thereof. A fourth threshold value TH4 (see later) for determining whether or not to be calculated
is output to the microphone closing determination unit 35A. For example, although the noise
frequency characteristic calculation unit 41 uses the same value as the noise frequency
characteristic of the volume level calculated by the noise frequency characteristic calculation unit
41 as the fourth threshold TH4, for example, the volume level calculated by the noise frequency
characteristic calculation unit 41 A value reduced by a predetermined amount from the noise
frequency characteristic of
[0078]
When the voice frequency characteristic calculation unit 42 receives from the voice detection
unit 33A a detection signal indicating that a microphone signal whose volume level exceeds the
third threshold TH3 is detected in the voice detection period, the portable terminal device 10A in
the voice detection period Calculate the frequency characteristic of the volume level of the audio
signal (microphone signal) including the content of the utterance spoken by the user.
[0079]
Note that the audio frequency characteristic calculation unit 42 calculates the frequency
characteristic of the volume level of the audio signal (microphone signal) including the content of
an utterance spoken by the user of the mobile terminal device 10A in a plurality of audio
detection periods, An average value of frequency characteristics of volume level may be
calculated.
11-04-2019
21
The audio frequency characteristic calculation unit 42 outputs the frequency characteristic of the
volume level of the audio signal (microphone signal) including the content of an utterance
spoken by the user of the mobile terminal device 10A in the audio detection period or the
average value thereof to the microphone blocking determination unit 35A.
[0080]
Based on the fourth threshold value TH4 output from the noise frequency characteristic
calculation unit 41 and the frequency characteristic or average value of the volume level of the
microphone signal output from the audio frequency characteristic calculation unit 42, the
microphone blocking determination unit 35A. It is determined whether or not the microphone 11
is in a closed state, and when it is determined that the microphone 11 is in a closed state, a
warning indicating that the microphone 11 is in a closed state is warned. Output to unit 22. The
operation of the warning unit 22 is the same as the operation of the first embodiment, and thus
the description thereof is omitted.
[0081]
Specifically, the microphone blockage determination unit 35A blocks the microphone 11 when
the frequency characteristic or the average value of the volume level of the microphone signal in
a predetermined band (for example, a high band described later) exceeds the fourth threshold
TH4. It is determined that the microphone 11 is blocked if the frequency characteristics in the
same predetermined band or the average value thereof is substantially the same as the
predetermined fourth threshold TH4. Do.
[0082]
FIG. 9 is a graph showing an example of frequency characteristics of noise and sound volume
levels.
The horizontal axis of FIG. 9 represents frequency [Hz], and the vertical axis of FIG. 9 represents
volume level [dB]. In FIG. 9, the solid line represents the frequency characteristic of the volume
level of noise, the dotted line represents the frequency characteristic of the volume level of the
microphone signal in a state in which the microphone 11 is not blocked, and the dashed dotted
line represents a state in which the microphone 11 is blocked. Represents the frequency
11-04-2019
22
characteristic of the volume level of the microphone signal at. For example, the frequency
characteristic (see solid line) of the volume level of noise is used as the fourth threshold TH4. In
the present embodiment, the sampling frequency of the ADC 31 is 16 kHz.
[0083]
When the microphone 11 of the mobile terminal device 10A is blocked by, for example, the
finger FG of the user, the frequency characteristic of the volume level of the microphone signal is
higher (for example, 6 kHz to It decreases when it reaches 8 kHz, and increases when it reaches
the low band (for example, 0 kHz to 2 kHz) (see the dashed dotted line and dotted line shown in
FIG. 9).
[0084]
Further, in a state in which the microphone 11 of the mobile terminal device 10A is closed by, for
example, the finger FG of the user, the frequency characteristic of the volume level of the
microphone signal becomes substantially the same as the frequency characteristic of the volume
level of noise in the high frequency band.
That is, in a state in which the microphone 11 of the mobile terminal device 10A is closed by, for
example, the finger FG of the user, the difference between the frequency characteristic of the
volume level of the microphone signal and the frequency characteristic of the volume level of
noise is a predetermined ratio (for example, several percent) It is an extent.
[0085]
Next, the operation procedure of the mobile terminal device 10A of the present embodiment will
be described with reference to FIG. FIG. 10 is a flowchart for explaining the operation procedure
of the mobile terminal device 10A of the second embodiment.
[0086]
In FIG. 10, the voice detection unit 33A detects a microphone signal whose volume level exceeds
a predetermined third threshold TH3 among the microphone signals output from the
11-04-2019
23
microphone 11 every predetermined voice detection period (S21). When the voice detection unit
33A does not detect a microphone signal whose volume level exceeds the third threshold TH3
(S21, NO), the noise detection frequency indicating that the microphone signal whose volume
level does not exceed the third threshold TH3 is not detected is noise frequency It is output to
the characteristic calculation unit 41.
[0087]
When the noise frequency characteristic calculation unit 41 receives from the voice detection
unit 33A a non-detection signal indicating that the microphone signal whose volume level
exceeds the third threshold TH3 is not detected in the voice detection period, the mobile terminal
device in the voice detection period The frequency characteristic of the volume level of ambient
sound (noise) around 10 A is calculated (S 22). In addition, the noise frequency characteristic
calculation unit 41 calculates the frequency characteristic of the ambient environment sound
(noise) by computing the frequency characteristic of the volume level of the ambient sound
(noise) around the mobile terminal device 10A in a plurality of speech detection periods. The
average value of the frequency characteristic of the level is calculated (S23).
[0088]
The noise frequency characteristic calculation unit 41 determines whether or not the microphone
11 is blocked based on the average value of the frequency characteristics of the volume level of
ambient sound (noise) around the portable terminal device 10A in the speech detection period.
The fourth threshold value TH4 for determining the threshold is calculated and output to the
microphone closing determination unit 35A (S24). Thereby, the operation of the mobile terminal
device 10A shown in FIG. 10 is ended.
[0089]
On the other hand, when the sound detection unit 33A detects a microphone signal whose sound
volume level exceeds the third threshold TH3 (S21, YES), a sound detection signal indicating that
the microphone signal whose sound volume level exceeds the third threshold TH3 is detected is
sounded The frequency characteristic calculation unit 42 outputs the result.
[0090]
11-04-2019
24
When the voice frequency characteristic calculation unit 42 receives from the voice detection
unit 33A a detection signal indicating that a microphone signal whose volume level exceeds the
third threshold TH3 is detected in the voice detection period, the portable terminal device 10A in
the voice detection period The frequency characteristic of the volume level of the audio signal
(microphone signal) including the content of the utterance spoken by the user is calculated (S25).
In addition, the audio frequency characteristic calculation unit 42 calculates the frequency
characteristic of the volume level of the audio signal (microphone signal) including the content of
the utterance spoken by the user of the mobile terminal device 10A in a plurality of audio
detection periods, The average value of the frequency characteristics of the volume level is
calculated (S26).
[0091]
The voice frequency characteristic calculation unit 42 outputs the average value of the frequency
characteristics of the volume level of the voice signal (microphone signal) including the content
of the speech spoken by the user of the portable terminal device 10A in the voice detection
period to the microphone blocking determination unit 35A.
[0092]
The microphone closing determination unit 35A calculates the average value of the frequency
characteristic of the volume level of the microphone signal in the high band (for example, 6 kHz
to 8 kHz) from the average value of the frequency characteristic of the volume level of the
microphone signal output from the audio frequency characteristic calculation unit 42 Extract
(S27).
[0093]
The microphone blocking determination unit 35A determines whether the fourth threshold TH4
output from the noise frequency characteristic calculation unit 41 and the average value of the
frequency characteristics of the volume level of the microphone signal in the high band extracted
in step S27 are substantially the same. It is determined whether or not the microphone 11 is
blocked by determining whether or not it is (S28).
11-04-2019
25
When the microphone closing determination unit 35A determines that the microphone 11 is
closed (S28, YES), it outputs a determination result indicating that the microphone 11 is closed to
the warning unit 22. .
[0094]
The operation of the warning unit 22 is the same as the operation of the first embodiment, and
thus the description thereof is omitted.
On the other hand, when it is determined that the microphone 11 is not blocked (S28, NO), the
operation of the mobile terminal device 10A shown in FIG. 10 ends.
[0095]
As described above, when the volume level of the audio signal (microphone signal) collected by
the microphone 11 is less than or equal to the third threshold TH3, the mobile terminal device
10A of the present embodiment determines the volume level of ambient environmental noise
(noise) Is calculated over a predetermined voice detection period, and when the volume level of
the microphone signal exceeds the third threshold, the frequency characteristic of the volume
level of the microphone signal including the speech content spoken by the user of the portable
terminal device 10A is the same. Calculate over the speech detection period.
[0096]
The portable terminal device 10A calculates a fourth threshold TH4 for determining whether or
not the microphone 11 is blocked based on the frequency characteristic of the volume level of
ambient environmental noise (noise), and this fourth threshold TH4 Whether or not the
microphone 11 is blocked is determined using the frequency characteristics of the volume level
of the microphone signal including the content of the utterance spoken by the user of the mobile
terminal device 10A.
[0097]
Therefore, even if the audio signal (speaker signal) output from the speaker 12 leaks into the
microphone 11, the portable terminal device 10A can easily and accurately make the microphone
11 blocked by the user's finger FG, for example. The determination can be made, and the user
can be notified of the state in which the microphone 11 is blocked.
11-04-2019
26
[0098]
Thereby, the mobile terminal device 10A is, for example, in the case where the microphone 11 is
blocked by the user's finger FG, for example, in the video conference communication with the
video conference partner terminal 50 connected via the network NW. Since the user can be
prompted to release the finger FG that has closed the key, it is possible to smoothly transmit and
receive audio data with the video conference partner terminal 50 after the user releases the
finger FG.
[0099]
In the second embodiment, the microphone blocking determination unit 35A extracts the average
value of the frequency characteristics of each volume level of high frequency (for example, 6 kHz
to 8 kHz) and low frequency (for example, 0 kHz to 2 kHz) in step S27. Also good.
[0100]
In this case, the microphone blockage determination unit 35A compares the average value of the
frequency characteristics of the high and low frequency levels with the fourth threshold TH4 in
step S28, and determines that the frequency characteristic of the volume level of the microphone
signal is high in the high region. It is determined that the microphone 11 is blocked when the
average value is substantially the same as the fourth threshold TH4 and the average value of the
frequency characteristics of the volume level of the microphone signal exceeds the fourth
threshold TH4 in the low band. You may.
As a result, the mobile terminal device 10A can determine with high accuracy whether the
microphone 11 is blocked or not as compared with the case where only the average value of the
frequency characteristic of the high-volume level is used.
[0101]
Hereinafter, the configuration, operation, and effects of the portable terminal device according to
the present invention described above will be described.
[0102]
One embodiment of the present invention is a portable terminal device that communicates with a
partner terminal connected via a network, and a voice output unit for outputting voice data from
11-04-2019
27
the partner terminal, and a voice output by the voice output unit. A correlation operation of an
audio collection unit for collecting data, audio data output by the audio output unit, and audio
data acquired by the audio collection unit with respect to audio data output by the audio output
unit The correlation calculation unit to be performed, a closing determination unit that
determines whether the voice pickup unit is in a closed state using the correlation calculation
result of the correlation calculation unit, the determination result of the closing determination
unit And a notification unit for notifying of a state in which the sound collection unit is blocked.
[0103]
According to this configuration, the portable terminal device uses the correlation calculation
result of the audio data output by the audio output unit and the audio data output by the audio
collection unit with respect to the audio data output by the audio output unit. For example, the
state in which the sound collection unit is blocked by a finger can be easily determined, and the
user can be notified of the state in which the sound collection unit is closed.
Therefore, the portable terminal device releases the finger blocking the voice pickup unit when
the voice pickup unit is blocked by a finger, for example, in a video conference communication
with a partner terminal connected via the network. Since the user can be prompted, it is possible
to smoothly transmit and receive voice data with the other terminal after the user releases the
finger.
[0104]
Further, one embodiment of the present invention further includes a voice detection unit that
detects voice data from the partner terminal, and the correlation operation unit detects the voice
data from the partner terminal by the voice detection unit. It is a portable terminal device which
performs the said correlation calculation in the case.
[0105]
According to this configuration, the portable terminal device performs the portable operation as
compared with the case where the correlation operation unit always performs the correlation
operation by causing the correlation operation unit to perform the correlation operation only
when the audio output unit outputs the audio data. It is possible to reduce the power
consumption in the terminal device, and to eliminate the influence of the correlation calculation
result in the silent state when averaging the correlation calculation result calculated a
predetermined number of times, so it is possible to obtain a more accurate correlation calculation
11-04-2019
28
result. it can.
[0106]
In one embodiment of the present invention, the closing determination unit determines that the
voice collection unit is closed when the peak correlation value of the correlation calculation
result of the correlation calculation unit exceeds a predetermined threshold. It is a portable
terminal device.
[0107]
According to this configuration, the portable terminal device can easily determine the state in
which the sound collection unit is blocked.
[0108]
Further, one embodiment of the present invention is a portable terminal device that
communicates with a partner terminal connected via a network, and a voice output unit that
outputs voice data from the partner terminal, and a periphery of the mobile terminal device A
voice pickup unit for picking up voice data of a voice, a voice frequency characteristic calculation
unit for computing frequency characteristics of voice data picked up by the voice collection unit,
and a calculation result of the voice frequency characteristic calculation unit A notification for
notifying of a state in which the voice collecting unit is blocked using a closing determination
unit that determines whether the voice collecting unit is in a closed state and a determination
result of the closing determination unit And a mobile terminal device.
[0109]
According to this configuration, the portable terminal device calculates the frequency
characteristic of the sound data collected by the sound collection unit. Therefore, using the
calculation result of the frequency characteristic of the sound data, for example, the sound
collection unit The closed state can be easily determined, and the user can be notified of the
closed state of the sound collection unit.
Therefore, the portable terminal device releases the finger blocking the voice pickup unit when
the voice pickup unit is blocked by a finger, for example, in a video conference communication
with a partner terminal connected via the network. Since the user can be prompted, it is possible
to smoothly transmit and receive voice data with the other terminal after the user releases the
11-04-2019
29
finger.
[0110]
In one embodiment of the present invention, an audio detection unit for detecting audio data
whose volume level exceeds a predetermined threshold among audio data collected by the audio
collection unit, and a volume threshold for the predetermined threshold. And a noise frequency
characteristic calculation unit configured to calculate frequency characteristics of surrounding
sound data when the sound data exceeding the limit is not detected.
[0111]
According to this configuration, when the voice collection unit does not pick up the voice data,
the portable terminal device causes the noise frequency characteristic calculation unit to
calculate the frequency characteristics of the surrounding voice data to pick up the voice data.
Using the frequency characteristics of the surrounding sound data (for example, ambient sound
and noise) in the case where the sound collection unit is not used, the threshold for determining
whether the sound collection unit is blocked is properly calculated. it can.
[0112]
In one embodiment of the present invention, an audio detection unit for detecting audio data
having a volume level exceeding a predetermined threshold among audio data collected by the
audio collection unit is further provided, the audio frequency characteristic The calculation unit
is a portable terminal device that calculates the frequency characteristic of the sound data
collected by the sound collection unit when the sound data in which the volume level exceeds the
predetermined threshold is detected.
[0113]
According to this configuration, the portable terminal device causes the voice frequency
characteristic calculation unit to constantly calculate the voice frequency by causing the voice
frequency characteristic calculation unit to calculate the voice frequency characteristic when the
voice collection unit picks up voice data. As compared with the case of calculating the
characteristics, the power consumption in the portable terminal device can be reduced, and
furthermore, when the sound frequency characteristics calculated over a predetermined number
of times are averaged, the influence of the sound frequency characteristics in silence can be
eliminated. Audio frequency characteristics can be determined more accurately.
[0114]
11-04-2019
30
Further, according to an embodiment of the present invention, in the case where the blockage
determination unit determines that the volume level in a frequency band of a first predetermined
range is equal to or less than a predetermined threshold among calculation results of the audio
frequency characteristic calculation unit It is a portable terminal device that determines that the
sound collection unit is closed.
Note that the frequency band in the first predetermined range is, for example, a high frequency
band (high frequency band) of the sound data collected by the sound collection unit.
[0115]
According to this configuration, the portable terminal device may block the sound collection unit
even if the sound data output by the sound output unit leaks into the sound collection unit while
the sound collection unit is closed, for example. Can be determined with high accuracy.
[0116]
In one embodiment of the present invention, the blockage determination unit determines that the
volume level in the frequency band of the first predetermined range is equal to or less than a
predetermined threshold among the calculation results of the audio frequency characteristic
calculation unit. The mobile terminal device determines that the sound collection unit is blocked
when the volume level in the lower second predetermined range frequency band exceeds the
predetermined threshold.
Note that the frequency band in the first predetermined range is, for example, a high frequency
band (high frequency band) of the sound data collected by the sound collection unit.
Further, the frequency band in the second predetermined range is, for example, a low frequency
band (low band) of the audio data collected by the audio collection unit.
[0117]
According to this configuration, the portable terminal device may block the sound collection unit
even if the sound data output by the sound output unit leaks into the sound collection unit while
11-04-2019
31
the sound collection unit is closed, for example. It is possible to determine the unhealthy state
with higher accuracy.
[0118]
Although various embodiments have been described above with reference to the drawings, it
goes without saying that the present invention is not limited to such examples.
It will be apparent to those skilled in the art that various changes or modifications can be
conceived within the scope of the appended claims, and of course these also fall within the
technical scope of the present invention. It is understood.
[0119]
The present invention is useful as a portable terminal device that smoothly transmits and
receives audio data even when a microphone is blocked in a video conference communication
with a partner terminal connected via a network, for example, a portable telephone, A
smartphone or a tablet terminal corresponds.
[0120]
DESCRIPTION OF SYMBOLS 10, 10A Portable terminal device 11 Microphone 12 Speaker 13
Camera 14 Display 15, 15A Audio processing part 16 Audio encoder 17 Audio decoder 18
Communication part 19 Video encoder 20 Video decoder 21 Video processing part 22 Warning
part 31 ADC 32 DAC 33, 33A Audio Detection unit 34 Correlation operation unit 35, 35A
Microphone closing determination unit 41 Noise frequency characteristic calculation unit 42
Audio frequency characteristic calculation unit 50 Teleconference partner terminal
11-04-2019
32
Документ
Категория
Без категории
Просмотров
0
Размер файла
46 Кб
Теги
description, jp2014204318
1/--страниц
Пожаловаться на содержимое документа