close

Вход

Забыли?

вход по аккаунту

?

DESCRIPTION JP2014112831

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JP2014112831
Abstract: A method and system for adaptively managing a plurality of microphones and speakers
in an electronic device. The operation mode of the electronic device can be determined, and the
operation of at least one speaker is managed based on the determined operation mode. This
management includes adaptively switching or modifying the function of at least one speaker. For
example, at least one speaker can be configured to act as a microphone or a vibration detector.
The input obtained with at least one speaker can be used to optimize voice related functions such
as noise reduction and / or acoustic echo cancellation. [Selected figure] Figure 2
System for managing multiple microphones and speakers
[0001]
The present invention relates to speech processing. More particularly, certain embodiments of
the present disclosure relate to an adaptive system for managing multiple microphones and
speakers.
[0002]
Claim of priority: This application refers to and claims this priority with reference to the patent
application entitled "An Adaptive System for Managing Multiple Microphones and Speakers" filed
on November 8, 2012. And claim benefits based on this. The application of Patent Document 1 is
hereby incorporated by reference in its entirety.
11-04-2019
1
[0003]
Existing methods and systems for managing audio input / output components (e.g., speakers and
microphones) in electronic devices may be inadequate and / or costly. By comparing the
conventional practice with some aspects of the present method and apparatus described in the
following part of the disclosure with reference to the drawings, there are more limitations and
drawbacks to such a practice. It will be apparent to those skilled in the art.
[0004]
US Provisional Patent Application No. 61 / 723,856
[0005]
The present invention is an adaptive system for managing a plurality of microphones and
speakers, substantially as shown and / or described in connection with at least one drawing, and
as more fully described in the claims. Provide a system and / or method.
[0006]
These and other advantages, aspects and novel features of the present disclosure and its
illustrative implementation details can be more fully understood from the following description
and drawings.
[0007]
An illustration showing an example of an electronic device equipped with a plurality of
microphones and speakers An illustration showing an architecture of an example of an electronic
device equipped with a plurality of microphones and speakers An illustration architecture
showing an example of an electronic device equipped with a plurality of microphones and
speakers FIG. 10 is a diagram illustrating the architecture of an example of an electronic device
equipped with a plurality of microphones and a speaker, and an explanatory diagram in which
the speaker is modified to be used as an audio input component; FIG. An explanatory diagram
showing an example of a pre-processing for converting a signal obtained from a speaker into a
signal matching with a signal from a standard microphone and using it together with a standard
audio signal obtained through the microphone. Multiple micro-phones in electronic devices Flow
chart illustrating an exemplary process for generating a voice input and by using a vibration
captured via flowchart speaker illustrating an exemplary process for managing speaker
11-04-2019
2
[0008]
Particular embodiment of a method and system for adaptively managing, controlling and
switching the operation of a plurality of microphones and speakers mounted on an electronic
device (e.g. a mobile communication system such as a mobile phone or tablet terminal) Can be
found.
In this regard, the built-in microphones and speakers of the electronic device can be used in
accordance with the present disclosure without changing the positions of the microphones and
speakers within the device's original structure.
Rather, the operation of the microphones and speakers of the electronic device can be managed,
controlled and switched to support enhancing and / or improving the functionality within the
electronic device.
For example, the built-in speakers of a standard mobile device can be used in combination with
the signal processing capabilities of the device, including hardware and software, to obtain input
for use within the device.
The built-in speaker can be configured and used as a microphone and / or a vibration detector to
ensure that the user of the device is talking and / or useful input for performing various
adaptation processes And / or can generate instructions. For example, the inputs or instructions
generated by the speaker can be used to improve the noise reduction process or the acoustic
echo cancellation process. Selection of the speaker and / or microphone to be used can be made
automatically and adaptively based on the operation mode of the present system and the like.
[0009]
As used herein, the terms "circuits and circuitry" may constitute physical electronic components
(i.e. hardware) as well as hardware and may be implemented by hardware or may be hardware
Refers to any software and / or firmware ("code") that may be coupled to the ware. As used
herein, for example, a particular processor and memory may comprise a first "circuit" when
11-04-2019
3
executing a first few lines of code and execute a second few lines of code If so, a second "circuit"
can be provided. As used herein, "and / or" means any one or more of the column articles
connected by "and / or". As an example, “x and / or y” means any element in the three-element
set {(x), (y), (x, y)}. As another example, “x, y, and / or z” is a set of seven elements {(x), (y), (z),
(x, y), (x, z) , (Y, z), (x, y, z)} means any element. As used herein, the terms "block" and "module"
refer to functions that one or more circuits can perform. As used herein, the term "example" is
meant to be used as a non-limiting example, case or description. As used herein, the term "for
example and eg," is intended to introduce one or more non-limiting examples, instances, or
articles of articles comprising explanations. is there. As used herein, a circuit is "operable" to
perform a function whenever the circuit is equipped with the necessary hardware and code (if
necessary) to perform the function. It does not matter if the performance of the feature is
disabled or not enabled by some user configurable setting.
[0010]
FIG. 1 shows an example of an electronic device equipped with a plurality of microphones and
speakers. Referring to FIG. 1, an electronic device 100 is shown.
[0011]
Electronic device 100 may include circuitry suitable to perform or support various functions,
operations, applications, and / or services. The functions, operations, applications, and / or
services that the electronic device 100 implements or supports may be advanced or controlled
based on the user's instructions and / or pre-configured instructions. In some cases, the
electronic device 100 can support data communication according to one or more supported
wired and / or wireless protocols or standards, such as via a wired and / or wireless connection is
there. In some cases, electronic device 100 may be a handset mobile device, ie, for use on the
move and / or at various locations. In this regard, the electronic device 100 can be designed and
/ or configured to be easy to move so that the user can easily move as he / she holds while the
electronic device 100 can be It can be configured to handle at least some of the functions,
operations, applications, and / or services that the electronic device 100 performs or supports
while traveling. Examples of electronic devices can include mobile communication devices (eg,
cell phones, smart phones, and tablet terminals), personal computers (eg, laptop or desktop), and
the like. However, the present disclosure is not limited to any particular type of electronic device.
11-04-2019
4
[0012]
In the exemplary embodiment, electronic device 100 can support voice input and / or output.
The electronic device 100 is, for example, a plurality of speakers used to output and / or input
(capture) sound along a circuit suitable for driving, controlling and / or using a speaker and a
microphone. And a microphone can be incorporated. For example, the electronic device 100 can
include a first speaker 110, a first microphone 120, a second speaker 130, and a second
microphone 140. The method of using the first speaker 110, the first microphone 120, the
second speaker 130, and / or the second microphone 140 may be based on the operation of the
electronic device 100. Furthermore, the electronic device 100 can support multiple operation
modes in accordance with (usually different) usage characteristics of the speaker and / or the
microphone. For example, when the electronic device 100 is a mobile communication device (for
example, a smart phone) (or used as a mobile communication device), the electronic device 100
sets modes such as “handset mode” and “speaker mode” Can be supported).
[0013]
In this respect, the handset mode can accommodate the use of the electronic device 100 during a
voice call, in which the user can hold the electronic device towards the user's face (ie the
electronic device 100 is , Used as a "phone" held in a typical way). For example, while in handset
mode, the first speaker 110 and the first microphone 120 can be used to support voice call
service-ie, the first speaker 110 It can be an earphone speaker while it's used nearby to capture
speech / voice input. In the speaker mode, the second speaker 130 (i.e., a speaker that is not an
earphone) can be used when outputting audio. The speaker mode can, for example, accommodate
use of the electronic device 100 during a voice call, but in the context of the user not being able
to hold the electronic device (eg using the electronic device 100 as a handsfree or speaker
"phone"). That's the case. In this regard, if the electronic device 100 operates in the speaker
mode during a handsfree voice call, the second speaker 130 (i.e., a speaker that is not an
earphone) can be used for voice output, A second microphone 140 (which is better suited to
capturing H.264) can be used for capturing speech / voice input. The speaker mode can also
support the use of the electronic device 100 in providing voice services unrelated to voice calls.
For example, the second speaker 130 can operate in the speaker mode when outputting music to
be reproduced in the electronic device 100. Speakers 110 and 130 can not operate
simultaneously (e.g. in handset mode) and main (earphone) speaker 110 can operate and be used
while second speaker 130 is inactive and / or inactive, while In the speaker mode, the main
(earphone) speaker 110 is not active while the second speaker 130, which can usually generate
stronger speaking power, is active.
11-04-2019
5
[0014]
In various embodiments of the present disclosure, the use and / or configuration of existing
microphones and speakers can be optimized within an electronic device (eg, electronic device
100) to improve various voice related functions. This is done, for example, in a way such as
capturing or acquiring the input signal using a speaker that can usually be deactivated in a
particular mode. Examples of voice related functions that can be enhanced in this way in an
optimal manner with existing microphones and speakers present in the device can include noise
reduction and / or echo cancellation etc. .
[0015]
For example, as it is usually sought to provide high quality voice communication, various
techniques can be applied to improve the quality of the voice. One of the techniques used to
improve voice quality is noise reduction (NR), which is a technique that allows ambient noise to
be reduced for the user (especially the other end user). . In some cases, noise reduction
techniques can be implemented using multiple microphones. For example, using two
microphones in the device, one microphone close to the user's mouth (used to capture the user's
voice) and the other microphone outside the device's mouth (eg near the ear and / or Or, if
placed on the opposite side of the device, the first microphone can be used to pick up the user's
voice and the surrounding noise, while the second microphone is mainly used to pick up the
ambient noise it can. The two signals (coming from the two microphones) can be processed to
generate a clean voice to send to the other party. In such a configuration, noise reduction can be
performed well if the noise is coherent and the noise picked up by the secondary microphone
and the noise picked up by the main microphone are correlated. However, in the presence of
non-coherent noise, such as reverberant noise normally present in small places such as offices,
the noise picked up by both microphones may not be strongly correlated, which degrades the
noise reduction performance There is something I can do. However, noise reduction performance
may be significantly improved when using microphones that are close together (e.g., 1 to 2 cm
from each other). This is because the correlation between the noises picked up by both
microphones may be significantly enhanced.
[0016]
In some cases, various techniques of echo cancellation can be used to reduce the echo and to
11-04-2019
6
make the recipient not hear the user's own voice echo. Acoustic echo cancellation technology
(AEC) can be based on the estimation of noise and echo in the environment of the device.
Furthermore, this estimation can be made continuously-eg during a call, using various adaptation
techniques, etc. The adaptation techniques may be based on various considerations, such as
whether the user is talking, as the user's voice may be interpreted as noise if the adaptation is
done while the user is talking it can. The estimation of whether the user is talking, done to
improve the adaptation, can be done using various techniques. For example, a voice activity
detector (VAD) can be used to analyze the captured signal to determine or estimate if the user is
talking. Most of these techniques are effective when the ambient noise level is low (eg, high
signal to noise ratio (SNR)). However, when the SNR is low (ie, the environmental noise level is
high compared to the level of the user's voice), the estimation process may fail to detect whether
the user is talking, resulting in NR and AEC. Performance is significantly reduced.
[0017]
The arrangement of microphones and / or speakers may be optimal for a given mode of
operation but may not be optimal for other audio related functions. For example, the
microphones 120 and 140 may be positioned relatively far apart from one another (typically for
mobile communication devices)-eg, 10 to 15 cm above and below, and / or may be located on
both sides of the device. However, such an installation may not be optimal for voice related
functions such as noise reduction (NR) and acoustic echo cancellation (AEC). Additional
microphone (s) may be placed relatively close to existing microphone (s) to provide a solution to
this problem. However, the addition of additional microphone (s) may not be desirable due to
various reasons, such as increased cost, device design limitations or limitations, and the like.
Another solution may be to adjust the microphone and speaker placement to especially improve
the performance for these voice related functions. However, such adjustments may adversely
affect the primary usage of these microphones and / or speakers and / or may not be feasible.
[0018]
Thus, in various embodiments, existing microphones and speakers (e.g., the speakers 110 and
130 and the microphones 120 and 140 of the electronic device 100) do not affect the usage of
the existing microphones and / or speakers. This performance can be configured to obtain
improved noise reduction (NR) and acoustic echo cancellation (AEC) performance without the
need for modification of the microphone and / or speaker placement, and this performance can
be Main) purpose of use-can be optimized for eg voice calls, background audio playback and / or
stereo recording capabilities etc. For example, the existing microphones (far away) and speakers
11-04-2019
7
may be configured to operate in a two close microphone based arrangement, such as a specific
operating mode (eg handset mode), Improved noise reduction performance and / or acoustic
echo cancellation can be obtained. This two close microphone based arrangement can be realized
by using one or more speakers to provide the required microphone based function. That is, the
speaker can be used as a "microphone"-ie for capturing sound and / or generating an input
signal.
[0019]
The speaker to be used may be automatically selected according to the operation mode or the
like. For example, the selected speaker can be equipped with a speaker that is otherwise inactive
in its mode of operation. The selected speaker can be used as a vibration detector-for example to
indicate whether the user is talking. The selected speaker can operate simultaneously with the
speaker and the vibration detector. A system implemented in accordance with the present
disclosure may be modular and / or be valid for any architecture. Speaker and microphone
operation can be managed to optimally implement voice related functions such as noise
reduction and / or echo cancellation. This management includes automatically recognizing the
mode of operation, indicating whether the user is talking, and depending on the mode of
operation recognized and / or indicating whether the user is talking Depending on the selection
of the speaker, and depending on the recognition operation mode of the mobile communication
system, and the indication of whether the user is talking, switch the operation of the selected
speaker to a function as a microphone or as a vibration detector Can be included.
[0020]
While some examples may refer to mobile phones, other mobile communication systems or any
suitable electronic system may be used as well. Furthermore, in the example described, a specific
number of speakers and microphones, a specific arrangement of which, and a specific
architecture with other specific parts to manage its operation in a specific way However, it is to
be understood that these examples are included merely to provide a thorough understanding of
the present disclosure and are not intended to limit the scope of the present disclosure. is there.
[0021]
FIG. 2 shows the architecture of an exemplary electronic device equipped with multiple
11-04-2019
8
microphones and speakers. Referring to FIG. 2, an electronic device 200 is shown.
[0022]
The electronic device 200 may be similar to the electronic device 100 of FIG. 1, for example. In
this regard, electronic device 200 can incorporate multiple audio output components (eg,
speakers 2301 and 2302) and audio input components (eg, microphones 2401 and 2402). The
electronic device 200 can also incorporate circuitry to support voice related processing and / or
operation. For example, the electronic device 200 can include the processor 210 and the audio
codec 220.
[0023]
Processor 210 is configured to process data, control or manage operations (e.g., the operation of
the electronic device 200 or parts thereof), and perform tasks and / or functions (or control such
tasks / functions). It can be equipped with any suitable circuit possible. The processor 210 can
run and / or execute applications, programs and / or code, which can be stored, for example, in a
memory (not shown) provided internally or externally to the processor 210. Further, processor
210 can control the operation of electronic device 200 (or a component or subsystem thereof)
using one or more control signals. Processor 210 may comprise a general purpose processor,
which may be configured to perform or support certain types of operations (eg, voice related
operations). Processor 210 may also include a special purpose processor. For example, processor
210 may comprise a digital signal processor (DSP), a baseband processor, and / or an application
processor (eg, an ASIC).
[0024]
Speech codec 220 may comprise suitable circuitry that may be configured to perform speech
encoding / decoding operations. For example, audio codec 220 may comprise one or more
analog-to-digital converters (ADCs), one or more digital-to-analog converters (DACs), and at least
one multiplexer (MUX); The signals handled within the audio codec 220 can be used to direct the
appropriate input and output ports of the multiplexer.
[0025]
11-04-2019
9
In operation, the electronic device 200 can support input and / or output of audio signals. For
example, microphones 2401 and 2402 can receive analog audio input, which can then be
forwarded to audio codec 220 (as analog signals 242 and 244). Audio codec 220 may convert
analog audio input into a digital audio stream (eg, via an ADC), which may be coupled to
processor 210 (eg, via digital signal 216-for example, via an I <2> S connection). Can be
transmitted. The processor 210 can then apply digital processing to the digital audio signal. On
the output side, the processor 210 can generate a digital audio signal, in which case the
corresponding digital audio stream is transmitted to the audio codec 220 (via the digital signal
214-eg on an I <2> S connection) Transmit Audio codec 220 can process the digital audio
stream, convert it to an analog signal (via DAC), and send this analog signal to speakers 2301
and 2302 (via analog connections 222 and 224) be able to.
[0026]
In an exemplary embodiment, the audio output signal can be sent to only one speaker. For
example, the electronic device 200 can support multiple modes such as handset mode and
speaker mode. Thus, the audio output signal can only be sent to the speaker 2301 (which can be
used as a "primary speaker") when the electronic device 200 is operating in the handset mode,
and the electronic device 200 operates in the speaker mode Can be sent only to the speaker
2302 (which can be used as a "secondary speaker"). Switching between the two speakers can be
done using the MUX of the audio codec 220. Furthermore, this switching can be controlled using
a control signal 212 (which can be set based on the operating mode).
[0027]
In some cases, it may be desirable to obtain or generate an audio input using an audio output
component (e.g., the speakers 2301 and 2302 of the electronic device 200), which may include
noise reduction and / or acoustic echo cancellation, etc. Can be used to optimize or improve voice
related features. For example, if the user uses the electronic device for some voice related
services (eg, the device may be a mobile phone and the user is using it during a voice call), the
device (or the device) ) May be in contact with the user's cheeks. The user's utterance (ie, voice)
can cause the user's skeleton to vibrate, which in turn can cause the device's housing to vibrate,
which adheres to the user's cheeks Caused by The device's speaker (s) can usually be attached to
the housing so that it can be used as a vibration detector (VSensor) to sense vibrations in the
housing such as those caused by the user's voice-ie Speakers can be used when generating
11-04-2019
10
VSensor signals. The signal of the VSensor can be analyzed to determine if the user is talking. In
addition, the VSensor's signal (possibly in combination with the signal obtained via a standard
microphone) can be processed, such as to improve the noise reduction and / or acoustic echo
cancellation process. The use of such a speaker may be more appropriate in a particular mode of
operation (eg in handset mode), but the disclosure is not so limited and is generally not relevant
to the user's speech The speaker can be used in the same way in other operating modes that may
be (eg in speaker mode). For example, even in speaker mode, if the device is close to the user's
mouth, the user's voice may still cause the device's housing to vibrate when the user is speaking.
Such vibrations can be detected with a speaker that is not normally active during the current
mode of operation-for example an "earphone" speaker, which may not normally be used during
modes such as speaker mode, It can be configured as a vibration detector (VSensor) and / or act
as a vibration detector (VSensor) to capture such vibrations.
[0028]
Support when using a speaker to obtain audio input (eg as a microphone or vibration detector)
may require the addition or modification of existing components (circuits and / or software)
within the electronics is there. Nevertheless, such changes can be minimized and can be
substantially more cost effective than adding additional dedicated voice input components.
Examples of embodiments supporting such use of the loudspeaker are shown at least in FIGS. 3,
4 and 5.
[0029]
FIG. 3 shows the architecture of an exemplary electronic device equipped with a plurality of
microphones and speakers, which has been modified to use the speakers as an audio input
component. Referring to FIG. 3, an electronic device 300 is shown.
[0030]
The electronic device 300 may be, for example, substantially similar to the electronic device 200
of FIG. However, the electronic device 300 may use an audio output component (for example, a
speaker) as an audio input component (for example, a microphone or a vibration detector) to
improve a specific audio-related function (for example, noise reduction and / or acoustic echo
cancellation). Can be configured to support use as The electronics 300 can include additional
11-04-2019
11
circuitry and / or components̶that is, in addition to the circuitry and / or components
described with respect to the electronics 200̶to support the use of such optimized speakers.
For example, in the embodiment shown in FIG. 3, the electronics can comprise a multiplexer
(MUX) 330 and a pair of amplifiers 310 and 320. MUX 330 and amplifiers 310 and 320 can be
used to take the input from speakers 2301 and 2302 (via connections 312 and 322) and send
this input (s) to audio codec 220 . The input (s) from the speakers 2301 and 2302 can be used to
enhance and / or optimize voice related functions such as noise reduction and / or acoustic echo
cancellation. In this regard, using the inputs from the speakers 2301 and 2302 may be arranged
by placing the speakers in the electronic device 300-for example, at a distance which is preferred
when capturing the inputs (for example the microphone 2401 and Or near the one of 2402 or
attached to the housing of the electronic device 300 may be desirable, thereby making it an ideal
arrangement for acting as a vibration detector. Become.
[0031]
In operation, speakers 2301 and 2302 can be configured and / or used as input devices (ie, to
obtain voice or vibration input). In an exemplary use, one or both of the speakers 2301 and 2302
can be selected for use in obtaining a "microphone" input, which is a process of noise reduction
and / or acoustic echo cancellation. Can be processed, such as in conjunction with input obtained
from a standard microphone (ie, one or both of microphones 2401 and 2402). Processor 210
operates as two close microphones, selecting an input obtained from MUX 330 (eg via control
signal 336), one of speakers 2301 and 2302 and one or more microphones 2401 and 2402 Can
be ordered to The specific pair of speaker and microphone used in this manner can be selected
automatically and / or adaptively based on the operation mode of the electronic device 300 or
the like.
[0032]
For example, in handset mode where the speaker 2301 can be used (eg, as an "earphone"
speaker), the processor 210 inputs to the MUX 330, via the control signal 336, the microphone
2401 (used as the main microphone) and the speaker 2302 Can be ordered to select Further, the
processor 210 can be configured to use the speaker 2302, which is not operating as a speaker
during handset mode, as a microphone-for example to obtain an input that supports NR and / or
AEC processing. For example, the speaker 2302 can be configured to generate an input signal
while using the same components or the like used in generating the output sound, but to function
in the opposite manner. Further, the generated signal can be amplified via amplifier 320 before
being sent to MUX 330. Thus, signals selected from components acting as close microphones (ie,
11-04-2019
12
microphone 2401 and speaker 2302) can be sent to audio codec 220 (via analog connections
332 and 334) and digitized with this audio codec Do. The corresponding digital signal may then
be sent to processor 210 (as digital signal 216) for further processing.
[0033]
In speaker mode, where the speaker 2302 can be used (eg, as a “non-earphone” speaker), the
processor 210 inputs to the MUX 330 via the control signal 336 the microphone 2402 (used as
the main microphone) and the speaker 2301 Can be ordered to select The processor 210 can be
configured to use the speaker 2301, which is not operating as a speaker during speaker mode, as
a microphone as described above. Thus, the microphone 2402 and the speaker 2301 can act as a
close microphone, and from there the signal input to the MUX 330 (after amplifying the signal
generated by the speaker 2301 via the amplifier 310) by the MUX 330 (connection 332) And
334) may be sent to the audio codec 220 for digitization, and the corresponding digital results
may be sent to the processor 210 for further processing.
[0034]
Processor 210 may be configured to perform additional steps in handling the input signal and
may identify the source of the input signal. For example, since the frequency response of a
standard microphone (eg, microphones 2401 and 2402) is typically different than the frequency
response of speakers (eg, speakers 2301 and 2302) acting as microphones, processor 210 is a
speaker acting as a microphone The pre-processing of the signal from can be performed to better
match the input signal coming from the standard microphone. An example of a pre-processing
step to match the signal from the speaker to the signal of a standard microphone is described in
more detail in FIG.
[0035]
FIG. 4 shows the architecture of an exemplary electronic device equipped with a plurality of
microphones and speakers, which has been modified in another way so that the speakers can be
used as audio input components. Referring to FIG. 4, an electronic device 400 is shown.
[0036]
11-04-2019
13
The electronic device 400 may be substantially similar to, for example, the electronic device 200
of FIG. However, as in the electronic device 300 of FIG. 3, the electronic device 400 may use an
audio output component (for example, a speaker) as an audio, for example, to improve certain
audio-related functions (for example, noise reduction and / or acoustic echo cancellation). It can
also be configured to support use as an input component (eg, a microphone or a vibration
detector). The electronics 400 can include additional circuitry and / or components̶that is, in
addition to the circuitry and / or components described with respect to the electronics 200̶to
support the use of such optimized speakers. For example, in the embodiment shown in FIG. 4, the
electronics can include a pair of switches 410 and 420 and a pair of amplifiers 430 and 440.
Each switch 410 and 420 can include circuitry to allow signals to be adaptively routed, such as
based on an input port that receives the signals. For example, switches 410 and 420 transfer the
signal from voice codec 220 (ie, the "output" signal) to speakers 2301 and 2302, and amplify the
signal obtained from speakers 2301 and 2302 (ie, the "input" signal) It can be configurable to
forward to 430 and 440. Switches 410 and 420 and amplifiers 430 and 440 can be used to take
the input from speakers 2301 and 2302 and send this input (s) to audio codec 220. As noted, the
input (s) from speakers 2301 and 2302 can be used to enhance and / or optimize voice related
functions such as noise reduction and / or acoustic echo cancellation.
[0037]
In operation, the speakers 2301 and 2302 can be configured and / or used as input devices (ie,
to obtain voice or vibration input). In an exemplary use, one (or both) of the speakers 2301 and
2302 may be selected as a VSensor and configured to use for sensing vibration and generating a
corresponding "vibration" input This vibration input can be processed, such as in conjunction
with the input obtained from a standard microphone (ie one of the microphones 2401 and
2402), in the process of noise reduction and / or acoustic echo cancellation. . The particular
speaker used as the VSensor can be selected automatically and / or adaptively based on the
operation mode of the electronic device 400 or the like.
[0038]
For example, in handset mode, the speaker 2301 can be activated to be used as the primary
speaker, but generally it can not be activated or used to support the voice call service. Thus, the
speaker 2302 can be selected when the electronic device 400 is in handset mode, and can be
configured as a VSensor. The speaker 2302 can generate a VSensor signal that can be routed to
11-04-2019
14
the amplifier 440 via the switch 420 (on connection 422, for example, when the electronic
device 400 is experiencing some vibration), which amplifies the signal and then This signal may
be sent to the audio codec 220 (via connection 442). Audio codec 220 may process the signal
(eg, apply conversion via its ADC) and send the resulting digital signal (as digital signal 216) to
processor 210 for processing. In some cases, processor 210 may incorporate a dedicated
application module 450 (eg, a software module), which may be configurable to analyze incoming
VSensor signals. For example, analysis of the VSensor signal can detect whether the
corresponding vibration indicates that the user of the device is talking.
[0039]
In the speaker mode where the speaker 2302 can be activated and used as a main speaker but
can not normally activate or use the speaker 2301, the speaker 2301 can be selected instead and
configured as a VSensor. In this way, the switch 410 can route any VSensor signal generated by
the speaker 2301 (on connection 412) to the amplifier 430, which amplifies this signal before it
is sent (on connection 432). Can be sent to the audio codec 220). This signal can then be
processed for the headset mode in the same manner as described above.
[0040]
In some embodiments, while operating and being used as a speaker, the speaker can be
configured as a VSensor while being used as such (ie, to generate a VSensor signal). For example,
in the speaker mode where the speaker 2302 can be normally operated and used as a primary
speaker, the speaker 2301 can continue to be configured as a VSensor. Then, the switch 420
routes the signal in both directions if necessary--that is, it routes the “output” signal received
from the audio codec 220 to the speaker 2302 and the “input” VSensor signal received from
the speaker 2301 the amplifier 440 Can also be configured to route.
[0041]
FIG. 5 shows an exemplary pre-processing for converting the signal obtained from the speaker to
match the signal from the standard microphone and using it in conjunction with the standard
audio signal obtained via the microphone There is. Referring to FIG. 5, a pre-treatment process
500 is shown.
11-04-2019
15
[0042]
Pre-processing stage 500 may be part of processing circuitry in an electronic device (e.g.,
processor 210) configured to handle audio processing in the electronic device. In particular, the
pre-processing stage 500 supports handling of audio input signals obtained from audio output
components (e.g. speakers etc.) to be used in conjunction with audio input from standard audio
input components (e.g. standard microphones) Can be configured.
[0043]
In the exemplary embodiment shown in FIG. 5, the pre-processing stage 500 is configured to act
as a (standard) input signal 520 received from a standard microphone (eg, one of the
microphones 2401 and 2402) and the microphone And an input audio signal 530 received from
a speaker (e.g., one of the speakers 2301 and 2302). The pre-processing stage 500 may then
process the speaker input signal 530, generating a corresponding (modified) signal 540, this
corresponding (corrected) signal 540 being (standard) It makes it possible to match the input
signal 520 properly. For example, the speaker input signal 530 may be filtered within the
preprocessing stage 500 to make the frequencies of the signals 520 and 540 similar (eg, via the
filter 510). In this regard, the filter 510 can comprise suitable circuitry for filtering the signal.
The filter 510 can be configured to properly convert the signal in such a way that the signal
corresponding to the speaker input can be matched to the standard microphone input.
[0044]
For example, the filter 510 can be implemented as a finite impulse response (FIR) filter, which is
linear in phase, so as not to corrupt the phase of the filtered signal. Additionally, the FIR filter can
be designed such that the spectrum of the processed speaker signal (ie, the filtered signal 540)
approximates the spectrum of the microphone signal (ie, the signal 520). For example, assuming
that S (f) corresponds to the spectrum of a speaker as a microphone and SM (f) is the spectrum of
a standard microphone, the filtering performed by this filter will result in the spectrum of the
processed signal-S ( f) The filter 510 can be configured such that xFIR (f) approximates the
spectrum SM (f) of the microphone spectrum. Therefore, the frequency response of the filter 510
can be configured to be FIR (f) = SM (f) / S (f). Thus, with the (FIR) filter 510 configured in this
way, filtering of the signal can be realized in a certain way, resulting in a difference between the
transmission function of the standard microphone and the speaker acting as the microphone.
11-04-2019
16
[0045]
The filtering function of filter 510 can be controlled using filtering parameters, which can be
determined based on, for example, a calibration process. The calibration process can be
performed once the filtering parameters are revealed, and the filtering parameters can then be
saved and used again. The calibration process may also be performed iteratively and / or
dynamically (eg, in real time). The filtering function (and thus also the corresponding filtering
parameters) may be different depending on the source of the signal. For example, the filtering
parameters may be different if the signal to be filtered is coming from the speaker 2301 instead
of the speaker 2302. As such, different sets of filtering parameters can be predefined for
different (available) speakers, making it possible to select the appropriate speaker according to
the source for each usage situation. The signals 520 and 540 can then be used as two
“microphone” signals̶for example, noise reduction (NR) operation of any two microphones.
[0046]
FIG. 6 is a flow chart illustrating an exemplary process for managing multiple microphones and
speakers in an electronic device. Referring to FIG. 6, a flowchart 600 is shown including a
plurality of exemplary steps, which may be performed within an electronic system (eg, electronic
device 300 or 400 of FIGS. 3 and 4) Optimal management of embedded speakers and
microphones can be simplified.
[0047]
In the first step 602, the electronic device (e.g., electronic device 300) can be powered on and
activated. This step may include turning on the power and activating and / or activating various
components of the electronic device, such that the electronic device has features or functions it
supports. An application can be ready to be implemented or run.
[0048]
In step 604, the operation mode of the electronic device can be set (or switched) based on the
11-04-2019
17
user's command / input or a pre-configured execution instruction. For example, if the electronic
device can support communication (especially voice calls) services, the operation mode can
include handset mode and / or speaker mode. Thus, the electronic device can switch to handset
mode when the user of the device initiates (or accepts) a voice call and places the electronic
device on the user's face.
[0049]
At step 606, it may be determined based on the current mode of operation whether there are any
inactive speakers. For example, in mobile communication devices with multiple speakers (e.g.
mobile phones), only certain speakers or speakers may be used in certain operating modes-e.g.
only "earphone" speakers in handset mode. If it is determined that there are no inactive (or
unused) speakers, the process may proceed to step 612, otherwise the process proceeds to step
608.
[0050]
At step 608, it may be determined whether inactive (or unused) speakers need to be configured
to provide input. For example, in an electronic device with multiple microphones, it may be
possible to use a microphone to obtain input to support functions such as noise reduction and
acoustic echo cancellation. However, the performance of these functions may be degraded if the
microphones used are not optimally placed (e.g. too far away). As such, it may be desirable to use
the speaker as a "microphone" if the speaker is more optimally positioned for one microphone.
Also, use the speaker as a vibration detector (VSensor), for example, if the speaker is ideally
positioned to receive vibrations propagating in the user's skeleton and the electronics (or its
housing) May be desirable. If it is determined that the non-active (or non-use) speakers need not
be configured to provide input, then the process may proceed to step 612, otherwise the process
proceeds to step 610. And proceed.
[0051]
In step 610, selected (e.g., based on inactive / not used as determined based on the current mode
of operation and / or based on optimal conditions for providing the desired input) One or more
speakers can be configured to provide the desired input (e.g., as a "microphone" to capture
ambient sound, or as a VSensor to capture vibrations propagating through the electronics). In
11-04-2019
18
addition, the input is performed-for example by activating the necessary components (amplifiers,
MUXs, switching elements etc) and routing and processing the generated input, of the selected
speaker (s). The electronics can be configured entirely to support use.
[0052]
At step 612, the electronic device can operate according to the current mode of operation. This
step can include using the input obtained via any selected speaker (s) to improve, for example,
noise reduction and / or acoustic echo cancellation processing. .
[0053]
FIG. 7 is a flow chart illustrating an exemplary process for generating an audio input using
vibration captured through a speaker. Referring to FIG. 7, a flow chart 700 is shown that includes
a plurality of exemplary steps. The plurality of exemplary steps may correspond to, for example,
an algorithm implemented via application module 450 and / or may be implemented in
accordance with this algorithm.
[0054]
In a first step 702, a signal can be captured via a speaker. The signal V (t) may, for example,
correspond to the vibration captured via the loudspeaker. In step 704, this signal may be
preprocessed to generate, for example, the corresponding discrete signal V (n), where "n" is a
sample of the signal V (t) at a discrete time nT. It corresponds to Such a signal V (n) may be
sensitive to vibrations due to speech but the sensitivity to ambient noise may be significantly
lower (e.g. up to about 1 kHz), especially at low frequencies. Thus, even in noisy environments,
the signal to noise ratio (SNR) may be relatively high.
[0055]
At step 706, the signal can be processed to be appropriate for analysis. For example, the signal V
(n) can be filtered (eg, using a band pass filter or BPF).
11-04-2019
19
[0056]
At step 708, the signal can be processed. For example, the VBP (n) signal (generated from the
filtered V (n) signal) can be processed on a sample-by-sample basis using one or more analysis
techniques. The VBP (n) signal can be analyzed using standard techniques, such as
autocorrelation to calculate pitch (eg, of the speaker). The VBP (n) signal can also be analyzed by
calculating the envelope VEN (n) of the signal.
[0057]
At step 710, the analysis results can be checked to determine if any matching criteria are met. If
it can be determined that no matching criteria are met, the process can return to step 708 to
analyze the next sample. It can be determined that at least one matching criterion is met--that is,
if the person is indicated to be talking, the process can proceed to step 712, in which the signal is
input As a signal-for example, it can be used as a voice activity detector (VAD).
[0058]
For example, the check performed at step 710 includes determining whether a pitch has been
detected and / or whether the envelope of the signal is above a predetermined threshold-eg, VEN
(n)> TH_env. be able to.
[0059]
The detection of the pitch can be based on the calculation of the pitch value by analyzing the
autocorrelation of the input signal and checking its maximum value for a predetermined
threshold.
Therefore, if the calculated maximum value (Auto_max) is larger than a predetermined threshold
(TH_pitch), the signal can be determined as an audio signal.
[0060]
11-04-2019
20
Therefore, if Auto_max> TH_pitch, or if Auto_max <TH_pitch but VEN (n)> TH_env, the signal can
be determined as a voice frame, and the VAD flag can be set. However, if not, the VAD flag is
released.
[0061]
In the exemplary process illustrated in FIG. 7, signal handling (calculation and / or analysis) is
performed on a sample-by-sample basis. However, instead of doing this, this processing may be
performed on a set of samples. For example, each of the N samples ("N" is an integer) can be
combined into one frame, and the calculation is performed frame by frame. The size of the frame
can be adjusted for optimal performance. For example, each frame can be 10 ms (so N is set so
that each of the N samples has a time of 10 ms).
[0062]
In some embodiments, a method of adaptively managing speakers and / or microphones can be
used in a system that can comprise an electronic device (eg, electronic device 300 or 400), the
electronic device comprising one or more electronic devices A circuit (eg, processor 210, audio
codec 220, switches 410 and 420, and amplifiers 310, 320, 430, and 440) and a first speaker
and a second speaker (eg, speakers 2301 and 2302) can be provided. . The one or more circuits
may be operable to determine an operating mode of the electronic device, one of the first speaker
and the second speaker based on the determined operating mode. Or managing both operations,
where management may include adaptively switching or modifying the function of one or both of
the first and second speakers. To switch or modify the function of one or both of the first and
second speakers, one of the first and second speakers is used as a microphone or as a vibration
detector (VSensor) Can be included to configure. The one or more circuits may be configured
such that one of the first and second speakers continues to function as a speaker while also being
used as a microphone or as a vibration detector. One or more circuits from one of the first and
second speakers configured to be used as a microphone or as a vibration detector to support
audio enhancement features in the electronic device. The input can be made operational to use.
Speech enhancement features can include noise reduction and / or acoustic echo cancellation.
One of the first and second speakers may be configured as a vibration detector to indicate if the
user of the electronic device is speaking. One of the first speaker and the second speaker can be
configured as a vibration detector for detecting a vibration in a housing of the electronic device.
One or more circuits may be operable to select the other of the first and second speakers
depending on another different mode of operation of the electronic device.
11-04-2019
21
[0063]
In some embodiments, a method of adaptively managing speakers and microphones includes a
first speaker and a second speaker (eg, speakers 2301 and 2302) and a first microphone and a
second microphone (eg, microphone 2401). And 2402) can be used for mobile communication
devices. The method includes determining a mode of operation of the mobile communication
device, generating a display when the user of the mobile communication device is talking, and
displaying that the mode of operation of the mobile communication device the user is talking
Based on the selection of the one of the first and second speakers, and managing the operation of
the selected speaker based on the determined operation mode. For this management, it is
determined that the time input from the first microphone and the second microphone is
inadequate to support the audio enhancement function in the mobile communication device, and
that the selected speaker And adaptively switching or modifying the function to achieve an input
to the selected speaker. Audio enhancement features can include noise reduction or acoustic
echo cancellation. The inputs from the first and second microphones are inadequate to support
voice enhancement features in the mobile communication device based on the arrangement and /
or spacing between the first and second microphones It may be determined that One of the first
and second speakers may be selected based on the placement and / or distance to one or both of
the first and second microphones.
[0064]
In another embodiment, a non-transitory computer readable medium and / or storage medium
storing machine code and / or a computer program having at least one code portion executable
by a machine and / or computer, and / or non-temporary A machine readable medium and / or a
storage medium can be provided, whereby the machine and / or computer, for an adaptive
system for managing a plurality of microphones and speakers, as described herein. To carry out.
[0065]
Thus, the method and / or the system can be realized in hardware, software or a combination of
hardware and software.
The method and / or system may be implemented centrally in at least one computer system, or
distributed if multiple elements are deployed across multiple interconnected computer systems.
11-04-2019
22
Any type of computer system or other system adapted to perform the methods described herein
is suitable. A typical combination of hardware and software is a general purpose computer
system comprising a computer program, such that the computer program performs the methods
described herein when loaded and executed. It can be a general purpose computer system that
controls the computer system. Another exemplary embodiment may comprise an application
specific integrated circuit or chip.
[0066]
The method and / or the system is a computer program product comprising any feature that
enables the implementation of the method described herein, which can be carried out when
loaded into a computer system It can also be incorporated into computer program products.
Here, a computer program refers to a system having an information processing ability to directly
execute a specific function or a) conversion to another language, code or code, b) reproduction in
a different material form, A set of instructions to be implemented after either or both have been
expressed in any language, code or code. Thus, in some embodiments, comprising a nontransitory machine readable (e.g. computer readable) medium (e.g. a FLASH drive, an optical disc,
a magnetic storage disc etc.) storing one or more lines of code executable by the machine. This
allows the machine to carry out the process as described herein.
[0067]
Although the method and / or system has been described with reference to particular
embodiments, various modifications may be made without departing from the scope of the
method and / or system, and equivalents may be substituted. Those skilled in the art will
understand that it is also possible. In addition, many modifications may be made to the teachings
of the present disclosure to adapt it to particular situations or materials, without departing from
the scope of the present disclosure. Accordingly, the method and / or system is not limited to the
specific embodiments disclosed, but the method and / or system is intended to include all
embodiments falling within the scope of the appended claims. .
11-04-2019
23
Документ
Категория
Без категории
Просмотров
0
Размер файла
42 Кб
Теги
description, jp2014112831
1/--страниц
Пожаловаться на содержимое документа