close

Вход

Забыли?

вход по аккаунту

?

DESCRIPTION JP2016127376

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JP2016127376
Abstract: To provide a sound output device capable of grasping surrounding conditions even
when using a sound output device. A headphone device is acquired by an acquiring unit
according to an acquiring unit for acquiring an environmental sound, an identifying unit for
identifying a user's condition, and a user's condition identified by the identifying unit. The
extraction unit 123 extracts a predetermined sound from the environmental sound, and the
reproduction unit 124 executes a reproduction process based on the sound extracted by the
extraction unit 123. [Selected figure] Figure 2
Sound output device and method of reproducing sound in sound output device
[0001]
The present invention relates to a sound output device such as earphones and headphones, and a
method of reproducing sound in the sound output device.
[0002]
In recent years, sound output devices such as earphone devices and headphone devices provided
with a function capable of cutting surrounding sounds (environmental sounds) have been put to
practical use.
For example, according to the headphone device described in Patent Document 1 below, it is
possible to prevent the user from hearing the noise by canceling the noise contained in the
11-04-2019
1
environmental sound. As noise to be canceled, noise generated by, for example, a vehicle can be
considered.
[0003]
JP, 2008-193420, A
[0004]
If the noise contained in the environmental sound is canceled while using the sound output
device by the above-described function, it may be difficult for the user to grasp the surrounding
situation.
For example, when the vehicle approaches, if the sound is canceled as noise, it is not preferable
because the user hardly notices the approach of the vehicle. In this case, the sound of the vehicle
is not noise but a sound necessary for the user. Therefore, what kind of environmental sound is
necessary for the user is considered to be different depending on the situation of the user.
[0005]
The present invention has been made in view of the above problems, and it is an object of the
present invention to provide a sound output device and a sound output device capable of
grasping surrounding conditions even when using the sound output device. The purpose is to
provide a regeneration method.
[0006]
The sound output apparatus according to an aspect of the present invention is acquired by the
acquiring unit according to the acquiring unit acquiring the environmental sound, the specifying
unit specifying the user's condition, and the user's condition specified by the specifying unit. An
extraction unit that extracts a predetermined sound from an environmental sound, and a
reproduction unit that executes reproduction processing based on the sound extracted by the
extraction unit.
[0007]
A sound reproduction method in a sound output device according to an aspect of the present
invention acquires an environmental sound according to a step of acquiring a user's situation, a
11-04-2019
2
step of identifying a user's situation, and a user's situation identified by the identifying step.
Extracting the specific sound from the environmental sound acquired by the step; and
performing a reproduction process based on the sound extracted by the extracting step.
[0008]
According to the sound output device or the method of reproducing sound in the sound output
device, a predetermined sound is extracted from the environmental sound according to the
situation of the user, and the reproduction process is executed based on the extracted sound.
For example, by executing playback processing that makes it easier for the user to hear a
predetermined sound required, the user can know the presence of the sound even when using
the sound output device, thereby , Will be able to grasp the surrounding situation.
[0009]
In addition, the reproduction means may perform an emphasizing process on the sound extracted
by the extraction means.
By listening to the enhanced sound, the user can more surely know the presence of the
predetermined sound required for the user.
[0010]
Further, the reproduction means may execute the emphasizing process in consideration of the
positional relationship between the sound source of the predetermined sound and the sound
output device.
Thereby, the user can hear the sound corresponding to the position of the sound source.
[0011]
11-04-2019
3
In addition, the reproduction means may execute the emphasizing process when the sound
extracted by the extraction means includes a predetermined vocabulary. As a result, it is possible
to further narrow down the sounds to be subjected to the emphasizing process, so that the user
can hear sounds more necessary for the user.
[0012]
Also, the reproduction means may execute the emphasizing process by changing the
reproduction state of another sound. By changing the playback state of other sounds so that the
user can easily hear the necessary sound, the user can hear the necessary sound more reliably.
[0013]
Also, the reproduction means may execute the emphasizing process by converting the sound
extracted by the extraction means. For example, by converting the extracted sound into a sound
that the user can easily hear and playing the sound, the user can know the existence of the sound
more reliably.
[0014]
According to the present invention, even when using a sound output device, it is possible to
grasp the surrounding situation.
[0015]
It is a figure for explaining an outline of a sound output device concerning an embodiment.
It is a figure which shows schematic structure, such as a sound output device. It is a figure which
shows an example of the data table which a memory | storage part memorize | stores. It is a
figure for demonstrating the outline | summary of the data processing for extracting the
footsteps of others. It is a figure for demonstrating specification of the user condition based on
the data of an electroencephalogram sensor. It is a flow chart which shows an example of
11-04-2019
4
processing which a sound output device performs.
[0016]
Hereinafter, embodiments of the present invention will be described with reference to the
drawings. In the description of the drawings, the same elements will be denoted by the same
reference symbols and redundant description will be omitted.
[0017]
FIG. 1 is a diagram for explaining an outline of a sound output device according to the
embodiment. In the example shown in FIG. 1, the sound output device is a headphone device
100, and the headphone device 100 includes a microphone 110, a controller (control unit) 120,
and a speaker 140. These elements are connected via a cord C, and at the end of the cord C, a
terminal T is provided. The user can listen to various sounds (for example, music and game
sounds) from the speaker 140 by connecting the terminal T to each device included in the device
group 200 described later and mounting the headphone device 100. Besides the sound, the
headphone device 100 can acquire various information from the device group 200. Note that
various sounds and information can be obtained not only by connecting the terminal T of the
headphone device 100 to each device, but also by using a short distance wireless communication
technology such as Bluetooth (registered trademark). The sound output device according to the
embodiment is not limited to the headphone device 100 but may be an earphone device.
[0018]
As the device group 200, in the example illustrated in FIG. 1, the terminal device 200a, the
wristband 200b, the glass 200c, the illuminance meter 200d, the wristwatch 200e, and the like
can be given.
[0019]
The terminal device 200a is, for example, a portable communication terminal device such as a
smart phone.
11-04-2019
5
Such terminal devices are not only capable of network communication, but can also enjoy music
and games by activating applications. When the headphone device 100 is attached, those sounds
are output to the user via the speaker 140.
[0020]
The wristband 200 b is attached to the user's wrist portion, thereby acquiring the user's heart
rate with the heart rate sensor 203 and transmitting the information to the headphone device
100.
[0021]
The glass 200 c is attached to the head of the user, whereby the brain wave sensor 202 acquires
the brain wave of the user, and transmits the information to the headphone device 100.
[0022]
The illuminance meter 200 d acquires illuminance by being carried by the user, and transmits
the information to the headphone device 100.
[0023]
The wristwatch 200e is attached to the arm of the user, acquires the current time, and transmits
the information to the headphone device 100.
[0024]
The functions of the above-described wristband 200b, glass 200c, illuminance meter 200d,
wristwatch 200e, and the like may be realized in the terminal device 200a.
In addition, the functions (described later) of the microphone 110 and the control unit 120
included in the headphone device 100 may also be realized in the terminal device 200a.
[0025]
FIG. 2 is a block diagram showing a schematic configuration of the headphone device 100 and
11-04-2019
6
the like.
The headphone device 100 further includes a storage unit 130 in addition to the microphone
110, the control unit 120, and the speaker 140 described in FIG. 1 described above.
The storage unit 130 stores various information necessary for the process performed by the
control unit 120.
Although the storage unit 130 is provided separately from the control unit 120 in the example
illustrated in FIG. 2, the storage unit 130 may be included in the control unit 120. Alternatively,
the function of the storage unit 130 may be realized in the server 400.
[0026]
First, similarly to a general earphone device or headphone device, the user can listen to music or
game sounds by using the headphone device 100. For example, the sound of the music
reproduction device 208 or the game machine 209 is sent to the headphone device 100 and
output from the speaker 140, and the user can hear the sound. The functions of the music
reproduction device 208 and the game machine 209 can be realized, for example, in the terminal
device 200a of FIG.
[0027]
In addition to the functions of the general earphone device and headphone device described
above, the headphone device 100 according to the present embodiment is characterized in
including components such as the microphone 110, the control unit 120, and the storage unit
130.
[0028]
The microphone 110 detects ambient sound (environmental sound).
The microphone 110 may be a plurality of microphones. In the example of FIG. 1 described
11-04-2019
7
above, the microphones 110 are configured by two microphones, but the number of the
microphones 110 may be three or more. By using a plurality of microphones 110, it is possible
to detect the time difference of the timing at which the sound of the sound source reaches each
microphone 110, and it is also possible to specify the position information of the sound source.
The position information of the sound source includes the distance from the headphone device
100 to the sound source, the direction thereof, and the like.
[0029]
The control unit 120 includes an acquisition unit (acquisition unit) 121, a specification unit
(specification unit) 122, an extraction unit (extraction unit) 123, and a reproduction unit
(reproduction unit).
[0030]
The acquisition unit 121 is a part that acquires an environmental sound.
Environmental sounds are acquired via the microphone 110. The acquisition unit 121 can also
record (store) the acquired environmental sound in a timely manner. The recording data can be
stored in the storage unit 130.
[0031]
The identifying unit 122 is a part that identifies the user's situation (user situation). The user
status is identified based on the information transmitted from the device group 200. This will be
described in more detail later with reference to FIG.
[0032]
The extraction unit 123 is a portion that extracts a predetermined sound (predetermined sound)
from the environmental sound acquired by the acquisition unit 121 according to the user
situation specified by the identification unit 122. The extraction of the predetermined sound can
be performed by analyzing the environmental sound using, for example, a known speech
recognition technology including analysis of frequency components. What kind of predetermined
11-04-2019
8
sound is extracted can be performed by creating in advance a database in which the user
situation and the predetermined sound are associated and described, and referring to the
database. Such a database can be stored in the storage unit 130. One example will be described
later with reference to FIG.
[0033]
The reproduction unit 124 is a part that executes reproduction processing based on the sound
extracted by the extraction unit 123. The reproduction process is a process for enabling the user
to know the presence of a predetermined sound. In the reproduction process, the extracted
sound may be reproduced as it is or may be reproduced after the emphasis process is performed.
The emphasizing process is, for example, a process of increasing the volume of the extracted
sound, converting the frequency of the extracted sound, or replacing (converting) the extracted
sound with another sound registered in advance. It is. The reproduction unit 124 transmits the
sound to be reproduced to the speaker 140. Thereby, the speaker 140 outputs a sound, and the
user can hear the sound. Note that the music and game sounds described above can also be
transmitted to the speaker 140 via the reproduction unit 124. Therefore, the reproduction unit
124 can also control reproduction of music and game sounds.
[0034]
The storage unit 130 stores various data necessary for the process performed by the control unit
120. An example of data stored in the storage unit 130 will be described later with reference to
FIG.
[0035]
The speaker 140 outputs the sound transmitted from the reproduction unit 124. The speaker
140 may be a plurality of speakers. In the example of FIG. 1 described above, the speaker 140 is
composed of two speakers. By using multiple speakers 140, stereo output is possible. For
example, by shifting the timing of the sound outputted by each speaker (the timing of the sound
heard by the user with the left and right ears), the user can perceive that the sound can be heard
from a specific direction. For example, when the position of the sound source is specified by the
plurality of microphones 110 described above, the sound can be output from the speaker 140 so
that the position of the sound source can be known to the user.
11-04-2019
9
[0036]
The control unit 120 described above physically includes one or more central processing units
(CPUs), a random access memory (RAM) as a main storage device and a read only memory
(ROM), a communication module as a data transmission / reception device, A computer including
hardware such as an auxiliary storage device such as a hard disk can be included. Each function
of the control unit 120 described with reference to FIG. 2 is, for example, a communication
module under control of the CPU by reading one or a plurality of predetermined computer
software on hardware such as a CPU and a RAM. , And reading and writing of data in the RAM
and the auxiliary storage device. In addition, each function of the control unit 120 can also be
realized using dedicated hardware. In addition, as described above, the function of the control
unit 120 can also be realized in the terminal device 200a.
[0037]
Next, the device group 200 will be described. In the example shown in FIG. 2, the device group
200 includes an acceleration sensor 201, an electroencephalogram sensor 202, a heart rate
sensor 203, an illuminance sensor 204, a timer 205, and a GPS (Global Positioning System). ), A
communication device 207, a music reproduction device 208, and a game machine 209. The
functions of acceleration sensor 201, GPS device 206, communication device 207, music
playback device 208 and game machine 209 are implemented, for example, in terminal device
200a. The electroencephalogram sensor 202 is mounted on, for example, a glass 200 c. Heart
rate sensor 203 is mounted, for example, on wristband 200 b. The illuminance sensor 204 is
mounted on, for example, the terminal device 200a or the illuminance meter 200d. The timer
205 is mounted on, for example, the terminal device 200a or the wristwatch 200e.
[0038]
The communication device 207 can communicate with the server 400 via the communication
network 300. For example, cooperation between the GPS device 206 and the communication
device 207 enables positioning using the process of the server 400, acquisition of movement
history, and the like. In addition, various music contents can be downloaded by the music
reproducing device 208 and the communication device 207 cooperating, and various game
contents can also be downloaded by the gaming machine 209 and the communication device
11-04-2019
10
207 cooperating. .
[0039]
The outline of the operation of the headphone device 100 will be described. First, when the user
uses the music reproduction device 208 or the game console 209, music or game sounds are
output from the speaker 140, and the user can hear the sounds. Furthermore, when the user is
listening to music or game sounds (that is, when using the headphone device 100), if the
environmental sound detected by the microphone 110 includes a predetermined sound, the
sound is Are extracted by the extraction unit 123. Then, the reproduction unit 124 executes
reproduction processing based on the extracted sound. Thereby, the user can know the presence
of the predetermined sound included in the environmental sound even when listening to the
music or the sound of the game.
[0040]
In the present embodiment, a sound considered to be necessary for the user is set as the
predetermined sound. Since such a predetermined sound differs according to the user situation,
in the present embodiment, an appropriate sound according to the user situation is set as the
predetermined sound. Therefore, the storage unit 130 stores a data table in which the user
situation and the predetermined sound are associated with each other.
[0041]
FIG. 3 is a diagram showing an example of a data table stored in the storage unit 130. As shown
in FIG. As shown in FIG. 3, the data table 130a describes the user situation and the
predetermined sound in association with each other.
[0042]
The user status includes, for example, “walk alone at night”, “running”, “focus on game in
train” (or sleeping in train). For example, predetermined sounds “footsteps of others”,
“vehicle sounds”, and “in-car announcements” correspond to these user situations.
11-04-2019
11
[0043]
The reason that the predetermined sound "footsteps of others" corresponds to the user situation
"walk alone at night" is, for example, in the case of walking alone at night, for example, when
another person approaches from behind, the presence of footsteps of other people It is
considered to be preferable to inform. This means that in the case of the user situation "walk
alone in the night path", "the footsteps of others" are judged to be sounds necessary for the user.
[0044]
Here, with reference to FIG. 4, an outline of data processing for extracting the footsteps of
another person will be described. First, with reference to FIGS. 4A and 4B, the case where there is
no other person's footsteps as the environmental sound and only the footsteps of the user exist
will be described.
[0045]
FIG. 4A is a graph showing an example of temporal change of environmental sound. The
horizontal axis of the graph indicates time, and the vertical axis indicates voltage. The vertical
axis indicates the voltage because the footsteps of the user are converted into voltage signals and
acquired by the microphone 110 (FIG. 2). As shown in FIG. 4A, it can be seen that footsteps of
the user are generated at predetermined intervals (one half of the walking cycle of the user).
[0046]
The graph of FIG. 4B is a graph showing an example of the frequency component of the
environmental sound. The horizontal axis of the graph indicates frequency, and the vertical axis
indicates power. This graph can be obtained, for example, by performing a Fourier transform
process on the data of FIG. 4 (a). The Fourier transform process is performed, for example, by the
extraction unit 123 (FIG. 2). As shown in FIG. 4 (b), a power spectrum component centered on
the frequency f 1 (for example, 2 Hz) is observed. The power spectrum component centered on
the frequency f 1 is the frequency component of the user's footsteps, and the frequency f 1 may
11-04-2019
12
depend on the walking cycle of the user.
[0047]
Next, with reference to FIGS. 4 (c) and 4 (d), a case will be described where, in addition to the
footsteps of the user, other people's footsteps also exist as environmental sounds. The graph of
FIG. 4C shows the time change of the environmental sound when another person's footsteps are
present in the graph of FIG. 4A. The graph of FIG. 4 (d) is obtained by performing the Fourier
transform process on the data of FIG. 4 (c). In FIG. 4D, not only the power spectrum component
centered on the frequency f 1 is observed, but also the power spectrum component centered on
the frequency f 2 (for example, 1.7 Hz). The power spectrum component centered on the
frequency f 2 is the frequency component of another person's footsteps, and the frequency f 2
may depend on the walking cycle of another person.
[0048]
As shown in FIG. 4D, the footsteps of the user and the footsteps of another person can be
distinguished as a power spectrum centered on the frequency f 1 and a power spectrum centered
on the frequency f 2. For this reason, if information on the footsteps of the user (walking cycle,
frequency f 1 according to it, etc.) is known in advance, the extraction unit 123 (FIG. 2) is based
on the frequency power spectrum as shown in FIG. It is possible to distinguish between the
footsteps of the user and the footsteps of others, and to extract the footsteps of others.
Information on the footsteps of the user may be stored in the storage unit 130. By observing the
power spectrum by Fourier transform processing, it is possible to prevent misrecognition of a
non-periodic single-shot sound or the like as a footstep.
[0049]
Referring back to FIG. 3 again, the reason why the predetermined sound "vehicle sound"
corresponds to the user situation "running" is, for example, when the vehicle is approaching from
behind, the presence of the vehicle sound during running. This is because it may be preferable to
notify the user. This means that in the case of the user status "during running", "vehicle sound" is
determined to be a sound required for the user. The identification of the sound of the vehicle can
be performed, for example, by determining whether or not the power spectrum includes a
component of the frequency band of the sound of the vehicle.
11-04-2019
13
[0050]
The reason that the predetermined sound “in-car announcement” corresponds to the user
status “concentrate in the game in the train (or sleeping)” is that, if there is an announcement
in the car when concentration in the game in the train, etc. This is because it is considered
preferable to inform the user of the presence of the in-car announcement. This means that "in-car
announcement" is determined to be a necessary sound for the user in the case of the user
situation "concentrating on a game in a train (or sleeping)". The identification of the in-vehicle
announcement can be performed, for example, by determining whether or not the power
spectrum includes a component of the frequency band of the in-vehicle announcement, or can be
performed by voice recognition.
[0051]
By referring to the data table 130a shown in FIG. 3 as described above, it is possible to set an
appropriate predetermined sound (sound required for the user) according to the user's situation.
[0052]
The user situation is identified based on the information transmitted from the device group 200
by the identifying unit 122 (FIG. 2) as described above.
Here, the method of specifying the user situation described in FIG. 3 will be specifically
described.
[0053]
The identification that the user situation is “walking alone at night” can be performed based
on the current time and information on the position of the user. Specifically, the current time is
night (for example, 20:00 to 7:00), the position of the user is on the road of a predetermined
road width (for example, 6 m or less), and the number of pedestrians at that position is relatively
small (for example For example, in the case of a relatively dark place (e.g., a brightness of 300 lux
or less) at 0.1 person / square meter, the user can identify (guess) that he is walking alone at
11-04-2019
14
night. As described with reference to FIG. 2, the current time can be obtained from the timer 205
and the position of the user can be obtained from the GPS device 206. Also, information on the
number of pedestrians at the position of the user can be acquired from the server 400 via the
communication device 207, for example. Information on illuminance can be acquired from the
illuminance sensor 204 as to whether or not it is a relatively dark place.
[0054]
In addition, when acquiring the information regarding the number of pedestrians in a user's
position in the above from the server 400, the information can be produced in the server 400.
FIG. Specifically, the server 400 communicates with communication terminals (for example, the
terminal device 200a in FIG. 1) carried by a plurality of users, and by grasping position
information of each communication terminal, a pedestrian at the position of the user You can
create information about the number.
[0055]
The identification that the user status is “running” can be performed based on the user's state,
information on the user's position, and the like. Specifically, when the user's heart rate is equal to
or more than a predetermined value (e.g., 100 beats / minute or more) and the moving speed of
the user is equal to or more than the predetermined value (e.g., 8 km / h or more), the user is
running. Can be identified (guessed). As described with reference to FIG. 2, the heart rate can be
obtained from the heart rate sensor 203. The moving speed of the user can also be calculated
based on the movement history acquired from the GPS device 206, or can be calculated based on
the acceleration history acquired from the acceleration sensor 201.
[0056]
The specification that the user status is “concentrate in the game in the train” or “sleep in the
train” can be performed based on information on the position of the user, the state of the user,
user operation, and the like. Specifically, it can be specified (estimated) that the user is on the
train from the position of the user, the moving speed, the moving history, and the like. Further,
based on the state of the user's brain waves, it can be specified (estimated) whether the user is in
a state of concentration or sleeping. Furthermore, when there is a user operation on the game
machine 209, it can be specified (estimated) that the user is playing a game. Referring to FIG. 2,
11-04-2019
15
the position, moving speed, moving history, and the like of the user can be acquired from the GPS
device 206 and the acceleration sensor 201. The state of the user's brain wave can be acquired
from the brain wave sensor 202. The presence or absence of the user operation on the game
machine 209 can be acquired from the game machine 209.
[0057]
Here, with reference to FIG. 5, identification of the user situation based on the
electroencephalogram sensor will be described. In the graph shown in FIG. 5, the horizontal axis
indicates time (in seconds, for example). The vertical axis shows the ratio (in units of%, for
example) of (the level of) the measured electroencephalogram to the electroencephalogram at
normal times. In this case, the measurement value of the electroencephalogram sensor is
calibrated with the measurement value of the normal electroencephalogram (for example,
measured and grasped in advance). Thereby, the difference between the current brain wave and
the normal brain wave can be grasped. The temporal change of the electroencephalogram (that
is, the waveform of the electroencephalogram) differs depending on the user situation. For this
reason, it is possible to specify the user situation (during concentration, sleeping, etc.) from the
information acquired by the electroencephalogram sensor. For example, when a waveform in
which a specific component of an electroencephalogram is observed by 30% or more as
compared to normal is measured, the user situation may be specified as being in a state of
concentration. Data in which such waveforms of several predetermined patterns are associated
with user conditions may be created in advance and stored in the storage unit 130 (FIG. 2). The
identifying unit 122 can identify the user situation based on the information transmitted from
the electroencephalogram sensor 202 and the data stored in the storage unit 130.
[0058]
FIG. 6 is a flowchart showing an example of processing (a method of reproducing sound in the
sound output device) executed by the headphone device 100. The processing of this flowchart is
started, for example, in response to the application of music reproduction or game being
executed in the terminal device 200 a and the sound of music or game being output from the
speaker 140 of the headphone device 100. Each process of the flowchart is assumed to be
executed by the control unit 120 unless otherwise described.
[0059]
11-04-2019
16
First, the headphone device 100 starts acquisition (recording) of environmental sound (step S1).
This process is executed by the acquisition unit 121. In addition, the acquisition unit 121 stores
the acquired environmental sound in the storage unit 130.
[0060]
Next, the headphone device 100 identifies the user situation (step S2). This process is executed
by the identifying unit 122. In the example shown in FIG. 6, three types of processing of
situations A to C are shown as user situations. The situation A is, by way of example, the user
situation “walk alone at night”, the situation B is, by way of example, the user situation
“running”, and the situation C is, by way of example, the user situation “focus on games in
the train”.
[0061]
If the situation of the user identified in the previous step S2 is the situation A, the headphone
device 100 sets the sound to be extracted to the predetermined sound A (step S3). The
predetermined sound A is, for example, footsteps of another person. Then, the headphone device
100 determines whether the environmental sound includes the predetermined sound A (step S4).
When the environmental sound includes the predetermined sound A (step S4: YES), the
headphone device 100 extracts the predetermined sound A from the environmental sound (step
S5). If not (step S4: NO), the headphone device 100 returns the process to step S2 again. These
processes are performed by the extraction unit 123. The determination that the predetermined
sound A is included in the environmental sound is, for example, data processing (for example, for
data from the present time to the past predetermined time (for example, several seconds to
several tens of seconds) of the recorded environmental sound data (for example, This can be
performed by executing the speech recognition process).
[0062]
After extracting the predetermined sound A in the previous step S5, the headphone device 100
executes the emphasizing process and executes the reproduction process (step S6). This process
is performed by the reproduction unit 124. The emphasizing process here is, for example,
reducing the sound of music and reproducing the footsteps registered in advance. The footsteps
11-04-2019
17
registered in advance are sounds reminiscent of footsteps, such as "katoon, katoon". Thereafter,
the headphone device 100 returns the process to step S2 again.
[0063]
On the other hand, when the situation of the user specified in the previous step S2 is the
situation B, the headphone device 100 sets the sound to be extracted to the predetermined sound
B (step S7). The predetermined sound B is, for example, the sound of a vehicle. Then, the
headphone device 100 determines whether the environmental sound includes the predetermined
sound B (step S8). When the environmental sound includes the predetermined sound B (step S8:
YES), the headphone device 100 extracts the predetermined sound B from the environmental
sound (step S9). If not (step S8: NO), the headphone device 100 returns the process to step S2
again. These processes are performed by the extraction unit 123.
[0064]
After extracting the predetermined sound B in the previous step S9, the headphone device 100
specifies the direction of the sound source (step S10). The direction of the sound source can be
identified by using a plurality of microphones 110 (FIG. 1) as described above. This process is
performed, for example, by the acquisition unit 121. For example, when the vehicle approaches,
the direction of the sound source is the direction in which the vehicle approaches.
[0065]
Then, the headphone device 100 executes the emphasizing process and executes the
reproduction process (step S11). The emphasizing process here takes into consideration the
positional relationship between the sound source of the predetermined sound B and the
headphone device 100. For example, the sound of the music is blurred and reproduced, and the
sound of the vehicle is reproduced clearly so that the direction is also known. It is. To blur the
sound of music is to make the sound of music sound to a user at a distant position. For example,
by reducing the sound of music or changing the frequency, the sound of music can be blurred,
which leads to relatively clear reproduction of the sound of a vehicle. Reproduction to make the
direction of the vehicle sound known can be performed by using a plurality of speakers 140 (FIG.
1) as described above. This process is performed by the reproduction unit 124. Thereafter, the
headphone device 100 returns the process to step S2 again.
11-04-2019
18
[0066]
Further, when the situation of the user specified in the previous step S2 is the situation C, the
headphone device 100 sets the sound to be extracted to the predetermined sound C (step S12).
The predetermined sound C is, for example, an in-car announcement. Then, the headphone device
100 determines whether the environmental sound includes the predetermined sound C (step
S13). When the environmental sound includes the predetermined sound C (step S13: YES), the
headphone device 100 extracts the predetermined sound C from the environmental sound (step
S14). If not (step S13: NO), the headphone device 100 returns the process to step S2 again. These
processes are performed by the extraction unit 123. It should be noted that the determination as
to whether the environmental sound includes an in-car announcement may be made by
determining whether the environmental sound contains a sound in the frequency band of the incar announcement, or using voice recognition technology. You may go.
[0067]
After extracting the predetermined sound C in the previous step S14, the headphone device 100
determines whether the predetermined sound C includes a predetermined vocabulary (step S15).
The predetermined vocabulary is, for example, the station name to be dismounted. This process
is performed by the extraction unit 123. When the predetermined sound C includes a
predetermined vocabulary (step S15: YES), the headphone device 100 operates. The emphasizing
process is performed to execute the reproduction process (step S16). If not (step S15: NO), the
headphone device 100 returns the process to step S2 again. The emphasizing process in step S16
is, for example, stopping the sound of the game (the music when listening to the music instead of
the game) and reproducing the in-vehicle announcement. Alternatively, the volume of the game
sound may be reduced (including zeroing the volume), or the playback speed of the sound may
be changed (e.g., reduced). That is, the emphasizing process in step S16 is a process of changing
the reproduction state of another sound (for example, the sound of the game) so that the user
can easily hear the required sound (for example, the name of the station to get off). This process
is performed by the reproduction unit 124. After completing the process of step S16, the
headphone device 100 returns the process to step S2 again. The above-described predetermined
vocabulary can be set, for example, in advance by the user and stored in storage unit 130.
[0068]
11-04-2019
19
The recording data of the environmental sound whose recording has been started in step S1 may
be updated as appropriate. For example, only data from the present to the past predetermined
time (for example, several seconds to several tens of seconds) may be left as recording data
necessary for extracting a predetermined sound, and data before that may be deleted. This can
prevent the recording data from becoming too large. Also, the processing of the flowchart of FIG.
6 ends, for example, in response to termination of the music reproduction or game application in
the terminal device 200 a and the output of music or game sound from the speaker 140 of the
headphone device 100.
[0069]
Next, the operation and effect of the headphone device 100 will be described. As shown in FIGS.
2 and 3 and the like, in the headphone device 100, the acquiring unit 121 acquires an
environmental sound (step S1), and the identifying unit 122 identifies the user situation (step
S2). The extraction unit 123 extracts a predetermined sound from the acquired environmental
sound according to the specified user situation (steps S5, S9, and S14). The reproduction unit
124 executes reproduction processing based on the extracted sound (steps S6, S10, S11, S15,
and S16).
[0070]
According to the headphone device 100, when the user is listening to music or a game sound by
executing reproduction processing that makes it easier for the user to hear a necessary sound
according to the user's situation (using the headphone device 100 Even when you are), you will
be able to understand the surrounding situation.
[0071]
Specifically, the reproduction unit 124 performs an enhancement process on the sound extracted
by the extraction unit 123 (steps S6, S11, and S16).
By listening to the enhanced sound, the user can more surely know the presence of the
predetermined sound required for the user.
11-04-2019
20
[0072]
In addition, the reproduction unit 124 can execute the emphasizing process in consideration of
the positional relationship between the sound source of the predetermined sound and the
headphone device 100 (steps S10 and S11). Thereby, the user can hear the sound corresponding
to the position of the sound source.
[0073]
In addition, the reproduction unit 124 can also execute emphasis processing when the sound
extracted by the extraction unit includes a predetermined vocabulary (steps S15 and S16). As a
result, it is possible to further narrow down the sounds to be subjected to the emphasizing
process, so that the user can hear sounds more necessary for the user.
[0074]
In addition, the reproduction unit 124 can execute the emphasizing process by changing the
reproduction state of another sound (step S16). By changing the playback state of other sounds
so that the user can easily hear the necessary sound, the user can hear the necessary sound more
reliably.
[0075]
Also, the reproduction unit 124 may perform the emphasizing process by converting the sound
extracted by the extraction unit 123 (step S6). For example, the user can easily grasp the
existence of the sound by converting the extracted sound into a sound easy to hear by the user
and reproducing the sound.
[0076]
As a sound output device such as the headphone device 100 described above, a so-called closed
headphone device or an earphone device can be suitably used. For example, as in the headphone
11-04-2019
21
device 100 shown in FIG. 1, when the speaker 140 is shaped so as to cover the entire ear of the
user when worn, it exhibits a function of blocking environmental sound, thereby a closed type
sound output device It can be said that Also, a so-called canal type earphone device or the like is
a closed type sound output device. The user can listen to music and game sounds more
comfortably if the environmental sound is shut off by using a closed type sound output device.
Moreover, even if the environmental sound is blocked by the closed-type sound output device,
according to the present embodiment, the sound necessary for the user is reproduced, so that the
user can grasp the surrounding situation. .
[0077]
DESCRIPTION OF SYMBOLS 100 ... Headphone apparatus, 110 ... Microphone, 120 ... Control
part, 121 ... Acquisition part, 122 ... Identification part, 123 ... Extraction part, 124 ...
Reproduction part, 130 ... Storage part, 140 ... Speaker, 200 ... Device group, 300 ...
Communication network, 400 ... server.
11-04-2019
22
Документ
Категория
Без категории
Просмотров
0
Размер файла
35 Кб
Теги
description, jp2016127376
1/--страниц
Пожаловаться на содержимое документа