close

Вход

Забыли?

вход по аккаунту

?

DESCRIPTION JP2011180470

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JP2011180470
A voice visualizing device capable of intuitively grasping the arrival direction and time of voice is
provided. SOLUTION: Audio data in which directivity is formed for each predetermined azimuth
is generated, and the azimuth of a sound source is determined from the volume of each audio
data. Based on the determination result of the sound source direction, the timing at which the
sound is collected and the circular image indicating the arrival direction of the sound are
displayed. The circle image is moved toward the center according to the passage of time. Also,
the size of the circular image is changed according to the volume of each audio data. [Selected
figure] Figure 4
Voice visualization device
[0001]
The present invention is also referred to as sound (hereinafter referred to as sound. It relates to
the technology to visualize).
[0002]
Heretofore, for example, a device as disclosed in Patent Document 1 has been proposed as a
device for visualizing the voice when reproducing the voice.
[0003]
11-04-2019
1
The apparatus of Patent Document 1 displays a visible image (ball image) moving from the upper
direction to the lower direction on a matrix display screen in which the horizontal axis is
associated with the keyboard and the vertical axis is associated with the time axis. It is
When the ball image is superimposed on the keyboard, a musical tone is generated, and it is
possible to know an elapsed state of time from the reception of data (performance information)
from another device to the arrival of the sound generation timing.
[0004]
Patent No. 3922207
[0005]
In many of the recent recording and reproducing apparatuses, it is possible to control the
directivity, and it is important to grasp the arrival direction of the voice. However, the apparatus
of Patent Document 1 intuitively grasps the elapsed time. Although it was possible, it was not
possible to grasp the arrival direction of the voice.
[0006]
Therefore, an object of the present invention is to provide a voice visualization device capable of
intuitively grasping the arrival direction and time course of voice.
[0007]
The voice visualization device of the present invention includes a plurality of microphones, a
directivity control unit, a temporary storage unit, a sound source direction determination unit,
and a display processing unit.
The directivity control unit generates a sound collection signal in which the directivity is formed
for each predetermined azimuth based on the sound collected by the plurality of microphones.
For example, two directional microphones whose sound collection directions are inclined at a
predetermined angle are provided close to each other, and the gain of the sound collected by
11-04-2019
2
each directional microphone is changed to generate a plurality of synthesized signals.
When the gain of the sound collected by each microphone is changed, the directivity is changed,
so that it is possible to generate a collected signal in which the directivity is formed in a plurality
of azimuths. The temporary storage unit temporarily stores the sound collection signal for each
direction.
[0008]
The sound source direction determination unit compares the levels of the collected sound signals
of the respective directions, and determines that the sound source is present in a direction with a
large level (a direction having a volume higher than a predetermined level).
[0009]
The display processing unit displays an image indicating the timing at which the voice is picked
up by the microphone and the arrival direction of the voice based on the determination result of
the sound source orientation determining unit, and moves and displays the image according to
the elapse of time.
For example, a circle image is displayed on the outside of the screen with the screen center
position as the position of the device. The circumference of this circular image represents the
sound pickup timing. Then, an image of a ball moving with the passage of time is displayed from
the outside of the screen toward the center. As a result, it is possible to intuitively grasp the
arrival direction and time lapse of voice.
[0010]
In addition, the display processing unit may control the size of the ball image based on the level
for each direction. In this case, it is possible to intuitively grasp the difference in volume.
[0011]
11-04-2019
3
Further, the sound source direction determination unit may detect the frequency characteristic of
the collected sound signal for each predetermined direction, and the display processing unit may
control the display color of the image based on the frequency characteristic. For example, if the
voice in the high frequency band is high, a bright color is displayed, and if the voice in the low
frequency band is high, a dark color is displayed. In this case, it is also possible to intuitively
grasp the frequency characteristics of the voice.
[0012]
Furthermore, the voice visualization device can be configured to include a recording unit that
records recording data based on voices collected by a plurality of microphones, and a
reproduction unit that reproduces the recording data recorded in the recording unit. is there.
[0013]
In this case, the directivity control unit generates a sound collection signal for each
predetermined azimuth based on the recording data read from the recording unit.
When the recording data is the one in which the voices collected by the plurality of microphones
are recorded as they are, the sound collecting signal for each direction is generated by
controlling and combining the gains of the voices as described above. If it is recorded as a sound
collection signal for each direction at the time of recording, it is output as it is.
[0014]
Then, the display processing unit moves and displays the image in synchronization with the
reproduction of the recording data of the reproduction unit. That is, when a circular image is
displayed on the outside of the screen with the screen center position as the position of the
device, the circumference of the circular image represents the timing of emitting the sound.
Then, an image of a ball moving with the passage of time is displayed from the outside of the
screen toward the center. In the case of reproducing the recorded data, the direction and timing
of the sound to be reproduced are read ahead, a ball image is displayed outside the circular
image representing the sound emission timing, and the sound to be emitted in the future is
visualized and displayed. It is also possible.
11-04-2019
4
[0015]
Further, the voice visualization device includes an operation unit that receives a user's operation,
and a signal processing unit that performs signal processing on the sound recording data or the
sound collection signal for each of the predetermined directions according to the operation
received by the operation unit. May be provided. In this case, the display processing unit changes
the movement mode of the image according to the operation received by the operation unit.
[0016]
For example, when the operation unit is a touch panel integrated with the display unit, the
following can be realized. That is, when an operation is performed to block the movement of the
ball image moving from the outside of the screen toward the center on the screen, the movement
of the ball image is stopped, and the level of the sound pickup signal in that direction is zeroed.
Set to to prevent recording. Alternatively, when the recorded data is being played back, when the
operation of blocking the movement of the ball image is performed outside the circular image
representing the sound emission timing, the movement of the ball image is stopped and the
direction thereof Set the playback level of the recorded data to zero so that no sound is emitted.
[0017]
According to the present invention, it is possible to intuitively grasp the arrival direction and time
lapse of voice.
[0018]
It is a figure showing the appearance of a recording and reproducing device.
It is a block diagram showing composition of a recording and reproducing device. It is the figure
which showed the structure of the decomposition | disassembly process part according to
direction. It is the figure which showed the screen displayed on a display part. It is a figure
explaining the movement aspect of a circle picture. It is the figure which showed the example of
the side view. It is a figure which shows the example of a screen display according to a user's
operation. It is a figure showing a file selection screen. It is a figure showing a reproduction
screen.
11-04-2019
5
[0019]
An embodiment according to a voice visualization device of the present invention will be
described. FIG. 1 is a view showing the appearance of a recording and reproducing device that
realizes the audio visualization device of the present invention, and FIG. 2 is a block diagram
showing the configuration of the recording and reproducing device.
[0020]
The recording and reproducing apparatus comprises a microphone unit 1 and a recording and
reproducing unit 2. The microphone unit 1 functions as a peripheral device of the recording and
reproducing unit 2 and includes a multi microphone 10 incorporating three microphones 101, a
microphone 102, and a microphone 103.
[0021]
The microphone unit 1 has a cylindrical appearance, and the portion where the multimicrophone 10 is disposed is formed of punching metal to ensure strength and damage
protection (but is acoustically released) That is). The portion formed of this punching metal is
bent by 90 degrees, so that the sound collecting direction of each of the microphones
incorporated in the multi microphone 10 can be changed by 90 degrees.
[0022]
The microphone unit 1 outputs the sound collected by the plurality of microphones to the
recording / reproducing unit 2 as it is or after performing various signal processing. Also, the
microphone unit 1 determines the relative orientation of the sound source based on the sound
collected by the plurality of microphones, and based on the determination result, the timing at
which the sound of the sound source is collected and the arrival of the sound of the sound source
Data for displaying an image indicating the direction is output to the recording and reproducing
unit.
11-04-2019
6
[0023]
The recording / reproducing unit 2 is a digital audio player having a card type appearance, and
in addition to reproducing the input audio data, reproduces the audio data stored in the built-in
storage and reproduces the audio even by itself Can. The recording and reproducing unit 2 is
provided with a display unit 21 on the top in appearance, and the display unit 21 is a touch panel
that doubles as an operation unit.
[0024]
In the present embodiment, in the recording / reproducing unit 2, the direction in which the
microphone unit is connected is the X direction (right surface direction), and the opposite
direction is the −X direction (left surface direction). Further, on a plane including the X axis, a
direction inclined by 90 degrees to the left from the X direction is taken as a Y direction (front
direction), and the opposite direction is taken as a -Y direction (back direction). The direction in
which the display unit 21 is provided is the Z direction (upper surface direction), and the
opposite direction is the −Z direction (lower surface direction).
[0025]
The microphone unit 1 and the recording / reproducing unit 2 are not necessarily separate
bodies as in the present embodiment, and may be integrated.
[0026]
As shown in FIG. 1A and FIG. 1B, the three microphones 101, the microphones 102, and the
microphones 103 of the multi-microphone 10 are disposed close to each other so that there is
virtually no phase difference. It is done.
In the same figure (A) and the same figure (B), although the example arrange | positioned with
the microphone 101, the microphone 102, and the microphone 103 toward the other end
sequentially from the one end side of a cylinder is shown, The arrangement mode is not limited
to this example, and for example, three microphones may be arranged in the same plane.
11-04-2019
7
[0027]
The microphone 101, the microphone 102, and the microphone 103 have single directivity
(cardioid characteristics), and as shown in FIG. 6C, the directions (sound collecting directions) at
which the respective directional sensitivities become maximum are the same. It is arranged to be
shifted by 120 degrees on the plane.
[0028]
As described above, the portion of the microphone unit 1 where the multi-microphone 10 is
disposed is bent 90 degrees, and in the bent state (L-shaped state) as shown in FIG. , And an
aspect in which voices on the XY plane are picked up.
As shown in FIG. 6B, in the unfolded state, the sound on the XZ plane is picked up.
[0029]
In the same figure (C), the example which picks up the sound on XY plane is shown. In the state
shown in FIG. 6C, the microphone 101 is in the Y direction (± 0 ° direction). The microphone
102 is directed in a direction (-120.degree. Direction) inclined by 120.degree. Counterclockwise
to the microphone 101. In FIG. The direction of sound collection is directed to the direction of
120.degree. And the microphone 103 is inclined 120.degree. The sound pickup direction is
directed to). The microphone unit 1 can pick up sound in all directions on the XY plane (or XZ
plane) by arranging each microphone in such a sound collection direction.
[0030]
In the example shown in (C) of the figure, an example is shown in which the sound collecting
directions of the respective microphones are arranged inward so as to cross at the center
position. However, the sound collecting directions of the respective microphones are outward.
May be However, in the case where the sound collection direction of each microphone is inward,
the microphone vibration surface can be disposed closer than in the outward direction, so that
the phase difference can be reduced.
11-04-2019
8
[0031]
Next, in FIG. 2, the microphone unit 1 is added to the above-mentioned multi microphone 10 and
a selector (Sel. 11) Disassembly processing unit by azimuth 12, analysis unit 13, display
processing unit 14, interface (I / F) 15, separation processing unit by channel (ch) 16, signal
processing unit 17, data creation unit 18, and control unit It has nineteen. Further, in addition to
the display unit 21 described above, the recording / reproduction unit 2 also has an interface (I /
F) 22, a control unit 23, an operation unit (a touch panel also used as the display unit 21) 24, a
storage 25, a reproduction processing unit 26, a communication unit 27 and a sensor unit 28. In
the present embodiment, the configurations of A / D conversion and D / A conversion are
omitted, and all the various signals are assumed to be digital audio signals unless otherwise
described.
[0032]
The selector 11 of the microphone unit 1 receives the sound collected by each of the
microphones input from the multi microphone 10 according to the instruction of the control unit
19 (output sound signal x1 (t of the microphone 101, output sound signal x2 of the microphone
102 ( t) The output sound signal x3 (t) of the microphone 103 is output to the azimuth-specific
decomposition processing unit 12, the channel-specific decomposition processing unit 16, and
the data creation unit 18.
[0033]
When the voice signal collected by each microphone is input from the selector 11, the data
creation unit 18 outputs the voice signal as it is to the interface 15.
In this case, the audio signal output to the recording and reproducing unit 2 is recorded as
recording data in the storage 25 through the interface 22 and the control unit 23 of the
recording and reproducing unit 2. The format of the recording data may be one in which the
audio signal of each microphone is recorded as it is as a digital audio signal, or may be one
encoded by the data creating unit 18 according to a predetermined encoding method. Note that
the recording / reproduction unit 2 may obtain predetermined additional information (map
information, position information by GPS, time information, etc.) from the sensor unit 28 or the
communication unit 27, and may record it in association with the recording data. It is possible.
11-04-2019
9
[0034]
The recording data recorded in the storage 25 is input to the reproduction processing unit 26
when the user issues a reproduction instruction using the operation unit 24, and is input via the
control unit 23, the interface 22 and the interface 15, and the control unit 19 to the selector It is
input to 11. If the recorded data is encoded, it is decoded and converted into an audio signal. The
reproduction processing unit 26 incorporates an amplifier, a speaker, and the like, and emits
sound based on the input sound signal.
[0035]
On the other hand, the selector 11 switches between and outputs an audio signal input from the
multi microphone 10 and an audio signal input from the control unit 19 in accordance with an
instruction from the control unit 19. That is, at the time of recording (during recording standby),
the control unit 19 instructs the selector 11 to output the audio signal input from the multi
microphone 10, and the user operates the operation unit 24 to instruct reproduction. When it is
instructed to output an audio signal based on the recorded data.
[0036]
The azimuth decomposition processing unit 12 corresponds to the directivity control unit of the
present invention, and when an audio signal is input from the selector 11, the directivity for each
predetermined azimuth is obtained by changing and combining the gains of these audio signals.
A plurality of collected sound signals are formed. The configuration and processing of the
orientation-based decomposition processing unit 12 will be described with reference to FIG.
[0037]
The azimuthal decomposition processing unit 12 is provided with a plurality of signal combining
units according to the direction in which the directivity is formed. In this example, eight signal
combining units 121 to 128 are provided to form directivity in eight directions. Each signal
combining unit includes a gain adjustment unit 31A, a gain adjustment unit 31B, a gain
adjustment unit 31C, and an addition unit 32.
11-04-2019
10
[0038]
The three audio signals to be input are subjected to gain adjustment in each gain adjustment unit
of the direction-specific signal synthesis unit, and then added in the addition unit 32. By setting
the gain of each gain adjustment unit to an arbitrary value, the azimuth-specific decomposition
processing unit 12 can generate a sound collection signal in which directivity is formed in an
arbitrary direction.
[0039]
For example, in the present embodiment, in order to increase the direction determination
accuracy of the sound source (sound source separation accuracy), the hypercardioid
characteristic having the sharpest directivity is set. That is, in order to form a single directivity in
the Y direction, the values of the gains (G1, G2, G3) of the gain adjustment unit 31A, the gain
adjustment unit 31B, and the gain adjustment unit 31C are (G1, G2, G3) = It is set to (7/6, -1/3, 1/3). The output signal X1 (t) after synthesis is M1 (t), M2 (t) and M3 (t) which are output signals
of the microphone 101 and the microphone 102, and X1 (t) = G1 × M1. (T) + G2 x M2 (t) + G3 x
M3 (t) = 7/6 x M1 (t) + (-1/3) x M2 (t) + (-1/3) x M3 (t) Be done. Thereby, the angle in the
sensitivity maximum direction of the combined characteristics of the three microphones can be
oriented in the 0 ° direction. That is, it is possible to generate the collected sound signal X1 (t)
in which unidirectionality is formed in the direction of θ = 0 °. In the present embodiment, the
other signal synthesis units 122 to 128 perform the same processing in the eight directions (± 0
°, + 45 °, + 90 °, + 135 °, ± 180 °, −45 °, −90 °, and so on). A sound collection signal
X1 (t) to X8 (t) having a single directivity at −135 ° can be generated.
[0040]
In addition, one more microphone having sensitivity maximum direction in the direction
orthogonal to the plane on which the three microphones are arranged is provided, and the
directivity is controlled in three dimensions when the output voice signals of the four
microphones are synthesized. You can also. In addition, directivity can be formed in any direction
by using delay combining with an array microphone.
11-04-2019
11
[0041]
Returning to FIG. 2, the channel-by-channel decomposition processing unit 16 generates a
collected sound signal having directivity in a plurality of directions as in the azimuth-based
decomposition processing unit 12, but the number and direction of generated sound pickup
signals Is different. For example, in the present embodiment, assuming that 5.1 ch recording is
performed, six sound collection signals in the direction corresponding to L, R, SL, SR, C, and LFE
(where LFE is non-directional) are generated. The generation number and direction of the
collected sound signal can be set variously according to the recording mode. Of course, the same
number of pickup signals may be generated in the same direction as the direction-specific
resolution processing unit 12.
[0042]
The respective sound pickup signals generated by the channel-by-channel decomposition
processing unit 16 are subjected to predetermined signal processing (noise reduction processing
and the like) by the signal processing unit 17, and then output to the data generation unit 18.
The data generation unit 18 outputs the input sound pickup signal to the interface 15 as it is or
after encoding or the like. Then, the recording data output from the data generation unit 18 is
output to the recording and reproducing unit 2 and recorded in the storage 25.
[0043]
On the other hand, the collected sound signals Xn (t) of each azimuth generated by the azimuth
decomposition processing unit 12 are input to the analysis unit 13. The analysis unit 13
corresponds to the temporary storage unit and the sound source direction determination unit of
the present invention, and determines the direction of the sound source based on the respective
collected sound signals Xn (t) input from the azimuth decomposition processing unit 12. That is,
the analysis unit 13 determines whether or not the input sound collection signal Xn (t) in each
direction is voiced. Whether or not the sound is present is determined by the level of each sound
collection signal Xn (t).
[0044]
11-04-2019
12
The analysis unit 13 refers to each collected sound signal Xn (t) as a predetermined time
(hereinafter, analysis time). Only store and detect peak levels within this analysis time. It is
determined whether the peak level is equal to or higher than a predetermined threshold value. If
the peak level is equal to or higher than the threshold value, it is determined that sound is
present, and it is determined that a sound source exists in this direction. In some cases, it may be
determined that sound sources exist in multiple directions simultaneously. Alternatively, an
average level in the analysis time may be calculated, and it may be determined to be voiced when
the average level is equal to or higher than the threshold, or it may be determined whether or not
the threshold is equal to or higher for each sampling time. (In this case, a configuration for
storing the sound collection signal for the analysis time is unnecessary).
[0045]
Although the direction of the sound source determined in the analysis unit 13 is a relative
direction with the front direction of the recording and reproducing unit 2 set to 0 °, when the
recording and reproducing unit 2 can acquire an absolute direction (for example, the sensor unit
28) (If the azimuth sensor is built in), receive from the recording / playback unit 2 the azimuth
information (information indicating which absolute azimuth the 0 ° direction corresponds to)
the absolute azimuth of the sound source It is also possible to determine.
[0046]
The analysis unit 13 outputs to the display processing unit 14 the sound collection signal Xn (t)
of the direction determined to be voiced.
Further, the analysis unit 13 also outputs information indicating the peak level and the average
level to the display processing unit 14 as information related to the analysis result. The display
processing unit 14 temporarily stores the input sound pickup signal Xn (t) for a predetermined
time (a time longer than the analysis time). The collected sound signal Xn (t) stored in the display
processing unit 14 is stored in association with the elapsed time from the start of the input.
[0047]
The analysis unit 13 may detect the frequency characteristic of the sound collection signal Xn (t)
of the direction determined to be voiced. The frequency characteristics can be obtained by
converting the collected signal Xn (t) on the time axis into a collected signal Xn (f) on the
11-04-2019
13
frequency axis by performing a Fourier transform (FFT). At this time, the analysis unit 13 may
output the collected sound signal Xn (f) on the frequency axis to the display processing unit 14
as it is, but the collected sound signal Xn (f) is divided into a plurality of bands (for example, low
frequency bands, It may be divided into high frequency bands, peak levels and average power in
each band may be calculated, and the calculation result may be output to the display processing
unit 14 by including it in the information related to the analysis result.
[0048]
The display processing unit 14 generates data for displaying an image on the display unit 21
based on the stored sound collection signal Xn (t) and the information on the analysis result, and
outputs the data to the recording and reproduction unit 2.
[0049]
An example of an image displayed on the display unit 21 will be described with reference to
FIGS. 4 and 5.
As shown in FIG. 4A, the display unit 21 displays a REC image 251 which is a circular image
having concentric circles at the screen center position, a sound collection timing image 252, and
a recording timing image 253. In addition, a plurality of small round images (ball images) are
displayed in the screen. For example, in the lower left of the screen, four ball images 201 to 204
are displayed in order from the outermost ball image 201. The display processing unit 14
performs control to display each ball image as an image indicating the timing at which the sound
is collected by the microphone and the arrival direction of the sound based on the stored sound
collection signal Xn (t).
[0050]
Here, when the user operates the touch panel (operation unit 24) and presses the REC image
251, the display processing unit 14 displays a recording screen as shown in FIG. 4 (B). In the
example of FIG. 4B, the REC image 251 is enlarged to be a circular image of the same size as the
recording timing image 253, displayed as RECORDING, and the elapsed time from the start of
recording is displayed. When the recording screen is displayed, the recording data (audio signals
and sound collection signals of the respective microphones, etc.) are output from the data
generation unit 18, and the recording data is recorded in the storage 25 of the recording and
11-04-2019
14
reproducing unit 2. Then, when the user operates the touch panel again to press the REC image
251, the display processing unit 14 displays a screen as shown in FIG. 4C. In the example of FIG.
4C, the file icon 255 of the circular image is displayed in the circular image of the recording
timing image 253. In the file icon 255, the recording date and time etc. are displayed.
[0051]
The display control of the ball image will be described with reference to FIG. The right columns
of (A) to (C) in the figure show time axis waveforms (level changes) of the collected sound signals
of each direction. Although the sound collection signal determined to be silent is also shown in
the figure for the sake of explanation, what is actually input to the display processing unit 14 is
in the analysis time zone determined to be voiced. It is a collected signal.
[0052]
First, as shown in FIG. 6A, when the sound processing signal Xn (t) determined to be voiced is
input from the analysis unit 13, the display processing unit 14 determines the directivity of the
sound collecting signal. A ball image is displayed on the circumference of the sound pickup
timing image 252 among the formed azimuths (for example, when the sound pickup signal X1 (t)
is input, the azimuth at an angle of 0 °). In the example of (A) of the figure, since it is the timing
when the sound collection signal corresponding to the azimuth of -140 degrees is input, the ball
image 201 on the circumference of the sound collection timing image 252 of the azimuth of -140
degrees. Is displayed.
[0053]
Then, as time passes, as shown in FIG. 6B, the input sound collection signal changes, so display
control is performed to move the ball image toward the screen center position. In the example of
FIG. 7B, the collected sound signal corresponding to the ball image 203 overlaps the recording
timing.
[0054]
11-04-2019
15
Here, the display control by the display processing unit 14 and the output of the recording data
in the data creating unit 18 are synchronized by the control unit 19 and the user gives an
instruction of recording (shown in FIG. 4B). In the case of the state), the display processing unit
14 overlaps the ball image with the recording timing image 253 (during recording, the REC
image 251). The recording data (for example, the output sound signal of each microphone) is
output from the data generation unit 18 at the timing when it is moved on the circumference of.
That is, the data generation unit 18 temporarily stores the input sound signal of each
microphone and the collected sound signal according to direction for the same predetermined
time as the display processing unit 14, and the display processing unit 14 receives the ball image
as the REC image 251. When notified by the control unit 19 of the timing of movement on the
circumference of the circle, the voice signal and the sound collection signal stored temporarily
are output. The sound collection signal temporarily stored by the display processing unit 14 may
be output to the storage 25 of the recording and reproducing unit 2 and recorded as recording
data.
[0055]
Here, for example, when the user performs a recording operation at the timing of FIG. 5B,
recording data after this recording timing (a voice output signal of the microphone or a collected
sound signal according to direction) is output, Thereafter, the sound of the sound source of the
direction corresponding to the ball images 201, 202, and 203 is recorded. Since the sound
collection signal corresponding to the ball image 204 has passed the recording timing, it will not
be included in the recording data, and the sound of the sound source in this direction will not be
recorded.
[0056]
Further, since the sound pickup signal corresponding to the ball image 204 has passed the
recording timing, it is erased from the display processing unit 14. However, as shown in FIG. 6B,
even after the recording timing, it is stored for a certain time (for example, for the above analysis
time), and is displayed on the screen until it moves to the screen center position. May be
[0057]
When time passes further, display control is performed to move the ball image toward the center
11-04-2019
16
of the screen as shown in FIG. In the state of FIG. 6C, the sound collection signal corresponding
to the ball image 202 is the recording timing, and it can be visually grasped that the sound
recording data includes the sound of the sound source of this direction.
[0058]
Further, the display processing unit 14 changes the size and the color of the ball image in
accordance with the analysis result of the collected sound signal. That is, the display processing
unit 14 changes the size of the ball image based on the peak level and the average level among
the analysis results. For example, a large ball image is displayed for a high level sound collection
signal (see the ball image 201), and a small ball image is displayed for a low level sound
collection signal (see the ball images 202 to 204). In addition, the display processing unit 14
changes the color of the ball image based on the analysis result of the frequency characteristic.
For example, when the level of the high frequency band is high, a bright color ball image is
displayed (see the ball images 201 and 202), and when the level of the low frequency band is
high, a dark color ball image is displayed (ball image 203). , 204). When the sound collection
signal Xn (f) on the frequency axis is input, a color may be displayed in which this frequency
characteristic is regarded as the wavelength of visible light. For example, when the level in the
low frequency band is high, a red ball image with a long wavelength is displayed, and when the
level in the high frequency band is high, a blue or purple ball image with a short wavelength is
displayed.
[0059]
The moving speed of each ball image is arbitrary, but is set according to the time length of the
sound collection signal temporarily stored by the display processing unit 14. For example, when
the display processing unit 14 stores the collected sound signals of each direction for 5 seconds,
the display processing unit 14 moves for 5 seconds from the circumference of the sound
collection timing image 252 to the circumference of the recording timing image 253.
[0060]
With the above-described configuration, the recording and reproduction apparatus according to
the present embodiment can intuitively grasp the arrival direction and time lapse of voice.
According to this configuration, the user himself can listen to the voice and refer to the display
11-04-2019
17
unit 21 to intuitively understand the arrival direction of the voice to be recorded and the
recording start timing, which can not be achieved conventionally. An operation mode can be
realized.
[0061]
In the above example, an example is shown in which head-up display is performed in which the
arrival azimuth of the recording and reproduction device and the voice is displayed as a relative
azimuth, but when acquiring an absolute azimuth, the true north azimuth is fixed, It is also
possible to perform a north-up display that displays the absolute orientation of the sound source.
[0062]
Moreover, when controlling directivity three-dimensionally using four microphones mentioned
above.
It is also possible to perform a side view display as shown in FIG. When the sensor unit 28 of the
recording and reproducing unit 2 includes the posture sensor (acceleration sensor), the front
view (Y direction) of the recording and reproducing unit 2 is directed vertically upward, and the
side view display shown in FIG. When the upper surface direction (Z direction) is directed
vertically upward, control may be performed such that the display in plan view shown in FIGS. 4
and 5 is performed.
[0063]
Next, with reference to FIG. 7, the signal processing control by user operation of the recording
and reproducing apparatus of the present embodiment will be described. When the user operates
the touch panel and performs a predetermined operation, the recording and reproduction
apparatus performs various signal processing such as gain control and control of frequency
characteristics on the sound collection signal of each direction according to the user's operation.
[0064]
11-04-2019
18
For example, as shown in the upper right of the screen in FIG. 7, when the user performs an
operation of tracing a finger in the circumferential direction by a predetermined azimuth range
on the screen and interrupting the movement of the ball image, display The processing unit 14
stops the movement of the ball image corresponding to the predetermined azimuth range. Then,
the signal processing unit 17 sets the level of the collected sound signal corresponding to the
predetermined azimuth range to zero so that the sound recording data is not recorded.
[0065]
In addition, as shown in the lower right of the screen in the figure, when the user performs an
operation of meandering and tracing in the circumferential direction by a predetermined azimuth
range on the screen, the display processing unit 14 corresponds to the predetermined azimuth
range. The color of the ball image to be changed is changed, and the signal processing unit 17
changes the frequency characteristic of the collected sound signal corresponding to the
predetermined azimuth range. For example, low-pass filter processing that attenuates the high
frequency band is performed to change the color of the ball image moving inward from the
position traced by the finger to a dark color.
[0066]
As described above, the recording and reproduction apparatus according to the present
embodiment can intuitively instruct the mode of signal processing by performing various
operations on the screen.
[0067]
Next, with reference to FIGS. 8 and 9, the case where the recording and reproducing apparatus
reproduces the recorded data recorded in the storage 25 will be described.
As shown in FIG. 4C, in the recording and reproducing apparatus, when the user operates the
touch panel to press the REC image 251, the file icon 255 of the circular image is displayed in
the circular image of the recording timing image 253. The file icon 255 is displayed as a file
selection screen as shown in FIG. In the state shown in FIG. 4C, when a button such as the "file
selection button" is operated, the screen is moved to the file selection screen. The display control
of the file selection screen is performed by the control unit 23 of the recording and reproducing
unit 2. The control unit 23 reads out information such as the recording date and time of each
11-04-2019
19
recording data recorded in the storage 25, and performs control to display each recording data
as a file icon as shown in FIG. 8A.
[0068]
As shown in FIG. 6A, on the file selection screen, a plurality of past recorded data are displayed
side by side as file icons, and in a planar view, an arc is drawn clockwise from the outside of the
screen toward the center of the screen. File icons are lined up (as a matter of course, left rotation
is also possible). The file icon closer to the center corresponds to the past recording data, and is
displayed smaller. Further, in the side view, the file icons are arranged so as to travel from left to
right from the top of the screen downward. The file icon closer to the lower side of the screen is
displayed smaller as it corresponds to the past recorded data. That is, the file icons are displayed
as being arranged in a spiral.
[0069]
Here, since the recording and reproducing unit 2 incorporates the acceleration sensor in the
sensor unit 28, as shown in FIG. 7B, the user moves the recording and reproducing unit 2 in the
vertically downward direction as the acceleration. When the detection is performed by the
sensor, the control unit 23 of the recording and reproducing unit 2 performs display control such
that the file icon corresponding to the past recorded data becomes larger by rotating the arc
shape to the left. Conversely, when the user detects an operation to move the recording /
reproducing unit 2 vertically upward by the acceleration sensor, the control unit 23 of the
recording / reproducing unit 2 causes the file icon corresponding to the recording data of the
new recording date to be arc-shaped. Display control is performed so that it rotates right and
becomes smaller.
[0070]
Then, when the user performs an operation of selecting each file icon, the reproduction screen
shown in FIG. 9 is displayed. As shown in FIG. 9A, on the playback screen, the same display as
the recording screen shown in FIG. 4A is performed. However, the innermost circular image is
the reproduced image 261, and the character "PLAY" is displayed. A sound generation timing
image 263 representing sound generation timing is displayed outside the reproduced image 261,
and sound is emitted at timing when the ball image reaches the circumference of the sound
11-04-2019
20
generation timing image 263. The outermost circle image corresponds to the sound pickup
timing image at the time of recording, but since the sound pickup is not performed at the time of
reproduction, the display of the sound pickup timing image 262 may be omitted. In addition,
although voices before the collection timing (voices of unacknowledged sounds not input to the
microphone) were not visualized at the time of recording, voices prior to the collection timing are
pre-read by recording data at the time of reproduction. The ball image may be displayed outside
the sound collection timing image 262 as well.
[0071]
The synchronization of display control at the time of reproduction and reproduction control of
sound will be described. In the example of FIG. 9, the case where the recording data is the
recording of the output sound signal of each microphone will be described. When the user
selects the file icon and shifts to the playback screen shown in FIG. 9, the control unit 23 reads
the corresponding recording data from the storage 25. Then, an audio signal is output to the
reproduction processing unit 26 in synchronization with the display control, and the audio is
emitted. Synchronization with display control is performed, for example, as follows.
[0072]
First, the control unit 23 outputs the read recording data (output sound signal of each
microphone) to the control unit 19 via the interface 22 and the interface 15. The control unit 19
outputs the input output sound signal of each microphone to the azimuth-specific decomposition
processing unit 12 and the channel-specific decomposition processing unit 16 via the selector
11. The azimuth-based decomposition processing unit 12 and the channel-based decomposition
processing unit 16 respectively generate sound collection signals of each azimuth and output
them to the analysis unit 13 and the signal processing unit 17 as in the recording. As in the case
of recording, the analysis unit 13 determines whether or not the sound collection signal of each
direction is voiced, and outputs the sound collection signal which is voiced to the display
processing unit 14. The signal processing unit 17 performs various signal processing on the
input sound collection signal and then outputs the processed signal to the data generation unit
18. The data generation unit 18 outputs the input sound pickup signals to the control unit 23 as
they are or as they are synthesized.
[0073]
11-04-2019
21
The display control by the display processing unit 14 and the output of the collected sound
signal by the data creation unit 18 are synchronized by the control unit 19 as in the recording.
However, when recording, the recording data is output when the ball image reaches the
circumference of the recording timing image 253, but when reproducing, the ball image reaches
the circumference of the sound generation timing image 263. A sound collection signal is output
and input to the reproduction processing unit 26 to emit sound.
[0074]
On the other hand, when the user performs an operation such as rubbing the screen with the left
and right fingers as shown in FIG. 9B on the playback screen, the recording / playback apparatus
according to the present embodiment performs fast-forward playback or rewind. It is possible.
When the user performs an operation such as dragging the finger from the outside of the screen
toward the center to accelerate the movement of the ball image, the various components increase
the processing speed and perform the fast forward reproduction. On the contrary, when the user
performs an operation of tracing a finger from the center of the screen toward the outside and
scraping a ball image from the center, the various components perform reverse reproduction. Of
course, it is also possible to pause playback when the user has stopped the tracing operation.
[0075]
In addition, even when the recorded data is being reproduced, when the operation of blocking
the movement of the ball image is performed outside the sound generation timing image 263, the
movement of the ball image is stopped, and It is also possible to set the reproduction level of the
recorded data to zero so that no sound is emitted, and it is also possible to control the frequency
characteristics such as a low pass filter.
[0076]
In the above example, the audio signal of each microphone included in the recording data is
generated as a sound collection signal for each direction by the separation processing unit by
channel 16, and then input to the reproduction processing unit 26 and emitted. Although an
example has been shown, the audio signal of each microphone included in the recorded data may
be input as it is to the reproduction processing unit 26 and emitted.
11-04-2019
22
In this case, the input of the audio signal to the channel-by-channel decomposition processing
unit 16 is not essential, and the control unit 23 may synchronize the display control of the
display processing unit 14 with the output of the audio signal to perform sound emission.
[0077]
Also, if the recorded data is a collected sound signal for each direction, the decomposition
processing unit 12 for each direction and the decomposition processing unit for each channel 16
become unnecessary at the time of reproduction, and the control unit 19 transmits the recorded
data to the analysis unit 13 and the signal processing unit 17. Input the sound pickup signal of
[0078]
Further, if the display data output from the display processing unit 14 is recorded in the storage
25, it is not necessary to determine the sound source direction again at the time of reproduction.
In this case, since all the control can be performed by the control unit 23 of the recording and
reproducing unit 2, the control of the ball image and the control of the signal processing can be
performed even with the single recording and reproducing unit 2.
[0079]
In the present embodiment, an example in which three microphones are provided is shown, but
sound source separation can be performed by providing at least two microphones. For example,
two directional microphones whose sound collection directions are inclined at a predetermined
angle (90 °) may be provided close to each other, and the gain of voice collected by each
directional microphone may be changed to generate a plurality of synthesized signals. It is. Of
course, more microphones may be arranged in the same plane.
[0080]
DESCRIPTION OF SYMBOLS 1 ... Microphone unit 2 ... Recording and reproducing unit 10 ... Multi
microphone 11 ... Selector 12 ... Decomposition processing part 13 according to direction 13 ...
Analysis part 14 ... Display processing part 15 ... Interface 16 ... Decomposition processing part
according to channel 17 ... Signal processing part 18 ... Data Creation unit 19 ... Control unit 21 ...
Display unit 22 ... Interface 23 ... Control unit 24 ... Operation unit 25 ... Storage 26 ...
11-04-2019
23
Reproduction processing unit 27 ... Communication unit 28 ... Sensor unit
11-04-2019
24
Документ
Категория
Без категории
Просмотров
0
Размер файла
39 Кб
Теги
description, jp2011180470
1/--страниц
Пожаловаться на содержимое документа