close

Вход

Забыли?

вход по аккаунту

?

DESCRIPTION JP2010213091

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JP2010213091
The present invention provides a sound source position estimation device capable of accurately
estimating the arrival direction of sound even when there is an obstacle around a microphone. A
sound source position estimation apparatus (100) stores a correction table (TB) based on
information on an obstacle located in the vicinity of a microphone array (110) composed of a
plurality of nondirectional microphones and a microphone array (110). To calculate the arrival
direction of the sound collected by the storage unit (130) and the microphone array (110) based
on the time of arrival difference or phase difference of the sound and the correction table (TB)
stored in the storage unit (130) A direction estimation processing unit (140) is provided.
[Selected figure] Figure 1
Sound source position estimation device
[0001]
The present invention relates to a sound source position estimation apparatus, and more
particularly to a sound source position estimation apparatus capable of accurately estimating the
arrival direction of sound even when there is an obstacle around a microphone of the sound
source position estimation apparatus.
[0002]
Conventionally, there has been proposed a monitoring camera (see, for example, Patent
Document 1) which detects the generation position of an external sound and captures an image
around the generation position of the abnormal sound only when the abnormal sound is
10-04-2019
1
generated.
In Patent Document 1, the direction of arrival of sound is calculated from the difference in level
of sound, and a sound of a predetermined level or more is determined as an abnormal sound.
However, Patent Document 1 has a problem that the resolution of the calculated direction of
arrival is low.
[0003]
Therefore, the present applicant has already applied for a sound monitoring device that includes
a microphone array and calculates the direction of arrival of sound using the difference in time of
arrival of sound and the phase difference of sound (see Patent Document 2).
[0004]
Patent Document 1: Japanese Patent Application Publication No. 2006-94251 Japanese Patent
Application No. 2007-290826
[0005]
In the case of estimating the arrival direction of sound using the arrival time and phase
difference of the sound arriving to each of the microphones constituting the microphone array as
in the technique of Patent Document 2, in order to perform more accurate estimation, It is
necessary to consider that.
First, depending on the place where the sound monitoring apparatus described in Patent
Document 2 is installed, for example, an obstacle may be present in the vicinity of the
microphone.
In this case, the sound wave is diffracted by the obstacle, and the propagation path from the
sound source to the microphone is changed as compared with the case where there is no
obstacle. Therefore, it is not possible to simply estimate the sound generation position (sound
source position) from the difference in arrival time of sound to each microphone. Further, the
sound reaching the microphone array includes not only the direct sound from the sound source,
but also the reflected sound from a wall or an obstacle, reverberation and the like, and these
effects also need to be considered.
10-04-2019
2
[0006]
In the case where estimation of the sound source position and photographing of the
surroundings by the monitoring camera are performed as in Patent Document 1 and Patent
Document 2, for example, a dome-shaped camera provided with a hemispherical transparent
camera dome covering the camera as a monitoring camera. It is conceivable to use (hereinafter
also referred to as a dome camera). At this time, the sound monitoring apparatus is preferably
manufactured as a product in which the microphone and the dome camera are integrated. At this
time, from the appearance of the product, it is desirable to attach a microphone without
significantly changing the appearance of the dome camera. However, depending on the mounting
position of the microphone, the camera dome itself may become an obstacle, which may affect
the estimation accuracy of the sound source position.
[0007]
Therefore, an object of the present invention is to provide a sound source position estimation
apparatus for use in, for example, a surveillance camera or a sound monitoring apparatus, which
can accurately estimate the arrival direction of sound even when there is an obstacle around the
microphone of the sound source position estimation apparatus. It is to do.
[0008]
In order to solve the various problems described above, a sound source position estimation
apparatus according to the present invention is based on a microphone array consisting of a
plurality of nondirectional microphones, and information on an obstacle located near the
microphone array (path difference, phase difference Or a storage unit for storing a correction
table (correction information) (such as arrival time difference of sound), and the arrival direction
of the sound collected by the microphone array is stored in the arrival time difference or phase
difference of the sound and the storage unit. An arrival direction estimation processing unit that
calculates based on a correction table (correction information).
[0009]
In the sound source position estimation device according to one embodiment of the present
invention, at least one of the obstacles is a casing (camera dome) constituting the sound source
position estimation device, and the plurality of nondirectional microphone arrays are , Arranged
so as to be substantially in contact with the outer wall of the housing.
10-04-2019
3
[0010]
Furthermore, the sound source position estimation device according to another embodiment of
the present invention is characterized in that the sound source position estimation device is
based on the level of sound collected by at least one of the plurality of microphones or a
microphone provided separately from the nondirectional microphone array. The apparatus
further comprises a selection unit that selects a sound to be used for calculation of the direction
of arrival of the sound by the arrival direction estimation processing unit.
[0011]
Furthermore, in the sound source position estimation apparatus according to another
embodiment of the present invention, the selection unit may be configured to: at least one of the
plurality of microphones or a microphone separately provided from the nondirectional
microphone array Among the sounds whose levels exceed a predetermined threshold, the sounds
whose sound levels above the predetermined threshold are not collected in advance within a
predetermined time are selected.
[0012]
Furthermore, in the sound source position estimation apparatus according to another
embodiment of the present invention, the selection unit may be configured to: at least one of the
plurality of microphones or a microphone separately provided from the nondirectional
microphone array And changing the predetermined threshold according to a change in level.
[0013]
According to the present invention, even if there is an obstacle around the microphone of the
sound source position estimation device, it is possible to provide a sound source position
estimation device which can accurately estimate the arrival direction of sound, for example, used
in surveillance cameras and sound monitoring devices. Is possible.
[0014]
It is a schematic block diagram of a sound source position estimating device according to an
embodiment of the present invention.
FIG. 1 is a schematic block diagram of a sound monitoring device 200 provided with a sound
10-04-2019
4
source position estimation device 100.
FIG. 1 is an external view of a sound monitoring device 200 provided with a sound source
position estimation device 100.
It is a schematic sectional drawing containing two microphones (MIC1, MIC3) of camera dome
DOM.
It is the graph which plotted the relationship between the propagation path difference and the
arrival direction θ of sound and approximated by a curve.
It is a time chart explaining the method of reducing the influence by a reflected sound.
It is a figure at the time of installing the sound monitoring apparatus 200 in a ceiling etc.
indoors.
[0015]
Hereinafter, embodiments of the present invention will be described in detail with reference to
the drawings.
FIG. 1 is a schematic block diagram of a sound source position estimation apparatus according to
an embodiment of the present invention. As shown in the figure, the sound source position
estimation apparatus 100 includes a microphone array 110 including a plurality of
nondirectional microphones, a level correction unit 120, a storage unit 130, and a sound arrival
direction estimation processing unit 140. Further, the sound source position estimation device
100 further includes a sound selection unit 150. The microphone array 110 collects the sound
generated at the installation place of the sound source position estimation apparatus 100. The
microphones constituting the microphone array 110 are spaced apart from each other at least
three when detecting the position of the sound source two-dimensionally and at least two when
detecting the position of the sound source one-dimensionally. The level correction unit 120
performs automatic gain correction on the level of the sound signal received by the microphone
array 110 using, for example, automatic gain control (AGC). That is, when the signal level of the
10-04-2019
5
sound arriving at the microphone array 110 is small, the signal level is increased and output to
each subsequent processing unit. The storage unit 130 stores a correction table TB based on
information on an obstacle located in the vicinity of the microphone array 110.
[0016]
The sound arrival direction estimation processing unit 140 sets the arrival direction of the sound
collected by the microphone array 110 to an arrival time difference (reception time difference) or
phase difference of the sound to each microphone included in the microphone array 110 and
between the microphones. Calculated based on the distance. At this time, based on the correction
table TB stored in the storage unit 130, the sound arrival direction estimation processing unit
140 appropriately corrects the calculated arrival direction, and outputs the corrected one as the
sound arrival direction. . Alternatively, the sound arrival direction estimation processing unit 140
calculates the arrival direction directly from the correction table TB stored in the storage unit
130 (details will be described later). The sound selection unit 150 selects the sound used to
estimate the arrival direction from the sounds collected by the microphone array 110.
[0017]
Next, the case where the sound source position estimation apparatus 100 is used for a sound
monitoring apparatus as described in Patent Document 2 will be described as an example. FIG. 2
is a schematic block diagram of a sound monitoring device 200 provided with the sound source
position estimation device 100. As shown in FIG. It should be noted that the present invention
does not necessarily have to be implemented as a sound monitoring device.
[0018]
First, the sound monitoring apparatus 200 will be briefly described. The sound monitoring device
200 is a device that combines information obtained by the sound source position estimation
device 100 with the arrival direction of sound estimated by the sound source position estimation
device 100 and information about sound abnormality, and displays the combined image on an
external monitor DIS, for example. The abnormal sound determination unit 240 determines
whether the sound collected by the microphone array 110 of the sound source position
estimation device 100 is an abnormal sound different from the environmental sound. The sound
information processing unit 210 combines information on the arrival direction of the sound
10-04-2019
6
estimated by the sound source position estimation device 100 and information on the abnormal
sound determined by the abnormal sound determination unit 240 with the image captured by
the camera CA, It is output to the external monitor DIS or the network processing unit 220. The
alarm processing unit 230 outputs an alarm or outputs alarm information to the network
processing unit 220 when the abnormal sound determination unit 240 determines that the
sound is an abnormal sound. The network processing unit 220 outputs the image or information
output from the sound information processing unit 210 to, for example, a mobile phone or the
like via the network NET.
[0019]
First, a method for reducing the influence of an obstacle located around the microphone array
110 on the estimation of the direction of arrival of sound will be described. FIG. 3 shows an
external view of a sound monitoring device 200 provided with the sound source position
estimation device 100. As shown in FIG. In the example of FIG. 3, the camera CA of the sound
monitoring apparatus 200 is realized by a dome camera, and the camera dome DOM covers each
component of the sound monitoring apparatus 200 including the sound source position
estimation apparatus 100 as a housing. The sound monitoring device 200 is installed on a ceiling
or the like indoors, for example, as shown in FIG. 3 (a) and 3 (b) are an external view and a
partially enlarged cross-sectional view, respectively, of the sound monitoring device 200, and in
FIG. 3 (b), an example in which the sound monitoring device 200 is attached to a ceiling It shows.
[0020]
As described above, in the sound monitoring device in which the microphone array 110 of the
sound source position estimation device 100 and the dome camera are integrated, the camera
dome itself becomes an obstacle depending on the mounting position of the microphone and
affects the estimation accuracy of the sound source position. There is a fear. Therefore, in the
present embodiment, as shown in FIG. 3, for example, three microphones MIC1, MIC2, and MIC3
forming the microphone array 110 are disposed so as to be substantially in contact with the
outer wall of the camera dome DOM. This will be described with reference to the enlarged crosssectional view of the portion of the microphone MIC3 in FIG. In the example of FIG. 3 (b), the
microphone MIC3 is approximately in contact with the outer wall of the camera dome DOM at
the edge of the camera dome DOM, ie, the distance m between the microphone MIC2 and the
outer wall of the camera dome DOM is as small as possible It is arranged to do. Attaching the
microphone in this way does not affect the appearance of the camera dome DOM. For example, in
10-04-2019
7
the example of FIG. 3B, since the microphone is attached to the edge of the camera dome DOM
and embedded in the ceiling, it is difficult to recognize the presence of the microphone. In
addition, it is difficult for the reflected sound from the camera dome DOM to reach the
microphone, and the influence of the reflected sound can be reduced. Furthermore, as described
later, it is possible to accurately estimate the arrival direction of the sound by correcting the
influence of the camera dome itself as an obstacle on the estimation of the arrival direction of the
sound of the sound source position estimation apparatus 100.
[0021]
Next, the correction table TB used for the sound arrival direction estimation processing unit 140
to estimate the sound arrival direction will be described. Here, the sound monitoring apparatus
200 shown in FIG. 3 will be described as an example. FIG. 4 is a schematic cross-sectional view
including the two microphones (MIC1 and MIC3) of the camera dome DOM when the sound
monitoring apparatus 200 shown in FIG. 3 is installed on a ceiling. In the figure, A and B are
sound receiving points, that is, the positions of the microphones, and receive sound waves
generated from the sound source SS. In addition, the camera dome DOM has a hemispherical
shape with a radius r. Assuming that the sound source SS is sufficiently far from the microphone
and the sound wave from the sound source SS propagates as a plane wave, the propagation path
difference d of the sound reaching the sound receiving points A and B when the camera dome
DOM does not exist is shown in FIG. It becomes a broken line in. The relationship between the
propagation path difference d, the arrival direction θ of the sound wave, the sound velocity c,
the propagation time difference t, and the radius r of the camera dome DOM can be expressed by
the following equation. <img class = "EMIRef" id = "205986249-000003" /> Therefore, the
arrival direction θ becomes <img class = "EMIRef" id = "205986249-000004" /> from the
equation (1).
[0022]
Also, the propagation path difference d ′ of the sound reaching the sound receiving points A
and B when the camera dome DOM is present is a thick line in FIG. The relationship between the
propagation path difference d ′, the arrival direction θ of the sound wave, the sound velocity c,
the propagation time difference t ′, and the radius r of the camera dome DOM can be expressed
by the following equation. <img class = "EMIRef" id = "205986249-000005" /> By transforming
equation (3), <img class = "EMIRef" id = "205986249-000006" /> is obtained.
10-04-2019
8
[0023]
From Equations (2) and (4), it is possible to plot the propagation path differences d and d ′
scaled by 2r with respect to the sound arrival direction θ. The relationship between the
propagation path difference and the arrival direction θ of sound is plotted in FIG. 5, and a graph
approximated by a curve is shown by a solid line when there is a camera dome and a broken line
when there is no camera dome. From the graph, it can be seen that when the arrival directions
are equal, the propagation path difference d ′ when there is a camera dome is longer than the
propagation path difference d when there is no camera dome. This is consistent with the
difference in propagation path difference between d and d 'shown in FIG. Also, when the
propagation path differences are equal (that is, when the arrival time differences of sound to the
sound receiving points A and B are equal), the arrival direction θ when there is a camera dome
is greater than the arrival direction θ when there is no camera dome. I understand that it is
small.
[0024]
From equation (3), it can be understood that the arrival direction θ of the sound can be obtained
if the arrival time difference t ′ of the sound is known. Here, θ can not be obtained by
numerical calculation from equation (3). However, if the relationship between the discrete value
θ and the propagation distance difference, that is, the arrival time difference, is held in advance
as the correction table TB as shown in FIG. 5, the arrival direction θ of the sound is estimated
from the measured arrival time difference t ′. be able to. Therefore, the sound source position
estimation apparatus 100 according to one embodiment of the present invention stores the
relationship between the propagation path difference or arrival time difference due to the camera
dome DOM and the arrival direction as the correction table TB in the storage unit 130 The
direction estimation processing unit 140 estimates the arrival direction of sound based on the
arrival time difference measured by each microphone and the correction table TB.
[0025]
The correction table TB may store information based on the difference in the arrival direction
between the case where there is no camera dome DOM and the case where there is a camera
dome DOM. From the graph of FIG. 5, it can be seen that the difference in the propagation
distance difference depending on the presence or absence of the camera dome DOM becomes
remarkable when the sound arrival direction θ becomes larger than 30 °. Therefore, the
10-04-2019
9
difference in the arrival direction due to the presence or absence of the camera dome DOM is
stored as the correction table TB, and the sound arrival direction estimation processing unit 140
determines the arrival direction calculated using Equation (2) when there is no camera dome
DOM. If the angle is larger than 30 °, the direction of arrival may be corrected with reference to
the correction table TB.
[0026]
Although the correction table TB in the case where the microphone array 110 substantially
contacts the camera dome DOM has been described in the above embodiment, the present
invention is not limited to this. For example, by geometrically obtaining the relationship between
the propagation path difference caused by the obstacle around the microphone array 110 and
the arrival direction of sound in the same manner, and storing the information in the storage unit
130 as a correction table, The estimation error of the arrival direction of the sound due to the
obstacle can be corrected. Further, the correction table TB may include not only the diffracted
sound that has passed through the shortest path as described above, but also information of
sound waves that have passed through other than the shortest path.
[0027]
Further, as a secondary effect in the presence of the camera dome DOM, it can be seen from the
graph of FIG. 5 that improvement in estimation accuracy can be expected around 90 ° in the
arrival direction. In FIG. 5, when there is no camera dome DOM, the propagation distance
difference in the arrival direction near 90 ° hardly changes. On the other hand, when there is a
camera dome DOM, the propagation distance difference changes relative to the change in the
arrival direction even when the arrival direction is near 90 °. Therefore, by installing the camera
dome DOM, it is possible to obtain good angular resolution over all directions of arrival, including
around 90 °.
[0028]
Next, a method of reducing the influence of the reflected sound from the wall or obstacle on the
estimation of the arrival direction of the sound will be described. FIG. 6 is a diagram for
explaining a method of reducing the influence of the reflected sound, and is shown by a time
chart in which time is taken on the horizontal axis. The time chart A of FIG. 6 shows the sound
10-04-2019
10
pressure level of the sound that is collected by one of the microphones constituting the
microphone array 110 when there is a reflected sound and corrected by the level correction unit
120. Further, time charts B, C and D respectively show a peak detection section of the sound
pressure level, a hold-off section for waiting for capturing of sound, and a timeout section for the
arrival direction processing.
[0029]
The time chart of FIG. 6 will be described. As shown in the time chart A, the sound pressure level
collected by the microphone changes with the direct sound from the sound source and the
reflected sound as time passes. At this time, the sound selection unit 150 determines that the
sound different from the environmental sound is detected when the sound pressure level exceeds
the predetermined threshold Th, and as shown in the time chart B, the peak detection signal
rises. Thereafter, when the direct sound from the sound source arrives and the first peak P1 is
observed, the peak detection signal of the time chart B falls.
[0030]
As shown in the time chart A, after the direct sound from the sound source arrives and the first
peak P1 occurs, the reflected sound causes the second peak P2. At this time, it is not preferable
to perform the estimation process of the sound source position when the second peak P2 that is
the reflected sound is detected, because this causes an error in the estimation of the arrival
direction of the sound. Therefore, when the peak detection signal falls after the detection of the
first peak P1 in the time chart B, as shown in the time chart C, the hold off signal rises.
Thereafter, the hold-off signal falls when the sound pressure level falls below a predetermined
threshold Th. As described above, a section from the sound pressure level exceeding the first
peak P1 to the level falling below the predetermined threshold Th is a take-off prohibited section
(hold-off section) T1. Even when the peak of the sound pressure level is observed, the estimation
process of the sound source position is not performed. That is, after the first peak is observed
based on the level of the sound collected by the microphone, the sound selection unit 150 makes
the sound arrival direction estimation processing unit 140 sound in a section where the sound
pressure level exceeds a predetermined threshold. Do not select the sound used to calculate the
direction of arrival of By doing this, it is possible to prevent the estimation of the sound source
position by the reflected sound.
[0031]
10-04-2019
11
In addition, in the time chart A, the capture inhibition period T1 must be canceled at the same
time as falling below the predetermined threshold value Th, and the next incoming sound (third
peak P3) must be captured. Therefore, an upper limit value (timeout period) T2 of a section for
performing hold off is provided, and when the sound pressure level does not fall below the
threshold Th for the time out period T2 or more, time out and sound capture is resumed. That is,
when the peak detection signal falls in the time chart B, as in the time chart D, the timeout
processing signal rises, and the timeout processing signal falls after the timeout period T2
elapses. Then, when the sound pressure level exceeds the predetermined threshold Th when the
time out period T2 falls, it is determined that the third peak P3 is detected, and the peak
detection signal rises as shown in the time chart B. The timeout period T2 may be, for example,
1.5 seconds at maximum.
[0032]
In addition, as a microphone which measures a sound pressure level, you may provide separately
a microphone different from the microphone array 110. FIG. At this time, level correction is also
performed on the sound collected by the microphone for measuring the sound pressure level.
[0033]
As described above, the sound selection unit 150 outputs the sound whose level of sound
collected by the microphone n provided separately from at least one of the microphones
constituting the microphone array 110 or the microphone array 110 exceeds the predetermined
threshold Th. Among them, a sound not exceeding a predetermined threshold Th in advance is
selected and output to the sound arrival direction estimation processing unit 140 within a
predetermined time (timeout period T2 or hold off period T1). . By doing this, it becomes possible
to estimate the sound source position more accurately with only the direct sound by removing
the influence of the reflected sound.
[0034]
In addition, the sound selection unit 150 can appropriately change the predetermined threshold
Th described above according to the ambient sound level. When the abnormal sound is detected
10-04-2019
12
by the sound monitoring apparatus 200, the environmental sound is an abnormal sound if the
predetermined threshold Th is set to a constant value at a place where the entering and leaving
of the person fluctuates irregularly or at a place where the ambient noise level greatly differs
between day and night However, there is a disadvantage that an abnormal sound is not
determined as an abnormal sound. Therefore, the sound selection unit 150 changes the value of
the predetermined threshold value Th by adapting to the change of the environmental sound.
Specifically, the sound selection unit 150 obtains the maximum value (maximum sound level) for
each time frame of the level of the sound corrected by the level correction unit 120. The time
frame is, for example, 1/30 second. Then, the sound selection unit 150 calculates an average
value of the maximum sound levels over several frames, and sets the calculated value as a
predetermined threshold Th. That is, assuming that the maximum sound level of the first frame is
a1, the maximum sound level of the second frame is a2, and the maximum sound level of the nth
frame is an, the average value aave of the maximum sound levels over the n frames is It can be
expressed by equation (5). <img class = "EMIRef" id = "205986249-000007" /> By using the aave
thus obtained, the threshold value is adaptively changed in response to the surrounding
environmental sound, and a more accurate sound source position is obtained. Can be estimated.
[0035]
In the equation (5), by changing the number n of frames used for averaging, it is possible to
change the threshold following the time variation of the environmental sound. For example, when
the fluctuation of the environmental sound is large, the number n of frames to be obtained as the
average may be reduced. On the contrary, when the variation of the environmental sound is
small, the number n of frames to be obtained as the average may be increased.
[0036]
The advantages of the present invention will be reiterated. As described above, according to the
present invention, when estimating the direction of arrival of sound using the arrival time and
phase difference of the sound arriving at each of the microphones constituting the microphone
array, an obstacle is present around the microphone array It is possible to provide a sound
source position estimation device that performs estimation with higher accuracy even in the case
where reflection sound occurs. Further, even when the sound source position estimation device is
integrated with the dome camera, the microphone array can be made to be substantially in
contact with the outer wall of the camera dome, so that the influence of the reflected sound from
the camera dome can be reduced. Furthermore, by arranging the microphone array so as to be
substantially in contact with the outer wall of the camera dome, it is possible to correct the
10-04-2019
13
influence of the camera dome itself which is an obstacle, and to estimate the arrival direction of
the sound with high accuracy. It also has the advantage that the size of the apparatus is not
increased without impairing the appearance of the dome camera.
[0037]
Further, according to the present invention, since the estimation of the arrival direction is not
performed on the reflected sound, the estimation accuracy of the arrival direction can be
improved.
[0038]
Although the present invention has been described based on the drawings and examples, it
should be noted that those skilled in the art can easily make various changes and modifications
based on the present disclosure.
Therefore, it should be noted that these variations and modifications are included in the scope of
the present invention. For example, functions and the like included in each component can be
rearranged so as not to be logically contradictory, and a plurality of components can be
combined or divided into one. For example, in the above-mentioned embodiment, although
camera dome DOM was explained as hemispherical shape, the present invention may not be
limited to this and may be box shape. Moreover, although the case where a sound source position
estimation apparatus was provided in the sound monitoring apparatus was demonstrated, this
invention is not limited to this. For example, only the sound source position estimation device can
be attached to the ceiling, or it can be used for a surveillance system that does not perform
imaging with a camera.
[0039]
DESCRIPTION OF SYMBOLS 100 sound source position estimation apparatus 110 microphone
array 120 level correction part 130 memory | storage part 140 sound arrival direction
estimation processing part 150 sound selection part 200 sound monitoring apparatus 210 sound
information processing part 220 network processing part 230 alarm processing part 240
abnormal sound determination part DIS monitor CA camera NET network MIC1 to MIC3
microphones P1 to P3 1st to 3rd peak T1 capture prohibited section T2 upper limit (timeout
section)
10-04-2019
14
Документ
Категория
Без категории
Просмотров
0
Размер файла
27 Кб
Теги
jp2010213091, description
1/--страниц
Пожаловаться на содержимое документа