close

Вход

Забыли?

вход по аккаунту

?

DESCRIPTION JP2017112415

код для вставкиСкачать
Patent Translate
Powered by EPO and Google
Notice
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
DESCRIPTION JP2017112415
The present invention provides a sound field estimation apparatus, a method and a program
thereof, which have a larger space area where extrapolation is effective than in the past. Kind
Code: A1 A sound field estimation apparatus 200 includes collected sound signals y (t, r, Ω) (j =
1, 2,...) Of a spherical microphone array 1 provided with microphones at J positions (r, Ω).・,
Using the spherical wave spectrum u (ω, r) (n = 0, 1, ..., N, m =-n,-n + 1, ..., n) calculated from ,, J),
the sound field is configured Plane wave decomposing unit 213 for estimating a vector
consisting of the strength of the plane wave, r is the radius of polar coordinates, Ω is the
declination of polar coordinates, estimated vector a (ω) consisting of the strength of plane waves
and position of virtual microphone ( and an extrapolation / estimation unit 216 that estimates
the collected sound signal u ^ (ω, r, Ω) in the frequency domain at the position of the virtual
microphone using r, Ω). [Selected figure] Figure 2
Sound field estimation device, method and program thereof
[0001]
The present invention relates to a technique for estimating a sound collection signal obtained
when a microphone is arranged at another position using a sound collection signal of a
microphone arranged at a certain position.
[0002]
In recent years, audio reproduction technology has been expanded from 2-channel stereo to 5.1channel reproduction, and further research and development on 22.2 channel reproduction and
wave-field synthesis methods are advanced, greatly improving the realism of reproduction itself,
and the realism It is intended to expand the high reproduction area of
11-04-2019
1
[0003]
In order to evaluate and verify such a multi-channel audio reproduction method, it is important
to measure the reproduced sound field.
For example, in the wavefront synthesis method, it is necessary to compare the actually recorded
sound field with the reproduced sound field to grasp the difference.
The reason is that various factors such as signal processing for converting the recorded sound
field into reproduced signals, encoding and decoding of the recorded signals, and acoustic
characteristics of the room in which the reproducing apparatus is installed affect the
reproduction accuracy of the sound field. This is because it is important to establish a method
with high reproduction accuracy.
[0004]
(Conventional method 1) As a method of measuring a sound field, it is conceivable to locally
arrange a microphone locally on a part of a target measurement area and estimate a sound field
of a surrounding area from the measurement result. As an example, examination of a spherical
microphone array is in progress. The spherical microphone array is a microphone array in which
several tens or more microphone elements are arranged on a spherical surface of radius r a, and
r a is in the range of several cm to several tens cm.
[0005]
FIG. 1 shows the signal flow of a sound field estimation process using a spherical microphone
array 1 in the prior art. The time domain signals y (t, ra, Ω j) collected by the J microphones
disposed on the spherical surface are converted by the short time Fourier transform unit 111
into the frequency domain signals u (i, ω, ra, Converted to Ω j). Where t is time, i is a frame, J is
an integer of 2 or more, ω is a time frequency, j = 1, 2,. In the following processing, processing is
performed in frame units, but i is omitted to simplify the notation. Ω j is a position on the
spherical surface of the j-th microphone element, and is specified by a pair of elevation angle θ j
11-04-2019
2
and azimuth angle φ j. Ω j = (θ j, φ j).
[0006]
The spherical wave spectrum conversion unit 112 obtains the spherical wave spectrum u n, m
(ω, r a) by the following equation for each frequency ω.
[0007]
[0008]
Where α j is a weight appropriately set such that the product-sum of equation (1) satisfies the
orthogonality of the spherical harmonics expressed by the following equation.
[0009]
[0010]
Y n <m> (θ j, φ j) is a spherical harmonic function of order n and order m, and * means complex
conjugate.
n = 0, 1,..., N, m = −n, −n + 1,.
δ nn 'is 1 when n = n' and 0 when n0n ', δ mm' is 1 when m = m 'and 0 when m ≠ m' It is a
value.
In order to obtain a spherical wave spectrum up to the order number N, (N + 1) <2> or more
microphone elements are required.
[0011]
Note that, from this point onward, measuring the sound field generated by the sound source
11-04-2019
3
located outside the measurement target range, that is, the internal problem is dealt with.
In other words, the sound field generated by the sound source outside the sphere of the spherical
microphone array is measured.
[0012]
The sound field is considered in the polar coordinate system (r, Ω) = (r, θ, φ) with the center of
the spherical microphone array as the origin.
[0013]
The extrapolation estimation unit 116 extrapolates the sound field from the position of the
radius ra to the radius r at the frequency ω according to the following equation, and the sound
collection signal u (r, Ω) = (r, θ, φ) Find ω, r, Ω).
In other words, the sound collection signal of the microphones disposed on the spherical
microphone array is used to estimate the sound field outside the sphere of the spherical
microphone array.
[0014]
[0015]
Where k is the wavenumber k = ω / c (c is the speed of sound) and b n () is the mode intensity
function.
[0016]
Non-Patent Document 1 shows the case of an open sphere spherical microphone array in which
microphone elements are hollow and arranged on a spherical surface.
In this case, the mode intensity function is expressed by the following equation.
11-04-2019
4
[0017]
[0018]
である。
Here, i is an imaginary number, and j n () is an n-order spherical Bessel function.
When the spherical microphone array is configured by arranging the microphone elements on
the surface of the hard sphere, the mode intensity function is expressed by the following
equation based on Non-Patent Document 2.
[0019]
[0020]
Where h n () is the n-th first kind Hankel function.
"A '" means the derivative of A.
[0021]
The short time inverse Fourier transform unit 118 converts the spatially extrapolated sound
collection signal from the signal u (ω, r, Ω) in the frequency domain to the signal y (t, r, Ω) in
the time domain and outputs the signal.
[0022]
In equation (3), b n (kr) / b n (k r a) is applied to the spherical wave spectrum, and product-sum
11-04-2019
5
is obtained by Y n <m> (θ, φ).
The product sum of Y n <m> (θ, φ) corresponds to inverse spherical wave spectrum conversion.
Therefore, the spatially collected sound pickup signal u (ω, r, Ω) is a signal in the frequency
domain.
[0023]
In the measurement by the open-ball microphone array, the influence of a singular point can not
be avoided, and measurement becomes impossible at k and r when j n (kr) = 0. Specifically, when
j n (kr) = 0 is satisfied even if there is a sound field, the output becomes zero. However, there is
no singularity in a rigid-sphere microphone array and it can not be measured. Therefore, it is the
mainstream to use a hard sphere microphone array as a spherical microphone array.
[0024]
T. Abhayapala and D. Ward, "Theory and design of high order sound field microphones using
spherical microphone array", in Acoustics, Speech, and Signal Processing (ICASSP), IEEE
International Conference on, 2002, pp. II-1949. Meyer, Jens; Elko, Gary, "A highly scalable
spherical microphone array based on an orthonormal decomposition of the soundfield",
Acoustics, Speech, and Signal Processing (ICASSP), 2002 IEEE International Conference on 2002,
II-1781-II- 1784.
[0025]
In the prior art, the Bessel function j n (kr) or the Bessel function j n (kr) and the Hankel function
h n (kr) are used when extrapolating the sound field from the position of the radius r a to the
radius r. According to reference 1, as kr tends to increase as a global tendency for both functions,
it decreases at a pace of 1 / kr. (Reference 1) E. G. Williams, "Fourier acoustics", Springer Fearak,
2005, pages 234-236.
[0026]
11-04-2019
6
For example, when r becomes ten times as large as r a, the extrapolation estimation value
decreases sharply to about 1/10. Therefore, the space area where extrapolation is effective is
limited to the periphery of the spherical microphone array surface. For the same reason, even if
the frequency ω is increased, k = ω / c is increased, and the extrapolation estimated value is
rapidly reduced. In other words, if the frequency is increased, the space area where extrapolation
is effective narrows sharply.
[0027]
An object of the present invention is to provide a sound field estimation device, a method and a
program for the sound field estimation device, which have a larger space area where
extrapolation is effective than the prior art.
[0028]
In order to solve the above-mentioned problems, according to one aspect of the present
invention, the sound field estimation apparatus j = 1, 2,..., J, ra as polar coordinates, θ j and φ j
as polar coordinates. A sphere calculated from the collected sound signals y (ra, θ j, φ j) of a
spherical microphone array provided with angles and ω as indices of time frequency and
microphones at J positions (ra, θ j, φ j) respectively Vector consisting of the intensity of the
plane wave that composes the sound field using the wave spectrum un, m (ω, ra) (n = 0, 1, ..., N,
m =-n,-n + 1, ..., n) Where r is the polar coordinate radius, θ and φ are polar coordinates, and
the estimated value a (ω) of the vector consisting of the intensity of the plane wave and the
position of the virtual microphone (r, θ, φ) And an extrapolation estimating unit that estimates a
collected sound signal u ^ (ω, r, θ, φ) in the frequency domain at the position (r, θ, φ) of the
virtual microphone.
[0029]
In order to solve the above problems, according to another aspect of the present invention, the
sound field estimation method is such that j = 1, 2,..., J, ra is a radius of polar coordinates, θ j
and φ j are polar coordinates. Calculated from the sound collection signal y (ra, θ j, φ j) of a
spherical microphone array provided with a deflection angle, ω as an index of time frequency,
and microphones at J positions (ra, θ j, φ j) respectively Using the spherical wave spectrum un,
m (ω, ra) (n = 0, 1, ..., N, m =-n,-n + 1, ..., n), it consists of the intensity of the plane wave that
composes the sound field A plane wave decomposition step for estimating the vector, r is the
radius of the polar coordinates, θ and φ are the polar angles, and the estimated value a (ω) of
the vector consisting of the intensity of the plane wave and the position of the virtual
microphone (r, θ, φ And V.) to estimate the collected sound signal u ^ (ω, r, θ, φ) in the
frequency domain at the position (r, θ, φ) of the virtual microphone.
11-04-2019
7
[0030]
According to the present invention, the space area where extrapolation is effective is larger than
that of the prior art.
[0031]
The functional block diagram of the sound field estimation apparatus which concerns on a prior
art.
FIG. 1 is a functional block diagram of a sound field estimation device according to a first
embodiment.
The figure which shows the example of the processing flow of the sound field estimation
apparatus which concerns on 1st embodiment.
FIG. 7 is a diagram showing an outline of positions of virtual microphones in the first
embodiment and its first and second modifications.
The functional block diagram of the sound field estimation apparatus which concerns on 2nd
embodiment. The figure which shows the example of the processing flow of the sound field
estimation apparatus which concerns on 2nd embodiment. The functional block diagram of the
sound field estimation apparatus which concerns on 3rd embodiment. The figure which shows
the example of the processing flow of the sound field estimation apparatus which concerns on
3rd embodiment.
[0032]
Hereinafter, embodiments of the present invention will be described. In the drawings used in the
following description, the same reference numerals are given to constituent parts having the
same functions and steps for performing the same processing, and redundant description will be
omitted. In the following description, the symbols “^”, “<->”, etc. used in the text should
11-04-2019
8
originally be written directly above the previous character, but due to the limitations of the text
notation Describe. In the formula, these symbols are described at their original positions.
Moreover, the processing performed in each element unit of a vector or a matrix is applied to all
elements of the vector or the matrix unless otherwise noted.
[0033]
<Point of First Embodiment> In the present embodiment, instead of extrapolating the spherical
wave spectrum, a collection of plane waves constituting the sound field is obtained from the
spherical wave spectrum, and the sound field is extrapolated using this plane wave. Through the
plane wave, it is possible to greatly expand the space area where extrapolation is effective. The
method will be described below.
[0034]
<Sound Field Estimation Device 200 According to First Embodiment> FIG. 1 shows a functional
block diagram of the sound field estimation device 200 according to the first embodiment, and
FIG. 2 shows its process flow.
[0035]
The sound field estimation apparatus 200 includes a short time Fourier transform unit 211, a
spherical wave spectrum conversion unit 212, a plane wave decomposition unit 213, an
extrapolation estimation unit 216, and a short time inverse Fourier transform unit 218.
[0036]
The sound field estimation device 200 is configured to receive the collected sound signal y (t, ra,
Ω j) (where j = 1, 2,..., J) in the time domain from the spherical microphone array 1, and position
information (r, Ω), and estimates and outputs a time-domain collected sound signal y (t, r, Ω) at
the position of the virtual microphone.
In the spherical microphone array 1, J microphones are arranged on the spherical surface of
radius r a, and the position of the j-th microphone is specified by Ω j = (θ j, φ j).
11-04-2019
9
That is, with the center of the spherical surface formed by the spherical microphone array 1 as
the origin, let ra be the radius of polar coordinates, θ j and φ j be the polar angles, and the
position of the j-th microphone be (ra, θ j, φ j) Is represented by
[0037]
<Short-Time Fourier Transform Unit 211> The short-time Fourier transform unit 211 receives
the collected signal y (t, ra, Ω j) (where j = 1, 2,..., J) in the time domain, and short-time Fourier
transform By conversion, the time-domain sound collection signal y (t, ra, Ω j) and the frequencydomain sound collection signal u (i, ω, ra, Ω j) (where i is a frame number, ω = 1, 2, , F, j = 1,
2,..., J) (S211) and output. Although the subsequent processing is performed for each frame i, the
frame number i is omitted to simplify the description. In addition, as long as it is a method of
converting a time domain signal into a frequency domain signal, a method other than short time
Fourier transform may be used.
[0038]
<Spherical Wave Spectrum Converter 212> The spherical wave spectrum converter 212 is a unit
that collects collected signals u (i, ω, ra, Ω j) (where ω = 1, 2,..., F, j = 1) in the frequency domain.
2, ..., J) are obtained, and for each frequency ω, the spherical wave spectrum un, m (ω, ra) is
determined by the equation (1) (S212), and the spherical wave spectrum un, m (ω, ra) (where , N
= 0, 1,..., N, m = -n, -n + 1,..., N, ω = 1, 2,.
[0039]
[0040]
Note that α j and Y n <m> (θ j, φ j) are as described in the above (Conventional method 1).
[0041]
<Planar Wave Decomposition Unit 213> The plane wave decomposition unit 213 has a spherical
wave spectrum un, m (ω, ra) (where n = 0, 1, ..., N, m =-n,-n + 1, ..., n, Receive ω = 1, 2, ..., F),
estimate the vector consisting of the intensity of the plane wave that composes the sound field
(S213), estimate value a (ω) (where ω = 1, 2, ..., F) Output
11-04-2019
10
For example, the plane wave decomposition unit 213 first solves the following convex
optimization problem using the L1 norm to obtain a collection of plane waves that constitute a
sound field.
[0042]
[0043]
The matrix D (ω) is expressed by the following equation.
[0044]
[0045]
Given by
The vertical vector of the l'-th column of the matrix D (ω) is a spherical microphone array when
a plane wave of amplitude 1 is incident at a single incident angle Ω l ', ie, an elevation angle θ l'
and an azimuth angle φ l ' It is a vector of a spherical wave spectrum calculated from one of the
collected signals y (t, ra).
For example, L ′ incident angles Ω l ′ are set so that L ′ plane waves can be acquired
uniformly from all directions.
For example, L ′ incident angles Ω l ′ are set such that a plane wave is incident from the
direction of the vertex of the regular polyhedron.
The vertical vector of the l'th row is given by the following equation.
[0046]
11-04-2019
11
[0047]
When using spherical harmonics up to order N, the size is (N + 1) <2>.
Note that this size needs to be smaller than the element number J of the microphones included in
the spherical microphone array 1 (thus, (N + 1) <2> <J).
The estimated value a (ω) is a vector composed of estimated values of the strengths of the plane
waves. It is set as a ((omega)) = [a1 ((omega)), a2 ((omega)), ..., a l '((omega)), ..., a L' ((omega))]
<T>.
[0048]
Convex optimization with the L1 norm derives a sparse vector containing many zeros as the
solution vector a (ω). Therefore, as shown in Reference 2, it is possible to extract plane waves
well even in a redundant case where the number L ′ of plane waves assumed in advance largely
exceeds the number of microphones. (Reference 2) A. Wabnitz, N. Epain, A. van. Shaik, C. Jin,
"Reconstruction of spatial sound field using compressed sensing", in Acoustics, Speech, and
Signal Processing (ICASSP), IEEE International Conference on, 2011. As an example, in the case
of a spherical microphone array having 32 microphone elements, N ≦ 4. Also, L 'can be set to
one hundred or more.
[0049]
<Extraction estimation unit 216> The extrapolation estimation unit 216 receives the estimated
value a (ω) and the position information (r, Ω) of the virtual microphone, and collects the
frequency domain at the position (r, Ω) of the virtual microphone The sound signal u ^ (ω, r, Ω)
(where ω = 1, 2, ..., F) is estimated by the following equation (S216) and output.
[0050]
[0051]
Here, ● means an inner product, and k <-> l is a wave number vector corresponding to the
11-04-2019
12
incident direction of the l-th plane wave, and is expressed by the following equation.
[0052]
[0053]
<T> represents transposition.
Further, r <−> is a representation of the designated position in the XYZ three-dimensional space,
and is expressed by the following equation.
[0054]
[0055]
The position information (r, Ω) of the virtual microphone is input by the user of the sound field
estimation apparatus 200, for example.
[0056]
<Short time inverse Fourier transform unit 218> The short time inverse Fourier transform unit
218 receives the collected sound signal u ^ (ω, r, Ω) (where ω = 1, 2, ..., F) in the frequency
domain, and The collected sound signal u ^ (ω, r, Ω) is converted into the collected sound signal
y (t, r, Ω) in the time domain by short time Fourier transformation (S218), and is output.
A method corresponding to the conversion method in the short time Fourier transform unit 211
may be used as a method of converting a time domain signal into a frequency domain signal.
[0057]
<Effects> With the above configuration, it is possible to realize a sound field estimation apparatus
in which the space area where extrapolation is effective is larger than that of the prior art.
11-04-2019
13
[0058]
<Modification 1> In the first embodiment, one virtual microphone is assumed, and a signal
picked up at that position is estimated.
However, as a matter of course, virtual microphones may be assumed at a plurality of positions.
In addition, by arranging virtual microphones on the same spherical surface, it is possible to
configure an open spherical virtual microphone array of radius r.
For example, when a virtual microphone array including P virtual microphones is configured, the
sound field estimation apparatus 200 determines position information (r, Ω p) of the P virtual
microphones (where p = 1, 2). ,..., P) and outputs a collected signal y (t, r, Ω p) (where p = 1, 2,...,
P).
[0059]
When the number P of virtual microphones included in the virtual microphone array is 1, this is
the first embodiment.
[0060]
<Modification 2> In the modification 1 of the first embodiment, the center of the spherical
microphone array 1 is the same as the center of the virtual microphone array.
However, the center of the virtual microphone array can be changed by the following equation.
The center of the virtual microphone array is at the position of D = [dxdydz] viewed from the
center (origin) of the spherical microphone array 1, and the p-th microphone position on the
spherical surface of the virtual microphone array is Ω p = ( Assuming that θ p, φ p) (where p =
1, 2,..., P), the extrapolation estimation unit 216 calculates the sound collection signal u ^ (ω, r,
Ω p) in the frequency domain by presume.
11-04-2019
14
[0061]
[0062]
The sound field estimation apparatus 200 receives the center D of the virtual microphone and
the position information (r, Ω p) (where p = 1, 2,..., P) of the P virtual microphones, and collects
the collected sound signal y (t, r, Ω p) (where p = 1, 2,..., P) are output.
[0063]
FIG. 4 shows an outline of positions of virtual microphones in the first embodiment, its first
modification, and its second modification.
Since the modification 1 is obtained when the center [d x d y d z] of the virtual microphone array
of the modification 2 is [0 0 0], the modification 1 can be said to be an example of the
modification 2 as well.
[0064]
Second Embodiment A description will be made focusing on parts different from the second
modification of the first embodiment.
[0065]
In the second modification of the first embodiment, a microphone array of an open ball type is
virtually assumed, and the collected sound signal is estimated.
In the second embodiment, based on the configuration of the second modification of the first
embodiment, instead of the open-sphere type microphone array, a hard-ball type microphone
array is virtually assumed and the collected sound signal is estimated.
[0066]
11-04-2019
15
FIG. 5 shows a functional block diagram of the sound field estimation apparatus 300 according
to the second embodiment, and FIG. 6 shows its processing flow.
[0067]
The sound field estimation apparatus 300 includes a short time Fourier transform unit 211, a
spherical wave spectrum conversion unit 212, a plane wave decomposition unit 213, an
extrapolation estimation unit 216, and a short time inverse Fourier transform unit 218, and
further includes an array type conversion unit 317. Including.
[0068]
First, as a virtual spherical microphone array, it is assumed that sound is collected by a dual open
sphere microphone array of reference 3.
In this microphone array, the microphone elements are disposed on a spherical surface of radius
r or a spherical surface of radius α r, and α = 1.2 is recommended.
(Reference 3) I. Balmages, B. Rafaely, "Open-Sphere Designs for Spherical Microphone Arrays",
IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, no. 2, pp 727-732, 2007.
[0069]
For example, it is assumed that Q = P × 2, and the positions of P virtual microphone elements
among the Q virtual microphone elements are the same as in the second modification.
That is, the center of the virtual microphone array is at the position of [dxdydz] with respect to
the center (origin) of the spherical microphone array 1, and the position of the pth virtual
microphone on the spherical surface of the virtual microphone array is Let Ω p = (θ p, φ p).
Of the Q virtual microphone elements, the remaining P virtual microphones are arranged on a
sphere with a center of [dxdydz] and a radius αr, and the position of the qth virtual microphone
is Ω q = (θ It is assumed that q 1, φ q) = Ω q = (θ q, φ q). That is, the q-th microphone and the
11-04-2019
16
p-th microphone are in the same direction from the center of the virtual microphone array, the
radius to the q-th microphone is r, and the radius to the q-th microphone is α r is there.
[0070]
The extrapolation estimation unit 216 calculates the estimated value A, the center D of the virtual
microphone array, and P position information (r, Ω p) of the virtual microphones (where p = 1,
2,..., P) and P Position information (α r, Ω q) (where q = P + 1, P + 2,..., Q) are received, and
virtual microphones at positions (r, Ω p) and (α r, Ω q) Pickup signals u ^ (ω, r, Ω p) (where p =
1, 2,..., P) in the frequency domain, u ^ (ω, r, Ω q) (where q = P + 1, P) .., Q) are estimated (S216)
and output. Here, instead of the P pieces of position information (α r, Ω q) (where q = P + 1, P +
2,..., Q), only α may be received.
[0071]
<Array type conversion unit 317> The array type conversion unit 317 is a signal of collected
sound signal u ^ (ω, r, Ω p) (where p = 1, 2,..., P) in the frequency domain, u ^ (ω, r , Ω q)
(where q = P + 1, P + 2, ..., Q) and convert them into spherical wave spectra un, m (ω, r) and un,
m (ω, αr) Do.
[0072]
[0073]
In an open sphere spherical microphone array, measurement becomes impossible at k and r
where j n (kr) = 0 due to the influence of singular points.
However, by selecting the larger absolute value of un, m (ω, r) and un, m (ω, αr), the double
open-ball spherical microphone array avoids the influence of singularity be able to.
[0074]
Therefore, the array type conversion unit 317
11-04-2019
17
[0075]
[0076]
And when | un, m (ω, r) |> | un, m (ω, αr) |
[0077]
[0078]
And when | u n, m (ω, r) | ≦ | u n, m (ω, αr) |
[0079]
[0080]
The spherical wave spectrum v n, m (ω, r) is determined as
[0081]
The array type conversion unit 317 finally performs inverse spherical wave spectrum conversion
[0082]
[0083]
Apply.
As a result, it is possible to obtain in the frequency domain a sound collection signal in the case
where a hard-sphere type microphone array of radius r is installed at the position of the double
open-sphere type spherical microphone array virtually installed first.
The array type conversion unit 317 outputs the signal v (ω, r, Ω p) (where p = 1, 2,..., P) in the
11-04-2019
18
frequency domain to the short time inverse Fourier transform unit 218.
[0084]
<Effect> With such a configuration, it is possible to obtain the same effect as that of Modification
2 of the first embodiment.
Furthermore, it is possible to virtually obtain a collected sound signal in the case where a rigidball type microphone array is installed.
[0085]
Third Embodiment The application of a hard-sphere microphone array to virtual reality is shown
in reference 4.
(Ref. 4) R. Duraiswami 1, DN Zotkin 1, Z. Li, E. Grassi, NA Gumerov, LS Davis, "High Order Spatial
Audio Capture and Binaural Head-Tracked Playback over Headphones with HRTF Cues",
Proceedings 119th convention of AES , 2005.
[0086]
In this reference 4, the sound pickup signal of a fixed hard-ball microphone array and the
direction of a virtual head are input, and when the head is turned in a designated direction, a
signal (binaural signal) that can be heard by the right ear and the left ear The method of
outputting is shown.
Since the spherical microphone array picks up in all directions, it is possible to generate binaural
signals corresponding to any specified direction without moving the microphone elements and
the microphone array.
That is, when the head rotation of the listener is measured and input in real time, a binaural
11-04-2019
19
signal can be generated to be presented to the listener following the rotation movement.
[0087]
In the second embodiment, a method of obtaining a collected sound signal of a virtually installed
rigid-sphere microphone array has been shown.
A configuration in which the binaural signal generation method is combined with this collected
sound signal as shown in FIG. 7 is the configuration of the present embodiment.
[0088]
Description will be made focusing on parts different from the second embodiment.
[0089]
FIG. 7 shows a functional block diagram of a sound field estimation apparatus 400 according to
the third embodiment, and FIG. 8 shows its process flow.
[0090]
The sound field estimation apparatus 400 includes a short time Fourier transform unit 211, a
spherical wave spectrum conversion unit 212, a plane wave decomposition unit 213, an
extrapolation estimation unit 216, an array type conversion unit 317, and a short time inverse
Fourier transform unit 218. A binaural signal generation unit 419 is included.
[0091]
<Binaural Signal Generation Unit 419> The binaural signal generation unit 419 is a virtual head
direction (posture) and time domain sound collection signal y (t, r, Ω p) (where p = 1, 2, ...). , P,
corresponding to the sound pickup signals of a hard-sphere type spherical microphone array),
and, according to the method described in, for example, reference 4, binaural signals y (t, R) in
virtual head position and direction from these signals. , y (t, L) are generated (S 419), and output
as an output value of the sound field estimation apparatus 400.
11-04-2019
20
The position of the virtual head corresponds to the center D = [dxdydz] of the virtual microphone
array, and the collected signal y (t, r, Ω p) in the time domain is that of the virtual head. It
corresponds to the sound pickup signal of the rigid-spherical spherical microphone array at the
position.
Therefore, in the binaural signal generation unit 419, the binaural signal y at the position and
direction of the virtual head is derived from the virtual head direction (posture) and the collected
sound signal y (t, r, Ω p) in the time domain. (t, R), y (t, L) can be generated.
[0092]
The method of reference 4 can follow only the rotational movement of the head and can not cope
with the translational movement of the head.
However, in the configuration of the present embodiment, the rigid sphere type spherical
microphone array can be virtually translated.
To this end, the present embodiment makes it possible to generate binaural signals following
both the rotational and translational movements of the head.
[0093]
<Other Modifications> The present invention is not limited to the above embodiment and
modifications.
For example, the various processes described above may be performed not only in chronological
order according to the description, but also in parallel or individually depending on the
processing capability of the apparatus that executes the process or the necessity.
In addition, changes can be made as appropriate without departing from the spirit of the present
invention.
11-04-2019
21
[0094]
<Program and Recording Medium> In addition, various processing functions in each device
described in the above embodiment and modification may be realized by a computer. In that
case, the processing content of the function that each device should have is described by a
program. By executing this program on a computer, various processing functions in each of the
above-described devices are realized on the computer.
[0095]
The program describing the processing content can be recorded in a computer readable
recording medium. As the computer readable recording medium, any medium such as a magnetic
recording device, an optical disc, a magneto-optical recording medium, a semiconductor memory,
etc. may be used.
[0096]
Further, this program is distributed, for example, by selling, transferring, lending, etc. a portable
recording medium such as a DVD, a CD-ROM or the like in which the program is recorded.
Furthermore, the program may be stored in a storage device of a server computer, and the
program may be distributed by transferring the program from the server computer to another
computer via a network.
[0097]
For example, a computer that executes such a program first temporarily stores a program
recorded on a portable recording medium or a program transferred from a server computer in its
own storage unit. Then, at the time of execution of the process, the computer reads the program
stored in its storage unit and executes the process according to the read program. In another
embodiment of the program, the computer may read the program directly from the portable
recording medium and execute processing in accordance with the program. Furthermore, each
time a program is transferred from this server computer to this computer, processing according
to the received program may be executed sequentially. In addition, a configuration in which the
11-04-2019
22
above-described processing is executed by a so-called ASP (Application Service Provider) type
service that realizes processing functions only by executing instructions and acquiring results
from the server computer without transferring the program to the computer It may be Note that
the program includes information provided for processing by a computer that conforms to the
program (such as data that is not a direct command to the computer but has a property that
defines the processing of the computer).
[0098]
In addition, although each device is configured by executing a predetermined program on a
computer, at least a part of the processing content may be realized as hardware.
11-04-2019
23
Документ
Категория
Без категории
Просмотров
0
Размер файла
34 Кб
Теги
jp2017112415, description
1/--страниц
Пожаловаться на содержимое документа