Patent Translate Powered by EPO and Google Notice This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate, complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or financial decisions, should not be based on machine-translation output. DESCRIPTION JP2017112415 The present invention provides a sound field estimation apparatus, a method and a program thereof, which have a larger space area where extrapolation is effective than in the past. Kind Code: A1 A sound field estimation apparatus 200 includes collected sound signals y (t, r, Ω) (j = 1, 2,...) Of a spherical microphone array 1 provided with microphones at J positions (r, Ω).・, Using the spherical wave spectrum u (ω, r) (n = 0, 1, ..., N, m =-n,-n + 1, ..., n) calculated from ,, J), the sound field is configured Plane wave decomposing unit 213 for estimating a vector consisting of the strength of the plane wave, r is the radius of polar coordinates, Ω is the declination of polar coordinates, estimated vector a (ω) consisting of the strength of plane waves and position of virtual microphone ( and an extrapolation / estimation unit 216 that estimates the collected sound signal u ^ (ω, r, Ω) in the frequency domain at the position of the virtual microphone using r, Ω). [Selected figure] Figure 2 Sound field estimation device, method and program thereof [0001] The present invention relates to a technique for estimating a sound collection signal obtained when a microphone is arranged at another position using a sound collection signal of a microphone arranged at a certain position. [0002] In recent years, audio reproduction technology has been expanded from 2-channel stereo to 5.1channel reproduction, and further research and development on 22.2 channel reproduction and wave-field synthesis methods are advanced, greatly improving the realism of reproduction itself, and the realism It is intended to expand the high reproduction area of 11-04-2019 1 [0003] In order to evaluate and verify such a multi-channel audio reproduction method, it is important to measure the reproduced sound field. For example, in the wavefront synthesis method, it is necessary to compare the actually recorded sound field with the reproduced sound field to grasp the difference. The reason is that various factors such as signal processing for converting the recorded sound field into reproduced signals, encoding and decoding of the recorded signals, and acoustic characteristics of the room in which the reproducing apparatus is installed affect the reproduction accuracy of the sound field. This is because it is important to establish a method with high reproduction accuracy. [0004] (Conventional method 1) As a method of measuring a sound field, it is conceivable to locally arrange a microphone locally on a part of a target measurement area and estimate a sound field of a surrounding area from the measurement result. As an example, examination of a spherical microphone array is in progress. The spherical microphone array is a microphone array in which several tens or more microphone elements are arranged on a spherical surface of radius r a, and r a is in the range of several cm to several tens cm. [0005] FIG. 1 shows the signal flow of a sound field estimation process using a spherical microphone array 1 in the prior art. The time domain signals y (t, ra, Ω j) collected by the J microphones disposed on the spherical surface are converted by the short time Fourier transform unit 111 into the frequency domain signals u (i, ω, ra, Converted to Ω j). Where t is time, i is a frame, J is an integer of 2 or more, ω is a time frequency, j = 1, 2,. In the following processing, processing is performed in frame units, but i is omitted to simplify the notation. Ω j is a position on the spherical surface of the j-th microphone element, and is specified by a pair of elevation angle θ j 11-04-2019 2 and azimuth angle φ j. Ω j = (θ j, φ j). [0006] The spherical wave spectrum conversion unit 112 obtains the spherical wave spectrum u n, m (ω, r a) by the following equation for each frequency ω. [0007] [0008] Where α j is a weight appropriately set such that the product-sum of equation (1) satisfies the orthogonality of the spherical harmonics expressed by the following equation. [0009] [0010] Y n <m> (θ j, φ j) is a spherical harmonic function of order n and order m, and * means complex conjugate. n = 0, 1,..., N, m = −n, −n + 1,. δ nn 'is 1 when n = n' and 0 when n0n ', δ mm' is 1 when m = m 'and 0 when m ≠ m' It is a value. In order to obtain a spherical wave spectrum up to the order number N, (N + 1) <2> or more microphone elements are required. [0011] Note that, from this point onward, measuring the sound field generated by the sound source 11-04-2019 3 located outside the measurement target range, that is, the internal problem is dealt with. In other words, the sound field generated by the sound source outside the sphere of the spherical microphone array is measured. [0012] The sound field is considered in the polar coordinate system (r, Ω) = (r, θ, φ) with the center of the spherical microphone array as the origin. [0013] The extrapolation estimation unit 116 extrapolates the sound field from the position of the radius ra to the radius r at the frequency ω according to the following equation, and the sound collection signal u (r, Ω) = (r, θ, φ) Find ω, r, Ω). In other words, the sound collection signal of the microphones disposed on the spherical microphone array is used to estimate the sound field outside the sphere of the spherical microphone array. [0014] [0015] Where k is the wavenumber k = ω / c (c is the speed of sound) and b n () is the mode intensity function. [0016] Non-Patent Document 1 shows the case of an open sphere spherical microphone array in which microphone elements are hollow and arranged on a spherical surface. In this case, the mode intensity function is expressed by the following equation. 11-04-2019 4 [0017] [0018] である。 Here, i is an imaginary number, and j n () is an n-order spherical Bessel function. When the spherical microphone array is configured by arranging the microphone elements on the surface of the hard sphere, the mode intensity function is expressed by the following equation based on Non-Patent Document 2. [0019] [0020] Where h n () is the n-th first kind Hankel function. "A '" means the derivative of A. [0021] The short time inverse Fourier transform unit 118 converts the spatially extrapolated sound collection signal from the signal u (ω, r, Ω) in the frequency domain to the signal y (t, r, Ω) in the time domain and outputs the signal. [0022] In equation (3), b n (kr) / b n (k r a) is applied to the spherical wave spectrum, and product-sum 11-04-2019 5 is obtained by Y n <m> (θ, φ). The product sum of Y n <m> (θ, φ) corresponds to inverse spherical wave spectrum conversion. Therefore, the spatially collected sound pickup signal u (ω, r, Ω) is a signal in the frequency domain. [0023] In the measurement by the open-ball microphone array, the influence of a singular point can not be avoided, and measurement becomes impossible at k and r when j n (kr) = 0. Specifically, when j n (kr) = 0 is satisfied even if there is a sound field, the output becomes zero. However, there is no singularity in a rigid-sphere microphone array and it can not be measured. Therefore, it is the mainstream to use a hard sphere microphone array as a spherical microphone array. [0024] T. Abhayapala and D. Ward, "Theory and design of high order sound field microphones using spherical microphone array", in Acoustics, Speech, and Signal Processing (ICASSP), IEEE International Conference on, 2002, pp. II-1949. Meyer, Jens; Elko, Gary, "A highly scalable spherical microphone array based on an orthonormal decomposition of the soundfield", Acoustics, Speech, and Signal Processing (ICASSP), 2002 IEEE International Conference on 2002, II-1781-II- 1784. [0025] In the prior art, the Bessel function j n (kr) or the Bessel function j n (kr) and the Hankel function h n (kr) are used when extrapolating the sound field from the position of the radius r a to the radius r. According to reference 1, as kr tends to increase as a global tendency for both functions, it decreases at a pace of 1 / kr. (Reference 1) E. G. Williams, "Fourier acoustics", Springer Fearak, 2005, pages 234-236. [0026] 11-04-2019 6 For example, when r becomes ten times as large as r a, the extrapolation estimation value decreases sharply to about 1/10. Therefore, the space area where extrapolation is effective is limited to the periphery of the spherical microphone array surface. For the same reason, even if the frequency ω is increased, k = ω / c is increased, and the extrapolation estimated value is rapidly reduced. In other words, if the frequency is increased, the space area where extrapolation is effective narrows sharply. [0027] An object of the present invention is to provide a sound field estimation device, a method and a program for the sound field estimation device, which have a larger space area where extrapolation is effective than the prior art. [0028] In order to solve the above-mentioned problems, according to one aspect of the present invention, the sound field estimation apparatus j = 1, 2,..., J, ra as polar coordinates, θ j and φ j as polar coordinates. A sphere calculated from the collected sound signals y (ra, θ j, φ j) of a spherical microphone array provided with angles and ω as indices of time frequency and microphones at J positions (ra, θ j, φ j) respectively Vector consisting of the intensity of the plane wave that composes the sound field using the wave spectrum un, m (ω, ra) (n = 0, 1, ..., N, m =-n,-n + 1, ..., n) Where r is the polar coordinate radius, θ and φ are polar coordinates, and the estimated value a (ω) of the vector consisting of the intensity of the plane wave and the position of the virtual microphone (r, θ, φ) And an extrapolation estimating unit that estimates a collected sound signal u ^ (ω, r, θ, φ) in the frequency domain at the position (r, θ, φ) of the virtual microphone. [0029] In order to solve the above problems, according to another aspect of the present invention, the sound field estimation method is such that j = 1, 2,..., J, ra is a radius of polar coordinates, θ j and φ j are polar coordinates. Calculated from the sound collection signal y (ra, θ j, φ j) of a spherical microphone array provided with a deflection angle, ω as an index of time frequency, and microphones at J positions (ra, θ j, φ j) respectively Using the spherical wave spectrum un, m (ω, ra) (n = 0, 1, ..., N, m =-n,-n + 1, ..., n), it consists of the intensity of the plane wave that composes the sound field A plane wave decomposition step for estimating the vector, r is the radius of the polar coordinates, θ and φ are the polar angles, and the estimated value a (ω) of the vector consisting of the intensity of the plane wave and the position of the virtual microphone (r, θ, φ And V.) to estimate the collected sound signal u ^ (ω, r, θ, φ) in the frequency domain at the position (r, θ, φ) of the virtual microphone. 11-04-2019 7 [0030] According to the present invention, the space area where extrapolation is effective is larger than that of the prior art. [0031] The functional block diagram of the sound field estimation apparatus which concerns on a prior art. FIG. 1 is a functional block diagram of a sound field estimation device according to a first embodiment. The figure which shows the example of the processing flow of the sound field estimation apparatus which concerns on 1st embodiment. FIG. 7 is a diagram showing an outline of positions of virtual microphones in the first embodiment and its first and second modifications. The functional block diagram of the sound field estimation apparatus which concerns on 2nd embodiment. The figure which shows the example of the processing flow of the sound field estimation apparatus which concerns on 2nd embodiment. The functional block diagram of the sound field estimation apparatus which concerns on 3rd embodiment. The figure which shows the example of the processing flow of the sound field estimation apparatus which concerns on 3rd embodiment. [0032] Hereinafter, embodiments of the present invention will be described. In the drawings used in the following description, the same reference numerals are given to constituent parts having the same functions and steps for performing the same processing, and redundant description will be omitted. In the following description, the symbols “^”, “<->”, etc. used in the text should 11-04-2019 8 originally be written directly above the previous character, but due to the limitations of the text notation Describe. In the formula, these symbols are described at their original positions. Moreover, the processing performed in each element unit of a vector or a matrix is applied to all elements of the vector or the matrix unless otherwise noted. [0033] <Point of First Embodiment> In the present embodiment, instead of extrapolating the spherical wave spectrum, a collection of plane waves constituting the sound field is obtained from the spherical wave spectrum, and the sound field is extrapolated using this plane wave. Through the plane wave, it is possible to greatly expand the space area where extrapolation is effective. The method will be described below. [0034] <Sound Field Estimation Device 200 According to First Embodiment> FIG. 1 shows a functional block diagram of the sound field estimation device 200 according to the first embodiment, and FIG. 2 shows its process flow. [0035] The sound field estimation apparatus 200 includes a short time Fourier transform unit 211, a spherical wave spectrum conversion unit 212, a plane wave decomposition unit 213, an extrapolation estimation unit 216, and a short time inverse Fourier transform unit 218. [0036] The sound field estimation device 200 is configured to receive the collected sound signal y (t, ra, Ω j) (where j = 1, 2,..., J) in the time domain from the spherical microphone array 1, and position information (r, Ω), and estimates and outputs a time-domain collected sound signal y (t, r, Ω) at the position of the virtual microphone. In the spherical microphone array 1, J microphones are arranged on the spherical surface of radius r a, and the position of the j-th microphone is specified by Ω j = (θ j, φ j). 11-04-2019 9 That is, with the center of the spherical surface formed by the spherical microphone array 1 as the origin, let ra be the radius of polar coordinates, θ j and φ j be the polar angles, and the position of the j-th microphone be (ra, θ j, φ j) Is represented by [0037] <Short-Time Fourier Transform Unit 211> The short-time Fourier transform unit 211 receives the collected signal y (t, ra, Ω j) (where j = 1, 2,..., J) in the time domain, and short-time Fourier transform By conversion, the time-domain sound collection signal y (t, ra, Ω j) and the frequencydomain sound collection signal u (i, ω, ra, Ω j) (where i is a frame number, ω = 1, 2, , F, j = 1, 2,..., J) (S211) and output. Although the subsequent processing is performed for each frame i, the frame number i is omitted to simplify the description. In addition, as long as it is a method of converting a time domain signal into a frequency domain signal, a method other than short time Fourier transform may be used. [0038] <Spherical Wave Spectrum Converter 212> The spherical wave spectrum converter 212 is a unit that collects collected signals u (i, ω, ra, Ω j) (where ω = 1, 2,..., F, j = 1) in the frequency domain. 2, ..., J) are obtained, and for each frequency ω, the spherical wave spectrum un, m (ω, ra) is determined by the equation (1) (S212), and the spherical wave spectrum un, m (ω, ra) (where , N = 0, 1,..., N, m = -n, -n + 1,..., N, ω = 1, 2,. [0039] [0040] Note that α j and Y n <m> (θ j, φ j) are as described in the above (Conventional method 1). [0041] <Planar Wave Decomposition Unit 213> The plane wave decomposition unit 213 has a spherical wave spectrum un, m (ω, ra) (where n = 0, 1, ..., N, m =-n,-n + 1, ..., n, Receive ω = 1, 2, ..., F), estimate the vector consisting of the intensity of the plane wave that composes the sound field (S213), estimate value a (ω) (where ω = 1, 2, ..., F) Output 11-04-2019 10 For example, the plane wave decomposition unit 213 first solves the following convex optimization problem using the L1 norm to obtain a collection of plane waves that constitute a sound field. [0042] [0043] The matrix D (ω) is expressed by the following equation. [0044] [0045] Given by The vertical vector of the l'-th column of the matrix D (ω) is a spherical microphone array when a plane wave of amplitude 1 is incident at a single incident angle Ω l ', ie, an elevation angle θ l' and an azimuth angle φ l ' It is a vector of a spherical wave spectrum calculated from one of the collected signals y (t, ra). For example, L ′ incident angles Ω l ′ are set so that L ′ plane waves can be acquired uniformly from all directions. For example, L ′ incident angles Ω l ′ are set such that a plane wave is incident from the direction of the vertex of the regular polyhedron. The vertical vector of the l'th row is given by the following equation. [0046] 11-04-2019 11 [0047] When using spherical harmonics up to order N, the size is (N + 1) <2>. Note that this size needs to be smaller than the element number J of the microphones included in the spherical microphone array 1 (thus, (N + 1) <2> <J). The estimated value a (ω) is a vector composed of estimated values of the strengths of the plane waves. It is set as a ((omega)) = [a1 ((omega)), a2 ((omega)), ..., a l '((omega)), ..., a L' ((omega))] <T>. [0048] Convex optimization with the L1 norm derives a sparse vector containing many zeros as the solution vector a (ω). Therefore, as shown in Reference 2, it is possible to extract plane waves well even in a redundant case where the number L ′ of plane waves assumed in advance largely exceeds the number of microphones. (Reference 2) A. Wabnitz, N. Epain, A. van. Shaik, C. Jin, "Reconstruction of spatial sound field using compressed sensing", in Acoustics, Speech, and Signal Processing (ICASSP), IEEE International Conference on, 2011. As an example, in the case of a spherical microphone array having 32 microphone elements, N ≦ 4. Also, L 'can be set to one hundred or more. [0049] <Extraction estimation unit 216> The extrapolation estimation unit 216 receives the estimated value a (ω) and the position information (r, Ω) of the virtual microphone, and collects the frequency domain at the position (r, Ω) of the virtual microphone The sound signal u ^ (ω, r, Ω) (where ω = 1, 2, ..., F) is estimated by the following equation (S216) and output. [0050] [0051] Here, ● means an inner product, and k <-> l is a wave number vector corresponding to the 11-04-2019 12 incident direction of the l-th plane wave, and is expressed by the following equation. [0052] [0053] <T> represents transposition. Further, r <−> is a representation of the designated position in the XYZ three-dimensional space, and is expressed by the following equation. [0054] [0055] The position information (r, Ω) of the virtual microphone is input by the user of the sound field estimation apparatus 200, for example. [0056] <Short time inverse Fourier transform unit 218> The short time inverse Fourier transform unit 218 receives the collected sound signal u ^ (ω, r, Ω) (where ω = 1, 2, ..., F) in the frequency domain, and The collected sound signal u ^ (ω, r, Ω) is converted into the collected sound signal y (t, r, Ω) in the time domain by short time Fourier transformation (S218), and is output. A method corresponding to the conversion method in the short time Fourier transform unit 211 may be used as a method of converting a time domain signal into a frequency domain signal. [0057] <Effects> With the above configuration, it is possible to realize a sound field estimation apparatus in which the space area where extrapolation is effective is larger than that of the prior art. 11-04-2019 13 [0058] <Modification 1> In the first embodiment, one virtual microphone is assumed, and a signal picked up at that position is estimated. However, as a matter of course, virtual microphones may be assumed at a plurality of positions. In addition, by arranging virtual microphones on the same spherical surface, it is possible to configure an open spherical virtual microphone array of radius r. For example, when a virtual microphone array including P virtual microphones is configured, the sound field estimation apparatus 200 determines position information (r, Ω p) of the P virtual microphones (where p = 1, 2). ,..., P) and outputs a collected signal y (t, r, Ω p) (where p = 1, 2,..., P). [0059] When the number P of virtual microphones included in the virtual microphone array is 1, this is the first embodiment. [0060] <Modification 2> In the modification 1 of the first embodiment, the center of the spherical microphone array 1 is the same as the center of the virtual microphone array. However, the center of the virtual microphone array can be changed by the following equation. The center of the virtual microphone array is at the position of D = [dxdydz] viewed from the center (origin) of the spherical microphone array 1, and the p-th microphone position on the spherical surface of the virtual microphone array is Ω p = ( Assuming that θ p, φ p) (where p = 1, 2,..., P), the extrapolation estimation unit 216 calculates the sound collection signal u ^ (ω, r, Ω p) in the frequency domain by presume. 11-04-2019 14 [0061] [0062] The sound field estimation apparatus 200 receives the center D of the virtual microphone and the position information (r, Ω p) (where p = 1, 2,..., P) of the P virtual microphones, and collects the collected sound signal y (t, r, Ω p) (where p = 1, 2,..., P) are output. [0063] FIG. 4 shows an outline of positions of virtual microphones in the first embodiment, its first modification, and its second modification. Since the modification 1 is obtained when the center [d x d y d z] of the virtual microphone array of the modification 2 is [0 0 0], the modification 1 can be said to be an example of the modification 2 as well. [0064] Second Embodiment A description will be made focusing on parts different from the second modification of the first embodiment. [0065] In the second modification of the first embodiment, a microphone array of an open ball type is virtually assumed, and the collected sound signal is estimated. In the second embodiment, based on the configuration of the second modification of the first embodiment, instead of the open-sphere type microphone array, a hard-ball type microphone array is virtually assumed and the collected sound signal is estimated. [0066] 11-04-2019 15 FIG. 5 shows a functional block diagram of the sound field estimation apparatus 300 according to the second embodiment, and FIG. 6 shows its processing flow. [0067] The sound field estimation apparatus 300 includes a short time Fourier transform unit 211, a spherical wave spectrum conversion unit 212, a plane wave decomposition unit 213, an extrapolation estimation unit 216, and a short time inverse Fourier transform unit 218, and further includes an array type conversion unit 317. Including. [0068] First, as a virtual spherical microphone array, it is assumed that sound is collected by a dual open sphere microphone array of reference 3. In this microphone array, the microphone elements are disposed on a spherical surface of radius r or a spherical surface of radius α r, and α = 1.2 is recommended. (Reference 3) I. Balmages, B. Rafaely, "Open-Sphere Designs for Spherical Microphone Arrays", IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, no. 2, pp 727-732, 2007. [0069] For example, it is assumed that Q = P × 2, and the positions of P virtual microphone elements among the Q virtual microphone elements are the same as in the second modification. That is, the center of the virtual microphone array is at the position of [dxdydz] with respect to the center (origin) of the spherical microphone array 1, and the position of the pth virtual microphone on the spherical surface of the virtual microphone array is Let Ω p = (θ p, φ p). Of the Q virtual microphone elements, the remaining P virtual microphones are arranged on a sphere with a center of [dxdydz] and a radius αr, and the position of the qth virtual microphone is Ω q = (θ It is assumed that q 1, φ q) = Ω q = (θ q, φ q). That is, the q-th microphone and the 11-04-2019 16 p-th microphone are in the same direction from the center of the virtual microphone array, the radius to the q-th microphone is r, and the radius to the q-th microphone is α r is there. [0070] The extrapolation estimation unit 216 calculates the estimated value A, the center D of the virtual microphone array, and P position information (r, Ω p) of the virtual microphones (where p = 1, 2,..., P) and P Position information (α r, Ω q) (where q = P + 1, P + 2,..., Q) are received, and virtual microphones at positions (r, Ω p) and (α r, Ω q) Pickup signals u ^ (ω, r, Ω p) (where p = 1, 2,..., P) in the frequency domain, u ^ (ω, r, Ω q) (where q = P + 1, P) .., Q) are estimated (S216) and output. Here, instead of the P pieces of position information (α r, Ω q) (where q = P + 1, P + 2,..., Q), only α may be received. [0071] <Array type conversion unit 317> The array type conversion unit 317 is a signal of collected sound signal u ^ (ω, r, Ω p) (where p = 1, 2,..., P) in the frequency domain, u ^ (ω, r , Ω q) (where q = P + 1, P + 2, ..., Q) and convert them into spherical wave spectra un, m (ω, r) and un, m (ω, αr) Do. [0072] [0073] In an open sphere spherical microphone array, measurement becomes impossible at k and r where j n (kr) = 0 due to the influence of singular points. However, by selecting the larger absolute value of un, m (ω, r) and un, m (ω, αr), the double open-ball spherical microphone array avoids the influence of singularity be able to. [0074] Therefore, the array type conversion unit 317 11-04-2019 17 [0075] [0076] And when | un, m (ω, r) |> | un, m (ω, αr) | [0077] [0078] And when | u n, m (ω, r) | ≦ | u n, m (ω, αr) | [0079] [0080] The spherical wave spectrum v n, m (ω, r) is determined as [0081] The array type conversion unit 317 finally performs inverse spherical wave spectrum conversion [0082] [0083] Apply. As a result, it is possible to obtain in the frequency domain a sound collection signal in the case where a hard-sphere type microphone array of radius r is installed at the position of the double open-sphere type spherical microphone array virtually installed first. The array type conversion unit 317 outputs the signal v (ω, r, Ω p) (where p = 1, 2,..., P) in the 11-04-2019 18 frequency domain to the short time inverse Fourier transform unit 218. [0084] <Effect> With such a configuration, it is possible to obtain the same effect as that of Modification 2 of the first embodiment. Furthermore, it is possible to virtually obtain a collected sound signal in the case where a rigidball type microphone array is installed. [0085] Third Embodiment The application of a hard-sphere microphone array to virtual reality is shown in reference 4. (Ref. 4) R. Duraiswami 1, DN Zotkin 1, Z. Li, E. Grassi, NA Gumerov, LS Davis, "High Order Spatial Audio Capture and Binaural Head-Tracked Playback over Headphones with HRTF Cues", Proceedings 119th convention of AES , 2005. [0086] In this reference 4, the sound pickup signal of a fixed hard-ball microphone array and the direction of a virtual head are input, and when the head is turned in a designated direction, a signal (binaural signal) that can be heard by the right ear and the left ear The method of outputting is shown. Since the spherical microphone array picks up in all directions, it is possible to generate binaural signals corresponding to any specified direction without moving the microphone elements and the microphone array. That is, when the head rotation of the listener is measured and input in real time, a binaural 11-04-2019 19 signal can be generated to be presented to the listener following the rotation movement. [0087] In the second embodiment, a method of obtaining a collected sound signal of a virtually installed rigid-sphere microphone array has been shown. A configuration in which the binaural signal generation method is combined with this collected sound signal as shown in FIG. 7 is the configuration of the present embodiment. [0088] Description will be made focusing on parts different from the second embodiment. [0089] FIG. 7 shows a functional block diagram of a sound field estimation apparatus 400 according to the third embodiment, and FIG. 8 shows its process flow. [0090] The sound field estimation apparatus 400 includes a short time Fourier transform unit 211, a spherical wave spectrum conversion unit 212, a plane wave decomposition unit 213, an extrapolation estimation unit 216, an array type conversion unit 317, and a short time inverse Fourier transform unit 218. A binaural signal generation unit 419 is included. [0091] <Binaural Signal Generation Unit 419> The binaural signal generation unit 419 is a virtual head direction (posture) and time domain sound collection signal y (t, r, Ω p) (where p = 1, 2, ...). , P, corresponding to the sound pickup signals of a hard-sphere type spherical microphone array), and, according to the method described in, for example, reference 4, binaural signals y (t, R) in virtual head position and direction from these signals. , y (t, L) are generated (S 419), and output as an output value of the sound field estimation apparatus 400. 11-04-2019 20 The position of the virtual head corresponds to the center D = [dxdydz] of the virtual microphone array, and the collected signal y (t, r, Ω p) in the time domain is that of the virtual head. It corresponds to the sound pickup signal of the rigid-spherical spherical microphone array at the position. Therefore, in the binaural signal generation unit 419, the binaural signal y at the position and direction of the virtual head is derived from the virtual head direction (posture) and the collected sound signal y (t, r, Ω p) in the time domain. (t, R), y (t, L) can be generated. [0092] The method of reference 4 can follow only the rotational movement of the head and can not cope with the translational movement of the head. However, in the configuration of the present embodiment, the rigid sphere type spherical microphone array can be virtually translated. To this end, the present embodiment makes it possible to generate binaural signals following both the rotational and translational movements of the head. [0093] <Other Modifications> The present invention is not limited to the above embodiment and modifications. For example, the various processes described above may be performed not only in chronological order according to the description, but also in parallel or individually depending on the processing capability of the apparatus that executes the process or the necessity. In addition, changes can be made as appropriate without departing from the spirit of the present invention. 11-04-2019 21 [0094] <Program and Recording Medium> In addition, various processing functions in each device described in the above embodiment and modification may be realized by a computer. In that case, the processing content of the function that each device should have is described by a program. By executing this program on a computer, various processing functions in each of the above-described devices are realized on the computer. [0095] The program describing the processing content can be recorded in a computer readable recording medium. As the computer readable recording medium, any medium such as a magnetic recording device, an optical disc, a magneto-optical recording medium, a semiconductor memory, etc. may be used. [0096] Further, this program is distributed, for example, by selling, transferring, lending, etc. a portable recording medium such as a DVD, a CD-ROM or the like in which the program is recorded. Furthermore, the program may be stored in a storage device of a server computer, and the program may be distributed by transferring the program from the server computer to another computer via a network. [0097] For example, a computer that executes such a program first temporarily stores a program recorded on a portable recording medium or a program transferred from a server computer in its own storage unit. Then, at the time of execution of the process, the computer reads the program stored in its storage unit and executes the process according to the read program. In another embodiment of the program, the computer may read the program directly from the portable recording medium and execute processing in accordance with the program. Furthermore, each time a program is transferred from this server computer to this computer, processing according to the received program may be executed sequentially. In addition, a configuration in which the 11-04-2019 22 above-described processing is executed by a so-called ASP (Application Service Provider) type service that realizes processing functions only by executing instructions and acquiring results from the server computer without transferring the program to the computer It may be Note that the program includes information provided for processing by a computer that conforms to the program (such as data that is not a direct command to the computer but has a property that defines the processing of the computer). [0098] In addition, although each device is configured by executing a predetermined program on a computer, at least a part of the processing content may be realized as hardware. 11-04-2019 23

1/--страниц