Patent Translate Powered by EPO and Google Notice This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate, complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or financial decisions, should not be based on machine-translation output. DESCRIPTION JP2015523609 Abstract: In audio reverberation reduction based on dual microphones, it is not necessary to accurately estimate the direction of arrival of direct sound, and it is not required that the microphones have high coincidence. A transfer function h (t) from an auxiliary microphone to a main microphone is calculated by a main microphone input signal x2 (t) and an auxiliary microphone input signal x1 (t), and a tailing portion hr (t) of h (t) is calculated. ) To determine the strength of the reverberation by h (t), calculate the regulator β of the gain function, perform convolution on x1 (t) and hr (t), and determine the late reverberation of x2 (t) Obtain an estimated signal, calculate the gain function from the frequency spectrum of x 2 (t), and the frequency spectrum of β and, multiply the frequency spectrum of x 2 (t) by the gain function, and apply dereverberation from x 2 (t) A frequency spectrum is obtained, and frequency / time conversion is performed to obtain a late reverberant time domain signal from x 2 (t). Thus, the late reverberation is removed from the main microphone input signal. [Selected figure] Figure 3 Voice reverberation reduction method and apparatus based on dual microphone [0001] The present invention relates to the field of speech enhancement, and more particularly, to a method and apparatus for reducing speech reverberation based on dual microphones. [0002] Due to the reflection of the sound signal on the sound of the hard interface such as the wall or the ground during the propagation process in the room, the sound reaching the microphone is 11-04-2019 1 added one or more times to the direct sound transmitted directly from the sound source. The acoustic signal transmitted through reflection is also included, and these non-direct sounds constitute a reverberation signal. An acoustic signal that has undergone one or a small number of reflections may be referred to as an early reflection signal, the early reflection signal may constitute an early reverberation signal, and the early reverberation signal may exert an emphasis on speech. An acoustic signal that has undergone multiple reflections is called a late reflection signal, and a late reflection signal constitutes a late reverberation signal, and if the late reverberation is strong, speech intelligibility will be reduced. [0003] In some hands free voice communications, the caller is far away from the microphone, and the speech intelligibility is degraded by the reverberation in the room, resulting in degraded call quality. Therefore, techniques for reducing reverberation and enhancing speech intelligibility are needed. The microphone received signal includes direct sound and reverberation signals, and as can be seen from the above, reverberation is further divided into early reverberation and late reverberation. Among them, it is mainly late reverberation that reduces speech intelligibility, and early reverberation usually exerts an emphasis on speech. Thus, the key to enhancing clarity is to reduce the late reverberation signal. [0004] Among the various dereverberation techniques, dual microphone based dereverberation methods based on spectral subtraction have gained wide attention. In the conventional dual microphone based spectral subtraction dereverberation method, the adaptive beamforming (GSC) structure provides two path signals, the first path signal is the output of the delay / sum beamformer, and the second The signal of the passage is the output of the blocking matrix. The energy envelopes of the two pass signals estimate the reverberation of the first pass signal by the adaptive filter and then remove the reverberation by spectral subtraction. This method has the following disadvantages. 1) The early reverberation is removed and the sound after processing becomes thin. 2) There is a possibility that the reverberation is weak and the voice quality may be damaged when the speech intelligibility is originally high by using the same spectral subtraction processing even in the case of different reverberations without judging the strength of the reverberation . 3) Since it is necessary to accurately estimate the direction of arrival of the direct 11-04-2019 2 sound and to separate the direct sound, the microphone is required to have a high degree of match, and there are severe limitations on acoustic design. [0005] SUMMARY OF THE INVENTION In view of the above problems, the present invention provides a dual microphone based audio reverberation reduction method and apparatus according to which the above problems are eliminated or at least partially eliminated. [0006] According to one aspect of the present invention, the main microphone input signal and the auxiliary microphone input signal are received, and the transfer function h (t) from the auxiliary microphone to the main microphone is calculated for each frame by the main microphone input signal and the auxiliary microphone input signal. Processing for obtaining the tailing part hr (t) of the transfer function h (t) and determining the strength of reverberation by the transfer function h (t) to calculate the adjustment factor β of the gain function And processing for obtaining a late reverberation estimation signal of a main microphone input signal by performing convolution to hr (t) using the auxiliary microphone input signal, and a frequency from a time domain with respect to a late reverberation estimation signal of the main microphone input signal. The conversion to the domain is performed to obtain the late reverberation spectrum of the main microphone input signal, and from the time domain to the main microphone input signal, Processing to obtain the frequency spectrum of the main microphone input signal by conversion to the number domain, the frequency function of the main microphone input signal, the adjustment factor β of the gain function, and the late reverberation spectrum of the main microphone input signal Calculating the frequency spectrum of the main microphone input signal by multiplying the frequency function of the main microphone input signal to obtain a frequency spectrum of the main microphone input signal after dereverberation; and calculating the frequency spectrum of the main microphone input signal after dereverberation A process of converting the frequency domain to the time domain to obtain a time domain signal after dereverberation of the main microphone input signal, and superimposing the time domain signal after dereverberation of the main microphone input signal for each frame After addition, the continuous signal after dereverberation of the main microphone input signal is output. An audio reverberation reduction method based on dual microphones is provided which performs the processing of [0007] According to another aspect of the present invention, the signals received by the main and auxiliary microphones are processed on a frame-by-frame basis, and a reverberation spectrum estimation unit and a spectrum subtraction unit are included, the reverberation spectrum 11-04-2019 3 estimation unit comprising A microphone input signal and an auxiliary microphone input signal are received, and a transfer function h (t) from the auxiliary microphone to the main microphone is calculated by the main microphone input signal and the auxiliary microphone input signal, and the transfer function h (t) is calculated. Of the tailing part hr (t), the reverberation strength is determined by the transfer function h (t), the gain function adjustment factor .beta. Is calculated and output to the spectral subtraction unit, and the auxiliary microphone input signal is Then, convolution is performed to hr (t) to obtain a late reverberation estimation signal of the main microphone input signal, and the main microphone input signal is obtained. The second reverberation estimation signal is converted from the time domain to the frequency domain to obtain the second reverberation spectrum of the main microphone input signal and then output to the spectrum subtraction unit, wherein the spectrum subtraction unit is the main microphone Receives the input signal, the adjustment factor β of the gain function output from the reverberation spectrum estimation unit, and the late reverberation spectrum of the main microphone input signal, converts the main microphone input signal from the time domain to the frequency domain, and performs main The frequency spectrum of the microphone input signal is obtained, the gain function is calculated from the frequency spectrum of the main microphone input signal, the adjustment factor β of the gain function and the late reverberation spectrum of the main microphone input signal, and the frequency spectrum of the main microphone input signal is calculated. Multiply the gain function to The frequency spectrum of the main microphone input signal after dereverberation is obtained, the frequency spectrum of the main microphone input signal after dereverberation is converted from the frequency domain to the time domain, and the time after the dereverberation of the main microphone input signal is obtained. A dual used for obtaining a domain signal, superimposing and adding the time domain signal after dereverberation of the main microphone input signal for each frame, and outputting a continuous signal after dereverberation of the main microphone input signal A microphone based audio reverberation reduction device is provided. [0008] As can be seen from the above, the present invention calculates the transfer function h (t) from the auxiliary microphone to the main microphone by the main microphone input signal and the auxiliary microphone input signal, and the tailing part hr (t) of the transfer function h (t) To determine the strength of the reverberation by the transfer function h (t), calculate the control function β of the gain function, and convolute to hr (t) using the auxiliary microphone input signal to calculate the main microphone The late reverberation estimation signal of the input signal is obtained, and the gain function is calculated from the frequency spectrum of the main microphone input signal, the adjustment factor β of the gain function and the late reverberation spectrum of the main microphone input signal, and the gain is converted to the frequency spectrum of the main microphone input signal The frequency spectrum of the main microphone 11-04-2019 4 input signal after dereverberation is obtained by multiplying the function, that is, the main microphone input is obtained by the spectral subtraction method. Since the late reverberation estimation spectrum of the main microphone input signal is subtracted from the frequency spectrum of the signal, the late reverberation can be effectively removed from the input signal of the main microphone and the early reverberation can be reserved, and the processed sound becomes thin. Voice quality is improved. At the same time, in the process of estimating the late reverberation, the intensity of the spectral subtraction is adjusted according to the reverberation intensity so that the spectral subtraction is reduced or not when the reverberation is weak, so that the reverberation is weak and the speech intelligibility is high in the first place It is guaranteed not to damage the voice in case the voice quality is protected. And in such a means, since it is not necessary to estimate the direction of arrival of a direct sound correctly, it is not required that the microphones have high consistency, and there is no strict limitation on acoustic design. [0009] 5 is a graph of a transfer function from an excitation signal to a microphone input signal as listed in an embodiment of the present invention. It is a graph of the transfer function from the auxiliary | assistant microphone mentioned to the Example of this invention to the main microphone. FIG. 6 is a schematic diagram of a dual microphone based audio reverberation reduction method flow according to one embodiment of the present invention; FIG. 7 is a schematic diagram of an overall flow of a dual microphone based audio reverberation reduction method according to another embodiment of the present invention; It is a graph of the transfer function from an auxiliary microphone to the main microphone in case the distance from the sound source to the main microphone in the Example of this invention is 0.5 m. It is a graph of the transfer function from an auxiliary microphone to the main microphone in case the distance from the sound source to the main microphone in the Example of this invention is 1 m. It is a graph of the transfer function from an auxiliary microphone to the main microphone in case the distance from the sound source to the main microphone in the Example of this invention is 2 m. It is a graph of the transfer function from an auxiliary microphone to the main microphone in case the distance from the sound source to the main microphone in the Example of this invention is 4 m. It is a graph of the amplitude frequency 11-04-2019 5 characteristic of a frequency compensation filter when the space | interval between the main microphone and auxiliary | assistant microphone in the Example of this invention is 6 cm. It is a graph of the amplitude frequency characteristic of a frequency compensation filter when the space | interval between the main microphone and auxiliary | assistant microphone in the Example of this invention is 18 cm. It is a figure of the time domain of the main microphone input signal in the Example of this invention. It is a figure of the time domain after dereverberation with respect to the main microphone in the Example of this invention. It is an audio spectrogram of the main microphone input signal in the example of the present invention. It is an audio | voice spectrogram after dereverberation with respect to the main microphone in the Example of this invention. It is a whole block diagram of the audio | voice reverberation reduction apparatus based on the dual microphone in the Example of this invention. It is a detailed structure of the sound reverberation reduction apparatus based on the dual microphone in one preferable embodiment of this invention, and its input-output schematic diagram. [0010] First, in the description of the present application, "microphone" will be abbreviated as "mike" for the sake of simplicity of the application. The analysis against the prior art requires accurate and stable late reverberation estimation and reverberation strength judgment as it is necessary to remove the late reverberation and at the same time protect the direct sound and the early reverberation in order to reduce the reverberation better. Be done. [0011] The present invention presents a means of dereverberation based on dual microphones (main microphone and auxiliary microphone), making full use of the approximate relationship between reverberation and transfer function of dual microphone space, dual microphone space According to the spectral subtraction module, the intelligibility is satisfied even in various reverberant environments and near-optimal audio quality is obtained by estimating the reverberation and determining the reverberation strength by the transfer function of In addition, since the means in the present invention does not have to separate the direct sound and do not need to estimate the direction of arrival, the requirement for acoustic design is alleviated without requiring the consistency of the microphone. [0012] 11-04-2019 6 The basic principle of the present invention is as follows. Direct sound and early reverberation can be well reserved in spectral subtraction, since the late reverberation is estimated by the tail of the transfer function between dual microphones. Then, in the process of estimating the late reverberation, furthermore, the reverberation degree of the room is estimated by the energy difference between the head and the tail of the transfer function between the dual microphones, and the intensity of spectral subtraction is adjusted to adjust the spectrum when the reverberation is weak. Protect voice quality by reducing or not reducing subtractions. [0013] In order to clarify the technical means of the present invention, the technical principle of the present invention will be analyzed and described below. Early reverberant signals can act as an emphasis on speech, but late reverberation will reduce speech intelligibility. FIG. 1 is a graph of the transfer function from the excitation signal to the microphone input signal given in the example of the present invention. Referring to FIG. 1, in the transfer function from the excitation signal to the microphone input signal, the location where the peak is the maximum corresponds to the direct sound, and usually, a point away from the maximum peak, the early reflection and the late As a boundary point with reflection, a portion from the maximum peak to the boundary point corresponds to early reverberation, and a portion after the boundary point corresponds to late reverberation. In FIG. 1, the boundary point is 50 ms. [0014] Excitation signal s (t), microphone input signal x (t), transfer function from excitation signal to microphone input signal tf (t), transfer function of part corresponding to direct sound and early reverberation tfd (t) If the transfer function of the part corresponding to the late reverberation is denoted as tfr (t), the microphone input signal is expressed as the convolution of the excitation signal and the transfer function as x (t) = s (t) * tf (t) The direct sound and early reverberation component of the microphone input signal can be expressed as xd (t) = s (t) * tfd (t), and the late reverberation component of the microphone input signal is xr (t) = s (T) can be expressed as * tfr (t). Therefore, the microphone input signal can also be expressed as x (t) = s (t) * tf (t) = s (t) * (tfd (t) + tfr (t)) = xd (t) + xr (t) . [0015] 11-04-2019 7 Speech clarity may be represented by C50. The formula is as follows. (1) w (t) is a transfer function from the excitation signal to the microphone input signal. 0 to 50 ms correspond to direct sound and early reverberation, and 50 ms or later correspond to late reverberation. The stronger the reverberation, the smaller the value of C50. Since the improvement of C50 before and after dereverberation can reflect the dereverberation effect, C50 may be used as an objective evaluation index of dereverberation. [0016] In the present invention, the reverberation estimation principle based on dual microphones (main microphone and auxiliary microphone) is as follows. As shown in FIG. 2, the input signal of the main microphone is denoted by x 2 (t), the input signal of the auxiliary microphone is denoted by x 1 (t), and the transfer function from the auxiliary microphone to the main microphone is denoted by h (t). FIG. 2 is a graph of the transfer function h (t) from the auxiliary microphone to the main microphone mentioned in the embodiment of the present invention. [0017] The input signal x2 (t) of the main microphone is equal to the convolution of the input signal x1 (t) of the auxiliary microphone and the transfer function h (t). (2) h (t) can be divided into two parts: head and tail. (3) Among them, hd (t) represents the head of h (t) and hr (t) represents the tail of h (t). Since the tailing part hr (t) of h (t) reflects multiple reflections of the signal in space, the tailing part hr (t) of h (t) and the auxiliary microphone input signal x1 (t) The convolution signal of is close to the later reverberation component of the main microphone, and is an estimated signal of the later reverberation component of the main microphone. Pick one point in h (t) and set the previous value of the boundary point of h (t) to 0 as the boundary point of hd (t) and hr (t) to get hr (t) Be A distance range from the boundary point to the maximum peak of h (t) can be set to 30 ms to 80 ms (experimental value). According to experience, when the maximum peak from the boundary point to h (t) is 50 ms or more, the direct reflection and early reflection components of the main microphone's late reverberation signal do not remain at all, thus reducing the damage to voice. In order to obtain, in the embodiment of the present invention, the explanation will be made by taking 50 ms as the boundary point as an example. [0018] 11-04-2019 8 In order to further clarify the object, technical means and advantages of the present invention, a more detailed description of embodiments of the present invention will be given below with reference to the drawings. FIG. 3 is a schematic diagram of the flow of the audio reverberation reduction method based on dual microphones according to one embodiment of the present invention. As shown in FIG. 3, the method mainly includes a reverberation spectrum estimation part and a spectrum subtraction part, and specifically, the following processing is performed for each frame. [0019] Step 1.1 Receive main microphone input signal x2 (t) and auxiliary microphone input signal x1 (t), and transfer function h (t) from auxiliary microphone to main microphone by main microphone input signal and auxiliary microphone input signal calculate. Step 1.2 Obtain the tailing part hr (t) of the transfer function h (t). Step 1.3 Then, the strength of reverberation is judged by the transfer function h (t) to calculate the adjustment factor β of the gain function. Step 1.4 Convolute to hr (t) using the auxiliary microphone input signal to obtain the late reverberation estimation signal of the main microphone input signal. Step 1.5 The late reverberation estimation signal of the main microphone input signal is converted from the time domain to the frequency domain to obtain the late reverberation spectrum of the main microphone input signal. [0020] Step 2.1 The main microphone input signal x2 (t) is converted from the time domain to the frequency domain to obtain the frequency spectrum X2 of the main microphone input signal. Step 2.2 The gain function G is calculated from the frequency spectrum X2 of the main microphone input signal, the adjustment factor β of the gain function, and the later reverberation spectrum of the main microphone input signal. Step 2.3 The frequency spectrum X2 of the main microphone input signal is multiplied by the gain function G to obtain a frequency spectrum D after dereverberation of the main microphone input signal. Step 2.4 The dereverberated frequency spectrum D of the main microphone input signal is converted from the frequency domain to the time domain to obtain a time domain signal d (t) after dereverberation of the main microphone input signal. Step 2.5 The reverberant-removed time domain signal of the main microphone input signal is superimposed and added frame by frame, and then the continuous signal xd (t) of the main microphone input signal after reverberation reduction is output. 11-04-2019 9 [0021] In the method shown in FIG. 3, the second reverberation estimation signal of the main microphone input signal is obtained by performing convolution to hr (t) using the auxiliary microphone input signal, and then from the frequency spectrum of the main microphone input signal by spectral subtraction. Since the late reverberation estimation spectrum of the main microphone input signal is subtracted, the late reverberation can be effectively removed from the input signal of the main microphone, and the early reverberation can be reserved, so that the voice quality is improved. Further, in the method shown in FIG. 3, the reverberation is weak by adjusting the intensity of the spectral subtraction according to the reverberation intensity to the process of estimating the late reverberation, and reducing or not reducing the spectral subtraction when the reverberation is weak. Voice quality is protected by ensuring that voice quality is not impaired if speech intelligibility is high in the first place. And in such a method, since it is not necessary to estimate the direction of arrival of the direct sound correctly, the microphone is not required to have high matching, and there is no strict limitation on the acoustic design. [0022] In one embodiment of the present invention, in addition to the method shown in FIG. 3, the late reverberation estimation signal of the main microphone input signal is compared with the true late reverberation component of the main microphone input signal, Low-pass filters are provided according to the different microphone spacings in order to additionally take into account the problem of lack of estimation in parts, and corresponding frequency compensation is performed on the late reverberation estimation signal. Specifically, reference is made to the embodiment shown in FIG. [0023] FIG. 4 is a schematic diagram of the overall flow of a dual microphone based audio reverberation reduction method according to another embodiment of the present invention. As shown in FIG. 4, the input of the entire system is the auxiliary microphone input signal x1 (t) and the main microphone input signal x2 (t), and the output is the reverberation reduced signal xd (t). Broadly speaking, it includes two parts, a reverberation spectrum estimation process and a spectrum 11-04-2019 10 subtraction process. Compared with the flow of the method shown in FIG. 3, FIG. 4 adds the step of performing frequency compensation on the late reverberation estimation signal (in FIG. 4, the step of performing frequency compensation on the late reverberation estimation signal is a step) 1.45, and the time domain / frequency domain conversion step is described as step 1.5 as in FIG. Hereinafter, the method will be described in detail with reference to FIG. [0024] １． Reverberation spectrum estimation Input: Input signal x1 (t) of auxiliary microphone, input signal x2 (t) of main microphone. Output: Gain function adjustment factor β (one input in the spectral subtraction process), late reverberation spectrum of the main microphone input signal (one input in the spectral subtraction process). The reverberation spectrum estimation includes six steps: step 1.1, step 1.2, step 1.3, step 1.4, step 1.45, step 1.5. [0025] ２． Spectrum subtraction Input: Main microphone input signal x 2 (t), gain function adjustment factor β (output in reverberation spectrum estimation process), late reverberation spectrum of main microphone input signal (output in reverberation spectrum estimation process). Output: Signal xd (t) after reduction of reverberation of the main microphone input signal (also an output of the entire system). The spectral subtraction process comprises five steps: step 2.1, step 2.2, step 2.3, step 2.4, step 2.5. [0026] Hereinafter, each step in the reverberation spectrum estimation process and the spectrum subtraction process and the relationship between them will be described in detail. １． Reverberation Spectrum Estimation Process Step 1.1 The transfer function h (t) from the auxiliary microphone to the main microphone is calculated. Input of step 1.1: input signal x1 (t) of auxiliary microphone and input signal x2 (t) of main microphone. Output of step 1.1: transfer function h (t) from auxiliary microphone to main microphone (input of step 1.2). [0027] 11-04-2019 11 In one embodiment of the present invention, a transfer function is generated using the cross power spectrum Px2x1 of the auxiliary microphone input signal x1 (t) and the main microphone input signal x2 (t) and the power spectrum Px1x1 of the auxiliary microphone input signal x1 (t). Calculate H. (4) Inverse Fourier transform is performed on the transfer function H in the frequency domain to obtain the transfer function h (t) in the time domain. In other embodiments of the present invention, the calculation of h (t) may use different methods, such as a method of adaptive filtering, which will not be described in detail here. [0028] Step 1.2 Find tailing part hr (t) of transfer function h (t). Input of step 1.2: Transfer function h (t) from auxiliary microphone to main microphone (output of step 1.1). Output of step 1.2: Tailing part hr (t) of transfer function from auxiliary microphone to main microphone (input to step 1.4). [0029] In the embodiment of the present invention, taking the boundary point between the early reverberation and the late reverberation on the time axis of the transfer function h (t), the previous value of the boundary point in the transfer function h (t) is set to 0. Thus, the tailing part hr (t) of the transfer function h (t) is obtained. In one preferred embodiment of the invention, one point in the transfer function h (t) is picked and the distance of the largest peak to h (t) at this point is 50 ms, the point at h (t) The previous value of is set to 0 and denoted as tailing part hr (t). [0030] Step 1.3 Judge the strength of the reverberation by the transfer function h (t) from the auxiliary microphone to the main microphone, and find the adjustment factor β of the gain function. Input of step 1.3: transfer function h (t) from auxiliary microphone to main microphone (output of step 1.1). Output of step 1.3: Gain function adjustment factor β (as one input in the spectral subtraction process). 11-04-2019 12 [0031] In order to reduce the damage to the voice due to the dereverberation when the reverberation is weak, in this step 1.3, the adjustment factor β of the gain function is calculated by judging the strength of the reverberation. In the embodiment of the present invention, the ratio of the energy of the head of the transfer function from the auxiliary microphone to the main microphone and the energy of the tailing part is taken logarithmically and denoted as ρ. (5) Among them, h (t) is a transfer function from the auxiliary microphone to the main microphone, and T is a designated boundary point on the time axis of h (t). The boundary point T is not necessarily the boundary point between the early reverberation and the late reverberation, but the boundary point T always includes direct sound and may further include part or all of the early reverberation. [0032] FIG. 5A is a graph of the transfer function from the auxiliary microphone to the main microphone when the distance from the sound source to the main microphone according to the embodiment of the present invention is 0.5 m. When the distance from the sound source to the main microphone is L = 0.5 m, the numerical range of T is 20 ms to 50 ms, where T is 50 ms (that is, the boundary point T is 50 ms away from the maximum peak of h (t)) When the time point is taken, the speech intelligibility index is C50 = 12.3 dB, ρ = 9.4 dB. [0033] FIG. 5B is a graph of the transfer function from the auxiliary microphone to the main microphone when the distance from the sound source to the main microphone in the embodiment of the present invention is 1 m. When the distance from the sound source to the main microphone is L = 1 m, the numerical range of T is 20 ms to 50 ms, where T is 50 ms (that is, the boundary point T is 50 ms away from the maximum peak of h (t)) When taking), the speech intelligibility index is C50 = 8.1 dB, ρ = 6.0 dB. [0034] FIG. 5C is a graph of the transfer function from the auxiliary microphone to the main microphone when the distance from the sound source to the main microphone in the embodiment of the 11-04-2019 13 present invention is 2 m. When the distance from the sound source to the main microphone is L = 2 m, the numerical range of T is 20 ms to 50 ms, where T is 50 ms (ie, the boundary point T is 50 ms away from the maximum peak of h (t)) When taking), the speech intelligibility index is C50 = 5.4 dB, ρ = 3.7 dB. [0035] FIG. 5D is a graph of the transfer function from the auxiliary microphone to the main microphone when the distance from the sound source to the main microphone in the embodiment of the present invention is 4 m. When the distance from the sound source to the main microphone is L = 4 m, the numerical range of T is 20 ms to 50 ms, where T is 50 ms (that is, the boundary point T is 50 ms away from the maximum peak of h (t)) When taking), the speech intelligibility index is C50 = 4.5 dB, ρ = 2.2 dB. [0036] The further the sound source is from the microphone, the stronger the reverberation. As can be seen from FIGS. 5A to 5D, as the reverberation becomes stronger, the energy of the head of the transfer function from the auxiliary microphone to the main microphone becomes lower, the energy of the tailing part becomes higher, and the logarithmic ρ taken for the ratio of the two. Can reflect the strength of the reverberation. As the reverberation gets stronger, the value of ρ gets smaller and smaller. Therefore, the strength of the reverberation is judged by the value of 、 , and the adjustment factor β of the gain function is obtained by this. [0037] There are a plurality of methods for calculating the regulatory factor β, and equation (6) is an empirical equation for calculating β in the embodiment of the present invention. (6) ρ1 and 22 are set values and are empirical values, and in the embodiment of the present invention, ρ1 is 9 dB and 22 is 2 dB (microphone spacing is 6 cm). [0038] 11-04-2019 14 Step 1.4 The input signal x1 (t) of the auxiliary microphone and the tailing part hr (t) of the transfer function from the auxiliary microphone to the main microphone are convoluted to obtain a late reverberation estimation signal of the main microphone input signal. Input of step 1.4: input signal x1 (t) of auxiliary microphone, tailing part hr (t) of transfer function from auxiliary microphone to main microphone (output of step 1.2). Output of step 1.4: Late reverberation estimation signal of main microphone input signal (as input of step 1.45). Specifically, it becomes the following equation. （７） [0039] Step 1.45 Perform frequency compensation on the late reverberation estimation signal of the main microphone input signal to obtain a compensated signal. Input of step 1.45: Late reverberation estimation signal of main microphone input signal (output of step 1.4). Output of step 1.45: Late reverberation estimation signal of main microphone input signal subjected to frequency compensation (as input of step 1.5). [0040] When the late reverberation estimation signal of the main microphone input signal is compared with the true late reverberation component of the main microphone input signal, the late reverberation estimation signal is underestimated in the low frequency region. Therefore, in the present invention, frequency compensation is performed on the late reverberation estimation signal of the main microphone input signal. Since the distance between the main microphone and the auxiliary microphone affects the late reverberation estimation signal, in the embodiment of the present invention, a low pass filter is provided according to different microphone intervals to correspond to the late reverberation estimation signal. Frequency compensation is performed to obtain a late reverberation estimation signal after compensation. [0041] FIG. 6A is a graph of the amplitude frequency characteristic of the frequency compensation filter when the distance between the main microphone and the auxiliary microphone in the embodiment of the present invention is 6 cm. FIG. 6B is a graph of the amplitude frequency characteristic of the frequency compensation filter when the distance between the main microphone and the auxiliary microphone in the embodiment of the present invention is 18 cm. 11-04-2019 15 As understood from the above, in the embodiment of the present invention, as the distance between the main microphone and the auxiliary microphone is larger, the degree to which the frequency compensation is performed on the low frequency band portion of the late reverberation estimation signal of the main microphone input signal It becomes smaller. [0042] Step 1.5 The late reverberation estimation signal of the main microphone input signal subjected to frequency compensation is converted from the time domain to the frequency domain to obtain a late reverberation spectrum of the main microphone input signal. Input of step 1.5: late reverberation estimation signal of the main microphone input signal subjected to frequency compensation (output of step 1.45). Output of step 1.5: Late reverberation spectrum of the main microphone input signal (as one input in the spectral subtraction process). The late reverberation estimation signal of the frequency-compensated main microphone is converted to the frequency domain to obtain the late reverberation spectrum of the main microphone input signal. （８） [0043] ２． Spectrum Subtraction Process Step 2.1 Convert the main microphone input signal x2 (t) from the time domain to the frequency domain and write X2. Input of step 2.1: main microphone input signal x2 (t). Output of step 2.1: Frequency spectrum X2 of main microphone input signal (inputted to step 2.2). Specifically, it becomes the following equation. （９） [0044] Step 2.2 The gain function G is calculated from the frequency spectrum X2 of the main microphone input signal and the estimated late reverberation spectrum of the main microphone, and the gain function is adjusted by the adjustment factor β. Input of step 2.2: Frequency spectrum X2 of main microphone input signal (output of step 2.1), late reverberation spectrum of main microphone (output of step 1.5 in reverberation spectrum estimation process), adjustment factor of gain function β (output of step 1.3 in reverberant spectrum estimation process). Output of step 2.2: gain function G (one input of step 2.3). [0045] 11-04-2019 16 In one embodiment of the present invention, the power spectral subtraction method is used to calculate the gain function G (l, k) according to the following equation. (10) Among them, 1 is a frame number, k is a frequency point number, β is an adjustment factor of a gain function, is a late reverberation frequency spectrum of the main microphone input signal, and X2 is a frequency spectrum of the main microphone input signal. [0046] As understood from the equation (10), the magnitude of the gain function G (l, k) can be adjusted by the adjustment factor β of the gain function. In this way, it is possible to reduce or not reduce spectral subtraction when the reverberation is weak, ensuring that the reverberation is weak and the speech is not damaged if the speech intelligibility is high, and the speech quality is protected. [0047] Step 2.3 The amplitude spectrum | X2 | of the main microphone input signal is multiplied by the gain function G and combined with the phase of the main microphone input signal to obtain the frequency spectrum D after dereverberation of the main microphone input signal. Input of step 2.3: frequency spectrum X2 of main microphone input signal (output of step 2.1), gain function G (output of step 2.2). Output of step 2.3: Frequency spectrum D after dereverberation of main microphone input signal (as input of step 2.4). Specifically, the frequency spectrum D (l, k) after dereverberation of the main microphone input signal is calculated by the following equation. (11) Among them, l is a frame number, k is a frequency point number, | X 2 (l, k) | is an amplitude spectrum of the main microphone input signal, G (l, k) is a gain function, and phase (l, k) is It is the phase of the main microphone input signal. [0048] Step 2.4 The dereverberated frequency spectrum D of the main microphone input signal is converted to the time domain and denoted as d (t). Input of step 2.4: Frequency spectrum D after dereverberation of main microphone input signal (output of step 2.3). Output of step 2.4: Time domain signal d (t) after dereverberation of main microphone input signal (input of step 2.5). （１２） 11-04-2019 17 [0049] Step 2.5 The reverberation-removed time domain signal of the main microphone input signal is superimposed and added for each frame to obtain a continuous signal xd (t) after the reverberation reduction of the main microphone input signal. Input of step 2.5: Time domain signal d (t) after dereverberation of main microphone input signal (output of step 2.4). Output of step 2.5: Continuous signal x d (t) after reduction of reverberation of main microphone input signal (output of entire system). [0050] FIG. 7A is a diagram of a time domain of a main microphone input signal according to an embodiment of the present invention. FIG. 7B is a diagram of a time domain after dereverberation for the main microphone in the embodiment of the present invention. FIG. 7C is an audio spectrogram of the main microphone input signal according to an embodiment of the present invention. FIG. 7D is an audio spectrogram after dereverberation with respect to the main microphone in the embodiment of the present invention. [0051] 7A to 7D, in the present embodiment, the main microphone and the auxiliary microphone face the sound source, the vertical distance from the sound source to the dual microphone is 2 m, and the distance between the main microphone and the auxiliary microphone is 18 cm. The C50 before dereverberation from the main microphone input signal is 6.8 dB, and the C50 after dereverberation using the method shown in FIG. 4 is 10.5 dB, as can be seen from this. After adopting the inventive method, C50 improves by 3.7 dB. [0052] FIG. 8 is an overall block diagram of a dual microphone based audio reverberation reduction apparatus according to an embodiment of the present invention. The apparatus processes the signals received by the main microphone and the auxiliary 11-04-2019 18 microphone for each frame, and includes a reverberation spectrum estimation unit 700 and a spectrum subtraction unit 800 as shown in FIG. [0053] The reverberation spectrum estimation unit 700 receives the main microphone input signal and the auxiliary microphone input signal, and calculates a transfer function h (t) from the auxiliary microphone to the main microphone by the main microphone input signal and the auxiliary microphone input signal, The tailing part hr (t) of the transfer function h (t) is acquired, and the strength of the reverberation is determined by the transfer function h (t) to calculate the adjustment factor β of the gain function and output to the spectral subtraction unit 800 Then, convolution is performed to hr (t) using the auxiliary microphone input signal to obtain the late reverberation estimation signal of the main microphone input signal, and the time domain to the frequency domain with respect to the late reverberation estimation signal of the main microphone input signal Conversion to obtain the late reverberation spectrum of the main microphone input signal, and output to spectrum subtraction unit 800. Used to. [0054] The spectrum subtraction unit 800 receives the main microphone input signal, the adjustment factor β of the gain function output from the reverberation spectrum estimation unit 700, and the later reverberation spectrum of the main microphone input signal, and receives the main microphone input signal from the time domain. Conversion to the frequency domain is performed to obtain the frequency spectrum of the main microphone input signal, and the gain function is calculated from the frequency spectrum of the main microphone input signal, the adjustment factor β of the gain function, and the later reverberation spectrum of the main microphone input signal The frequency spectrum of the microphone input signal is multiplied by the gain function to obtain the frequency spectrum of the main microphone input signal after dereverberation, and the frequency spectrum of the main microphone input signal after dereverberation is converted from the frequency domain to the time domain To the main microphone input signal Obtain the time domain signal after dereverberation of the No., and superimpose and add the time domain signal after dereverberation of the main microphone input signal for each frame, and then output the continuous signal after the reverberation reduction of the main microphone input signal Used for [0055] In one embodiment of the present invention, the reverberation spectrum estimation unit 700 performs convolution to hr (t) using the auxiliary microphone input signal to obtain a later reverberation estimation signal of the main microphone input signal, and then the main microphone first. After performing frequency compensation on the late reverberation estimation 11-04-2019 19 signal of the input signal and then performing time domain to frequency domain conversion on the frequency compensated signal to obtain the late reverberation spectrum of the main microphone input signal, the spectrum is obtained. Output to the subtraction unit 800. [0056] FIG. 9 is a detailed configuration of an audio reverberation reduction apparatus based on dual microphones according to a preferred embodiment of the present invention and an input / output schematic diagram thereof. Referring to FIG. 9, the audio reverberation reduction apparatus based on the dual microphone includes reverberation spectrum estimation unit 91 and spectrum subtraction unit 92. Among them, the reverberation spectrum estimation unit 91 includes a transfer function calculation unit 911, a transfer function tailing calculation unit 912, a reverberation strength determination unit 913, a late reverberation estimation unit 914, a frequency compensation unit 915, and a first time / frequency conversion unit 916. Including. The spectral subtraction unit 92 includes a second time / frequency conversion unit 921, a gain function calculation unit 922, a dereverberation unit 923, a frequency / time conversion unit 924 and a superposition and addition unit 925. [0057] The transfer function calculation unit 911 receives the main microphone input signal and the auxiliary microphone input signal, and calculates the transfer function h (t) from the auxiliary microphone to the main microphone by the main microphone input signal and the auxiliary microphone input signal, The transfer function h (t) is used to output to the transfer function tailing calculation unit 912 and the reverberation strength determination unit 913. [0058] The transfer function tailing calculation unit 912 is used to obtain the tailing part hr (t) of the transfer function h (t) and output it to the late reverberation estimation unit 914. 11-04-2019 20 Specifically, the transfer function tailing calculation unit 912 takes the boundary point between the early reverberation and the late reverberation on the time axis of the transfer function h (t), and obtains the value before the boundary point of the transfer function h (t). Set to 0 to obtain the tailing part hr (t) of the transfer function h (t). [0059] The reverberation strength determination unit 913 is used to determine the strength of the reverberation by the transfer function h (t) and to calculate the adjustment factor β of the gain function and output it to the gain function calculation unit 922. Specifically, the reverberation strength determination unit 913 calculates a parameter 表 す representing the reverberation strength according to the above equation (5). That is, h (t) is the transfer function from the auxiliary microphone to the main microphone, and T is the designated boundary point on the time axis of h (t). Then, the reverberation strength determination unit 913 calculates the adjustment factor β of the gain function by the above equation (6). That is, among them, ρ1 and ρ2 take set values. For example, ρ1 is 9 dB and 22 is 2 dB (the distance between microphones is 6 cm). [0060] The late reverberation estimation unit 914 receives the auxiliary microphone input signal and performs convolution to hr (t) using the auxiliary microphone input signal to obtain the late reverberation estimation signal of the main microphone input signal and then to the frequency compensation unit 915. Used for output. [0061] The frequency compensation unit 915 is used to perform frequency compensation on the late reverberation estimation signal of the main microphone input signal and to output the frequency compensated signal to the first time / frequency conversion unit 916. The larger the distance between the main microphone and the auxiliary microphone, the smaller the degree to which the frequency compensation unit 915 performs frequency compensation on the late reverberation estimation signal of the main microphone input signal. 11-04-2019 21 [0062] The first time / frequency conversion unit 916 converts the frequency-compensated main microphone input signal from the time domain to the frequency domain with respect to the later reverberation estimation signal to obtain the later reverberation spectrum of the main microphone input signal. After being obtained, it is used to output to the gain function calculation unit 922. [0063] The second time / frequency conversion unit 921 receives the main microphone input signal, performs conversion from the time domain to the frequency domain, and obtains the frequency spectrum of the main microphone input signal, and then the gain function calculation unit 922 and dereverberation It is used to output to unit 923. [0064] The gain function calculation unit 922 is a frequency spectrum of the main microphone input signal output from the second time / frequency conversion unit 921, an adjustment factor β of the gain function output from the reverberation strength determination unit 913, and the first time / time. The second reverberation spectrum of the main microphone input signal output from the frequency conversion unit 916 is used to calculate a gain function and output the result to the dereverberation unit 923. The gain function calculation unit 922 can calculate the gain function G (l, k) according to the above equation (10). That is, among them, 1 is a frame number, k is a frequency point number, β is an adjustment factor of a gain function, is a late reverberation frequency spectrum of the main microphone input signal, and X2 is a frequency spectrum of the main microphone input signal. [0065] The dereverberation unit 923 multiplies the frequency spectrum of the main microphone input 11-04-2019 22 signal by the gain function to obtain the dereverberated frequency spectrum of the main microphone input signal, and outputs the frequency spectrum to the frequency / time conversion unit 924. In the present embodiment, the dereverberation unit 923 calculates the frequency spectrum D (l, k) after dereverberation of the main microphone input signal according to the above equation (11). Among them, 1 is the frame number, k is the frequency point number, | X 2 (l, k) | is the amplitude of the main microphone input signal, G (l, k) is the gain function, and phase (l, k) is the main microphone It is the phase of the input signal. [0066] The frequency / time conversion unit 924 converts the frequency spectrum of the main microphone input signal after dereverberation from frequency domain to time domain to obtain a time domain signal after dereverberation of the main microphone input signal. It is used to output to the superposition addition unit 925. [0067] The superposition and addition unit 925 is used to perform superposition addition for each frame on the time domain signal output from the frequency / time conversion unit 924 to obtain a continuous signal after dereverberation of the main microphone input signal. [0068] Summarizing the above, the audio reverberation reduction apparatus based on the dual microphone as in the embodiment of the present invention processes the signal received by the main microphone and the auxiliary microphone for each frame. The reverberation spectrum estimation unit 700 in the device receives the input signal x2 (t) of the main microphone and the auxiliary microphone input signal x1 (t), and transmits from the auxiliary microphone to the main microphone by x2 (t) and x1 (t). The function h (t) is calculated to obtain the tailing part hr (t) of h (t), and the strength of the reverberation is judged by h (t) to calculate the regulator function β of the gain function It outputs to the spectrum subtraction unit 800 in this apparatus. At this time, convolution is performed to hr (t) using x1 (t) to obtain a late reverberation estimation signal x2 (t), and conversion from time domain to frequency domain is performed on x2 (t). It is used to obtain the later reverberation spectrum of t) and output to the spectrum 11-04-2019 23 subtraction unit 800 in the device. A spectral subtraction unit 800 in the apparatus performs a time domain to frequency domain conversion on x 2 (t) to obtain a frequency spectrum of x 2 (t), and the frequency spectrum of x 2 (t), the β and β The gain function is calculated by and the frequency spectrum of x 2 (t) is multiplied by the gain function to obtain the frequency spectrum after dereverberation of x 2 (t), and conversion from the frequency domain to the time domain is performed, x 2 It is used to obtain a time domain signal after dereverberation of (t). [0069] In a device such as the present invention, the second reverberation estimation signal of the main microphone input signal x2 (t) is obtained by performing convolution on the auxiliary microphone input signals x1 (t) and hr (t), and then the spectral subtraction method is performed. Since the late reverberation estimation spectrum of the main microphone input signal is subtracted from the frequency spectrum of the main microphone input signal x2 (t), the late reverberation is effectively removed from the main microphone input signal x2 (t) and the early reverberation is reserved. Can improve voice quality. At the same time, in the process of estimating the late reverberation of the present invention, the intensity of the spectral subtraction is adjusted according to the intensity of the reverberation, and when the reverberation is weak, the reverberation is weak and speech intelligibility is weak. Is guaranteed not to damage the voice if it is high, and the voice quality is protected. And, in such a device, there is no need to accurately estimate the direction of arrival of the direct sound, so the microphone is not required to have high consistency, and there is no strict limitation on the acoustic design. [0070] As can be seen from the above, the technical means of the present invention removes reverberation and at the same time effectively protects speech, automatically estimates the degree of reverberation in a room, and selects appropriate processing even in various environments. You then reach near-optimal voice quality. And, there are no strict limitations on microphone consistency and acoustic design, making the application more flexible and convenient. [0071] 11-04-2019 24 What has been described above is merely a preferred embodiment of the present invention and is not intended to limit the protection scope of the present invention. All changes, equivalent replacements, improvements and the like made within the spirit and principle of the present invention shall fall within the protection scope of the present invention. 11-04-2019 25

1/--страниц