вход по аккаунту



код для вставкиСкачать
Patent Translate
Powered by EPO and Google
This translation is machine-generated. It cannot be guaranteed that it is intelligible, accurate,
complete, reliable or fit for specific purposes. Critical decisions, such as commercially relevant or
financial decisions, should not be based on machine-translation output.
BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a
method using an electro-acoustic device required for the generation of a sense of space and / or
a sense of space actually present or calculated, wherein a listening program is here used.
Optionally, monophonic, stereophonic or multi-channel audio programs can be used. The
reproduction may advantageously take place binaural via headphones (earphones) but may also
take place via speakers.
2. Description of the Related Art Each created audio program generally has a three-dimensional
(spatial) acoustic characteristic existing in input recording (recording), and the acoustic
characteristic is of course the fine structure in the conventionally known stereophonic
reproduction method. It could not be completely regenerated. The fact that the input recording
(recording) was performed in a space (room) having a predetermined reverberation
(characteristic) could not be recognized by the listener (listener or listener) at the time of
reproduction. Even better conditions can be fulfilled which allow the listener to re-recognize the
program input memory (recording) only by additional means by means of the corresponding
electro-acoustic device.
Simulations faithful to the original sound of a stereophonic event are, for example, a binaural
three-dimensional (spatial information) pulse (impulse) response measured with a given
reception location in one space (room) and an arbitrary audio program. It can be implemented by
convolution. A binaural three-dimensional (spatial information) pulse (impulse) response refers to
a two-dimensional (spatial information) pulse (impulse) response, where one three-dimensional
(spatial information) pulse (impulse) response is applied to one ear The other stereo (spatial
information) pulse (impulse) response is associated with the other ear. According to the
recognition by system theory, the space (room) forms a transmission system of one linear
relationship with the reception characteristics of the human ear, and this system has a threedimensional (spatial information) pulse (impulse) response in the time domain Is represented by
The respective three-dimensional (spatial information) pulse (impulse) response is approximately
the system response to an acoustic pulse, and the duration of the acoustic pulse is one period of
a frequency twice the upper cutoff frequency of the audio signal (Period). The convolution of the
binaural three-dimensional (spatial information) pulse (impulse) response with one arbitrary
audio program provides a suitable signal for electro-acoustic reproduction, which signal is
outstandingly superior and suitable reproduction. The following listener (listener) acoustic
experience (experiential sound field feeling) is obtained at the listener's both ears in the case of
the listener, ie the same at the original (original) listening place where the actual stereophonic
event takes place A listener's experience (experienced sound field feeling) like that of a whale
experienced by a listener is elicited. It is not possible for the listener (listener) to distinguish
whether the listening event (recognition) was performed at the actual acoustic phenomenon
occurrence point or by the simulation process. When using a loudspeaker rather than
headphones (or earphones) for reproduction, the communication path between the loudspeaker
and the listener's ear must be simulated in essentially the same way.
Certainly makes the listener seem as though the temporal spectral stereodynamic sound field
structure actually present exists at the original (sound) signal listening location (location point).
Simulation methods are extremely complex and expensive (especially with regard to the
technology, engineering equipment required for simulation). In general, the convolution is
performed as follows: an audio signal and a three-dimensional (spatial information) pulse
(impulse) response are digitized and a computer-computed signal is calculated and converted
back to an analog signal ) It is converted. The number of calculation steps depends on the length
of the pulse response. For example, what should be mentioned here requires a sampling rate of
approximately 50 KHz under a 20 KHz audio signal band, and thus a sampling rate of μsec, so
for a typical 3 s (spatial information) pulse (impulse) response of 2 sec. 105 sampling values are
required, and further 5 × 10 4 × 10 5 = 5 × 10 9 multiplications and additions must be
performed every second during the convolution of the audio signal with the corresponding stereo
(spatial information) pulse (impulse) response. That is, the cost of the device for convolution with
the audio signal must be unusually high, especially if the entire sequence of processes should be
done in real time. Therefore, the application of such simulation processes outside the research
area is considered impossible for economic and price reasons.
Electro-acoustic device configuration for the simulation of the near-original sound fidelity of the
listening situation present at one predetermined listening location Stereo with headphone (or
earphone) in Austrian patent specification No. 394650 (AT-PS 394650) A phonic-based binaural
audio program is described. Continuation of audio original sound (original) fidelity, and also
cause problems with proper localization of a predetermined sound source distributed and
dispersed in space (room) or the cause for stereophonic speaker reproduction The sound
recording (sound collection) present in is properly provided for headphone (or earphone)
reproduction of the original sound faithfulness as follows: in addition to the direct incoming
audio signals of both left and right channels Both spatial (stereo) reflections in the listening space
and direction-dependent ear transfer functions are weighted and simulated. Integration of the
extraneous transfer function across all spatial directions gives the ear an approximately flat
amplitude-frequency characteristic. Since such complex simulations are practically impossible,
one must rely on a simplified configuration. Such a significantly simplified arrangement would
allow each ear to be provided with three different audio signals in order to ensure a faithful
listening event.
Simulations of stereoacoustic phenomena can be implemented using a process (for example
known from EP-A-O 5949). In the method, a transfer function is simulated using a transfer
function simulator. The above transfer function simulator comprises a sound source arranged in
one acoustic system, an acoustic receiver, and a device for measuring the acoustic transfer
function. A number of different positions between any two points can be taken into account in
the acoustic system for the measurement of the acoustic transfer function. A feature of the
above-mentioned simulator / self is that means are provided for the evaluation of the poles
present in the transfer function present, where AR coefficients (which correspond to the physical
poles of the acoustic system) Is evaluated on the basis of a number of measured transfer
functions, the ARMA filter (which is synthesized from the AR filter and the MA filter) simulates
the following: Of the measured transfer functions, the ones that match the acoustic system of
interest are simulated. The above very complicated method is used to simulate the transfer
function as follows: acoustic echo transfer function, which is also required for echo localization,
acoustic echo localization, echo localization, echo localization. Used to simulate The simulation of
the transmission characteristics is performed by the signal processor. In the simulation process
itself, the transfer function is necessarily simulated in a very short time, with little computational
The simulation method described above was used to realize and implement basic stereoacoustic
phenomena, faithful reproduction of events, according to the relevant modification. In that case,
it is extremely complicated and costly from the technical point of view, and it is so specific that it
can not have particular utility for the overall purpose in terms of the effective and economic
application of the method in question.
The simulation of stereoacoustic phenomena is also based on the time delay between the source
signal and the convolved signal, which is also inherent to the method (method), by means of
known fast convolutions using miscellaneous Fourier transforms. Appropriate measures for
economical equipment for, no way to be given.
SUMMARY OF THE INVENTION It is an object of the present invention to provide a simplified
method and apparatus for simulation by means of an electroacoustic apparatus, and the
realization means of the method and apparatus can be made technically and economically
advantageously. It is something to be done.
The problem is solved by method steps comprising the features of claim 1.
The calculation cost is reduced accordingly, based on the selection of the predetermined part of
the multiple solid (spatial information) pulse (impulse) response, that is the excluded of the solid
(spatial information) pulse (impulse) response This is because there is no need to carry out any
calculations on the part.
The advantage of the new method is that there is no degradation of simulation quality for the
method process at a significantly reduced cost.
Furthermore, a simplified FIR filter structure for convolution may be used.
The convolution process itself progresses in real time with no significant time delay.
Therefore, the central point of the present invention is that the simulation for achieving the effect
results is performed by a predetermined one of acoustic phenomena and events.
The critical choice only requires information, knowledge about the part of the steric (spatial
information) pulse (impulse) response that is important for the sense of hearing.
The information about each stereo (spatial information) pulse (impulse) response, the method of
obtaining knowledge is implemented via real or virtual (virtual) measurements. The
determination of which part is excluded from the three-dimensional (spatial information) pulse
(impulse) response is made in accordance with the psychoacoustic principle.
According to an important development of the method, the value of the steric (spatial
information) pulse (impulse) response is compared with the time-dependent limit value and the
value of the steric (spatial information) pulse (impulse) response exceeding the limit value is
used. The limit value is time-dependent with respect to the steric (spatial information) pulse
(impulse) response (the limit value has a maximum magnitude in the region of the beginning of
the steric (spatial information) pulse (impulse) response and As far as it decays). As a result, the
wide area of the three-dimensional (spatial information) pulse (impulse) response becomes zero.
The advantage in such partitioning lies in the significantly reduced computational cost for the
simulation processor. The region of the steric (spatial information) pulse (impulse) response that
directly captures the sound is synthesized with the region containing the reverberation, where it
has to be synthesized so that the original sound quality is maintained in the simulation.
As such, only the part for the convolution process is used, which plays an important role,
contributing for faithful simulation. All the remaining parts of the simulation no longer appear
due to the "zero set" and no computational cost is needed for that part. The FIR filter used for
convolution does not require an expensive structure, and the computational power of the signal
processor need only be injected if the corresponding coefficients that differ from zero appear.
Such an approach significantly reduces the computational cost as compared to the conventional
convolution. Therefore, the reduction factor may be 10 to 100 (the degree of reduction may be
10 to 100 times), and the reverberation time for such a simulated stereoacoustic phenomenon is
maintained, and the reduced solid of 10 msec. (Spatial Information) For the full length of the
pulse (impulse) response, a reverberation time of 100 to 1000 msec is simulated as desired.
Here, the three-dimensional simulation is not affected by any contingency factor.
The above process with the required electro-acoustic devices can be configured as follows: pre
(pre) and post (in) the three-dimensional (spatial information) pulse (impulse) response in order
to obtain a simulation where the selection of the key parts is faithful Post) It is done by
considering the masking phenomenon.
Due to the masking phenomena known in the listening (listening) acoustical properties, a second,
additional sound in the presence of the sound is audible only when: the excitation force of it in
the human ear is the excitation of the first sound It becomes audible only when the force is
As a result, a shift of the audible threshold value, which is simulated by the above-mentioned
time-dependent threshold value, takes place, so that sounds below the threshold value are not
The combination of the aforementioned method step sequences is an optimal configuration
implementation of the process. The computational cost and the yield (efficiency) relative to the
input of the technical equipment, the efficiency is the highest, and the yield efficiency obtained
thereby is the most economical.
The application of the simulation method according to the invention lies in the special hi-fi and
acoustic studio areas, since the advantages of binaural hearing exist both in the head (earphone)
reproduction as well as in the speaker reproduction. The device according to the invention
removes the known drawbacks of hearing in dead spaces of sound and creates a good, original
sound fiducial measure which does not disturbably overlap the sound properties provided by the
input recording. The simulation of, for example, a given loudspeaker device in a given space by
head (or earphone) reproduction is an important application of the simulation process, including
the required electroacoustic device.
present invention will now be described with the aid of the drawings, together with the necessary
electroacoustic devices.
FIG. 1 shows a possible approach for spatial information (in) pulse response detection.
At the position of the sound source a measurement signal is emitted which is received by the
measurement microphone at the listening location. A spatial information (in) pulse response is
generated from the received signal. When one pulse having a duration (width) equal to the period
of twice the frequency of the upper frequency domain of the audio signal domain is used as the
measurement signal, the received signal has spatial information (in) pulse response h (t) be
equivalent to. Because the signal-to-noise ratio is small in this approach, in practice a relatively
long measurement signal is advantageously used to calculate the spatial information pulse
response computationally. The binaural spatial information pulse response required for
reproduction via the headphones (earphones) is generated as follows: the measurement
microphone is the ear channel of the subject whose spatial information pulse response is to be
determined It is placed in the ear canal). The speaker space-ear-distance space is then measured,
followed by the (in) pulse response to the system headphone ear. The resulting pulse response is
transformed to the frequency domain, the transformed function is divided, and the quotient is
transformed back to the time domain. When the process is performed on both ears, a binaural
spatial information pulse response (which consists of right and left spatial information pulse
responses) is obtained.
FIG. 2 shows the course of the process in one of the two spatial information pulse responses
determined as described above. The three-dimensional (spatial information) pulse (impulse)
response h (t) is led to the dividing circuit 1 to be divided into the direct reverberation
component d (t) and the direct acoustic reverberation component r (t). The reverberation
component r (t) contains all the individual reflections of the measurement signal originating from
the space (chamber) wall.
The steric (spatial information) pulse (impulse) response is by its nature a continuous time signal,
digitized for processing, to form h (t) to d (t) to r (t), h (t) t) to d (t) to r (n) are formed. The timediscrete representation h (n) is used exclusively in each figure, since a time-discrete
representation is required in the digital filter for the digital processing used here. Where n is the
continuous index to the sampling value coupled with time by t = nτ, and τ is the period
duration of the sampling frequency. The displayed contents in the figure are shown as
continuous functions for the sake of easy understanding.
For a three-dimensional (spatial information) pulse (impulse) response h (n) and its division into
direct sound d (n) and reverberation i (n), the time-dependent amplitude curve corresponding to
FIGS. It is shown. The sound comes directly to the listening location after the time T = Nτ. It
should then be expected that only the components due to reflection or echo will be more. It
should be mentioned for the purpose of explanation that in a frequency linear transmission
system the pulse response will only consist of the first value. The pulse response outlined here is
determined directly by the transfer function of the transmission path from the sound source to
the entrance of the ear canal in the area of direct acoustics, and is extended to several msec, for
example due to reflections in the head and body.
An electronic device 2 for extracting a predetermined component from a determined threedimensional (spatial information) pulse (impulse) response, which is a three-dimensional (spatial
information) pulse (impulse) response divided into two acoustic components d (n) and r (n)
Provided that the predetermined component is a listening room (space) acoustic characteristic
and a characteristic value of a sound field existing in the listening room (space), and a left and
right ear transfer function assignable to the listener By the process of convolution with an audio
program (which guarantees the simulation of the fidelity of the entire stereophonic sound event).
The extraction is performed according to the criterion scale described below. The extracted or
reduced three-dimensional (spatial information) pulse (impulse) response h (n) is convoluted with
the signal s (n) of the arbitrarily selected audio program in the processor 3, so that the signal In
the case of proper reproduction in the ears of the listener in which is formed, the desired
listening outcome according to the invention, ie a faithful simulation of the listening location in
the desired listening space, is achieved.
The extraction circuit 2 for selecting the main component from the determined three-dimensional
(spatial information) pulse (impulse) response will be described with reference to the circuit
diagram of FIG.
Because of the above-described limited computational effort of processor 3, it is preferred to use
only the forward portion of each determined solid (spatial information) pulse (impulse) response.
For this purpose, the steric (spatial information) pulse (impulse) response which is added to the
input side E and is divided directly into the sound and reverberation components is divided in
function block 4 into individual sections or parts having a length T1.
6a-e show how the calculated three-dimensional (spatial information) pulse (impulse) response is
obtained using the functional block 4 with acoustic components d (n), r2 (n), r3 (n). . . . It shows
whether it is divided into individual blocks or parts Ti with ri (n).
The division into direct acoustic and reverberant components takes place.
That is because the direct component of the determined three-dimensional (spatial information)
pulse (impulse) response should be unchanged, at least for studio applications, and the
reverberation component is reduced as described above. However, an application mode is also
possible in which both components of the determined three-dimensional (spatial information)
pulse (impulse) response are reduced.
Using the direct component separation comparator 5, the reverberant component of the stereo
(spatial information) pulse (impulse) response (which is below a predetermined limit according to
the criterion described below) is set to zero. The sampling value (scanning value) in the
reverberation portion of the three-dimensional (spatial information) pulse (impulse) response to
be reduced is counted by the coefficient counter 6. The counter value obtained is compared in
the setpoint comparator 7 with a limit value (which is determined by the allowed calculation
costs). If the limit has not yet been exceeded, a further block of the steric (spatial information)
pulse (impulse) response determined as shown in FIGS. 6a-e is further required. As such,
computational capacity is fully exploited during post-convoluting with reduced three-dimensional
(spatial information) pulse (impulse) responses. Once the predetermined set state has been
reached, the now present reduced stereo (spatial information) pulse (impulse) response is sent to
the output A.
If a critical (critical) evaluation of the determined three-dimensional (spatial information) pulse
(impulse) response is performed according to the masking phenomenon, the arrangement shown
in FIG. 4 is required. In addition to the schematic configuration shown in FIG. 3, a limit value
adaptation circuit (functional part) (which comprises a comparator 9 and a limit value generator
10) is additionally provided. In the comparator 9, the determined solid (spatial information) pulse
(impulse) response is compared with the instantaneous limit value, the magnitude of the limit
value being determined according to the masking phenomenon the solid (spatial information)
pulse (impulse) It depends on the leading value of the response. By means of feedback to the
comparator 5 via the threshold generator 10, for example by Zwicker, a dynamic adaptation
adjustment to a predetermined psychoacoustic criterion scale according to the masking
phenomenon is realized.
The critical selection of the signal components for the simulation of the steric (spatial
information) pulse (impulse) response determined as shown in FIGS. 7a, b can be made as
follows, ie the predetermined fixed limits All components of the sought three-dimensional (spatial
information) pulse (impulse) response below the value A are to be set to zero. As a result, the
signal component in question is not taken into account for the post-convolution process, while
the signal component above the threshold value or the associated sampling value is reduced with
an amplitude that does not change. Information) It is captured (taken) in a pulse (impulse)
response. Since there is a direct relationship between the strength of acoustic reflection and the
value of the determined three-dimensional (spatial information) pulse (impulse) response that
can be mapped to the reflection, the limit value determination The reference scale provides an
important aid, a clue, for the extraction of values that are important for the simulation of the
determined three-dimensional (spatial information) pulse (impulse) response. At the time of
convolution, only important features or markers given by the selection criterion scale from the
determined three-dimensional (spatial information) pulse (impulse) response are taken into
consideration excessively, whereby the required calculation The cost is significantly reduced. A
reduced solid (space corresponding to a filter factor of 10 and a pulse response length of 10 ms
for a sampling period of 20 μs) if multiplication and addition with 25 × 10 per second can be
performed by the signal filter in an FIR filter Information) A space (room) with a reverberation
time up to 1 sec can be simulated by the processor 3 using the pulse (impulse) response.
Furthermore, as shown in FIGS. 8a, b, a critical selection can also be made by the criterion
measure according to the masking phenomenon. Therefore, it is not necessary to take account of
the components of the determined three-dimensional (spatial information) pulse (impulse)
response which are somehow unrecognizable when listening. Corresponding to the upper being
present, the components to be masked should be removed from the post-convolution. In this case
it is no longer necessary to make a direct distinction between sound and reverberation, and the
overall three-dimensional (spatial information) pulse (impulse) response determined from the
outset can be reduced as described above.
TV represents the area of pre (pre) masking, and TN represents post (post) masking. This is the
period as follows: there is a period when the signal below the level limit is no longer noticeable to
the main signal, as schematically shown in FIG. 8a. The masking effect depends on the time
intervals, the level ratio, the masked signal and the masking signal, as is also clear from the
standard literature for the problem. Thus, this can not be completely displayed by the figure.
Depending on the steric (spatial information) pulse (impulse) response, the time- and level
relationships can be controlled, among others. In short, a somewhat wider range of values is used
than is directly obtained from the limit criterion scale anyway. In addition, the value domain must
be extrapolated into the originally masked domain in order not to introduce an undesirable filter
effect in the frequency domain.
As shown in FIGS. 9a, b, the trapezoidal shape shows how the limit values are reduced and,
correspondingly, the signal components for the simulation are extracted.
FIG. 10 shows in what form, for example, the architecture of a conventional FIR filter can be
In the cascading of the intermediate memory Z 1 (each memory stores one signal value over one
sampling period), one signal value is taken out at each connection line in each sampling cycle, It
is multiplied by the filter coefficient associated with. The result is added to all other results in an
adder and fed to the output, thus forming a direct implementation of the convolution in a
processor. Depending on the processor engineering conditions, the convolution can of course
also be performed on other conjugated stretchers. This can save computing power and work.
Here, basically, a temporally optimal sequence of addition and multiplication is always important,
so that in the best case 2-3 times computational power, work can be obtained.
FIG. 11 shows how the architecture of the FIR filter is transformed when convolution with the
extracted three-dimensional (spatial information) pulse (impulse) response is performed.
Here, the successive successive sampling values of the reverberation signal component of the
three-dimensional (spatial information) pulse (impulse) response form a filter coefficient.
They are of great importance for a faithful simulation, corresponding to the following filter
coefficients, ie reference numerals from the example of FIG. In this case, the number of all filter
coefficients is smaller by one to two orders than the number of intermediate memories. Since the
filter coefficients no longer appear equally spaced in time, the filter processor is simultaneously
conveyed the delay time or sampling signal with the filter coefficients.
Compared with the filter shown in FIG. 10, under the same filter length, the operation to be
weighted (evaluated) as the same for listener's (listener's) recognition needs an operation for a
few orders (one or two orders) I will not.
According to the present invention, there is no degradation of simulation quality for the method
process under significantly reduced costs.
Furthermore, a simplified FIR filter structure for convolution may be used. The convolution
process itself progresses in real time with no significant time delay.
Без категории
Размер файла
27 Кб
Пожаловаться на содержимое документа