
Consequences of Nyquist
Theorem for Acoustic Signals Stored in Digital Format
Wayne R. Young, School of Human Communication Disorders, Dalhousie University, Halifax, Nova Scotia, Canada This paper was published in proceedings from "Acoustic Week in Canada 1991"  CAA Conference, Edmonton, Alberta, Canada, October 7  10, 1991 Copyright 19912014, Digital Recordings. All Rights Reserved.
Introduction The calculation of functions in digital domain from analogue acoustic signals involves a twostep process which includes analog to digital (A/D) conversion and digital calculations performed on digitized acoustical signal. Function V(t) will be represented digitally without any loss of information as long as sampling occurs in accordance with the Nyquist criteria [19]. How can we determine the values of digitized function for points between samples when we have only N samples available? The Nyquist formula requires an infinite number of samples to accomplish this task. In situations when digital samples are sufficiently dense, one can approximate many continuous functions with their discrete formulations. Errors generated in these cases will be small, since they depend on spacing between samples. The situation will be different, however, when samples are coarsely spaced. For example, a sinusoidal tone of frequency f = 20,000 Hz sampled at f_{sampling} = 44,100 Hz is represented by only 2.205 samples per period. Calculation of many functions (for example RMS values) may lead in this case to some errors. Finite duration sampling of continuous signal results in errors caused by our limited knowledge of the function for all points in time. It turns out that the more samples we have around the region of interest, the more accurately we are able to reproduce the function there. This paper investigates the error caused by truncation of the Nyquist sampling formula with the aim of quantifying it and establishing ways to minimize its effect. Nyquist Theorem According to the Nyquist theorem [19] the discrete time sequence of a sampled continuous function { V(t_{n} = n T_{s}) } contains enough information to reproduce the function V=V(t) exactly provided that the sampling rate (f_{s} = 1/T_{s}) is at least twice that of the highest frequency contained in the original signal V(t):
Nyquist Theorem's Consequences It is worth noting that information about the signal V = V(t) at any given moment in time t n T_{S} is distributed among all discrete samples { V[n] } with appropriate weights ( see eq. (1) ). Realistically, we are never presented with an infinite discrete time sequence and are therefore forced to perform the summation over a finite range. This is equivalent to a loss of information about the function V = V(t) not only before and after our time window (which is understandable), but also at the time points between the sampling points. This can introduce errors into the process of reconstructing the function. Let assume that we have available to us N digital samples of function V = V(t) (this is illustrated in Figure 1):
Values of the function V = V(t) for the times t [ 0 T_{S} ; (N1) T_{S}], can be estimated by a truncated version of formula (1) :
The errors resulting from truncation are _{LEFT}(t) and _{RIGHT}(t). They represent the "LEFT" and "RIGHT" portions of the sum (with respect to the time axis) in eq. (1) that are omitted in eq.(3), and can be represented mathematically by the following formulas :
The sum truncation error is generated when eq.(3) is used instead of eq.(1), and is given by the following formula:
where:
In the next section we will try to evaluate the sum truncation error for different cases. Evaluation of Sum Truncation Error If a priori information is given, that V(t) = 0 for t [0 ; (N1)T_{S}] then _{TOTAL} = 0 (since _{LEFT} = 0 and _{RIGHT} = 0) and we can use with full confidence the truncated version of the Nyquist sum ( eq. (3) ). Otherwise using the sum from eq. (3) is equivalent to the zeroextension (or zeropadding) method [3,5]. If a priori information is given, that function V = V(t) is periodic ( ie. V(t) = V(t + T) ) then this information can be used in formula (1). In the special case when the function period T= N_{0} T_{S}, where N_{0} N, one can use formula (1) directly since all values of function V[n] are known for n ( ; + ). In this case one can also use the Discrete Fourier Transform ( DFT) on N_{0} consecutive data points (from the available N data points V[n] where n=0,1,2.....(N1) ) to calculate amplitudes A_{i} , and phases _{i}, of the periodic signal and then use the formula :
where:
When a priori information is not available about the function V = V(t) then direct use of the truncated Nyquist sum ( eq. (3) ) is going to lead to truncation error _{TOTAL} given by eq. (4) and (5). Values V[n] in eq. (4 a) and (4 b) can be any arbitrary numbers, since we do not have any a priori information about the signal. In the next section we will investigate this in greater detail to estimate values of possible errors. Estimation of Truncation Error for General Case Let's consider for simplicity the "LEFT" error given by eq. (4 a) (estimation of "RIGHT" error is performed in identical way). Total error given by eq. (4 a) is a sum of contributions from data points V[n] where n=1,2,3......... . Error contribution from the nth point is given by :
where:
The function sin(x) / x ( where x= f_{S} (t  n T_{S}) ) is equal to 0 at time points t = m T_{S}; m = 0,1,2,3, . . . ,(N1) (which are the sampling points in our time window). Therefore there is no contribution to the error in V(t) at sampling points due to truncation. Also the function sin (x) / x has local max. and min. approximately at the middle points between adjacent sampling points in our time window. This can be proved easily by taking first derivative of function sin(x) / x. The approximation gets better for larger values of x. Therefore from now on we will consider the error at the middle points between samples V[n] in our time window (see fig.1.) :
Substituting time t from eq (8) into formula (7) we get:
where:
An interesting question to ask is how large the value of index "m" in eq.(9) must be in order for the absolute value of error _{1}[m+1/2] to be equal or smaller than 1/2 of the quantization step , which is defined as a difference between quantization levels [1,3,8,9]. This is a reasonable comparison since all samples V[n] have quantization error inherent to the process of A/D conversion [1,3,8,9]. Quantization error is uniformly distributed in the range [/2 ; /2 ], where is a step size in the A/D converter [1,3,8,9]. From eq. (9) we have for n=1 and V[1] = V_{MAX} = 2^{B1} (where V_{MAX} is the max. possible signal amplitude in the A/D converter and B is the No. of bits in the A/D converter) :
From inequality (10) we get (times are calculated for f_{S} = 44,100 Hz):
Results in (11) show that if we have no information about the signal before our time window, then in order to avoid errors associated with the unknown values of function V[1], V[2], V[3]........, one has to be " m" samples deep inside of the time window. We can then use the formula (3) for any time t as long as we stay away from the ends of the time window by " m " samples ( t [ m T_{S} ; (N1m) T_{S} ] ). Another interesting question to ask is which sequence of samples V[1], V[2], V[3]........will generate the largest error at the middle points in our time window ? Taking the summation of eq. (9) from n = 1 to m =  we get :
If we take the sequence of samples V[n] = V_{MAX} (1)^{n} then we get from eq (12) :
Unfortunately our choice of { V[n] } in eq. (13) was inappropriate because the sum of this series diverges to . As a matter of fact this happened since this particular series corresponds to the digital representation of a sine wave with the Nyquist frequency f = f_{S} / 2. Such a frequency can't exist in the digital domain, since it can't be recorded via A/D conversion. A realizable choice of samples would be, for example, one which represents sinusoidal wave with frequency f f_{S} / 2 :
where:
Substituting (14) in eq. (12) we get:
Computer calculations of error were performed using formula (15) for m = 10,000 (calculations of error in middle point between 10,000 and 10,001 sample in time window) . Results were as follows :
where:
S(10,000) represents summation of first 10,000 terms in eq. (15) Results (16) show, that max. error is obtained when frequency f approaches the Nyquist frequency f_{S} / 2. The value of the error from the first term ( V[1] ) is larger than from the sinusoidal wave as long as frequency f is smaller than about 0.490 f_{S} ( which is 98% of Nyquist frequency = f_{S} / 2 ). This seems reasonable since the input antialiasing filter would usually eliminate all frequencies above 0.46 f_{S} (see for example data for PCM or DAT recorders). The errors resulting from sinusoidal waves are similar to the error contribution from V[1], therefore we expect that the recommendation given in (11) should be valid for any arbitrary sequence V[1],V[2],V[3] .., since such sequence is a linear combination of pure tones according to the Fourier Theorem. However, further investigation and computer simulations are required to substantiate this claim. Conclusions In this paper we investigated the errors due to finite duration sampling of continuous signal and determined that this error can be considerable at the beginning and near the end of the sampling time window. These errors had a tendency to get larger at higher frequencies as they approach the Nyquist frequency ( f_{S} / 2) for signal near the inside boundaries of the time window. At this time however, we don't know which physically realizable sequence of samples V[1], V[2], V[3]..........will produce the largest error inside of the time window. Further tests and computer simulations are required.
Terms of Use  Return Policy  Privacy Policy
No part of the information provided on this www page may be reproduced for any purpose, in any form, without prior written approval.
