Digital Recording Techniques

Introduction to Digital Recording Techniques
by Marek Roland-Mieszkowski, M.Sc., Ph.D., Digital Recordings
This paper was published in Canadian Acoustical Association Journal, 1989
Copyright 1989-2014, Digital Recordings. All Rights Reserved.

Content

Introduction
Digital Recording Principles
A Digital Recording/Processing System
Digital Advantages
Digital Formats
Applications
Conclusions
References

Introduction
In the simplest case, the word digital refers to the representation of a quantity in numerical form and analog refers to a continuous physical quantity. To digitize means to convert an analog physical quantity into a numerical value. For example, if we represent the intensity of a sound by numbers proportionally related to the intensity, the analog value of the intensity has been represented digitally. The accuracy of the digital conversion depends upon the number of discrete numerical values that can be assigned and the rate at which these numerical measurements are made. For example, 4 numerical levels will represent changes in the amplitude of sound less accurately than 256 numerical levels and a rate of 8 conversion/sec will be less accurate than a rate of 10,000 conversions/sec.
The process for digitally coding sound by computer was first developed in 1957 by Max Mathews of Bell Telephone Laboratories in Murray Hill (Mathews, 1963). Other advances in digital electronics and microchips led to the development of the first digital Pulse Code Modulation (PCM) audio recorder in 1967 at the NHK Technical Research Institute (Nakajima, 1983). This machine was a 12- bit companded scheme (using a compression/expansion of sound to improve dynamic range) with a 30 kHz sampling rate. Data were recorded on a one- track, two-head helical scan VTR (Video Tape Recorder). The first commercial PCM/digital recording session was performed by DENON in 1972 (Takeaki, 1989).
Back to the Content

Digital Recording Principles
During digital recording of the analog signal, analog to digital (A/D) conversion takes place from continuous time-amplitude coordinates to discrete time-amplitude coordinates as illustrated in Figure 1. The difference between the instantaneous analog signal and de digital representation is digital error.
[...]

Figure 1: Use of an A/D (or D/A) converter to convert a continuous function (time-amplitude) to a discrete function (discrete time - discrete amplitude). Conversion introduces a digital error in the signal - digital noise.

We will separately consider the consequences of discrete time and discrete amplitude coordinates on the representation of the analog signal.
Discrete Time
Nyquist theorem. The Nyquist theorem states that if a signal V(t) does not contain frequencies higher than f_s/2 (where f_s = 1/T_s), then it can be fully recovered from its sampled values V( nT_s) at discrete times t_n = nT_s where n = ... -1, 0 , 1 , 2 , 3 ...

[...]
(1)

where:

f_s = 1/T_s, the sampling frequency

V(t) = value of signal at arbitrary time t.

This is a remarkable result. The recovered signal will have all the frequencies in the range from 0 to f_s/2 Hz.
Discrete Amplitude
The term bit stands for binary digit and is associated with a two-choice situation (0 and 1). Thus, any digital system with just two levels has a 1 bit resolution. Generally, the logarithm to the base 2 is used to convert the number of available quantization levels to number of bits. A device with two stable positions, such as a relay or a flip-flop, can store 1 bit of information. N such devices can store N bits of information, because the total number of possible states is 2^N and amount of information is equal to log₂2^N = N bits ( Shannon, 1949/1975). Thus, 4 levels is 2 bits, 8 is 3 bits, 16 is 4 bits, etc. For an N-bit A/D or D/A converter

No. of levels = 2^N (2)
Example:

N = 8 No. of levels = 256

N = 12 No. of levels = 4,096

N = 16 No. of levels = 65,536

N = 20 No. of levels = 1,048,576

When a voltage amplitude from 0 to V_max is used (for example from 0 to 1 Volt), then one quantization step will be:

= V_max / No. of levels = V_max/2^N (3)

At an adequately high level and complexity of input signal V(t), the digital error (difference between analog signal and stored digital value) from sample to sample will be statistically independent and uniformly distributed in the range of [ -/2, /2 ] where is the step size in the A/D converter.
Thus, the maximum Signal-to-Noise Ratio (S/N) in decibels can be calculated to be (Nakajima, 1983; Mieszkowski, 1987):

S/N 20 log (V_{SIGNAL RMS} / V_{NOISE RMS}) = 6.02 N + 1.76 [dB] 6 N [dB] (4)
for all, practical purposes.
Thus, converter resolutions of 8, 12,16, and 20 bits would allow a 48, 72, 96, and 120 dB S/N ratio respetively.
Back to the Content

A Digital Recording/Processing System
A block diagram of a digital recording/processing system is shown in figure 2. The processes at each of the numbered blocks 1 to 7 are described below:

[...]

Figure 2: Block diagram of digital recordinglprocessing system. Both sources of noise [N₁(t), N₂(t)] are needed in order to avoid digital distortions of the signal V(t) in the form of coherent noise N_D(t). Properly chosen N₁(t) and N₂(t) add only a little noise to the output, but remove coherence of N_D(t) (digital noise) with the signal V(t).

Following Nakajima (1983), Mieszkowski (1989) and Wannamaker, Lipshitz and Vanderkooy (1989), analog dither must be added to the input signal in order to
a) linearize the A/D converter
b) make possible improvement of S/N by averaging process according to formula:

(S/N) after averaging = (S/N) before averaging n^1/2 (5)

where: n = No. of averaged signals
c) eliminate harmonic distortions (created when digital noise N_D(t) is coherent with signal V(t)).
d) eliminate intermodulation distortion (created as well when digital noise N_D(t) is coherent with signal V(t) ).
e) eliminate "digital deafness" (when the signal V(t) falls below , where is the step size in the A/D converter, the signal will not be recorded at all unless there is a noise N₁(t) on the input).
f) eliminate noise modulation by the signal

Input low pass filter (antialiasing filter) should eliminate all frequencies above f_s / 2 , where f_s = sampling frequency, in order to avoid aliasing distortion (Folding of frequencies into passband: f_new = f_s - f_original where f_original f_s / 2).

A/D converter converts analog signal into a digital number (for example, 10110110 represents a binary coded 8-bit amplitude). Sampling speeds range from 2 kHz to 10 GHz and amplitude resolution ranges from 4 bits to 20 bits.

If DSP is performed on the signal, one must add digital dither N₂(t) (box 5) to avoid digital distortions and coherent noise N_D (t) on the output of D/A converter. Digital processing should also be performed using sufficiently precise real numbers to avoid round-off errors.
Storage of digital data can be performed on magnetic tape, optical disk, magnetic disk, or RAM (Random Access Memory). Prior to storage, extra code is generated to allow for error correction. This error correction code allows detection and correction of errors during playback of the audio signal. Redundant information must be added to the original signal in order to combat noise inherent in any storage/communication system. The particular type of code and error correction system depends on storage medium, communication channel used and immunity from errors (an arbitrarily small probability of error can be obtained, Nakajima, 1983; Shannon, 1949/1975).

Prior to D/A conversion, digital dither must be added to numbers representing amplitude of the signal if DSP has been performed. Optimal digital dither has triangular probability density function (PDF) (Wannamaker, et al. 1989).

D/A converter converts digital numbers into analog signal. Available conversion speeds are 2 kHz to 200 MHz and available amplitude resolution is 4 bits to 20 bits.

Output low pass filter should eliminate all frequencies above f_s /2 which are generated during D/A conversion.

Back to the Content

Digital Advantages
Table I summarizes the author's comparison of studio quality reel-to-reel analog tape recorder with 16 bit digital recorder. These data are derived from specifications by various manufacturers of analog and digital audio products. This table implies that the digital recorder has many advantages over its analog counterpart Performance of the analog recorder depends very much on the calibration and tape used, as well as on the environmental conditions such as temperature and humidity. This is not the case for a digital recorder, as long as errors generated are within the limits of error correctability of the particular device.

Parameter Reel to Reel
Tape Recorder 16 Bit
Digital Recorder

S/N Ratio 65 dB (linear system) 93 dB (linear system)

Total harmonic distortion 0.2% 0.005%

Wow and Flutter 0.03% UNMEASURABLE (Quartz Accuracy)

Frequency Response 30-20,000 Hz (3 dB) 0-20,000 Hz (0.5 dB)

Loss of S/N during copying 3 dB NONE (PERFECT COPY) - as long as done in digital domain

Deterioration over time YES NONE (as long as within limits of correctability of the system)

Table I - Comparison of analog reel to reel recorder with 16 bit digital recorder (any type).

Back to the Content

Digital Formats
Common coding systems
Below is a short list of commonly used digital coding algorithms (using as an example a single channel digital recording system with swnpling frequency f_s = 44,100 Hz and 16 bit A/D and D/A conversion). The data compression algorithms, which are more efficient than PCM (use less storage space), preserve the information content of the signal. Not mentioned here are data reduction/compression algorithms, which reduce information content of the original signal (arbitrarily or on the basis of psychoacoustics research results).
PCM - PCM was invented by A.H. Reeves in 1939 (American Patents 2272070, 1942-2 see Nakajima, 1983) and was analyzed and developed as a modulation system from the point of view of communication theory by C.E. Shannon (1949). Using only two alternative pulse values (0 and 1), a 16- pulse train is generated which indicates the sampled value (for example, 1010 1111 0110 1101, a binary coded 16 bit number). During conversion, 16 bit amplitudes A1, A2, A3 ... are generated with a rate 44,100/sec. The demand on the storage device and speed of transmission channel is 88,200 Bytes/sec. This is a 'brute force' approach, which is not the most effective way of using the storage device and transmission channel.
DPCM - Differential Pulse Code Modulation. During conversion only 4 bit (for example) differences between consecutive amplitudes are generated (A2-A1), (A3-A2), (A4-A3) ... at the rate of 44,100 /sec. Demand on the storage device and speed of transmission channel is 22,050 Bytes/sec.
ADPCM - Adaptive Differential Pulse Code Modulation. Depending on the signal, the number of available bits to represent the difference between consecutive 16 bit samples is varied. For example, for the case of total quiet at the input (or small signal) the difference could be switched off totally or represented only by 1 bit. Demand on the storage device and the speed of transmission channel could vary between 0 Bytes/sec and 88,200 Bytes/sec depending on signal complexity. This is probably the most effective way of coding. Similar means of coding could be used for video signals because there is not much change from frame to frame most of the time.
M - Delta Modulation. During coding only 1 bit differences between consecutive amplitudes are generated at a high conversion speed indicating whether the signal was increased or decreased (from the previous sample). Demand on the storage device and the speed of transmission channel is very high in comparison to the PCM system for the same quality of signal (Nakajima, et al., 1983).
Recording/Storage Systems
Listed below are current common recording/storage systems for digital audio data.
PCM unit + VCR recorder - 2 and 4 channels.
These are professional and semi-professional systems with 14 bit or 16 bit resolution and 44,056 Hz or 44,100 Hz sampling frequency. The PCM signal is stored on video tape in pseudo-video format. Most of the early systems were of this type.
DASH (Digital Audio Stationary Head Recorder)
This is a professional 16 bit system with up to 48 tracks. Available are 40,056, 44,100 and 48,000 Hz sampling frequencies.
R-DAT (Rotating Head Digital Audio Tape Recoder)
This is a professional and consumer 2-channel system with 16 bit resolution and 32,000, 44,056, 44,100, 48,000 Hz sampling frequencies.
Magnetic Hard Disk and RAM (Random Access Memory) based Recorders.
These are computer based professional and semi-professional recording systems having from 1 to 24 tracks. The resolution is from 8 to 18 bits. Sampling frequencies are from 2 kHz to 250 kHz. The computers may be common microcomputers as well as main-frame computers. They offer the highest flexibility in terms of digital editing of stored sound and in the author's opinion are the trend of the future.
Optical WMRM (Write Many Read Many), Erasable Optical Disk based Recorders. This format is becoming popular for audio applications because the removable optical cartridge can store about 600 MBytes of data and is more robust than magnetic media. Writing and reading is done by laser without physical contact with the disk. The NeXT computer has the first commercially available optical disk drive with 256 MBytes capacity ( Thmpson and Baran, 1988). Also, Nakamichi recently showed during the AES 7th International Conference a working prototype of an optical disk recorder, similar to a CD player (Mascenik, 1989).
Back to the Content

Applications
Digital techniques for storing and transmission of audio signals are attractive because they offer high quality signals, which do not deteriorate with transmission distance, number of copies or time. Digital information when properly stored and transmitted maintains its 100% integrity in contrast to analog information which deteriorates during each transmission and storage cycle.
DSP is also far more powerful than ASP (Analog Signal Processing). First, the quality of the signal is maintained during DSP. Second, most of the DSP devices are very flexible because one can run many different applications on the same hardware by a change of the software. Analog devices are devoted to particular tasks and are not as flexible. Third, digital signal processing can perform operations impossible in the analogue domain.
Some of the functions which could be performed by the DSP devices are: filtering, equalization, compression/expansion of dynamic range, time compression/expansion, delay, reverberation, pitch change, generation of arbitrary signal or noise, music and voice synthesis, noise reduction, signal restoration, automatic pattern and voice recognition, time- reverse, noise gate, automatic gain control, mixing of signals, and FFT (Fast Fourier Transforms).
In recent years DSP units have become relatively affordable. Also, there are many products available as plug-in cards for popular microcomputers, which contain DSP chips from such manufacturers as Motorola or Texas Instruments. DSP systems based on microcomputers are relatively fast (but not as fast as devoted hardware) and very flexible.
Back to the Content

Conclusions
The future of digital recording and DSP looks very bright. Higher speeds of microprocessors and DSP chips make real time applications of even complex algorithms realistic. Falling prices of RAM chips and storage devices like the erasable optical disk make them affordable for many researchers and musicians.

In the author's opinion it is almost certain that the majority of future recording and DSP equipment will be based on microcomputers. Storage media of the future will probably be erasable optical disks and RAM cards. With falling prices of RAM chips and already available 4 Mbit chips in a single package, one can expect portable RAM based ADPCM recorders to replace mechanically complex R-DAT machines in the near future.
Back to the Content

Use CD-CHECK to test error correction and tracking capability of a CD / DVD player

Use DED for continuous monitoring of the digital output signal and the digital link from CD / DVD players, DAT recorders and other digital sources

References

Anazawa Takeaki, et al.(1989). A Historical Overview of the Development of PCM/Digital Recording Technology at DENON, AES 7th International Conference, Audio in Digital Times, Toronto, May 14-17.
Mascenik, S. (1989). A Magnetoptical Disk Digital Audio Recorder, AES 7th International Conference, 'Audio in Digital Times', Toronto, May 14-17.
Mathews, M. V. (1963). The digital computer as a musical instrument. Science, 142, 553-557.
Mieszkowski, M.(1987). Commercially Available Hardware and Software for Data Acquisition and Analysis. Invited Paper, 'Data Acquisition Seminar/Workshop', Technical University of Nova Scotia, Continuing Education Division, November 6-17. Manuscript available from author.
Nakajima, H., et al. (1983). The Sony Book of Digital Audio Technology, TAB Books Inc.
Shannon, C. E. (1949/1975). The mathematical theory of communication. Urbana: University of Illinois Press (1st published, 1948).
Wannamaker, R.A., Lipshitz, S.P., Vanderkooy (1989). Dithering to Eliminate Quantization Distortion. Halifax, N.S., Canada, October 16-19 (see these Proceedings, 78-86.)

Back to the Content

Terms of Use | Return Policy | Privacy Policy

Copyright (©) 1989-2014 by Digital Recordings. All Rights Reserved.
No part of the information provided on this www page may be reproduced for any purpose, in any form, without prior written approval.

This site uses frames. To enjoy them your screen's resolution should be at least 800 x 600, preferably 1024 x 768. To invoke frames click here.