PS AUDIO: “So for a moment let’s imagine a digital audio scheme that has no practical limits in loudness”

 

Paul McGowan writes: I wrote a long post about a bit of high-end history covering the invention of the separate DAC for high-end audio. I promised another about a future idea that might be interesting. First, a primer.

A DAC is a digital to analog converter – its name describing exactly what it does. The beginning of this chain is called an A to D converter (analog to digital) which does the opposite. The idea is to convert the audio (continuous stream) from a microphone into something a computer can understand: bits. To do this, the continuous audio is broken down into particles or discrete quanta that are then stored by optical means (CD) or magnetic means (hard drive). The DAC reverses this process.

Digital audio has limits – depending on the number of bits. This has always been a problem for me because in real life there are no limits: sounds can get as loud or soft as they do without restriction.

Some of my colleagues will argue that these limits are meaningless since 24 bit audio has a dynamic range of 144 dB which far exceeds analog and, for that matter, human hearing (and 32 bit audio even higher).

While the numbers are correct, I would argue that most of the dynamic range available is unusable because it is far below what we can hear – and the usable range is not much more than our ability to hear. Certainly we could manage to use what dynamic range we have available better, but currently that’s not the case.

So for a moment let’s imagine a digital audio scheme that has no practical limits in loudness. A system that if we had a microphone that could capture the quietest sounds to the loudest sounds (we don’t) we could record it and play it back.

What I like to call the “Vector DAC” is such a system with basically unlimited dynamic range and could be compressed or expanded without degradation. The idea for it came to me through photography.

All photography (digital or film) is much like today’s DACs – based on discrete quanta or bits. In film it is called grain (actual grains of silver) and in digital photography it’s called pixels. Look too closely and what you see is not a picture, but bits (like Leggos) that when viewed from a distance fool us into believing they are smooth and continuous.

You cannot scale a photo up or down without degrading the original because the bits get messed up. Same with audio – compressed you lose and expanded you lose.

Then we learned about vector based imagery. In the late 1980′s a company called Adobe introduced a program called Illustrator and this is where many of us first learned vector graphics. Unlike pixel based systems vector graphics can be scaled up or down without degradation of any kind – bigger or smaller – it’s all the same.

Vectors work by the computer recording a vector, which is a meeting point including angle and length – basically a mathematical description of a line. The line is going in this direction (angle), and continues for a certain length. Using this plus a few other parameters we can describe just about anything – color, width, even speed – and you have a complete scalable model of something.

Why not apply this to audio? After all, an analog signal can be represented by the angle, duration and speed of its movement and if you know all that, you know where it is at any one time. There are no bits or discrete quanta that cannot be scaled.

In the Vector DAC idea we simply record the vectors and related data which are completely scalable without degradation.

I think this would be a real breakthrough. Now, if only we had the time and resources.

Paul McGowan

2 thoughts on “PS AUDIO: “So for a moment let’s imagine a digital audio scheme that has no practical limits in loudness”

  1. Is this not a variation on MP3 and lossy compression? The idea is that you find an equation that fits the sampled data points and store/stream the coefficients of that equation, rather than the values of the data points themselves. At the other end, the software puts the coefficients into the equation and reconstitutes sampled data points from the calculation.

    Many forms of equation or vector could be used, but by representing the waveform in the frequency domain, which is similar to how the ear responds to sound, there is an opportunity to discard those coefficients which we deem contribute the least to the sound – but of course we needn’t throw away any of the coefficients if we don’t want to. The frequency domain representation can also be used for frequency-selective filtering which has many applications in audio.

    If the final output has to come from some form of DAC (just like a computer graphics image), and we are wanting to simply scale the amplitude of the signal, then I don’t see what advantage the vector system gives us over simply multiplying sample points in a suitably high resolution numerical format, as ultimately the output has to be quantized by the DAC.

    I would say that a more fitting analogy with graphics is when we want to dilate or compress the waveform with respect to time, rather than amplitude. So if we wish to shift the pitch or duration of the sound, then it makes perfect sense to represent it in a vector form. However, this is not usually a requirement for straightforward audio reproduction.

  2. And as I pressed ‘Submit’ of course I also realised I had forgotten to mention resampling of audio at an arbitrary sample rate, which is a variation on time/pitch shifting, and for which we also use a vector (frequency domain) representation.

Comments are closed.