SA-CD.net - Playback Disappointment in Linear PCM Recording Systems

Post by Fitzcaraldo215 November 7, 2011 (11 of 15)

We cannot get to the full paper, of course. But, conversion of DSD>PCM may well be probematical. Their representation of signal is entirely different, as we know. It may be a false assumption because of the differing nature of the two formats, but let's assume we need at least equal bandwidth or bit rate for the two to sound nearly equal after conversion. If so, then the common 88K or 96K does not have equivalent bit rate to SACD, if we do the math, these PCM resolutions come up somewhat short:

DSD = 2.8224 MHz x 1-bit/channel = 2,822,400 bits/sec.
88k PCM = 88K x 24 bits/channel = 2,112,000 bits/sec.
96K PCM = 96K x 24 bits/channel = 2,304,000 bits/sec.
176K PCM = 176K x 24 bits/channel = 4,224,00 bits/sec.
192K PCM = 192K x 24 bits/channel = 4,608,000 bits/sec.

The Linn site offers downloads of some DSD material converted to both 96K and 192K, the latter in stereo only and at higher cost. A friend and I have compared them sonically, but not blind or double blind. The Oppo BDP-93 used always converted DSD at 88K into an Anthem D2V, which could not accept DSD input. The 96K download was essentially indistinguishable from the silver disk converted at 88K. But, we felt the 192 K download sounded somewhat better, somewhat truer to our sense of live performance. This is, of course, something anyone can try for themselves.and reach their own conclusions. Some 2L SACD/PCM 2 disk packages offer the same opportunity for comparison. Incidentally, Linn believes that sticking with even multiples of 44K in the conversion is unnecessary.

In any case, I would like to see players convert DSD to PCM at higher resolution than the prevalent 88K. But, at the end of the day, I believe the difference is still small. The payoff from ever higher resolutions appears to me to have sharply diminishing returns, considering the cost of the added bandwidth.

Of course, if the author is using double the normal SACD DSD bit rate, then converting at 96K might exhibit even somewhat larger, but likely still small in my view, differences from the original. On the other hand, the purist in me would have no objection to consumer products and media operating at DXD resolution, and I am sure we will see them someday.

Post by tailspn November 7, 2011 (12 of 15)

AmonRa said:

Says who? To me it sounds like 5.6 MHz DSD was used in a null test to compare before and after mix down samples done in 24/96 PCM.

Or have you read the paper? (I have not)

Not being a current AES member, I have not read the document. It's not the normal AES preprint, which is easy to purchase as a non member. John Bailey mentions 5.6MHz DSD used in his null testing, but mentions 2.8MHz DSD null testing in his Sept 30 post johnbs provided in his link previously here.

I can not comment on the sonics of 5.6MHz vs 2.8MHz DSD (128fs vs 64fs), but engineers who have used it state there is a discernible difference. But I understand it's no where near the magnitude of difference that I've heard numerous times of 88.2KHZ or 96KHz PCM and 64fs DSD, from analog simultaneously through separate PCM and DSD analog to digital converters. I'm talking about the unprocessed monitor mix, prior to any degradation caused by mastering. You can prove this yourself with a Korg MR2000-S or a Tascam DV-RA1000HD.

Post by Beetle November 7, 2011 (13 of 15)

RWetmore said:

I wonder why the choice of 5.6 Mhz? Isn't 2.8 Mhz more comparable to 96khz/24bit PCM?

Hi Guys,

Just popping my head in here to say 'Hi' and address a couple of important points.

First of all, I had originally done all of my test files at 5.6MHz on a Korg MR2000S recorder (at the time I submitted the synopsis), and ended up having to re-lay all of the test files at 2.8MHz because there is (to my knowledge) no system that will allow me to edit, null, and compare 5.6MHz DSD files (128x-fs).

I am comparing 3 test files. The fist pass is the 'live' mix from my workstation feeding it's DA converters. The second pass is the workstation playing back the WAV file that was captured while the 'live mix was running. The third pass is another playback pass, exactly like #2.

I did my comparison tests on a Pyramix workstation in DXD mode at 8x-fs (352.8 kHz - 32bit floating point), which is the minimum amount of data to represent a 2.8MHz DSD stream. So even with only the ability to nudge the (inverted) comparison file by 8x-fs one-sample nudges to get close to a perfect null (a one-sample nudge at 352.8kHz is moving by 8 DSD samples at a time), I was still able to demonstrate a near-perfect null of two playback passes, when compared to the 'live' pass.

The purpose of the test was to demonstrate the difference between a live digital stream (AES/EBU or S/PDIF) and the ability of a digital recorder to play back the digital stream from the resulting WAV file. My daily experience here in the studio has shown that there is a very clear difference between what I hear coming out of my workstation when a mix is running 'live' compared to when it's played back from a WAV file – at any sample rate, although the effect is more pronounced (to me) at 24/96.

The interesting thing is that, if I record what I'm hearing while the mix is running 'live' to a DSD recorder, and down-convert it to the destination sample rate, the result is MUCH more pleasing (and sounds closer to the original) than the digital stream recorded directly to a WAV file. It seems like PCM digital can work OK as a container for the audio, but our existing framework to capture the stream properly with it's full, proper relationship to the PLL clock is flawed.

The difference in ambience, transient response, and 'graininess' when comparing the two sources down-converted follows all the way down to a 256MBPS MP3 file. I'm quite confident I could identify the difference in one of my own mixes 100% of the time.

The most important observation for me, is that, as a listener, I would much rather hear a DSD recording played through an analogue low-pass filter at 25kHz than a wide-open 24-bit/96kHz (48kHz audio bandwidth) playback of the same. It seems that for me, at least, the time-domain is much more important than the frequency domain in terms of representing the details I care about. So in the analogue world, I suppose I would much rather hear something played through a high slew-rate amplifier with average frequency response, than a high bandwidth amplifier with slow transistors.

The implication of this, is that there is some degradation that is happening EVERY time an audio file is captured from an AD converter, or from a digital stream. The way the AD converters sound in the control room when recording, is as good as it's ever going to get. When you play back what you recorded, there is a slight 'Disappointment'.

Although I have essentially described it, I will post the PDF of my eBreif on a download link and make it available for 14 days, if anyone's interested....

Cheers,
JB

Post by tailspn November 7, 2011 (14 of 15)
Great post John! In your comparison study, was the "live" mix the mic feeds through the console? Is the "third pass" therefore the live mix recorded to DSD, then down sampled? If so, what was used to do the down sample and conversion? Weiss Saracon? Also, could you provide some instruction on finding your eBrief site? Thanks Tom

Post by Fitzcaraldo215 November 9, 2011 (15 of 15)

Beetle said:

The interesting thing is that, if I record what I'm hearing while the mix is running 'live' to a DSD recorder, and down-convert it to the destination sample rate, the result is MUCH more pleasing (and sounds closer to the original) than the digital stream recorded directly to a WAV file. It seems like PCM digital can work OK as a container for the audio, but our existing framework to capture the stream properly with it's full, proper relationship to the PLL clock is flawed.

Nice post, Beetle. If I am reading you correctly, then the issue you refer to might appear to be the translation to WAV or, less likely, something inherrent in WAV itself. It's less about DSD vs. PCM for recording "accuracy", which is a different tangent altogether.

The Linn stereo dowloads I referred to in my post above were all FLAC played back from hard drive via the Oppo's eSata port. We have had problems with WAV dropping the center channel in Mch WAV playback, but not in FLAC. Since my friends and I are Mch devotees, we tend to avoid WAV. On the other hand, rips of 44K CD's to WAV have generally sounded slightly but noticeably better to several of us when played back from the hard drive as opposed to the silver disk, at least from an Oppo 93. We also heard no meaningful differences, if any, with CD's or WAV rips played back via SPDIF or HDMI. The WAV CD rips, incidentally, were fed to the Oppo 93 player via USB from the PC.

So, there may be a lot going on here. Differences in equipment used or DSD-PCM conversion algorithms might possibly explain all of it. And, there may well be better ways than others - DSD, PCM, file formats, conversion algorithms, ADC's, output DAC's, etc. - to play back music most truthfully and satisfyingly with available gear. But, where to start?