Thread: Meyer Moran result debunked - again

Posts: 111
Page: prev 1 2 3 4 5 6 ... 12 next

Post by Kal Rubinson September 8, 2010 (11 of 111)
eesau said:

So, don't you think that the results are somewhat contradictory … and this, again, shows that AES is not very discriminative when accepting papers to their conventions.

Generally, the screening for papers at scientific meetings is based on the procedures/methods and statistics, not necessarily for consistency of results among different portions of the study nor for interpretation of the results.

Kal

Post by canonical September 8, 2010 (12 of 111)
Fitzcaraldo215 said:

I think there are numerous problems with ABX and how some ABX tests are conducted.

[...]


Bottom line: ABX is a crazy, unnatural way to listen that is fraugt with problems. I have no doubt it can identify big differences. But, when we get to the smaller differences that are quite prevalent in audio, it's just not reliable. And, of course, an ABX result of no "statistically significant" difference does not mean there is no difference in reality.

Yup - absolutely correct.

ABX is a very complicated procedure ... it requires that you listen to ... not 2 ... but 3 different sources:. First, you listen to source A, then you listen to source B, then listen to source X, and then you have to say whether X is the same as A or whether it is the same as B. That's a very difficult thing to do ... to remember what 3 different sound sources sound like ... as the music unfolds in front of you.

A far simpler way to run these sorts of studies is simply run an AB test: "Do you prefer A or B?" That is all that is needed.


Because ABX is such a cumbersome and complicated procedure, ABX testing hardly ever yields statistically significant results. The standard conclusion drawn is that we were unable to prove that product A is statistically different to product B. There have been papers which, for example, use ABX testing to 'conclude' that mp3 is statistically indistinguishable from CD. Which is rather silly.

As Fitzcaraldo215 notes, ABX testing only ever finds really BIG DIFFERENCES between audio sources. Yes - big differences ... And that is what impresses me about this paper ... the fact that the paper yields any statistically significant results using ABX testing ... even when comparing just 88.2 kHz vs 44.1 kHz ... because the differences must be BIG for ABX testing to pick them up.

Hardly surprising to those of us who specially purchase hi-rez recordings (whether on SACD or other) ... but it's clearly flat-earth time for all the disbelievers stuck in low-rez 1980.

Post by Fitzcaraldo215 September 8, 2010 (13 of 111)
canonical said:


ABX is a very complicated procedure ... it requires that you listen to ... not 2 ... but 3 different sources:. First, you listen to source A, then you listen to source B, then listen to source X, and then you have to say whether X is the same as A or whether it is the same as B. That's a very difficult thing to do ... to remember what 3 different sound sources sound like ... as the music unfolds in front of you.

A far simpler way to run these sorts of studies is simply run an AB test: "Do you prefer A or B?" That is all that is needed.


Because ABX is such a cumbersome and complicated procedure, ABX testing hardly ever yields statistically significant results. The standard conclusion drawn is that we were unable to prove that product A is statistically different to product B. There have been papers which, for example, use ABX testing to 'conclude' that mp3 is statistically indistinguishable from CD. Which is rather silly.

Yes, to confound matters, I think ABX has a superficial elegance and seeming simplicity about it, wrapped in a "scientific" methodology. It's so straightforward, how could it possibly fail to reveal differences if in fact they are there? Well, I agree, it is deceptively complex when you look at it carefully. I have to admit, I used to be seduced by the "power and science" of ABX as I read about it. But, I have come full circle.

I also agree about double-blind A vs. B preferences as a much simpler and more reliable testing protocol that tells us us what we really want to know: which sounds better? That's how we, at least I, am used to listening to different components, though not double-blind. It's normally what we do in showrooms or in our homes with loaners. It also considerably shortens the listening time required to reach a "decision", reducing test fatigue and lessening the reliance on acoustic memory.

I think it's quite intuitive for almost anyone to listen for preference between A and B. It's not intuitive or natural at all to listen in such a way that you can identify X as either A or B.

If Meyer-Moran or a host of other ABX'ers had done A-B preference instead, whatever the result, I think there would be less controversy. But, on the other hand, the AES would likely not publish such a study because it was not "scientific" enough.

Post by eesau September 8, 2010 (14 of 111)
Arnaldo said:


As for picking a few samples from the Pras/Guastavino AES paper and using them selectively in order to (try to) make a specific point, well, it's not very scientific to say the least...

Heh,

you don't obvously hold the paper in your hand like I do.

That was ALL the results of this paper.

They must have been desparately in the need to publish something ....

Esa

Post by audioholik September 9, 2010 (15 of 111)
Arnaldo said:

One intriguing point about the Pras/Guastavino AES paper is why they chose to compare 44.1kHz to just 88.2kHz.

Maybe they didn't have the appropriate equipment (many converters block sampling rate at 96kHz)...

BTW other studies suggest that 192kHz should be a minimum sampling rate.

http://www.cas.sc.edu/dean/news/2009/HiFi-Critic-Kunchur.pdf

Post by david moran September 28, 2010 (16 of 111)
RWetmore said:

That's most interesting. I would like to hear David Moran's comments on it.

Well, it is seriously impressive to see that the hostility, stupidity, ignorance (about ABX testing and so much else), nonstop insulting / personal namecalling remain unabated from so many on this forum. Such energy you have. And the mistakes! (I only wish I were also a recording engineer, and as for our having used CD-rez recordings, wow.) Arnaldo, canonical, fitzcaraldo et al. -- y'all seriously should consider some chill pills.

And you also need to do your own authentically scientific tests, as so many others have pointed out all along.

Thank God for esa.

The chief thing worth responding to here and now, though, is to point out that this McGill work appears to be, uh, a preprint given at an AES convention. Has it been published in the AESJ, does anyone know? Am I mistaken in this conclusion? Did everybody miss that crucial distinction, or am I wrong? I have not seen it or found it as such, a reviewed and published paper. Any fool can write a preprint, and many have. There is no review process and open to anyone. It is something else entirely to get your work through the peer process and published in the Journal.

If you don't get that, and if you think this is a valid refutation, well, no wonder you don't get so much else.

In any case, we have said all along we welcomed further investigation using blind level-matched methods, and if this preprint really is that investigation, it's important news, and yay, I say. It's how science gets done.

Post by Fitzcaraldo215 September 28, 2010 (17 of 111)
david moran said:

Well, it is seriously impressive to see that the hostility, stupidity, ignorance (about ABX testing and so much else), nonstop insulting / personal namecalling remain unabated from so many on this forum. Such energy you have. And the mistakes! (I only wish I were also a recording engineer, and as for our having used CD-rez recordings, wow.) Arnaldo, canonical, fitzcaraldo et al. -- y'all seriously should consider some chill pills.

Gee, I feel especially honored to be singled out in your condemnation. I think if you read my posts with an open and inquiring mind, I raised some potential questions that perhaps you have answers for. I do not see where i was guilty at all of "hostility, stupidity, ignorance (about ABX testing and so much else), nonstop insulting / personal namecalling".

I think it is you, not me, who has taken this personally. Perhaps I am indeed ignorant of the process. So, rather than a hostile tirade, perhaps you could address some of the issues I have raised in my previous posts and educate me, as well as all of us here.

So, let me restate my main concerns to save you the trouble of looking up my previous posts. I stated an issue with some ABX tests, not yours, which do not use perfectly time-synchronized source material. But, as we know, that does not apply to your tests, which were done with only a single SACD source. (See, I read your paper!) It was either played back directly or via the 44K A to D to A process, giving us our two test choices A and B. Hence, yours were perfectly time-synchronized.

But, here are my questions, and I think you will easily see where I am going with my concerns. (1.) Could the test subjects, not the testers, freely "rewind" and repeat specific short, musical sub-passages of their own choosing for A, B and X as much as they wished, switching back and forth during mid passage? (2.) Were they in any way encouraged to do so, as opposed to just being shown how to operate the test apparatus? (3.) Do we have any way of knowing if they actually did this during the test? (4.) Were there any time limits or time pressures on the test sessions?

I am assuming also that the switchover time from A to B or X was also quite short - i.e., under a second. Am I correct?

I have not found clear answers to these questions in your paper or on the BAS site.

Post by david moran September 28, 2010 (18 of 111)
Fitzcaraldo215 said:

Gee, I feel especially honored to be singled out in your condemnation. I think if you read my posts with an open and inquiring mind, I raised some potential questions that perhaps you have answers for. ...

So, let me restate my main concerns to save you the trouble of looking up my previous posts. I stated an issue with some ABX tests, not yours, which do not use perfectly time-synchronized source material. But, as we know, that does not apply to your tests, which were done with only a single SACD source. (See, I read your paper!) It was either played back directly or via the 44K A to D to A process, giving us our two test choices A and B. Hence, yours were perfectly time-synchronized.

But, here are my questions, and I think you will easily see where I am going with my concerns. [[questions omitted]]

I am assuming also that the switchover time from A to B or X was also quite short - i.e., under a second. [[ etc. ]]

Wow, I certainly owe you a strong apology for lumping you with the others. I am sorry; I did just reread your postings, and I obviously got tangled up in my own defenses. I apologize to you.

As to your concerns:
There is nothing in ABX testing using the comparator we used to prevent any of what you pose in your questions, and yes, for sure, all 55+ subjects did just that. We did not constrain them in any way (one of the many variances from strictest statistical hygiene; see our AESJ reply to Dranove for more detail on that). We resolved to look as hard and thoroughly as possible for anyone who could tell the difference, any difference whatsoever, b/w hi-rez and hi-rez 'degraded' through the RBCD loop. As you know, we did not find anyone who could show that they could detect the difference and the alleged degradation. (This is NOT averaging our successes against failures or whatever insane notion someone else proposed.)
Everyone could and did rewind and repeat, switching back and forth as much as desired; there were no time limits or pressures whatsoever (another violation of strictest protocols).
Most people grow frustrated at an undetectable test like this one (what appears to be the case, at least in our approach); it is not like ABX testing Brad has done with power amps and cassette playback, where subjects actually quickly home in on differences. Subjects are / were all encouraged to have as good a time as they could, or whatever the right phrase might be, and enjoy themselves if possible -- including up to their choices of material and kinds of material and recording processes.
One working engineer brought his own 18-bit material (private) to audition and compare to its 'degraded' version.

Switchover is instantaneous to the ear.

As for 'any way of knowing', only from taking my and Brad's word for it, from our being present.

Oh, it just occurs to me, some of what I just stated may not have been true for the large group of recording engineering students Brad worked with at a local university. In other words I am not sure they had the same degree of freedom and leisureliness; I think not.

Post by david moran September 28, 2010 (19 of 111)
Anyone who is not tired of all this, and would like to see the exchange b/w David Dranove of this forum, who sent quite the letter to the AESJ, and Brad Meyer's and my reply to it, which contains further results sorting and further explanation of the tests, can email me and I will send you a pdf. (I already sent the exchange to Fitzcaraldo.)

Post by RWetmore September 28, 2010 (20 of 111)
No need for any insults here.

Page: prev 1 2 3 4 5 6 ... 12 next

Closed