Friday, September 15, 2006

An Example of a Blinded Test

A couple of weeks ago, my wife and I found some interesting bottles of wine on the closeout rack of a wine store we frequent (to digress, closeout racks are great). We found bottles of Jewel Shiraz and Jewel Syrah at a very low price. I immediately found the labeling odd, since Shiraz and Syrah are actually the same species of grape under two different names. The grape is usually called Shiraz by Australian growers, while French growers use Syrah, but both of these wines were made from grapes grown in California. Consequently, they should be exactly the same (as they were also bottled the same year, 2002), so why give them different labels.

I asked a store clerk about the labeling, and she told me that the Syrah should actually taste a little softer and richer. That sounded nonsensical to me, but why not take this opportunity to conduct a little test, especially since these bottles were marked down to very manageable prices.

We therefore bought a bottle of Shiraz and a bottle of Syrah and took them with us to the apartment of some friends of ours -- Lord William and Lady Juliana. My job would be to devise a test to see if anyone else at the table could distinguish between the Shiraz and the Syrah.

The test would be a simple “Which of these things is not like the others?” I started with four identical wine glasses. I then tied some colored thread provided by Juliana around the stems to uniquely identify each glass: gold, light blue, white, and green. Rolling a die, I randomly assigned Syrah to one of the glasses and wrote its color on a piece of paper to stick in my pocket. I poured Syrah in that glass and poured Shiraz in the other three. I made sure they were all filled to the same height and then took them to the dinner table. I provided blank sheets of paper for my three testers. Their task was to write down the thread color of the glass that held different wine from the other three. Instructions given, I left the room. Juliana's parents came by a little later, and they got pulled into the test, too, so I had a sample size of five participants.

Just by random chance, I should reasonably expect one or two of the testers (roughly 25%) to get the right glass. Three or more correctly identifying the odd glass (the Syrah) would indicate that there probably was a detectable difference between the two wines, be it flavor, aroma, color, etc.

As it turned out, no one identified the glass of Syrah correctly, which conclusively proved to me that there was no difference between the Shiraz and the Syrah (except, perhaps, that the Syrah had a classier label).

There were some flaws in this test. Given time and resources, I would have liked to make the following changes:
  • It would have been nice to have an intermediary between me and the testers so that I -- as the person pouring the wine -- would have no interaction with the people tasting it; that would make it a truly double-blind test (as it is, it's single-blind).
  • It would be nice to have more people involved in the test for statistically better results.
  • More test scenarios would be good: three glasses of Syrah and one of Shiraz, for instance.
  • I also didn’t think to ban table-talk before the test, so the comments of Juliana -- the first tester -- may have prejudiced the next two somewhat; fortunately the Juliana's parents arrived later and didn’t hear the table-talk, and I was able to forbid discussion of wine opinions before they’d written their choices.


Jim Anderson said...

In other science news, even bad wine can be good with cheese.

Edmund said...

I believe it is typical in wine tasting to rinse one's mouth between glasses, and since you did not mention doing this, it could be another potential flaw.

Of course it's possible that there are minor differences in the manufacturing process depending on the name, and it's also possible your testers simply weren't sensitive enough.

But from what I gather, calling it Syrah versus Shiraz is simply a marketing ploy. Traditional or modern? There's a name for either mood, but it's the same juice.

I suggest doing a similar test with beer, but instead see if one can predict whether a sample is an expensive import or a frat beer. Get some Busch, Pabst, Keystone, Budweiser.. and compare to Foster's, Red Stripe, etc.

Anonymous said...

What you're missing is a control. Perform the same experiment, but pour the same wine in all four glasses.

The results should prove interesting.

(Nevertheless, I think you proved your point.)