View Single Post
      11-27-2012, 05:17 PM   #30

Drives: 2013 M3
Join Date: Jun 2010
Location: Vero Beach, FL

iTrader: (0)

Garage List
Building a statistically accurate model of an entire system from a sampling of data is incredibly difficult. Each of these data sets has their own issues. The biggest one is that all the sampling sizes are usually too small to make an accurate inference.

The USDoT says there were 5.6 million passenger cars sold last year, as well as 4.1 million trucks [1]. I don't know what portion of those trucks are for use by your average buyer, but I know that Ford sold a half million F-series pickups. If we said that about half of those 4.1 million were consumer truck sales, the total number of vehicles sold would be around 7.5 million (using round figures). 7.5 million is a HUGE number.

Looking at the total number of cars on the road, the USDoT puts that at around 250 million registered vehicles, 190 million of which are "Light duty vehicle, short wheel base" [2].

The CarMD data FAQ [3] gives some information, but I find it a little bit confusing. More to the point the report posted here says:

CarMD's network of thousands of certified automotive technicians and database of more than 3 million verified repairs. The November 2012 Index statistically analyzes repairs that apply to roughly 136 million model year 2002 to 2012 vehicles, taking place in the U.S. from Sept. 1, 2011 through Sept. 1, 2012.
Ok, so the CarMD index is based on 3 million verified repairs across 136 million vehicles in a 1 year period. That's over half of the registered vehicles on the road. Looking pretty good from a statistical standpoint.

The WhatCar survey posted above uses data from Warranty Direct; a company that sells extended warranties. I searched the source website at but was unable to find any information on their sample size. They use only vague figures (they spend millions of pounds a year on claims). If we divided "millions of dollars" by the average repair cost in the CarMD data, you could be looking at a sample size in the hundreds of thousands. It's hard to say, but they're not forthcoming. That doesn't inspire confidence.

Let's look at JD Power:

The study, which is based on responses from more than 31,000 original owners of 2009 model-year vehicles after three years of ownership, measures problems experienced during the previous 12 months by those original owners.
So the sample size is 31,000 for a single year model. Ok, so you don't have to have a masters in statistical analysis to recognize the problems here:

1) It's a survey, so you're relying on PEOPLE to accurately report their experience. This is a bad idea, because people are extremely susceptible to bias.

2) The sample size is only 31,000. That's 0.4% of the number of cars sold each year. You'd have to have some serious science to back up your selection criteria in order for this sample size to reflect the full 7.5 million car data set.

So, for my money, the CarMD index is looking pretty damn good. It's certainly possible that their data is flawed as well, but the fundamentals look good.

This reminds me a little bit of the whole Nate Silver, 538 election prediction. No one wanted to believe him, but his models were based on simple mathematics, which is hard to refute.

1 -

2 -

3 -
His: 2013 ///M3 - Interlagos Blue Black M-DCT
Hers: 2013 X3 28i - N20 Mineral Silver / Sand Beige / Premium, Tech
Past: 2010 135i - TiAg Coral Red 6MT ///M-Sport