South Carolina Lazy? I don’t think so!

By: Scott Moore
August 6, 2010 · Posted in statistics 

Lazy:  When Noise Interferes with the Signal

Recently the Post and Courier ran an article highlighting a Business Week analysis that said South Carolina was the eighth laziest state in the union! Typically, subjective words used to describe data pop a red flag that warns me of impending data misuse doom.

The Data Set

The American Time Use Survey (ATUS), measures the time people spend doing various activities such as work, childcare, housework, watching television, volunteering and socializing. Hence this is an activity survey, not a lazy survey.  The data are collected by the Census Bureau and sponsored by the Bureau of Labor Statistics (BLS). I ran a query to understand the nature of the survey, data availability and error rates.  I called in the big guns from Global Pragmatica LLC to assist in converting the data from a ASCIDAT file to my JMP statistical software package format. These folks are experts in scripting and were a huge help. Thank you!

These data are collected regionally but analyzed nationally.  There is about a 90-percent chance, or level of confidence, that an estimate based on a sample will differ by no more than 1.6 standard errors from the “true” population value because of sampling error.   No estimates are made for state level data, and one University of Minnesota analyst stated she was not aware of state level error estimates.

It is inappropriate to analyze these data at the state level without calculating the error inherent in the data. If you did that, the analysis would be interesting but useless when comparing one state to another. Why?

Sports Activity Variable Analysis

For a test sample, I choose state level geography,with sports as a variable activity. This category captures the respondent’s participation in sports, exercise and recreational activities. To extract the data from the system, I used a tool created by the University of Minnesota called the American Time Use Survey -X.  The data needs to be processed by a statistical package, in this case my JMP program. An analysis of people participating in sports activities indicates that South Carolina would  rank 22nd out of 50 states  in terms of average minutes spent participating in sports in a 24 hour period – not bad. However, upon further inspection of South Carolina’s 2009 detailed weighted data, the state could rank  anywhere from 12th to 23rd,based on national error rates! (PDF) Unfortunately, since these are state data, the results are meaningless. That’s because the sample is simply too small, which is one of many buried statistical problems. This 2009 sample included a total of 200 people, where 166 recorded zero sports activity minutes. (PDF) In fact, the median is zero, which is another red flag for this data set.  A review of other states’ data revealed the same issue. This is a fascinating national data set. But unfortunately, analysis of non-national geographies yields unreliable results.

Comments

Leave a Reply

You must be logged in to post a comment.