Consumer Price Index (CPI)

By: Scott Moore
January 19, 2012 · Posted in economics · Comment 

Issue

The Consumer Price Index (CPI) is one measure of  consumer prices. The Bureau of Labor Statistics (BLS) CPI program produces monthly data on changes in the prices paid by urban consumers for a representative basket of goods and services. The BLS data sets allows us to review price increases (or declines) on more than 200 categorical items. (Link)

While my home shopping econometric expert and I were comparing prices from a recent trip to the grocery store, we realized there had been a significant price increase in our Cream of Wheat®. In fact, the cereal had increased by more than 6 percent. However, the CPI November report from the BLS stated:

The Consumer Price Index for All Urban Consumers (CPI-U) was unchanged in November (2011) on a seasonally adjusted basis, the U.S. Bureau of Labor Statistics reported today. Over the last 12 months,the all items index increased 3.4 percent before seasonal adjustment.

So how do the items we purchase prices relate to the CPI?

CPI: Index Weighting

It turns out there is a relationship between our breakfast cereal and the CPI.  Although our individual cereal price is in the “basket of goods”, it is a small contributor to the overall index. The index is heavily weighted toward food, but even more so for housing and transportation. Pie Chart

This is one criticism of the index – that it does not clearly reflect the items that we purchase every day. Who of us purchases a home once a week?! In other words, our cereal is mixed in with a variety of goods and services, some of which we seldom use.

CPI: Applied

The CPI however, is actually a pretty handy consumer tool, especially if one is going to make a major purchase beyond daily consumables. As an example, in 2006 a price drop in televisions at Best Buy® caught my eye. I decided to save the January/February sale catalog and watch prices over the years.  After our recent morning cereal discussion, I recovered the January 2006 catalog and starting comparing prices. What I found is that a 32-inch television in the 2006 catalog had decreased in price by 86.3 percent in 2012. When I looked at the “basket” of television items in the CPI, the index noted a 61.5 percent decrease from 2006 to 2010, and 82.7 percent decrease from 2001 to 2010. (PDF) Another way to look at this is that your 32-inch television, if you purchased one in 2006, is worth 61.5 percent less, not including depreciation!

Conclusion

The CPI is a good tool for not only reviewing price trends, but forecasting what we might expect in price increases or decreases from a large assortment of items we purchase. This broad base of items range from daily purchases on consumables to housing purchases, which may only happen once or twice in a lifetime.

Maybe I will wait another year before I get that 52 inch big screen.

For a monthly analysis of these data see the Center for Economic and Policy Research (cepr).

Bowl Championship Series: BCS

By: Scott Moore
January 17, 2012 · Posted in statistics · Comment 

Issue

The Bowl Championship Series (BCS) ranking process is a failure by any measure. The good news is that it finally appears the powers-that-be are going to work out a playoff system. But what is the root cause of the problematic BCS rankings? Why don’t they work? And what type of numerical system might meet the needs of a college football ranking system?

Statistics: Or Lack of!

A cursory review of BCS statistics quickly identifies the main problem, which is that is the people who created these “methods” do not appear to use any form of statistics. Further limiting the public’s understanding of these data is that the methods used to calculate rankings are not available. In other words, they have not been peer-reviewed in any meaningful way – and subscribe to the “trust me” method!

We know the accuracy is questionable at best or scandalous at worst, since we never read or hear about odds, confidence intervals, error, probability or other common statistical references when referring to these data. We also know intuitively that around each number there is error. If the error is not displayed, we know we cannot trust neither the numbers nor the authors – hence the ruckus around these rankings.

The Champ: Play-off

The great thing about a playoff for college football, like every other major sports league, is that you know the answer at the end. The best team on that day is the final one standing. End of debate. Rodney Harrison was recently asked who he liked in the NFL playoff and his answer was that it is hard to estimate since anything can happen in a playoff game. Well said. The challenge with a college playoff system is not that it wouldn’t work, because it would. Rather, it cuts the number of bowl games in half. Ouch, that is a lot of lost revenue!

The Champ: Numerical Calculation

I will disclose my bias for a playoff system since, as Rodney stated, anything can happen. But I believe there is likely a method that would, in fact, provide a numerical answer that most would agree with. First, the method needs to be made public, and it should be a method that has a history of success. “Odds” are, of course, one system, but in reviewing the odds estimates for the BCS championship game, there were many conflicting estimates with some odds makers suggesting a difference of only a point or two. In other words, it was too close to call.

Odds is an interesting process (better than the “look what I made up” numerical process), but probability estimates are the only real tool we have that could pick a winner. Odds and probability sound similar but in fact are quite different. The difference:

  • Probability is used to express sensitivity, specificity and predictive value. It is the proportion of people in whom a particular characteristic, such as a positive test, is present.
  • Odds is the ratio of two complementary probabilities. (PDF)

Along the probability line is a process called Evidence Based Management (EBM) which uses Bayesian analysis.

Bayes Theorem: a statistical principle for combining prior knowledge of the classes with new evidence gathered from data. See Introduction to Data Mining Chapter 5 pp: 228-229) (PDF)

EBM with Bayesian analysis states: What was thought before the test was done, combined with the test result is greater than what is thought after the test result. In other words, what you thought you knew before the football contest, the game, and what you think afterward – LSU is still No. 1 syndrome! It is this process that could provide an answer to who is No. 1 regardless of the date, time or opponent,* effectively removing the Rodney affect, but not likely the debate!

Conclusion

I am not sure that the BCS question is all that important or worth a lot of time in the context of solving the world’s problems, but if we are going to do the math, let’s at least try to make the process transparent, thoughtful and based on some sort of peer-reviewed science. Frankly, that is the only way my team will EVER have a chance at a BCS championship!

*Note: I do not address “style” points: a non-sportsmanship concept.

Gini Coefficient

By: Scott Moore
January 10, 2012 · Posted in statistics · Comment 

Issue

The Gini Coefficient, developed by the Italian statistician Corrado Gini, is the most commonly used measure of inequality. The coefficient varies between 0, which reflects complete equality and 1, which indicates complete inequality (one person has all the income or consumption, all others have none). (The World Bank) We wanted to use this method to look at income distribution throughout South Carolina, but first we had to understand the formula.

At first glance, there is a fair amount of math needed to calculate the coefficient. Make no mistake, this is and can be a very complex formula, utilizing probability sampling, bootstrapping, confidence intervals and other statistical methodology. We however, tried to keep it applied, and therefore used the most basic variation:

Gini Formula

After sorting out the symbolism, we created a sample problem (PDF).  The sample problem allowed us to work through the math in a structured process. The value of  ”doing the math” is that one gains an understanding as to how different variables affect the formula. The PDF contains two versions of the sample problem, one showing the formula and the other with plugged numbers. Note how unlike most of the available examples, we show a calculation needed prior to using the formula.  In this case (dollars strata) TIMES (number of persons). That’s because the analyst may need to do a number of calculations prior to applying the formula.

The Formula: Results

We applied the formula to the classic income distribution (wealth share) problem, using Census, Household and Family Income Report B19001, for each county in South Carolina. These data have 16 income strata. We found the formula is particularly sensitive to changes in the top two strata, not necessarily the number of persons, but average dollar value. In other words, ”the tail wags the dog” in this formula. The other critical piece of information needed is what value to assign the highest strata. The census uses approximately $400,000 as an approximation for the average top strata dollar figure.  They calculate this number using volumes of data, so it’s good enough for me.

After making our calculations, the formula really did reveal a number of interesting trends. One, the impact of the economy on higher wage earners – in the case of these data – is very delayed. In other words, higher income households continued to make money well into the latest recession. The other revealing attribute is the affect of a rising tide. A rising tide does in fact lift boats, but some higher than others and in the process it also sinks a few!  In this case,  households with higher incomes grew at a proportionally higher rate than those with lower incomes, and in some counties, household income (high and low) was hit particularly hard.

Conclusions

Now that you understand the formula, if you use these data, the Census Bureau has already done the Gini Coefficient income calculations for you! Yes, to my surprise the the Bureau has been doing this calculation since the 1990s.  The file is B19083. It may sound like I have given you a shortcut but now you have to figure out the new GUI American Community Survey interface. Good Luck!

Acknowledgement: Thank you to the staff at the US Census Bureau for assisting me in understanding key drivers of the Gini Coefficient.