Monday, November 23, 2009

Variable Selection Methods in Regression

There has been discussion at recent journal clubs about some related econometric issues such as omitted variable bias, multicollinearity and control variables, parsimony in regressions, and more broadly - how to select variables for an econometric model. This article by Bruce Ratner ("Variable Selection Methods in Regression: Many Statisticians Know Them, But Few Know They Produce Poorly Performing Models") is a useful overview of five wisely used selection methods (Forward Selection, Backward Elimination, Stepwise, R-squared, and All-possible Subsets).

4 comments:

Kevin Denny said...

These approaches are very much a statisticians way of thinking about the problem and don't feature that much in econometrics. That is partly because economists are guided more by theory: a Mincer equation has a particular form for example but also because we tend to focus on a small number of parameters, often just one. In other words, we are not that interested in model building, maybe we should be. Gelman's review of "Mostly harmless..." [in the Stata journal I think] has a nice discussion of this issue.

Anonymous said...

Thanks for the ref (and comments) Kevin; will investigate.

Kevin Denny said...

I'm sure you can get it on his web site. Given the way a lot of labour economics is going with increasing use of more ad hoc variables, perhaps there should be more concentration on variable selection methods. Maybe we can learn something from statisticians after all [I jest]. I suppose to an economist these approaches seem a bit mechanical.

Anonymous said...

It is certainly worthwhile to reflect on what makes econometrics different from applied statistics. We have a theory of human behaviour! At least some of the time...