Examples and data borrowed from Oscar Torres-Reyna’s excellent tutorial on panel data.

library( foreign )
library( car )
Panel <- read.dta("http://dss.princeton.edu/training/Panel101.dta")

Panel$y <- Panel$y / 1000000  # change to millions of dollars

Group Bias

A problem that we encounter often in statistics is bias that is correlated with group structure in the data. This is sometimes referred to as Heterogeneity.

If we have panel data, data where the study variables are measured repeatedly for each group over time, we can account for much of this bias using a “fixed-effect”, a separate intercept term for each group within the data.

All Data Plotted Together

plot( Panel$x1, Panel$y, 
      col=gray(0.5,0.5), 
      pch=19, cex=3, bty="n", 
      xlab="X", ylab="Y" )

abline( lm( y~x1, data=Panel ), col="red", lwd=2 )

Data Plotted by Group

plot( Panel$x1, Panel$y, 
      col=factor(Panel$country), 
      pch=19, cex=3, bty="n", 
      xlab="X", ylab="Y" )

Data Plotted by Group with Fit

fe  <- lm( y ~ x1 + factor(country) - 1, data=Panel )
yhat <- fe$fitted

plot( Panel$x1, yhat, col=factor(Panel$country), 
      pch=19, cex=2, xlab="X", ylab="Y", bty="n" )

# Or use the scatterplot() function in the car library

scatterplot( yhat~Panel$x1|Panel$country, 
             boxplots=FALSE, xlab="x1", 
             ylab="yhat",smooth=FALSE )
abline(  lm(Panel$y~Panel$x1),  lwd=3, col="red")



Fixed Effect Regression Model

We run a fixed effects regression by adding a term for the group strucure, which in this example is “country”. Since this is a categorical variable, R will create multiple dummy variables that each represent a unique intercept for a group. When each group has an intercept like this, we need to omit the model-level intercept by adding a negative one to the model command.



# Regular OLS regression

ols <- lm( y ~ x1, data=Panel )

# Fixed Effects Model

fe  <- lm( y ~ x1 + factor(country) - 1, data=Panel )


library( stargazer )

#  To add a stargazer table to your markdown file, 
#  add this argument to your code chunk:
#
#  {r, results='asis'}

stargazer( ols, fe, type="html", 
           omit.stat = c("adj.rsq", "f","ser"), 
           digits=2)
Dependent variable:
y
(1) (2)
x1 494.99 2,475.62**
(778.86) (1,106.68)
factor(country)A 880.54
(961.81)
factor(country)B -1,057.86
(1,051.07)
factor(country)C -1,722.81
(1,631.51)
factor(country)D 3,162.83***
(909.46)
factor(country)E -602.62
(1,064.29)
factor(country)F 2,010.73*
(1,122.81)
factor(country)G -984.72
(1,492.72)
Constant 1,524.32**
(621.07)
Observations 70 70
R2 0.01 0.44
Note: p<0.1; p<0.05; p<0.01



If we would like to improve the print-out by suppressing the fixed effects in the table, you can use these arguments in stargazer.


# Suppress the printout of all the intercepts

stargazer( ols, fe, type="html", 
           omit.stat = c("adj.rsq", "f","ser"), 
           omit="country",            
           add.lines=list(c("Country Fixed Effects?", "No","Yes")),
           digits=2)
Dependent variable:
y
(1) (2)
x1 494.99 2,475.62**
(778.86) (1,106.68)
Constant 1,524.32**
(621.07)
Country Fixed Effects? No Yes
Observations 70 70
R2 0.01 0.44
Note: p<0.1; p<0.05; p<0.01



As we would expect based upon the picture, the slope in the fixed effects model is larger than the slope in the OLS model. We also see that is it not significant in the OLS model, but it is in the fixed effects model because the fit is not better (each data point is closer to the line of fit since there are now several lines of best fit).