Handcoding a Panel Model

Handcoding a Panel Model

Handcoding a Panel Model

The most basic panel estimation is the Pooled OLS model, this model combines all data across indices and performs a regular Ordinary Least Squares Estimation.

{% highlight r %}

load the PLM library for panel estimation

library(plm)

load the Crime data set

data(Crime) {% endhighlight %}

{% highlight r %}

define the model

m1 <- formula(crmrte ~ prbarr + prbconv + polpc)

create a panel data.frame (pdata.frame) object

PanelCrime <- pdata.frame(Crime, index=c(“county”, “year”) )

estimate Pooled OLS using the basic lm function

lm(formula = m1, data = Crime) {% endhighlight %}

{% highlight text %}

Call:

lm(formula = m1, data = Crime)

Coefficients:

(Intercept) prbarr prbconv polpc

0.043643 -0.050993 -0.003251 3.055626

{% endhighlight %}

{% highlight r %}

estimate the Pooled OLS using the plm package

plm(formula = m1, data = PanelCrime, model = “pooling” ) {% endhighlight %}

{% highlight text %}

Model Formula: crmrte ~ prbarr + prbconv + polpc

Coefficients:

(Intercept) prbarr prbconv polpc

0.043643 -0.050993 -0.003251 3.055626

{% endhighlight %}

A more complex estimation method is the Fixed-Effect (or within) estimator. If our data only contains to time-periods, the results of this estimator are equivalent to a OLS estimation of the first-differenced variables.

{% highlight r %}

create data.frame with only years 81 and 82

Crime8182 <- subset(Crime, year %in% c(81, 82) )

put into panel data.frame form (pdata.frame)

PanelCrime8182 <- pdata.frame(Crime8182, index=c(“county”, “year”) )

first difference the non-panel data.frame

library(dplyr) {% endhighlight %}

{% highlight text %}

Attaching package: ‘dplyr’

{% endhighlight %}

{% highlight text %}

The following object is masked from ‘package:plm’:

between

{% endhighlight %}

{% highlight text %}

The following objects are masked from ‘package:stats’:

filter, lag

{% endhighlight %}

{% highlight text %}

The following objects are masked from ‘package:base’:

intersect, setdiff, setequal, union

{% endhighlight %}

{% highlight r %} Crime8182FD <- Crime8182 %>% group_by(county) %>% summarise(crmrte = diff(crmrte), prbarr = diff(prbarr), prbconv = diff(prbconv), polpc = diff(polpc) )

use lm to estimate the two-period fixed-effects model

lm (formula = m1, data = Crime8182FD ) {% endhighlight %}

{% highlight text %}

Call:

lm(formula = m1, data = Crime8182FD)

Coefficients:

(Intercept) prbarr prbconv polpc

-6.133e-05 -1.965e-02 -1.537e-03 3.358e+00

{% endhighlight %}

{% highlight r %}

verify with the plm package

plm(formula = m1, data = PanelCrime8182, model = “fd” ) {% endhighlight %}

{% highlight text %}

Model Formula: crmrte ~ prbarr + prbconv + polpc

Coefficients:

(intercept) prbarr prbconv polpc

-6.1332e-05 -1.9645e-02 -1.5365e-03 3.3584e+00

{% endhighlight %}

If our data set contains more than two time periods, we need to estimate an proper fixed effects model. This is done by creating a fixed-effect variable for every level along the cross-sectional index (i.e. the non-time index). A simple way of doing this, is by encoding the cross-section index as a factor and including that factor in the regression (more on factors/categorical variables in the post on Handcoding a Linear Model).

{% highlight r %} fe <- lm (formula = crmrte ~ prbarr + prbconv + polpc + factor(county), data = Crime) fe$coefficients[2:4] {% endhighlight %}

{% highlight text %}

prbarr prbconv polpc

-0.008008440 -0.001010476 2.029003066

{% endhighlight %}

{% highlight r %} plm(formula = m1, data = PanelCrime, model = “within” ) {% endhighlight %}

{% highlight text %}

Model Formula: crmrte ~ prbarr + prbconv + polpc

Coefficients:

prbarr prbconv polpc

-0.0080084 -0.0010105 2.0290031

{% endhighlight %}

This post is licensed under CC BY 4.0 by the author.