```
library(tidyverse)
library(fpp3)
```

# Interrupted time series & regression discontinuity

Causal impact assessment workshop

In this practical, you will create several versions of interrupted time-series models for estimating the counterfactual cigarette sales. For the advanced time-series models with autoregression and differencing, we will use the package `fpp3`

. First, load the following two packages:

We will again be using the proposition 99 dataset:

`<- read_rds("raw_data/proposition99.rds") prop99 `

## Data preparation

In this practical, we will need to transform our dataset into a `tsibble`

(a time-series table object). This is necessary for the `fpp3`

package to figure out which column indicates time. We also only need the cigarette sales data from California for this practical You can prepare the data by running the following code:

```
# try to figure out what each line does!
<-
prop99_ts |>
prop99 filter(state == "California") |>
select(year, cigsale) |>
mutate(prepost = factor(year > 1988, labels = c("Pre", "Post"))) |>
as_tsibble(index = year) |>
mutate(year0 = year - 1989)
```

Note that we have also already included a `prepost`

variable in the preparation which can be used to filter the pre or post-intervention time-series.

## Growth curve

Through estimating the effect of time on the pre-intervention time-series, we can create a prediction for the post-intervention counterfactual which includes a trend. In some fields, this approach is called estimating a “growth curve”.

## Time-series model

Time-series techniques also take into account autocorrelations and the idea that recent values of the outcome of interest have more predictive power over the current value than values far in the past. With the `fpp3`

package, we can do a data-driven model selection for the proposition 99 dataset and produce a counterfactual with automatic uncertainty quantification.

## Regression Discontinuity

Regression discontinuity designs (RDDs) share many similarities to Interrupted Time Series approaches. In an RDD analysis, we typically fit a **piecewise** linear model of some kind, and test whether the relationship between two variables changes on either side of a threshold. In the context of Interrupted Time Series, this often amounts to fitting a **growth-curve** type model on the full time-series, including main and interaction effects of an intervention indicator.

## Conclusion

In this practical, you have used growth curves and time-series models to estimate the effect of the proposition 99 policy intervention. You have seen that these models can be used to “impute” the counterfactual or to directly parameterize the change in the target variable after the intervention. There are many details we skipped over here, such as how to best perform model selection, and the many different RDD-type analyses you could perform, but this provides a basic starting point. Notice how different model types can provide both different point estimates of the causal effect, and very different quantifications of our uncertainty around that causal effect. In particular ARIMA-type models, explicitly designed to forecast, will often reflect the idea that we become more and more uncertain about the future the farther ahead we want to predict. There is likely no simple answer to the question of which of these models or approaches should be preferred in practice.