Skip to content

Method | Introduction About Time Series Regression

Posted on:March 18, 2024

Table of contents

1. Introduction

This section briefly discusses what time series data is, using the static model and the finite distributed lag model as two simple examples.

1.1. What is a Time Series?

1.2 Static Model

Static model describes a synchronous relationship:

yt=β0+β1zt+uty_t = \beta_0 + \beta_1z_t + u_t

1.3 Finite Distributed Lag Model (FDL)

Finite distributed lag model contains lag terms of independent variables:

yt=α0+δ0zt+δ1zt1++δqztq+uty_t = \alpha_0 + \delta_0 z_t + \delta_1 z_{t-1} + \cdots + \delta_q z_{t-q} + u_t

Coefficient interpretations:

Remarks:

2. Classical Assumptions

i.e. Finite Sample Properties of OLS estimators

2.1 Assumptions for Unbiasedness

TS1: Linear in Parameters

The stochastic process {(xt,yt)}\{(\mathbf{x}_t, y_t)\} follows the linear model (t=1~n):

yt=β0+β1xt1++βkxtk+uty_t = \beta_0 + \beta_1x_{t1} + \cdots + \beta_kx_{tk} + u_t

Remarks: xtjx_{tj} can be a lag term of an independent or dependent variable, such as yt1y_{t-1} or yt2y_{t-2}

TS2: No Perfect Collinearity

In the time series process, no independent variable remains constant nor a perfect linear combination of the others.

TS3: Zero Conditional Mean (i.e. Strict Exogeneity)

E(utX)=0,t=1,...,nE(u_t | \mathbf{X}) = 0,\quad t=1,...,n

It implies both contemporaneous exogeneity:

E(utxt)=0 for tE(u_t|\mathbf{x}_t) = 0\ \text{for}\ \forall t

and non-contemporaneous exogeneity:

E(utxs)=0 for tsE(u_t|\mathbf{x}_s) = 0\ \text{for}\ \forall t\ne s

Remarks:

2.2 Additional Assumptions for BLUE

TS4: Homoskedasticity

Var(utX)=Var(ut)=σ2Var(u_t|\mathbf{X}) = Var(u_t) = \sigma^2

TS5: No Serial Correlation

Cov(ut,usX)=0 for tsCov(u_t,u_s|\mathbf{X}) = 0\ \text{for}\ \forall t\ne s

Remarks:

2.3 Additional Assumptions for MVUE

TS6: Normality

utXuti.i.d.N(0,σ2)u_t \perp \mathbf{X}\\ u_t \sim_{i.i.d.} N(0, \sigma^2)

Remarks:

2.4 Other Topic: Trend and Seasonality

3. Modern Assumptions

3.1 Prerequisites: Stationarity and Weak Dependence

Definition of Stationarity: The stochastic process {xt}\{x_t\} is stationary if for every collection of time indices 1t1<t2<<tm1 \le t_1 < t_2 < \cdots <t_m, the joint distribution of (xt1,xt2,,xtm)(x_{t_1}, x_{t_2}, \cdots, x_{t_m}) is the same as the joint distribution of (xt1+h,xt2+h,,xtm+h)(x_{t_1+h}, x_{t_2+h}, \cdots, x_{t_m+h}) for all intergers h1h \ge 1.

E(xt)=constantVar(xt)=constantCov(xt,xt+h)=f(h), t,h1E(x_t) = constant\\ Var(x_t) = constant\\ Cov(x_t, x_{t+h}) = f(h),\ \forall t,h \ge1

Definition of Weak Dependence: The stochastic process {xt}\{x_t\} is weakly dependent if xtx_t and xt+hx_{t+h} are “almost independent” as h increases without bound.

Cov(xt,xt+h)0 as hCov(x_t, x_{t+h}) \rightarrow 0\ \text{as}\ h \rightarrow \infin

Remarks:

3.2 Assumptions for Consistency

TS1’: Linearity + Stationary & Weak Dependence

TS2’: No Perfect Collinearity

TS3’: Zero Conditional Mean

TS1’ + TS2’ + TS3’ → OLS estimators are consistent.

3.3 Additional Assumptions for Asymptotic Normality

TS4’: Homoskedasticity

TS5’: No Serial Correlation

TS1’ + TS2’ + TS3’ + TS4’ + TS5’ → OLS estimators are asymptotically normally distributed.

4. Serial Correlation

Serial correlation indicates a violation of TS5 or TS5’, which might be the most common problem in time series analysis. This section discusses the potential consequences of this violation, how to test for serial correlation, and how to address this problem.

4.1 Consequence

In the context of classical assumptions, serial correlation does not affect unbiasedness as long as the model satisfies TS1 to TS3. Similarly, under modern assumptions, serial correlation does not affect consistency as long as the model meets assumptions TS1’ to TS3’.

However, it does make the standard errors of OLS estimates and other test statistics invalid, thereby rendering various statistical tests unreliable. As for R-squared, if time series data is stationary and weakly dependent, even in the presence of serial correlation, R-squared remains useful.

A common misconception is that the OLS estimator is inconsistent when lagged dependent variables are present❌. Below are two examples to illustrate that the OLS estimator of an AR model can be consistent or inconsistent.

Example: AR(1) model with contemporaneous exogeneity

yt=β0+β1yt1+utE(utyt1)=0y_t = \beta_0 + \beta_1 y_{t-1} + u_t \\ E(u_t | y_{t-1}) = 0

Example: AR(1) model with AR(1) errors

yt=β0+β1yt1+utut=ρut1+ety_t = \beta_0 + \beta_1 y_{t-1} + u_t \\ u_t = \rho u_{t-1} + e_t

4.2 Testing

1) Regressors are strictly exogenous

When independent variables are strictly exogenous, test AR(1) serial correlation:

ut=ρut1+etu_{t}=\rho u_{t-1}+e_{t}

T test

Durbin-Watson test

2) Regressors are not strictly exogenous

However, when independent variables are not strictly exogenous (e.g. AR(1) model or other model with lagged DV as regressors), or we do not know anything about the strict exogeneity, we should use the following approach to test AR(1) serial correlation:

Meanwhile, this approach is easily extended to higher orders of serial correlation:

4.3 Correction

1) Regressors are strictly exogenous

yt=β0+β1xt1+β2xt2++βkxtk+utut=ρut1+ety_t = \beta_0 + \beta_1 x_{t1} + \beta_2 x_{t2} + \cdots + \beta_k x_{tk} + u_t\\ u_t=\rho u_{t-1}+e_{t}

If we do weighted differencing:

ytρyt1=(1ρ)β0+β1(xt1ρxt1,1)++ety_t-\rho y_{t-1} = (1-\rho)\beta_0 + \beta_1(x_{t1}-\rho x_{t-1,1})+ \cdots+e_t

Hence, we could:

Remarks: This process is called the Cochrane-Orcutt procedure, which omits the first observation in the final estimation step. Another similar procedure, called Prais-Winsten estimation, includes the first observation in the last step. However, asymptotically, it makes no difference whether or not the first observation is used.

2) Regressors are not strictly exogenous

In this case, we use OLS to get consistent estimations but use another program to compute corrected standard errors and other test statistics (i.e. serial correlation–robust standard error).

Eg: Newey-West Standard Error (with parameter g)

library(sandwich)

# Covariance matrix
Sigma <- NeweyWest(model, lag=4)

# Compute NW standard error for spdlaw and beltlaw
se_spdlaw  <- sqrt(diag(Sigma))["spdlaw"]      # 0.02547 <- 0.02057
se_beltlaw <- sqrt(diag(Sigma))["beltlaw"]     # 0.03336 <- 0.02323

The parameter g controls how much serial correlation we are allowing in computing se.