Statistics 730 Time Series Analysis

Spring 2017 MW 5-6:15, Mth 1313

Instructor:Eric Slud, Statistics Program, Math. Dept.Office:Mth 2314, x5-5469,

For a set of Sample Problems for the In-Class Test, click here.

Course Text:R. Shumway & D. Stoffer,Time Series Analysis and its Applications, 2nd ed. 2006, Springer.

(This text is free as an e-book to UMCP students through the library: see website with datasets and errata.)Recommended text:H. Lutkepohl,New Introduction to Multiple Time Series Analysis, 2005, Springer. (Also free as e-book.)

Overview:This course covers the concepts and tools of statistical time series analysis, both from a mathematical

and a data-analytic viewpoint. Course segments on mathematical tools will be interleaved with segments

emphasizing model-building, statistical analysis (inR), and simulation. The course introduces methods both

in the time and frequency domains. The mathematical theorems and proofs are an essential part of the course.

Students will be required to make further mathematical arguments and extensions in graded homework problems,

and understanding of the conditions under which the techniques are valid will be tested.

Prerequisite:Stat 700 plus a graduate course in mathematical analysis, plus some computing familiarity.

Course requirements and Grading:there will be 7 or 8 graded homework sets (one every 1½ to 2 weeks)

which together will count 40% of the course grade. There will also be an in-class test and a final course

project (or take-home test), each of which will count as 30% of the course grade.

NOTE ON USE OF THEORETICAL MATERIAL.Both in homeworks and the in-class test, there will be

theoretical material involving probability theory as needed to apply the law of large numbers and central limit

theorem, along with the `delta method' (Taylor linearization), linear algebra and other manipulations at advanced-

calculus level, in some cases verging on measure-theoretic probability techniques. (Look at Appendix A of the

Shumway-Stoffer book to see what I mean). There will also be some use of Hilbert space methods. The

theoretical material in the Shumway and Stoffer book is concentrated in the Appendices, but that material will be

supplemented in class.

Course Coverage:Chapters 1-5 and Appendixes A, B, C of the Shumway and Stoffer book, plus material

from Chapters 6-7 as time permits.

NOTE ON COMPUTING.Both in the homework-sets and the course project, you will be required to do

computations on real datasets using a statistical-computing platform such as R or SAS or MATLAB. The book

and various class demonstrations and scripts on this web-page will be given in R, and that is the only software

platform that I will use or provide help with. If you are learning one of these packages for the first time, I strongly

recommend R, and I will provide links to free online materials introducing them. In addition, there is a concise

introduction to R commands in time series analysis that you should consult.

COMPUTER ACCOUNTS.Math, Stat, and AMSC graduate students have access to R, MATLAB and SAS

through their Mathnet and glue accounts. R is freely available in Unix or PC form through this link.

Getting Started with

** Note:** In this course, the book and I will make many references to the **R**
language and statistical programming platform. This is a free software package. If you are new to **R**, you should get started as soon as
possible, using it either on your university *Glue account* in a *Linux* setting, or on a
workstation or PC, either at the University or on your home computer by downloading the software
following instructions at the **R website**.
For the systematic **Introduction to R** and **R reference manual** distributed with the R
software, either download from the R website or simply
invoke the command

> help.start()

from within R. For a quick start, see my own
Rbasics handout originally
intended for a Survival Analysis class, and then read more about **R** objects and syntax in the
Venables and Ripley text, in my Stat 705 Lecture Notes, and in the R introduction manual distributed with
the R software.
A really useful short summary of a lot of R commands can be found here. See also the previously mentioned concise introduction to R commands in time series analysis .

R Logs

For R practice logs that will periodically illustrate R commands related to time series

data and exercises, see RLogsS730 Directory.

Assignment 1. (First 1½ weeks of course, HW with 7 Problems due Mon., Feb. 6).

Read Chapter 1 through Sec. 1.5, plus Section A.1 (Appendix A).

Solve problems #1.3, 1.8, 1.9, 1.13, 1.15, 1.16(b),plus one more problem, below.

In #1.3, you may use R commands as in Example 1.10 as the

problem suggests, or you may code the generation of the time-series variables directly.Extra Problem, not in text:(i) Prove that if X(t) is a stochastic process with finite second moments

for integer indices t, and for each n ≥ 1, X_{n}(t) is a strictly stationary process in t, also with finite

second moments, such that E(X_{n}(t)-X(t))^{2}→ 0 as n → ∞, then X(t) is strictly stationary.

(ii) Prove the same assertion as in (i) with "strictly stationary" replaced by "weakly stationary".Assignment 2. (Second 1½ weeks of course, HW with 7 Problems due Wed., Feb. 15).

Read Chapter 2 through Example 2.7, plus Section A.2 and B.1 (AppendicesA,B).

Solve problems #2.2, 2.4, 2.5, 2.6, 2.8 (counts as 2 problems),plus one more problem, below.

Extra Problem, not in text:Suppose that w_{t}for all integer t is a (0,σ^{2}) White Noise, and that

X_{t}= ∑_{j= -2,2}w_{t-j}, Y_{t}=X_{t-1}+X_{t}+X_{t+1}. (i) Derive γ_{X}(h) and γ_{Y}(h). (ii) Express Y_{t}as a Moving

Average of w_{t}. (iii) Prove that w_{t}cannot be expressed as a finite-order moving average of X_{t}.Assignment 3. (Weeks 4-5 of course, HW with 7 Problems due Wed., Mar. 1).

Read Chapter 3 through Section 3.4, plus Section B.2 and B.4 (Appendix B).

Solve problems #3.2 (counts as 2 problems), 3.3, 3.6, 3.8,plus two more problems, below.

Extra Problems, not in text:(I). Suppose that (X_{t}, t=1,2,3,4) are jointly Gaussian mean 0 with

4x4 covariance matrix Σ_{j,k}= r(j-k) where r(0)=2, r(1)=r(-1) = 1, r(2)=r(-2) = 0.5, r(3) = r(-3) = 0.

(a) Find the partial correlation of X_{1}and X_{3}(given X_{2}).

(b) Find the partial correlation of X_{1}and X_{4}(given X_{2}, X_{3}).

(II). Show that in order for the AR(2) with autoregressive polynomial φ(z) = 1 - c_{1}z - c_{2}z^{2}

to be causal, the parameters (c_{1}, c_{2}) must lie in the region of pairs

such that c_{1}+c_{2}<1, c_{2}-c_{1}< 1, and |c_{2}| < 1.

Are these conditions sufficient for causality ?Assignment 4. (Weeks 6-7 of course, HW with 7 Problems due Fri., Mar. 17 6pm).

Read Chapter 3, Sections 3.4 through 3.8.

Solve problems 3.11, 3.12 (proof of P3.4 only), 3.15, 3.17, 3.23, 3.27, plus one more.

Extra ProblemSuppose that X, Y are scalar r.v.'s andZa p-vector variable, and denote the

covariance matrix (assumed finite) of the p+2 dimensional random vector (X,Y,Z) as B. Show that

the partial correlation of X,Y givenZis 0 if and only if the (1,2) entry of B^{-1}is 0.Assignment 5. (Weeks 8-9 of course, HW with 7 Problems due Wed. April 5 in-class).

Read Chapter 4, Sections 4.1-4.8.

Solve problems 3.40 (Hint: project onto the space spanned by{w_{j}-w_{0}, j=1,...,n}.) ,

plus 4.4, 4.5, 4.6, 4.10, 4.13, 4.20.Assignment 6. (Weeks 10-11 of course, HW with 9 Problems due Mon. May 1).

Read Chapter 4, Sections 4.5-4.8, 4.10, 4.11.

Solve problems 4.8, 4.23, 4.25, 4.28plus 5 more, immediately following.

(I).(Counts as 2 problems.) (a) Simulate a long (n ≥ 1000) time series with the stationary

ARMA(1,2) model X_{t}- 0.3 X_{t-1}= (1-0.5B)(1-0.2B)W_{t}, with W_{t}standard normal.

Verify that your estimates of the parameters γ(0) and γ(1) agree reasonably closely

with the theoretically correct values of these parameters.

(b) Find an analytical expression for the spectral density of the X_{t}process, and plot it in

a suitably labeled graph.

(c) Overplot on the same graph (with a different line-type or color) a smoothed periodogram

estimator (with no tapering) based on a Daniell kernel with L=21 points (each with weight 1/21).

(d) Also overplot on the same graph (again with a different line-type or color another

smoothed periodogram estimator of the spectral density which gives greater weight to

periodogram ordinates near the center of the lag window consisting of 21 points, specifying

what kernel you used and how you implemented it in the software you used.

(e) Make sure in your solution to part (d) that your scaling of the spectral density and

periodogram are such that the smoothed periodograms are reasonably close to the true spectral density.

(II).Simulate a long (n ≥ 1000) stationary time series with spectral density very close to

f(x) = (1 - (x/π)^{2}) for -π < x ≤ π. You can find a two-sided MA process ∑_{j: -a <j≤b}c_{j}W_{t-j}

with large positive a,b to accomplish this. Overplot a graph of this spectral density f with a

smoothed periodogram estimate of the spectral density to show that you did this correctly (and say

what lag-window smoother you used, and show the computer code that generated your picture).(III).(Counts as 2 problems.) Simulate a pair of long, dependent, stationary time series X_{t}, Y_{t}

(t=1,...,n, n ≥ 1000) with the model X_{t}- 0.9 X_{t-1}= W_{t}and Y_{t}= 0.5 X_{t-3}+ 0.5 V_{t}, where

W_{t}and V_{t}are independent white-noise sequences with Uniform[-1,1] distribution.

(a) Find the theoretical form for the cross-covariance γ_{YX}(h), and show that the form you

find is reproduced in a plot of the estimated cross-covariance from your simulated pair of

time-series.

(b) Find the theoretical form of the cross-spectral density and coherence of X_{t}and Y_{t}.Assignment 7. Applied Data Analysis HW set, will be due Friday, May 12.

Note that 2 problems have been deleted (because previously assigned) and one substituted:

just like number(II)from HW6 -- see HW6Notes in Rlogs for method.)

Read Sections 2.3, 3.7-3.9, 4.10, 5.3, 5.5, 5.6, 6.1 and 6.2.

Do problems Problems: 3.31, 3.32,plus 3 more, immediately following.

(A).Consider the SOI series, which we found to have several prominent autocorrelations at

lags k*12, filtered by the seasonal detrending operator 1-B^{12}.

(i) Show that this series has two spectral peaks, when the periodogram is only very slightly

smoothed. Do you think they are both real ? Try to smooth the periodogram with lag windows weighting more heavily toward the center of the window.

(ii) Follow the stepwise stochastic linear regression steps we previously used for the original SOI

series on this filtered series. Do you find that the residuals from your fitted models now pass the

Box test for model adequacy ?

(iii) If not, explain which lags in the residuals contributed most heavily to your Ljung-Box statistic.

(B).Using the smoothed bivariate periodogram tmp = spec.pgram(SOI.Rec, kernel("daniell",4), taper=0)

as in the R Log TSAdataAnalysis.txt covered in class, find by inverse FFT the weights for the

optimal linear filter approximating Rec[t] by ∑_{j}b_{j}* SOI[t-j].

(C).Simulate a long (n ≥ 1000) stationary time series with spectral density very close to f(x) = 1

for -π/2 < x ≤ π/2 and = 0 for x ≤ -π/2 and x > π/2. You can find a one-sided MA process

∑_{j: 0 ≤j≤b}c_{j}W_{t-j}with large positive b to accomplish this. Overplot a graph of this spectral density f

with a smoothed periodogram estimate of the spectral density to show that you did this correctly (and

say what lag-window smoother you used, and show the computer code that generated your picture).

** SYLLABUS for Stat 730**

**I. Definitions and Constructions of Time Series Models.** (*2 weeks, Ch. 1 & Appendix A*)

A. White Noise AR, MA, Random Sinusoids

i. **R** basics and time series commands

B. Autocovariance and autocorrelation functions.

C. Strong and Weak Stationarity

D. Review of Multivariate Normal, Convergence of RVs and Distributions, and Limit Theorems (leading to Thm A.2).

**II. Exploratory Data Analysis for Time Series.** (*2 weeks, Ch. 2*)

A. Regression and ANOVA (Gaussian case)

i. Information Criteria and Model Building

ii. Differencing

B. Autocorrelation and Spectrum Estimation (Periodogram)

C. Kernel and Spline Smoothing

**III. Autoregressive Integrated Moving Average (ARIMA) Models.** (*4½ weeks, Ch. 3 & Appendices A,B*)

A. Definitions, Relation to Difference Eq'ns

i. Autocorrelation and Partial Autocorrelation

ii. Prediction; Nonstationary Models

B. Estimation, Model-building

C. Decomposition into Signal, Noise, and Seasonal Components

**IV. Spectral (Fourier) Analysis & Periodogram.** (*4 weeks, Ch. 4 & Appendix C*)

A. Filtered Series, Periodogram & Discrete Fourier Transform

B. Nonparametric vs. Parametric Spectral Estimation

C. Fourier Analysis vs. Wavelets

D. Estimation, Prediction, & Filtering

E. Extensions to Multiple (Vector) Time Series

**V. Miscellaneous Topics.** (*2-3 weeks, Ch. 5 & 6*)

A. GARCH, Long-memory and ARMAX Models

B. State Space Models & Methods

C. Likelihoods in Time-Domain and Spectral Forms, Maximum Likelihood, Missing Data, Structural Models

Project Ideas -- for a list of Project paper Guidelines, click here.

Suggestions for ideas and papers which might be used as the basis for a final report or project will be added

here from time to time. **The Final Project will be due by 5pm Fri., May 19.**

**(1)** Time series methods are sometimes used in connection with repeatedly collected survey data. Two technical

reports that provide good exposition of how sample survey theory and time series ideas combine are

Bell & Hillmer 1987 and Bell & Hillmer 1989, and there are many later references to sample-survey data with

a history of using time-series methods, such as the Current Population Survey monthly employment numbers.

stationary-residual structure and forecast on the basis of it. This approach is especially prominent in econometric

time series, under the heading of "seasonal adjustment" -- the idea is to separate longer-term trends and aspects

of the business cycle from the stationary time series residuals. One of the papers that started all this off is

Cleveland, W., Tiao, G. (1976). Decomposition of seasonal time series: A model for the Census X-11

program. Jour. Amer. Statist. Assoc. 71:581–587.

which is covered in many well-known books and papers, and also in recent papers emphasizing specific methods

for the choice of good smoothers, e.g.

P. Stoica and T. Sundin (1999) Optimally Smoothed Periodogram, Signal Processing Volume 78(3), pp. 253–264,

http://doi.org/10.1016/S0165-1684(99)00066-3.

or a war or market-crash) that causes a dislocation of a previously stationary series in a way that decays

over further time and can be modeled. A famous and seminal paper on this idea is

Box, G. and Tiao, G. (1976), Intervention analysis with application to economic and environmental

problems, Jour. Amer. Statist. Assoc. vol.70, pp.70-79.

the Box-Ljung-Pierce Q statistic. The Box-Ljung and Pierce papers or a chapter on this topic in some other time

series book could form a very good topic for an expository term project, possibly augmented with real or

simulated-data examples.

There are parametric-bootstrapping methods (which require specififying the White-Noise error distribution, or

methods based on bootstrapping residuals from fitted models (which do not require specifying error distributions),

or nonparametric methods involving bootstrapping of blocks. There are various papers you might use, especially

one of Politis-Paparoditis cited in Shumway and Stoffer.

Additional Computing Resources. There are many publicly available datasets for practice data-analyses.

Many of them are taken from journal articles and/or textbooks and documented or interpreted. A good

place to start is Statlib. Datasets needed in the course will be either be posted to the course web-page,

or indicated by links which will be provided here.

To begin, here are a few time-series websites:

Time Series Data Library

Economic Indicators and Time Series (BLS)

What is a Time Series?

** CourseEvalUM main page: https://www.CourseEvalUM.umd.edu (top button)**

**First Class: Wed., January 25, 2017****Spring Break: Mon., Mar. 20 -- Fri., Mar. 24, NO CLASS****Mon., April 10, 2017: In-class test****Last Class: Wed., May 10, 2017****Term Project Due: Fri., May 19, 2017 by 5pm.**

The UMCP Math Department home page.

The University of Maryland home page.

My home page.

© Eric V Slud, May 8, 2017.