Statistics 710  Advanced Statistics:
Large-Sample Statistical Theory

MWF 10am,    MTH1313          Spring 2011

Instructor: Eric Slud, Statistics Program, Math. Dept.

Office:    Mth 2314, x5-5469, email evs@math.umd.edu

Office hours: initially M3, Th2

Course Text: A. van der Vaart, Asymptotic Statistics (2000),
           Cambridge University Press (paperback).

Assigned work and Grading: the course grade will be based on 7
           homework problem sets assigned throughout the course.

Prerequisite: Stat 700 and Stat 600.

For current HW set, click here.

Class on Friday Feb. 11 will be an in-class presentation of problem
solutions for HW1 by you, the students.

Class on Friday, Feb. 18 will be CANCELLED.


Most of this web-page was designed for Stat 710 as I gave it in the Fall of 2007.
For the Spring of 2011, the emphasis will be more on the early parts of the
van der Vaart book, especially the material on U-statistics, Estimating Equations
(including ML), and Contiguity, and less on Empirical Processes (material from Ch.19).
The latter will be introduced and a few key results from that Chapter will be used
throughout to help prove important rigorous results related to the asymptotic
behavior of solutions of estimating equations.


This course consists of five topical modules on advanced probability and statistical theory, with the common theme of statistical inference from large-sample data. Three of the modules are mostly about Probability Theory tools:

(I) Empirical processes --- material generalizing the Law of Large Numbers to provide results about uniform almost-sure convergence of empirical averages of random variables like f(Xj) (for iid r.v.'s X j) where "uniformity" is over classes of functions f.

For this material, the references are: Chapter 19 of the Van der Vaart book; a 1980 book, "Convergence of Stochastic Processes" by David Pollard; and some results from a 1996 book of Van der Vaart and Wellner, "Weak Convergence and Empirical Processes".

(II) Contiguity Theory and Local Asymptotic Normality. References here are Chapters 6 and 7 of Van der Vaart's book, possibly supplemented with the books Le Cam, and L. Yang, G.L. (1990), "Asymptotics in Statistics: Some Basic Concepts" or Greenwood, P. E., and Shiryaev, A. N. (1985), "Contiguity and the statistical invariance principle".

Our main application of this material will be to Relative Efficiency of Estimators and Sample Size Formulas and least-favorable alternatives, with a little exposure to `asymptotically linear estimators' and influence functions, Regular estimators and H\'ajek convolution theorem. (Reference for this latter material is Chapter 8 of van der Vaart.)

(III) Estimating Equations. Maximum likelihood and generalizations. Minimum contrast, misspecified likelihood, and M estimators. References are Chapters 5 of van der Vaart, plus other materials to be filled in later, in conjunction with module (IV) on efficient estimating equations.

(IV) U statistics and Projections. Reference is Chapters 11 and 12 of van der Vaart.

(V) Counting processes, compensators, martingales, and statistics defined in terms of stochastic integrals with respect to compensated counting process martingales. References from several books on martingales (e.g. Bremaud 1981, "Point Processes and Queues, Martingale Dynamics") and Survival Analysis, such as Fleming, T. and Harrington, D. (1991), "Counting Processes and Survival Analysis", plus my own notes.



Spring 2011 Lectures and Reading Assignments

The first lecture will be an overview lecture on the interplay between probabilistic limit theorems and statistical large-sample theory, sketching the kinds of results we will cover in the course.

The second lecture, going on for the next couple of weeks, will motivate the study of uniform limit theorems by considering the large-sample consistency and asympototic normality of ML and estimating equation estimators. The reading is Chapter 5 of the van der Vaart text, pp. 41-59. From there, we will branch to Chapter 19 and introduce just enough empirical process theory to complete Theorem 5.23 via Lemma 19.31.

Homework problems for Spring 2011 are posted below. You can see older homework problem sets and some problem solutions on the Old-homework web-page for Fall 2007 (from the van der Vaart text) and, for Fall 2002 from a different book, in the OldHW directory.


Homework 1, due Wednesday, Feb. 9, 2011: in Van der Vaart, Chapter 5,
         do problems numbered   7, 8, 13, 19, 24, 25.   (Solutions not given.)

Homework 2, due Friday, March 4, 2011: in Van der Vaart, Chapter 19
of van der Vaart text, do: #19.3, 19.4, 19.5, 19.6, 19.7, and 19.10.
Notes. Problem 19.3 involves only checking equality of covariances and
invoking an appropriate Theorem to imply that a unique set of finite-
dimensional distributions determines a unique stochastic process law.
In problem 19.4, the meaning of the notations Fm, Gn are different from
the empirical-process usage of the chapter: here they are "empirical
distribution functions". That is, Fm(t) is the proportion of
observations X1, ..., Xm less than or equal to t, and Gn(t) is the
proportion of observations Y1, ..., Yn less than or equal to t.
Problem 19.4(c) and 19.10 are exercises in formulating limiting
probabilities using empirical process convergence plus continuous
mapping Theorem. Problem 19.5 is about bracketing and is fairly
straightforward. 19.6 and 19.7 give some practice in estimating the
VC numbers used to measure the size of function classes used in
proving GC and Donsker properties.   Solutions.


Homework Set #3, due Monday March 28: Chap. 6, p.91: # 1, 2, 3, 4, 6.

Additional Problem: Give a general sufficient condition (on f, a density
with mean 0) using Lecam's 3rd Lemma for the probability laws   Qn  
corresponding to n-tuples   X1,..., Xn   of iid scalar random variables
with density   f(x - c/n1/2)   to be contiguous (for each fixed   c)
with respect to the probability laws Pn for the same random n-tuples
with density f(x). Use this to derive and justify the power function
for a large-sample one-sided test which has significance level   1-α  
based on these data, of the null hypothesis EX1 <= 0 . Solutions.

Homework Set #4, due Friday April 15: Ch.11 # 2, 7;   and  
           Ch.12 #1, 2, 6, 9, 10.

Homework Set #5, due Friday April 29. Ch.8 #3,   Ch.12 #3, 5, 8,
           and   Ch. 14 #5.

NOTE: There will be one more HW due May 11, on the martingale and
counting-process material.

Homework Set #6, due Wednesday May 11.

(I). In the Exponential Frailty Example 5.26, find the asymptotic
relative efficiency of the estimating equation estimator given by
the author, versus the MLE and versus the martingale estimating equation
estimator for θ , under the (restrictive) hypothesis that all frailties
zi are equal to (an unknown) constant μ.

(II.) Problems #1, p.4; #2, p.8; and #3, p.11, in Martingale & Counting
Process handout chapter.



NOTES

(1). A very useful general lemma on uniform convergence of random functions   Mn(&theta)   
defined in terms of data (and which which will be maximized to estimate &theta ) is
given in Appendix II (p.1116) of

P. K. Andersen; R. D. Gill (1982), Cox's Regression Model for Counting Processes:
A Large Sample Study
, Annals of Statistics 10, No. 4. (Dec., 1982), pp. 1100-1120

which can be found in JSTOR. The Lemma and proof are restricted to a single page,
and can be found here.

(2). A really nice article by Peter Bickel along the lines of our semiparametric
efficiency discussion is "On Adaptive Estimation", the 1980 Wald Memorial Lectures
published in Annals of Statistics (1982) 10, 647-671. The Stable URL is
http://links.jstor.org/sici?sici=0090-5364%28198209%2910%3A3%3C647%3AOAE%3E2.0.CO%3B2-1 .

(3). A set of Chapters I wrote on Martingale Methods in Statistics can be found
here as reading material for the last segment of the course.

(4). To see statistical applications and developments of the ideas we studied
under the heading of asymptotic relative efficiencies and contiguous alternatives,
you may be interested in a paper I wrote with Sudip Bose of GWU, on combining tests
to be simultaneously powerful against several designated alternative directions.



Important Dates

  • First Class: Monday, January 24, 2011.
  • Class Cancelled:   Friday, Feb. 18, 2011.
  • Last class: Monday, May 9, 2011.

  • The UMCP Math Department home page.

    The University of Maryland home page.

    My home page.

    Last updated: May 4, 2011.