Stat 700                                               Fall '14     


                   PROBLEM ASSIGNMENTS

All section and problem numbers are as in Casella and Berger,
Statistical Inference, 2nd edition. Problem Sets listed here 
for later in the course may change by the time they are assigned.

Problem Set 1, due Monday, Sept. 15 in class.  

(1) Prove that the logistic density is symmetric about 0 [see exercise 
#2.26 for definitions and properties of symmetric densities], and show that 
its mean is  0  and by integration evaluate its variance (as given 
on p. 624 for  mu=0, beta=1) or find a valid series expansion for it.

(2) Suppose that the distribution function of a random variable X 
is given by:
    F(t) = 0               if     t < 0
         = 1/4 + t/12      if     0 <= t < 1
         = 1/3 + (t-1)^2/3 if     1 <= t < 2
         = 1               if     t >= 2
Find the mean, variance, and moment generating function of X.

(3) Verify 2 ways --- both analytically and by verifying the 
conditions of Theorem 2.4.3 or 2.4.8 in Casella & Berger --- that
    dM(t; lambda)/dt = E( X exp(t X) )

where the r.v.  X ~ Poisson(lambda)  and  M(t) = M(t; lambda) = E(exp(tX))
is its moment generating function.

(4) Prove that the first term on the right-hand side of the last 
displayed equation on p.124 (3rd line from the end of the proof 
of Lemma 3.6.5) is 0, under the conditions of the Lemma.

(5) Prove [the assertion in Casella-Berger problem #2.7(b)] that
Theorem 2.1.8 continues to hold if the sets A_0, A_1, ... A_k partition 
a set LARGER than the set (2.1.7) of positivity of the density f_X.
(This means that the disjoint sets A_i are allowed to contain points $x$ 
where f_X(x) = 0.)  Use this fact (or other results in Chapter 2 of 
Casella and Berger) to find the density of Y = (X-2)^2 where X is a random 
variable which is uniformly distributed on [0,3].

(6) Problems # 3.20 and 3.34.

(7) Suppose that a random variable T has the survival function 
S(t) = 1-F(t) = exp(- b t^a)   for t > 0, for positive constants a,b.
Show that for each fixed value of a > 0, this family of probability 
distributions parameterized by b > 0 is a SCALE FAMILY, and also that the 
distribution of the random variable log(T) is a LOCATION-SCALE family with 
respect to the parameters (a,b). 


--------------------------


Problem Set 2, due Monday, Sept. 29, in class. 


(1) (a) Give an algorithm to simulate on the computer a random variable 
whose distribution function is given by F(t) in Problem (2) of HW Set 1.
    (b) Give an algorithm to simulate on the computer a r.v. whose density is
= 2 exp(-x) (1-exp(-x)) for x > 0  (and =0 for x <=0).

(2) Find the moment generating function and variance of a random variable Z 
defined as the sum of a random number M of independent identically distributed 
N(1,4) random variables X_1, X_2, ... where M is a Poisson(lambda) random 
variable independent of the set of X_j's.

(3) Find the distribution function and variance of a random variable Y defined 
as follows. Let V be a Bernoulli(p) random variable, W be a Geometric(0.1) 
random variable (with values 1,2,...), and X an Exponential r.v. with mean 0.5,
and assume that V, W, and X are jointly independent, and that Y is defined by
Y = V W + (1-V) X.

(4) Suppose that (X,Y) is a bivariate-normal random vector with E(X)=0, E(Y)=2,
Var(X) = 1, Var(Y)=4, Cov(X,Y)=1.  Find  E( (X+Y)^4 | Y )  and  Var( X+2Y | Y ).

(5) [Counts as 2 problems] The probability  Q  of Heads in a coin-toss experiment is 
determined by selecting a coin from a population of biased coins in such a way that 
Q ~ Beta(4,4). Then, given fixed Q, that coin is tossed independently and repeatedly 
until the second time that Heads occurs. Let M be the number of tosses this takes 
(which means that among the first M-1 tosses, exactly 1 Head has appeared, and on 
the M'th toss a Head also occurs).
   (a) Find the unconditional distribution of M.
   (b) Suppose you do the coin-flipping experiment with the same coin (and 
therefore same Q) 10 times, recording numbers M_1,...,M_10 each as the numbers 
of tosses needed until the occurrence of the 2nd head. Find the probability 
distribution of  L_1 = M_1+...+M_10, and find its variance.
   (c) Suppose that in (b) the separate variables M_1,...,M_10 are obtained 
by tossing different independently selected coins with respective Heads 
probabilities Q_1,...,Q_10 ~ Beta(4,4). Compare the variance of the total  
L_2 = M_1+...+M_10 in this series of experiments with the variance you found 
in part (b). Which one is larger ? Why ? Would this fact still be true if the 
Beta(4,4) distribution for Heads-probabilities of individual coins were replaced 
by some other Beta(alpha,beta) distribution ?

(6) If Z_1, Z_2, Z_3 and Z_4 are independent identically distributed standard-normal 
random variables, and Z  denotes the column vector with these four random variables 
as components, then find the density of the random variable
(Z_1+Z_3)^2 + (Z_2-Z_4)^2 + (Z_1-Z_3+Z_2+Z_4)^2 + (Z_1-Z_3-Z_2-Z_4)^2  .

[The answer in this problem can be found in closed form.]


---------------------------


Problem Set 3, due Monday Oct. 6 in class.


(1) Let (Z_1, Z_2, Z_3) be iid standard normal random variables, and
Y_1 = max(Z_1-Z_2, Z_1+Z_3), Y_2 = min(Z_2,Z_3). Find the joint density of
Y_1, Y_2.

(2) It was shown in class by an argument involving the "memoryless property" 
that if X_1, X_2, ..., X_k  are independent identically distributed Expon(lambda) 
random variables, with order statistics denoted  X_(1), X_(2),... X_(k) , then 
X_(1), X_(2)-X_(1), X_(3)-X_(2), ... , X_(k)-X_(k-1) are independent random 
variables. Prove this fact directly, by regarding this last random vector as 
an invertible transformation of the order-statistics vector.

(3) (a) Find the density of X = Z_1/Z_2 where Z_1 and Z_2 are
independent standard normal random variables. 
     (b) Show that E(|X|)  is infinite, but that E(sqrt(|X|)) is finite. 
     (c) Show from the definition that if X_1,...,X_n are independent and 
identically distributed with the same density as X, then  Xbar = (X_1+...+ X_n)/n 
also has exactly the same distribution as  X.

(4) [Counts as 2 problems] Consider the noncentral chi-squared random variable 
Y = (Z_1+a_1)^2 + (Z_2+a_2)^2 + ... + (Z_k+a_p)^2 ,  where a_1,a_2,...,a_p
are any real numbers, and Z_1,...,Z_p are iid standard normal for any positive 
integer p. 
    (a) Show that the distribution of Y depends on the numbers a_1,...,a_p 
only through the noncentrality parameter lambda = (a_1^2+...+a_p^2)/2.
HINT: to do this, expand the squares in the definition of Y and show that there 
is an orthogonal transformation linear from (Z_1,...,Z_p) to (W_1,...,W_p) such 
that W_1 = (a_1*Z_1+...+a_k*Z_k)/sqrt(lambda).
    (b) So by (a) there is no loss in generality in assuming that a_2=...=a_k=0 in 
part (a). Show that the moment generating function of the random variable Y is
exp(-lambda) * (1-2t)^(-p/2) * exp(lambda/(1-2t)).
    (c) By expanding the final exponential in the mgf of part (b), show that this 
moment generating function is exactly the same as the mgf of the mixture density 
given in formula (4.4.3) of Casella and Berger, which is the mixture with weights 
exp(-lambda) lambda^k / k!  of chi-square(p+2k) densities over nonnegative 
integers k . (Note: justify your use of series expansion inside an integral !)


---------


Problem Set 4, due Friday Oct. 24 (under my door).
     NOTE CHANGED DUE-DATE and also removal of INCOMPLETENESS assertion in 4.(c).

(1) Give two different algorithms to generate (as a function of a sequence of 
iid Uniform[0,1] random variable values U_1 U_2, ...) a random vector (V,W) 
which is uniformly distributed on the triangle {(v,w):  0<v<2, 0 < w < min(v,2-v)}.

(2) If Y_1,...,Y_n are a sample from the Unif[-theta, 2*theta] density, then find
a scalar-valued sufficient statistic T = T(Y_1,...,Y_n) for theta. Show that it is 
sufficient, minimal and complete.

(3) Show from the basic definition of sufficiency that when Y_1,...,Y_m are a sample 
from NegBin(2,p), Y_1+...+Y_m is sufficient for p.

(4) [Counts as 2 problems] Suppose that random variables X_1,...,X_n are i.i.d., 
continuously distributed with density f(x,theta) proportional (as a function of 
argument x) to exp(-lambda*(x-mu)^3) I[0 < x < 1] , where theta = (lambda,mu) and 
lambda > 0 while mu could be any real number.
    (a) Find the sufficient statistic vector T obtained by representing this density 
as an exponential family, and explain how you know they are sufficient.
    (b) Show that your sufficient statistic vector is minimal.
    (c) Show that when lambda=1 is known, a sub-vector S of your sufficient statistic 
vector T is minimal sufficient for mu. 

(5) [Counts as 2 problems] Use Basu's Theorem to show that:
    (a) When X_1,...,X_n is a sample from N(0, b^2), the vector (sign(X_1),...,
sign(X_n)) is independent of X_1^2+...+X_n^2.
    (b) When X_1,...,X_n is a continuously distributed sample (of scalar r.v.'s),
the rank-vector (R_1,...,R_n) with $R_k = \sum_{i=1}^n I[X_i\le X_k]$ is independent 
of the order-statistic vector.
    (c) When X_1,...,X_n is a sample from Gamma(alpha,lambda), then the vector 
(X_1,...,X_n)/(X_1+...+X_n)  is independent of X_1+...+X_n.
    (d) When X_1,...,X_n is a sample from Uniform[0,L], then (X_(1),...,X_(n-1))/X_(n)
is independent of X_(n).

IN EACH CASE in (5), THE METHOD IS TO SHOW THAT ONE STATISTIC IS SUFFICIENT FOR AN 
UNKNOWN PARAMETER (perhaps, with other parameter(s) fixed) AND THE OTHER IS ANCILLARY. 
In part (b) the parameter is the unknown distribution function.


---------------


Problem Set 5, due Friday Nov. 7 (under my door).
All problems in this HW set count as 2 (so total points are 80).

(1) (Compare #6.12.) Suppose that we perform a comparison of the lifetimes of 
a certain kind of device under two operation conditions A and B, as follows. 
For each of 100 independently and identically produced specimens of this device,
a biased coin is flipped independently, with probability 3/5 of Heads and 2/5 
of Tails. Each device for which the coin falls Heads is operated under condition A, 
and the others under condition B. Let e_j for j=1,...,100 be 1 if the j'th coin 
falls Heads, 0 otherwise. Assume that the device lifetimes are independent 
and  Expon(lambda_A) and Expon(lambda_B) respectively under conditions A and B.Each 
one of the 100 devices, j=1,...,100, is operated to destruction and its lifetime T_j 
noted. Let n_A = 100 - n_B be the number of coins falling Heads, and let 
         S_A = sum_{i=1}^{100} e_j T_j     and   S_B = sum_{i=1}^{100} (1-e_j) T_j  
be the respective total lifetimes of all of the condition-A and condition-B devices.
    (a) Prove that (S_A, S_B, n_A) is minimal sufficient for theta = (lambda_A,lambda_B),
and that n_A is ancillary for theta.
    (b) Conditionally given 0 < n_A < 100, find the variance-covariance matrix for 
the (conditionally) unbiased estimator (S_A/n_A, S_B/n_B) of theta.

NOTE that the conditionality principle says we should perform inferences about theta
conditionally given  n_A, which is completely reasonable in this setting.

(2) (a) Give two completely different methods [HINT: mixture, and conditional 
probability integral theorem] for simulating as a function of i.i.d. Uniform[0,1] 
random variables U_1,U_2,...  a random vector  (X,Y)  with joint density 
     f_{X,Y}(x,y)  =  2 ( x y + (1-x)(1-y) ) I[0<x<1] I[0<y<1].
But for the methods you give, justify why they generate random 2-vectors with the 
desired joint density.
    (b) Give any algorithm you like to simulate a random vector (V,W) with the density
     f_{V,W}(v,w) = 2v/sqrt(1-v^2) I[v,w > 0, v^2+w^2<1]
but justify that your function of U_1,U_2 etc. has the stated joint density.

(3) Suppose that for i=1,...,n, the iid discrete random variables  X_i  take values 
0, 1, and 2 with respective probabilities 1-alpha-beta, alpha, and beta, where 
alpha and beta are unknown positive parameters with alpha + beta < 1. Based on data 
X_1,...,X_n: 
    (a) Find the method of moments estimators of alpha, beta. 
    (b) Find the MLE of theta = (alpha, beta).
    (c) Find the Bayes (posterior-mean) estimator of (alpha,beta), using the prior
g(alpha,beta) = 2 I[alpha, beta > 0, alpha+beta < 1].

HINT: your calculations will be easier in (b) and (c) if you first establish 
that N_1 = number of X_i's equal to 1 and N_2 = number of X_i's equal to 2 are 
sufficient statistics for (alpaha, beta). Also: your calculation in (c) will be 
related to the Dirichlet Distribution; look it up on Wikipedia if you do not want 
to derive the formulas yourself based on what you know about Beta densities.

(4) Find formulas for the mean-squared errors (MSEs), as a function of alpha and 
beta, for each of the three estimators you found in problem (3). Compare these 
MSEs as well as you can: does any of these estimators dominate the others ? Is any 
uniformly worse than the others ? You may base your answer on numerical
comparisons for particular choices of alpha, beta or on general theorems stated 
in class.


---------------


Problem Set 6, due Monday Nov. 24.

(1) (a) Find the UMVUE for lambda^2 based on a sample X_1,...,X_n of Poisson(lambda) data.
    (b) Find the Cramer-Rao lower bound for an unbiased estimator of lambda^2 based on 
the same sample. Is the lower bound achieved by the UMVUE ? Explain why or why not.
    (c) Find the variance of the UMVUE.

(2) Suppose that V_1,...,V_n are a sample from the density  (2v/theta^2) I[0<v<theta].
     (a) Show that the conditional probability distribution of V_1 given V_(n) 
[the maximum order statistic] is of mixed type, and evaluate the conditional
distribution function.
     (b) Find the UMVUE of theta based on the sample.

(3) A sample of data Y_1, ..., Y_{100} are assumed N(mu,sigma^2) distributed, and their 
observations result in a sample mean of 37.2 and a sample variance of 6.25. Consider
the Bayesian estimation of the unknown parameters (mu, sigma^2) subject to a prior density
pi(mu,sigma^2) which makes mu and sigma independent, where mu is discrete taking possible
values 32, 36, 40 with respective probabilities .3, .4, .3, and where 1/sigma^2 ~
Gamma(2, 10).
     (a) Find the posterior distribution of (mu, sigma^2).
     (b) Find the Bayes posterior-expectation estimators for mu and sigma^2. 

(4) Suppose that a discrete dataset Y with probability mass function  p(y,theta)
satisfies the assumptions of the Cramer-Rao lower-varianance bound theorem, and
that the dataset X is "less informative" than Y in the sense that it is derived
from Y as a function  X = h(Y) , where the function h is known and does not depend
on theta. Show under mild assumptions on the derivatives with respect to theta
being allowed to pass across (possibly infinite) summations, that in this setting
the Fisher information for theta contained in data X is less than the Fisher
information I_Y(theta).
[HINT: this simple result uses the definitions and the Cauchy-Schwarz inequality.
Assume what you need to about differentiation and summation being interchanged,
but state explicitly what you assume.]

(5) (a) Find (and justify) the UMVUE of p^2 based on a Bernoulli(p) sample X_1,...,X_n.
    (b) Show that, based on the same data sample, there does not exist an unbiased
estimator of sqrt(p).
[HINT: Let S_n = X_1+...+X_n, and express the relation that a function h(S_n) is
an unbiased estimator as a polynomial equation in p^2, and show that this equation
is impossible (as an identity in p falling in the unit interval) for any function h.]

(6) A prior density pi(lambda) = gamma exp(-lambda gamma) for lambda > 0 is given 
for the unknown parameter lambda of a Poisson(lambda) distributed data sample 
Y = (Y_1,..., Y_n),  where  gamma > 0 is known. Based on this sample, and this prior,
find the estimator tilde{lambda} = tilde{lambda}(Y) that minimizes the expectation 
over lambda and Y  of the loss L(lambda, tilde{lambda}(Y)), where 
L(lambda, a) = (lambda - a)^2/lambda for a, lambda > 0.


---------------


Problem Set 7, due Wednesday Dec. 10 in class or Friday Dec. 12 at 4pm review session.

(1) An observer records a single data-value X in (0, 1), where X
has one of two possible densities according to the value  theta = 1 or 2, with
f(x,theta) = 2(1-x) I[0<x<1] if   theta = 1,  and  f(x,theta) = 2x I[0<x<1] if theta = 2.
The observer must devise a statistical decision for action space A = {1,2} subject to
the loss function L(theta, a) = 3 I[a not = theta] + 9 I[a> theta], and restricts his
attention to the class of nonrandomized decision rules T_s(x) indexed by s in (0, 1)
which are defined by:      T_s(x) = 1 + I[x>s].

   Show that in this statistical decision problem, there is a unique minimax rule
(find it and justify it!). Also show that every decision rule of the form
tau_t(x) = 1+I[x<t] for 0 < t < 1 is inadmissible by comparison with some rule T_s.

(2) Suppose that you have a sample of n independent identically distributed
integer observations uniformly distributed in {1, 2, ..., theta}, where  theta  is an
unknown positive integer. Find a scalar sufficient statistic T and verify that it is
complete, and find the  UMVUE of 1/theta.

(3) Suppose that independent data samples V = (V_i, i=1,...,n) and W =
(W_i, i=1,...,n) are observed, where V_i ~ N(theta, theta/2) and W_i ~ N(2theta, theta/4).
The unknown parameter theta is assumed to lie in the parameter space  (0, infty).
(a) Find a sufficient statistic for  theta  based on V, W. Is it complete ?
(Justify your answer.)
(b) Give (as a function of n, theta) a lower bound on the variance of all
unbiased estimators of  theta  based on V, W.
(c) Does a UMVUE of  theta  based on V, W exist ? Will its variance be
as small as the answer you gave in (b) ? Justify your answers.

(4) A dataset W_1,...,W_n consists of i.i.d. variables with density
             f(w,theta) = 0.5*(1 + theta*(2*I[w>1]-1)) , 0 < w < 2,
where  -1 < theta < 1 is an unknown parameter.
   (a) Find a UMP test of size exactly alpha=.05 based on W_1,...,W_30 (n=30) for
the null hypothesis H_0: theta=0 versus the alternative H_1: theta = 1/2.
(HINT: why is auxiliary randomization necessary in this problem ?)
   (b) Find the power against theta=0.5 of the test you find in part (a).

(5) Consider the distributional setting of problem (4), still with n=30 with the
same null hypothesis H_0 as in problem (4), but now let the alternative
hypothesis be H_2: |theta| = 0.5.
   (a) Is there a UMP test of H_0 versus H_2 of size alpha=.05 ? Give details
showing why or why not.
   (b) Find the Likelihood Ratio Test of size alpha=.05 for testing H_0 versus
H_2, and find its power against theta = 0.5.

(6) Let Z_1,...,Z_n be a N(mu,1) sample; and define  H_0: mu=0  and  H_2: |mu| > 0.
Suppose that your prior probability distribution for mu is a mixed-type distribution
with a point mass of 0.4 at mu=0 and continuous density-component 0.6 times a
Normal(0, tau) density. Define the loss-function by
           L(mu, a) = 3 * I[ mu=0 and a=1] + mu^2*I[ |mu| >=1 and a=0] ,
where  the possible actions in this testing problem are a=0,1, with 1 interpreted as
rejection of the null hypothesis.
   (a) For n=10, tau=1, find the Bayes-optimal testing procedure, and find its size
and power.
   (b) What is the posterior conditional probability that your test in (a) rejects
given that |mu| = 1.5 ?