Stat 700 Fall '14
PROBLEM ASSIGNMENTS
All section and problem numbers are as in Casella and Berger,
Statistical Inference, 2nd edition. Problem Sets listed here
for later in the course may change by the time they are assigned.
Problem Set 1, due Monday, Sept. 15 in class.
(1) Prove that the logistic density is symmetric about 0 [see exercise
#2.26 for definitions and properties of symmetric densities], and show that
its mean is 0 and by integration evaluate its variance (as given
on p. 624 for mu=0, beta=1) or find a valid series expansion for it.
(2) Suppose that the distribution function of a random variable X
is given by:
F(t) = 0 if t < 0
= 1/4 + t/12 if 0 <= t < 1
= 1/3 + (t-1)^2/3 if 1 <= t < 2
= 1 if t >= 2
Find the mean, variance, and moment generating function of X.
(3) Verify 2 ways --- both analytically and by verifying the
conditions of Theorem 2.4.3 or 2.4.8 in Casella & Berger --- that
dM(t; lambda)/dt = E( X exp(t X) )
where the r.v. X ~ Poisson(lambda) and M(t) = M(t; lambda) = E(exp(tX))
is its moment generating function.
(4) Prove that the first term on the right-hand side of the last
displayed equation on p.124 (3rd line from the end of the proof
of Lemma 3.6.5) is 0, under the conditions of the Lemma.
(5) Prove [the assertion in Casella-Berger problem #2.7(b)] that
Theorem 2.1.8 continues to hold if the sets A_0, A_1, ... A_k partition
a set LARGER than the set (2.1.7) of positivity of the density f_X.
(This means that the disjoint sets A_i are allowed to contain points $x$
where f_X(x) = 0.) Use this fact (or other results in Chapter 2 of
Casella and Berger) to find the density of Y = (X-2)^2 where X is a random
variable which is uniformly distributed on [0,3].
(6) Problems # 3.20 and 3.34.
(7) Suppose that a random variable T has the survival function
S(t) = 1-F(t) = exp(- b t^a) for t > 0, for positive constants a,b.
Show that for each fixed value of a > 0, this family of probability
distributions parameterized by b > 0 is a SCALE FAMILY, and also that the
distribution of the random variable log(T) is a LOCATION-SCALE family with
respect to the parameters (a,b).
--------------------------
Problem Set 2, due Monday, Sept. 29, in class.
(1) (a) Give an algorithm to simulate on the computer a random variable
whose distribution function is given by F(t) in Problem (2) of HW Set 1.
(b) Give an algorithm to simulate on the computer a r.v. whose density is
= 2 exp(-x) (1-exp(-x)) for x > 0 (and =0 for x <=0).
(2) Find the moment generating function and variance of a random variable Z
defined as the sum of a random number M of independent identically distributed
N(1,4) random variables X_1, X_2, ... where M is a Poisson(lambda) random
variable independent of the set of X_j's.
(3) Find the distribution function and variance of a random variable Y defined
as follows. Let V be a Bernoulli(p) random variable, W be a Geometric(0.1)
random variable (with values 1,2,...), and X an Exponential r.v. with mean 0.5,
and assume that V, W, and X are jointly independent, and that Y is defined by
Y = V W + (1-V) X.
(4) Suppose that (X,Y) is a bivariate-normal random vector with E(X)=0, E(Y)=2,
Var(X) = 1, Var(Y)=4, Cov(X,Y)=1. Find E( (X+Y)^4 | Y ) and Var( X+2Y | Y ).
(5) [Counts as 2 problems] The probability Q of Heads in a coin-toss experiment is
determined by selecting a coin from a population of biased coins in such a way that
Q ~ Beta(4,4). Then, given fixed Q, that coin is tossed independently and repeatedly
until the second time that Heads occurs. Let M be the number of tosses this takes
(which means that among the first M-1 tosses, exactly 1 Head has appeared, and on
the M'th toss a Head also occurs).
(a) Find the unconditional distribution of M.
(b) Suppose you do the coin-flipping experiment with the same coin (and
therefore same Q) 10 times, recording numbers M_1,...,M_10 each as the numbers
of tosses needed until the occurrence of the 2nd head. Find the probability
distribution of L_1 = M_1+...+M_10, and find its variance.
(c) Suppose that in (b) the separate variables M_1,...,M_10 are obtained
by tossing different independently selected coins with respective Heads
probabilities Q_1,...,Q_10 ~ Beta(4,4). Compare the variance of the total
L_2 = M_1+...+M_10 in this series of experiments with the variance you found
in part (b). Which one is larger ? Why ? Would this fact still be true if the
Beta(4,4) distribution for Heads-probabilities of individual coins were replaced
by some other Beta(alpha,beta) distribution ?
(6) If Z_1, Z_2, Z_3 and Z_4 are independent identically distributed standard-normal
random variables, and Z denotes the column vector with these four random variables
as components, then find the density of the random variable
(Z_1+Z_3)^2 + (Z_2-Z_4)^2 + (Z_1-Z_3+Z_2+Z_4)^2 + (Z_1-Z_3-Z_2-Z_4)^2 .
[The answer in this problem can be found in closed form.]
---------------------------
Problem Set 3, due Monday Oct. 6 in class.
(1) Let (Z_1, Z_2, Z_3) be iid standard normal random variables, and
Y_1 = max(Z_1-Z_2, Z_1+Z_3), Y_2 = min(Z_2,Z_3). Find the joint density of
Y_1, Y_2.
(2) It was shown in class by an argument involving the "memoryless property"
that if X_1, X_2, ..., X_k are independent identically distributed Expon(lambda)
random variables, with order statistics denoted X_(1), X_(2),... X_(k) , then
X_(1), X_(2)-X_(1), X_(3)-X_(2), ... , X_(k)-X_(k-1) are independent random
variables. Prove this fact directly, by regarding this last random vector as
an invertible transformation of the order-statistics vector.
(3) (a) Find the density of X = Z_1/Z_2 where Z_1 and Z_2 are
independent standard normal random variables.
(b) Show that E(|X|) is infinite, but that E(sqrt(|X|)) is finite.
(c) Show from the definition that if X_1,...,X_n are independent and
identically distributed with the same density as X, then Xbar = (X_1+...+ X_n)/n
also has exactly the same distribution as X.
(4) [Counts as 2 problems] Consider the noncentral chi-squared random variable
Y = (Z_1+a_1)^2 + (Z_2+a_2)^2 + ... + (Z_k+a_p)^2 , where a_1,a_2,...,a_p
are any real numbers, and Z_1,...,Z_p are iid standard normal for any positive
integer p.
(a) Show that the distribution of Y depends on the numbers a_1,...,a_p
only through the noncentrality parameter lambda = (a_1^2+...+a_p^2)/2.
HINT: to do this, expand the squares in the definition of Y and show that there
is an orthogonal transformation linear from (Z_1,...,Z_p) to (W_1,...,W_p) such
that W_1 = (a_1*Z_1+...+a_k*Z_k)/sqrt(lambda).
(b) So by (a) there is no loss in generality in assuming that a_2=...=a_k=0 in
part (a). Show that the moment generating function of the random variable Y is
exp(-lambda) * (1-2t)^(-p/2) * exp(lambda/(1-2t)).
(c) By expanding the final exponential in the mgf of part (b), show that this
moment generating function is exactly the same as the mgf of the mixture density
given in formula (4.4.3) of Casella and Berger, which is the mixture with weights
exp(-lambda) lambda^k / k! of chi-square(p+2k) densities over nonnegative
integers k . (Note: justify your use of series expansion inside an integral !)
---------
Problem Set 4, due Friday Oct. 24 (under my door).
NOTE CHANGED DUE-DATE and also removal of INCOMPLETENESS assertion in 4.(c).
(1) Give two different algorithms to generate (as a function of a sequence of
iid Uniform[0,1] random variable values U_1 U_2, ...) a random vector (V,W)
which is uniformly distributed on the triangle {(v,w): 0 0 while mu could be any real number.
(a) Find the sufficient statistic vector T obtained by representing this density
as an exponential family, and explain how you know they are sufficient.
(b) Show that your sufficient statistic vector is minimal.
(c) Show that when lambda=1 is known, a sub-vector S of your sufficient statistic
vector T is minimal sufficient for mu.
(5) [Counts as 2 problems] Use Basu's Theorem to show that:
(a) When X_1,...,X_n is a sample from N(0, b^2), the vector (sign(X_1),...,
sign(X_n)) is independent of X_1^2+...+X_n^2.
(b) When X_1,...,X_n is a continuously distributed sample (of scalar r.v.'s),
the rank-vector (R_1,...,R_n) with $R_k = \sum_{i=1}^n I[X_i\le X_k]$ is independent
of the order-statistic vector.
(c) When X_1,...,X_n is a sample from Gamma(alpha,lambda), then the vector
(X_1,...,X_n)/(X_1+...+X_n) is independent of X_1+...+X_n.
(d) When X_1,...,X_n is a sample from Uniform[0,L], then (X_(1),...,X_(n-1))/X_(n)
is independent of X_(n).
IN EACH CASE in (5), THE METHOD IS TO SHOW THAT ONE STATISTIC IS SUFFICIENT FOR AN
UNKNOWN PARAMETER (perhaps, with other parameter(s) fixed) AND THE OTHER IS ANCILLARY.
In part (b) the parameter is the unknown distribution function.
---------------
Problem Set 5, due Friday Nov. 7 (under my door).
All problems in this HW set count as 2 (so total points are 80).
(1) (Compare #6.12.) Suppose that we perform a comparison of the lifetimes of
a certain kind of device under two operation conditions A and B, as follows.
For each of 100 independently and identically produced specimens of this device,
a biased coin is flipped independently, with probability 3/5 of Heads and 2/5
of Tails. Each device for which the coin falls Heads is operated under condition A,
and the others under condition B. Let e_j for j=1,...,100 be 1 if the j'th coin
falls Heads, 0 otherwise. Assume that the device lifetimes are independent
and Expon(lambda_A) and Expon(lambda_B) respectively under conditions A and B.Each
one of the 100 devices, j=1,...,100, is operated to destruction and its lifetime T_j
noted. Let n_A = 100 - n_B be the number of coins falling Heads, and let
S_A = sum_{i=1}^{100} e_j T_j and S_B = sum_{i=1}^{100} (1-e_j) T_j
be the respective total lifetimes of all of the condition-A and condition-B devices.
(a) Prove that (S_A, S_B, n_A) is minimal sufficient for theta = (lambda_A,lambda_B),
and that n_A is ancillary for theta.
(b) Conditionally given 0 < n_A < 100, find the variance-covariance matrix for
the (conditionally) unbiased estimator (S_A/n_A, S_B/n_B) of theta.
NOTE that the conditionality principle says we should perform inferences about theta
conditionally given n_A, which is completely reasonable in this setting.
(2) (a) Give two completely different methods [HINT: mixture, and conditional
probability integral theorem] for simulating as a function of i.i.d. Uniform[0,1]
random variables U_1,U_2,... a random vector (X,Y) with joint density
f_{X,Y}(x,y) = 2 ( x y + (1-x)(1-y) ) I[0 0, v^2+w^2<1]
but justify that your function of U_1,U_2 etc. has the stated joint density.
(3) Suppose that for i=1,...,n, the iid discrete random variables X_i take values
0, 1, and 2 with respective probabilities 1-alpha-beta, alpha, and beta, where
alpha and beta are unknown positive parameters with alpha + beta < 1. Based on data
X_1,...,X_n:
(a) Find the method of moments estimators of alpha, beta.
(b) Find the MLE of theta = (alpha, beta).
(c) Find the Bayes (posterior-mean) estimator of (alpha,beta), using the prior
g(alpha,beta) = 2 I[alpha, beta > 0, alpha+beta < 1].
HINT: your calculations will be easier in (b) and (c) if you first establish
that N_1 = number of X_i's equal to 1 and N_2 = number of X_i's equal to 2 are
sufficient statistics for (alpaha, beta). Also: your calculation in (c) will be
related to the Dirichlet Distribution; look it up on Wikipedia if you do not want
to derive the formulas yourself based on what you know about Beta densities.
(4) Find formulas for the mean-squared errors (MSEs), as a function of alpha and
beta, for each of the three estimators you found in problem (3). Compare these
MSEs as well as you can: does any of these estimators dominate the others ? Is any
uniformly worse than the others ? You may base your answer on numerical
comparisons for particular choices of alpha, beta or on general theorems stated
in class.
---------------
Problem Set 6, due Monday Nov. 24.
(1) (a) Find the UMVUE for lambda^2 based on a sample X_1,...,X_n of Poisson(lambda) data.
(b) Find the Cramer-Rao lower bound for an unbiased estimator of lambda^2 based on
the same sample. Is the lower bound achieved by the UMVUE ? Explain why or why not.
(c) Find the variance of the UMVUE.
(2) Suppose that V_1,...,V_n are a sample from the density (2v/theta^2) I[0 0 is given
for the unknown parameter lambda of a Poisson(lambda) distributed data sample
Y = (Y_1,..., Y_n), where gamma > 0 is known. Based on this sample, and this prior,
find the estimator tilde{lambda} = tilde{lambda}(Y) that minimizes the expectation
over lambda and Y of the loss L(lambda, tilde{lambda}(Y)), where
L(lambda, a) = (lambda - a)^2/lambda for a, lambda > 0.
---------------
Problem Set 7, due Wednesday Dec. 10 in class or Friday Dec. 12 at 4pm review session.
(1) An observer records a single data-value X in (0, 1), where X
has one of two possible densities according to the value theta = 1 or 2, with
f(x,theta) = 2(1-x) I[0 theta], and restricts his
attention to the class of nonrandomized decision rules T_s(x) indexed by s in (0, 1)
which are defined by: T_s(x) = 1 + I[x>s].
Show that in this statistical decision problem, there is a unique minimax rule
(find it and justify it!). Also show that every decision rule of the form
tau_t(x) = 1+I[x1]-1)) , 0 < w < 2,
where -1 < theta < 1 is an unknown parameter.
(a) Find a UMP test of size exactly alpha=.05 based on W_1,...,W_30 (n=30) for
the null hypothesis H_0: theta=0 versus the alternative H_1: theta = 1/2.
(HINT: why is auxiliary randomization necessary in this problem ?)
(b) Find the power against theta=0.5 of the test you find in part (a).
(5) Consider the distributional setting of problem (4), still with n=30 with the
same null hypothesis H_0 as in problem (4), but now let the alternative
hypothesis be H_2: |theta| = 0.5.
(a) Is there a UMP test of H_0 versus H_2 of size alpha=.05 ? Give details
showing why or why not.
(b) Find the Likelihood Ratio Test of size alpha=.05 for testing H_0 versus
H_2, and find its power against theta = 0.5.
(6) Let Z_1,...,Z_n be a N(mu,1) sample; and define H_0: mu=0 and H_2: |mu| > 0.
Suppose that your prior probability distribution for mu is a mixed-type distribution
with a point mass of 0.4 at mu=0 and continuous density-component 0.6 times a
Normal(0, tau) density. Define the loss-function by
L(mu, a) = 3 * I[ mu=0 and a=1] + mu^2*I[ |mu| >=1 and a=0] ,
where the possible actions in this testing problem are a=0,1, with 1 interpreted as
rejection of the null hypothesis.
(a) For n=10, tau=1, find the Bayes-optimal testing procedure, and find its size
and power.
(b) What is the posterior conditional probability that your test in (a) rejects
given that |mu| = 1.5 ?