Stat 700 Fall '14 PROBLEM ASSIGNMENTS All section and problem numbers are as in Casella and Berger, Statistical Inference, 2nd edition. Problem Sets listed here for later in the course may change by the time they are assigned. Problem Set 1, due Monday, Sept. 15 in class. (1) Prove that the logistic density is symmetric about 0 [see exercise #2.26 for definitions and properties of symmetric densities], and show that its mean is 0 and by integration evaluate its variance (as given on p. 624 for mu=0, beta=1) or find a valid series expansion for it. (2) Suppose that the distribution function of a random variable X is given by: F(t) = 0 if t < 0 = 1/4 + t/12 if 0 <= t < 1 = 1/3 + (t-1)^2/3 if 1 <= t < 2 = 1 if t >= 2 Find the mean, variance, and moment generating function of X. (3) Verify 2 ways --- both analytically and by verifying the conditions of Theorem 2.4.3 or 2.4.8 in Casella & Berger --- that dM(t; lambda)/dt = E( X exp(t X) ) where the r.v. X ~ Poisson(lambda) and M(t) = M(t; lambda) = E(exp(tX)) is its moment generating function. (4) Prove that the first term on the right-hand side of the last displayed equation on p.124 (3rd line from the end of the proof of Lemma 3.6.5) is 0, under the conditions of the Lemma. (5) Prove [the assertion in Casella-Berger problem #2.7(b)] that Theorem 2.1.8 continues to hold if the sets A_0, A_1, ... A_k partition a set LARGER than the set (2.1.7) of positivity of the density f_X. (This means that the disjoint sets A_i are allowed to contain points $x$ where f_X(x) = 0.) Use this fact (or other results in Chapter 2 of Casella and Berger) to find the density of Y = (X-2)^2 where X is a random variable which is uniformly distributed on [0,3]. (6) Problems # 3.20 and 3.34. (7) Suppose that a random variable T has the survival function S(t) = 1-F(t) = exp(- b t^a) for t > 0, for positive constants a,b. Show that for each fixed value of a > 0, this family of probability distributions parameterized by b > 0 is a SCALE FAMILY, and also that the distribution of the random variable log(T) is a LOCATION-SCALE family with respect to the parameters (a,b). -------------------------- Problem Set 2, due Monday, Sept. 29, in class. (1) (a) Give an algorithm to simulate on the computer a random variable whose distribution function is given by F(t) in Problem (2) of HW Set 1. (b) Give an algorithm to simulate on the computer a r.v. whose density is = 2 exp(-x) (1-exp(-x)) for x > 0 (and =0 for x <=0). (2) Find the moment generating function and variance of a random variable Z defined as the sum of a random number M of independent identically distributed N(1,4) random variables X_1, X_2, ... where M is a Poisson(lambda) random variable independent of the set of X_j's. (3) Find the distribution function and variance of a random variable Y defined as follows. Let V be a Bernoulli(p) random variable, W be a Geometric(0.1) random variable (with values 1,2,...), and X an Exponential r.v. with mean 0.5, and assume that V, W, and X are jointly independent, and that Y is defined by Y = V W + (1-V) X. (4) Suppose that (X,Y) is a bivariate-normal random vector with E(X)=0, E(Y)=2, Var(X) = 1, Var(Y)=4, Cov(X,Y)=1. Find E( (X+Y)^4 | Y ) and Var( X+2Y | Y ). (5) [Counts as 2 problems] The probability Q of Heads in a coin-toss experiment is determined by selecting a coin from a population of biased coins in such a way that Q ~ Beta(4,4). Then, given fixed Q, that coin is tossed independently and repeatedly until the second time that Heads occurs. Let M be the number of tosses this takes (which means that among the first M-1 tosses, exactly 1 Head has appeared, and on the M'th toss a Head also occurs). (a) Find the unconditional distribution of M. (b) Suppose you do the coin-flipping experiment with the same coin (and therefore same Q) 10 times, recording numbers M_1,...,M_10 each as the numbers of tosses needed until the occurrence of the 2nd head. Find the probability distribution of L_1 = M_1+...+M_10, and find its variance. (c) Suppose that in (b) the separate variables M_1,...,M_10 are obtained by tossing different independently selected coins with respective Heads probabilities Q_1,...,Q_10 ~ Beta(4,4). Compare the variance of the total L_2 = M_1+...+M_10 in this series of experiments with the variance you found in part (b). Which one is larger ? Why ? Would this fact still be true if the Beta(4,4) distribution for Heads-probabilities of individual coins were replaced by some other Beta(alpha,beta) distribution ? (6) If Z_1, Z_2, Z_3 and Z_4 are independent identically distributed standard-normal random variables, and Z denotes the column vector with these four random variables as components, then find the density of the random variable (Z_1+Z_3)^2 + (Z_2-Z_4)^2 + (Z_1-Z_3+Z_2+Z_4)^2 + (Z_1-Z_3-Z_2-Z_4)^2 . [The answer in this problem can be found in closed form.] --------------------------- Problem Set 3, due Monday Oct. 6 in class. (1) Let (Z_1, Z_2, Z_3) be iid standard normal random variables, and Y_1 = max(Z_1-Z_2, Z_1+Z_3), Y_2 = min(Z_2,Z_3). Find the joint density of Y_1, Y_2. (2) It was shown in class by an argument involving the "memoryless property" that if X_1, X_2, ..., X_k are independent identically distributed Expon(lambda) random variables, with order statistics denoted X_(1), X_(2),... X_(k) , then X_(1), X_(2)-X_(1), X_(3)-X_(2), ... , X_(k)-X_(k-1) are independent random variables. Prove this fact directly, by regarding this last random vector as an invertible transformation of the order-statistics vector. (3) (a) Find the density of X = Z_1/Z_2 where Z_1 and Z_2 are independent standard normal random variables. (b) Show that E(|X|) is infinite, but that E(sqrt(|X|)) is finite. (c) Show from the definition that if X_1,...,X_n are independent and identically distributed with the same density as X, then Xbar = (X_1+...+ X_n)/n also has exactly the same distribution as X. (4) [Counts as 2 problems] Consider the noncentral chi-squared random variable Y = (Z_1+a_1)^2 + (Z_2+a_2)^2 + ... + (Z_k+a_p)^2 , where a_1,a_2,...,a_p are any real numbers, and Z_1,...,Z_p are iid standard normal for any positive integer p. (a) Show that the distribution of Y depends on the numbers a_1,...,a_p only through the noncentrality parameter lambda = (a_1^2+...+a_p^2)/2. HINT: to do this, expand the squares in the definition of Y and show that there is an orthogonal transformation linear from (Z_1,...,Z_p) to (W_1,...,W_p) such that W_1 = (a_1*Z_1+...+a_k*Z_k)/sqrt(lambda). (b) So by (a) there is no loss in generality in assuming that a_2=...=a_k=0 in part (a). Show that the moment generating function of the random variable Y is exp(-lambda) * (1-2t)^(-p/2) * exp(lambda/(1-2t)). (c) By expanding the final exponential in the mgf of part (b), show that this moment generating function is exactly the same as the mgf of the mixture density given in formula (4.4.3) of Casella and Berger, which is the mixture with weights exp(-lambda) lambda^k / k! of chi-square(p+2k) densities over nonnegative integers k . (Note: justify your use of series expansion inside an integral !) --------- Problem Set 4, due Friday Oct. 24 (under my door). NOTE CHANGED DUE-DATE and also removal of INCOMPLETENESS assertion in 4.(c). (1) Give two different algorithms to generate (as a function of a sequence of iid Uniform[0,1] random variable values U_1 U_2, ...) a random vector (V,W) which is uniformly distributed on the triangle {(v,w): 0 0 while mu could be any real number. (a) Find the sufficient statistic vector T obtained by representing this density as an exponential family, and explain how you know they are sufficient. (b) Show that your sufficient statistic vector is minimal. (c) Show that when lambda=1 is known, a sub-vector S of your sufficient statistic vector T is minimal sufficient for mu. (5) [Counts as 2 problems] Use Basu's Theorem to show that: (a) When X_1,...,X_n is a sample from N(0, b^2), the vector (sign(X_1),..., sign(X_n)) is independent of X_1^2+...+X_n^2. (b) When X_1,...,X_n is a continuously distributed sample (of scalar r.v.'s), the rank-vector (R_1,...,R_n) with $R_k = \sum_{i=1}^n I[X_i\le X_k]$ is independent of the order-statistic vector. (c) When X_1,...,X_n is a sample from Gamma(alpha,lambda), then the vector (X_1,...,X_n)/(X_1+...+X_n) is independent of X_1+...+X_n. (d) When X_1,...,X_n is a sample from Uniform[0,L], then (X_(1),...,X_(n-1))/X_(n) is independent of X_(n). IN EACH CASE in (5), THE METHOD IS TO SHOW THAT ONE STATISTIC IS SUFFICIENT FOR AN UNKNOWN PARAMETER (perhaps, with other parameter(s) fixed) AND THE OTHER IS ANCILLARY. In part (b) the parameter is the unknown distribution function. --------------- Problem Set 5, due Friday Nov. 7 (under my door). All problems in this HW set count as 2 (so total points are 80). (1) (Compare #6.12.) Suppose that we perform a comparison of the lifetimes of a certain kind of device under two operation conditions A and B, as follows. For each of 100 independently and identically produced specimens of this device, a biased coin is flipped independently, with probability 3/5 of Heads and 2/5 of Tails. Each device for which the coin falls Heads is operated under condition A, and the others under condition B. Let e_j for j=1,...,100 be 1 if the j'th coin falls Heads, 0 otherwise. Assume that the device lifetimes are independent and Expon(lambda_A) and Expon(lambda_B) respectively under conditions A and B.Each one of the 100 devices, j=1,...,100, is operated to destruction and its lifetime T_j noted. Let n_A = 100 - n_B be the number of coins falling Heads, and let S_A = sum_{i=1}^{100} e_j T_j and S_B = sum_{i=1}^{100} (1-e_j) T_j be the respective total lifetimes of all of the condition-A and condition-B devices. (a) Prove that (S_A, S_B, n_A) is minimal sufficient for theta = (lambda_A,lambda_B), and that n_A is ancillary for theta. (b) Conditionally given 0 < n_A < 100, find the variance-covariance matrix for the (conditionally) unbiased estimator (S_A/n_A, S_B/n_B) of theta. NOTE that the conditionality principle says we should perform inferences about theta conditionally given n_A, which is completely reasonable in this setting. (2) (a) Give two completely different methods [HINT: mixture, and conditional probability integral theorem] for simulating as a function of i.i.d. Uniform[0,1] random variables U_1,U_2,... a random vector (X,Y) with joint density f_{X,Y}(x,y) = 2 ( x y + (1-x)(1-y) ) I[0 0, v^2+w^2<1] but justify that your function of U_1,U_2 etc. has the stated joint density. (3) Suppose that for i=1,...,n, the iid discrete random variables X_i take values 0, 1, and 2 with respective probabilities 1-alpha-beta, alpha, and beta, where alpha and beta are unknown positive parameters with alpha + beta < 1. Based on data X_1,...,X_n: (a) Find the method of moments estimators of alpha, beta. (b) Find the MLE of theta = (alpha, beta). (c) Find the Bayes (posterior-mean) estimator of (alpha,beta), using the prior g(alpha,beta) = 2 I[alpha, beta > 0, alpha+beta < 1]. HINT: your calculations will be easier in (b) and (c) if you first establish that N_1 = number of X_i's equal to 1 and N_2 = number of X_i's equal to 2 are sufficient statistics for (alpaha, beta). Also: your calculation in (c) will be related to the Dirichlet Distribution; look it up on Wikipedia if you do not want to derive the formulas yourself based on what you know about Beta densities. (4) Find formulas for the mean-squared errors (MSEs), as a function of alpha and beta, for each of the three estimators you found in problem (3). Compare these MSEs as well as you can: does any of these estimators dominate the others ? Is any uniformly worse than the others ? You may base your answer on numerical comparisons for particular choices of alpha, beta or on general theorems stated in class. --------------- Problem Set 6, due Monday Nov. 24. (1) (a) Find the UMVUE for lambda^2 based on a sample X_1,...,X_n of Poisson(lambda) data. (b) Find the Cramer-Rao lower bound for an unbiased estimator of lambda^2 based on the same sample. Is the lower bound achieved by the UMVUE ? Explain why or why not. (c) Find the variance of the UMVUE. (2) Suppose that V_1,...,V_n are a sample from the density (2v/theta^2) I[0 0 is given for the unknown parameter lambda of a Poisson(lambda) distributed data sample Y = (Y_1,..., Y_n), where gamma > 0 is known. Based on this sample, and this prior, find the estimator tilde{lambda} = tilde{lambda}(Y) that minimizes the expectation over lambda and Y of the loss L(lambda, tilde{lambda}(Y)), where L(lambda, a) = (lambda - a)^2/lambda for a, lambda > 0. --------------- Problem Set 7, due Wednesday Dec. 10 in class or Friday Dec. 12 at 4pm review session. (1) An observer records a single data-value X in (0, 1), where X has one of two possible densities according to the value theta = 1 or 2, with f(x,theta) = 2(1-x) I[0 theta], and restricts his attention to the class of nonrandomized decision rules T_s(x) indexed by s in (0, 1) which are defined by: T_s(x) = 1 + I[x>s]. Show that in this statistical decision problem, there is a unique minimax rule (find it and justify it!). Also show that every decision rule of the form tau_t(x) = 1+I[x1]-1)) , 0 < w < 2, where -1 < theta < 1 is an unknown parameter. (a) Find a UMP test of size exactly alpha=.05 based on W_1,...,W_30 (n=30) for the null hypothesis H_0: theta=0 versus the alternative H_1: theta = 1/2. (HINT: why is auxiliary randomization necessary in this problem ?) (b) Find the power against theta=0.5 of the test you find in part (a). (5) Consider the distributional setting of problem (4), still with n=30 with the same null hypothesis H_0 as in problem (4), but now let the alternative hypothesis be H_2: |theta| = 0.5. (a) Is there a UMP test of H_0 versus H_2 of size alpha=.05 ? Give details showing why or why not. (b) Find the Likelihood Ratio Test of size alpha=.05 for testing H_0 versus H_2, and find its power against theta = 0.5. (6) Let Z_1,...,Z_n be a N(mu,1) sample; and define H_0: mu=0 and H_2: |mu| > 0. Suppose that your prior probability distribution for mu is a mixed-type distribution with a point mass of 0.4 at mu=0 and continuous density-component 0.6 times a Normal(0, tau) density. Define the loss-function by L(mu, a) = 3 * I[ mu=0 and a=1] + mu^2*I[ |mu| >=1 and a=0] , where the possible actions in this testing problem are a=0,1, with 1 interpreted as rejection of the null hypothesis. (a) For n=10, tau=1, find the Bayes-optimal testing procedure, and find its size and power. (b) What is the posterior conditional probability that your test in (a) rejects given that |mu| = 1.5 ?