DEMONSTRATION LOG OF PROBABILITY SAMPLING using R functions. ==================================================== 9/7/05 NOTE: lines of R coding are given following the standard prompt > , while comments and summaries are given following #. > yvals <- c(1,2,4,4,7,7,7,8) # These are the attribute values y_i given in Example 2.1, p. 26. # Consider all possible samples of size 4 # They are placed below into a matrix with 70 rows. # NB: these rows are all distinct, but not all of the corresponding samples of yvals are !! > sampmat <- NULL for(i in 1:5) for (j in (i+1):6) for (k in (j+1):7) for(m in (k+1):8) sampmat <- rbind(sampmat, c(i,j,k,m)) > dim(sampmat) [1] 70 4 > tvec <- apply(sampmat,1, function(irow) 2*sum(yvals[irow])) ## This compact line of R coding applies the same rule --- ## twice the sum of the yvals values with 4 specified ## sampled indices --- for each possible sample of indices. ## It produces a vector of length 70: all possible t-hat ## estimators of the y-attribute total based on the 70 ## equally likely samples that could be drawn. > table(tvec) tvec 22 28 30 32 34 36 38 40 42 44 46 48 50 52 58 1 6 2 3 7 4 6 12 6 4 7 3 2 6 1 ## Each of these values divided by 70 gives probability ## of resulting estimate of total of y attributes. > mean(tvec) [1] 40 > var(tvec)*69/70 ## correction because this is a theoretical [1] 54.85714 ## not a 'sample' variance ! ## Formulas in Ch.2 are: mean = sum(yvals), and ## var = (8^2/4)*(1-4/8)*var(yvals) > sum(yvals) [1] 40 > (8^2/4)*(1-4/8)*var(yvals) [1] 54.85714 ==================================================== NOTE for IN-CLASS comparison with the WITH-REPLACEMENT SAMPLING DESIGN: > mean(yvals) ## Mean [1] 5 > (7/8)*var(yvals) ## and Theoretical Variance [1] 6 ## of single equiprobably sampled ## value from yvals