Solution to HW 20.
=================

> dim(mtcars)
[1] 32 11
> names(mtcars)
 [1] "mpg"  "cyl"  "disp" "hp"   "drat" "wt"   "qsec" "vs"   "am"   "gear"
[11] "carb"

> maxfit = lm(mpg ~ ., data=mtcars)
> extractAIC(maxfit)
[1] 11.00000 70.89774

> tmpfit = step(frstfit, list(upper= ~ qsec+vs+gear+carb
        + cyl+ disp+ hp+ drat+ wt+ am, lower = ~1), dire="for", k=4)

Step:  AIC= 69.2
 mpg ~ wt + cyl

       Df Sum of Sq     RSS     AIC
<none>              191.172  69.198
+ hp    1    14.551 176.621  70.665

> summary(tmpfit)$coef
             Estimate Std. Error   t value     Pr(>|t|)
(Intercept) 39.686261  1.7149840 23.140893 3.043182e-20
wt          -3.190972  0.7569065 -4.215808 2.220200e-04
cyl         -1.507795  0.4146883 -3.635972 1.064282e-03

### Running same "step" function with upper = ~qsec+vs+gear+carb + 
              (cyl+ disp+ hp+ drat+ wt+ am)^2
### leads to "best" model

Step:  AIC= 66.89
 mpg ~ wt + cyl + wt:cyl

### but improper termination !!?

> fitfinal = update(tmpfit, formula=.~. + wt*cyl)
> extractAIC(fitfinal, k=4)
[1]  4.00000 66.89147
> summary(fitfinal)$coef
              Estimate Std. Error   t value     Pr(>|t|)
(Intercept) 54.3068062   6.127535  8.862749 1.288912e-09
wt          -8.6555590   2.320122 -3.730648 8.609666e-04
cyl         -3.8032187   1.005028 -3.784193 7.471791e-04
wt:cyl       0.8083947   0.327322  2.469723 1.988242e-02
### STILL A VERY STRONG AND SIGNIFICANT MODEL

> backfit = step(lm(mpg ~ qsec+vs+gear+carb + cyl+ disp+ hp+ 
        drat+ wt+ am + wt:cyl, data=mtcars), k=4)

Step:  AIC= 65.75
 mpg ~ qsec + cyl + wt + cyl:wt

         Df Sum of Sq     RSS     AIC
<none>                133.680  65.751
- qsec    1    23.296 156.976  66.891
- cyl:wt  1    46.925 180.605  71.378

> summary(backfit)$coef
              Estimate Std. Error   t value     Pr(>|t|)
(Intercept)  41.642888  8.2001797  5.078290 2.467874e-05
qsec          0.757161  0.3490572  2.169160 3.904427e-02
cyl          -3.392655  0.9632556 -3.522072 1.543295e-03
wt          -10.819534  2.3977297 -4.512408 1.128707e-04
cyl:wt        0.976600  0.3172252  3.078570 4.734899e-03
### STILL VERY STRONG and all coefficients sensible
> plot(backfit)
### Residuals plot generall sensible, but a few dramatic outliers!
(Toyota Corolla, Corona, and Fiat 128): without these three (17,20,21)
get 

> backf2 = step(lm(mpg ~ qsec+vs+gear+carb + cyl+ disp+ hp+ 
     drat+ wt+ am + wt:cyl, data=mtcars[c(1:17,19,22:32),]), k=4)
Step:  AIC= 44.67
 mpg ~ cyl + wt + am + cyl:wt         ### model very similar

         Df Sum of Sq    RSS    AIC
<none>                67.899 44.671
- am      1    12.966 80.865 45.739
- cyl:wt  1    22.229 90.128 48.884

> summary(backf2)$coef
             Estimate Std. Error   t value     Pr(>|t|)
(Intercept) 54.382372  5.6124728  9.689556 9.067462e-10
cyl         -3.511505  0.7878346 -4.457160 1.651637e-04
wt          -8.457864  1.9597174 -4.315859 2.365715e-04
am          -2.044559  0.9550441 -2.140801 4.265448e-02
cyl:wt       0.718421  0.2563001  2.803046 9.859160e-03

## previous bad residuals were + - 4, now (Firebird, Imperial, Camaro) + - 3

### Sequential selection seems dramatically dependent on the order
###  chosen, partly because of outliers.

### NOW SAS 

  data home.mtcars;
        infile "mtcars.dat";
        input carname $ mpg cyl  disp  hp drat
             wt  qsec vs am gear carb ;
        V23 = cyl*disp;
        V24 = cyl*hp;
        V25 = cyl*drat;
        V26 = cyl*wt;
        V29 = cyl*am;
        V34 = disp*hp;
        V35 = disp*drat;
        V36 = disp*wt;
        V39 = disp*am;   
        V45 = hp*drat;
        V46 = hp*wt;
        V49 = hp*am;
        V56 = drat*wt;
        V59 = drat*am;
        V69 = wt*am;       run;

### 32 x 27 !!!

   proc reg data=home.mtcars;
        model mpg = cyl  disp  hp drat
             wt  qsec vs am gear carb V23 V24 
             V25 V26 V29 V34 V35 V36 V39
             V45 V46 V49 V56 V59 V69 / selection = stepwise
             slentry = .1 slstay = .05 ;
      run;


                      Summary of Stepwise Selection

     Variable  Variable  Number  Partial   Model
Step Entered   Removed   Vars In R-Square R-Square  C(p)   F Value Pr > F

  1  V26                     1    0.7821   0.7821  73.8147  107.69  <.0001

###ONLY variable entered is wt*cyl which for us means that we should
###  enter also wt*cyl. But as we saw in R, that is not quite the
###  best model !!