Homework 13, Assigned 10/26/17, due Monday 11/6 in class
========================================================       16 points

(A) Consider the dataset "debt" from package "faraway" on characteristics of 464  
people answering questions on a (UK) postal survey regarding attitudes to debt.

The response variable of interest is "prodebt". Delete the column "ccarduse" and all rows 
in which the variable "prodebt" is missing. Then recode the two columns "bankacc" and 
"bsocacc"  into a single variable "savings" using the rule:

savings = 1 if either bankacc = 1 or bsocacc = 1 (allowing possible missing values)
          0 if both bankacc and bsocacc = 0 or if one is NA and the other is 0
          NA if both bankacc = NA and bsocacc = NA

Then remove all rows of the remaining data-frame in which any NA values occur.

Your resulting data-frame "debt.edt" should have 355 rows and 11 columns.

PROBLEM: fit the best linear model you can to this data-frame "debt.edt" with response variable
"prodebt" in terms of the other variables in the data frame, with variables including the 
data-frame columns ("main-effect terms") and their pairwise products ("interaction terms"), 
with the constraint that an interaction term is included only if both of its main-effect factors
is also included. The criterion for "best model" should be "parsimony" (do not include predictors 
that are very weak unless they are included in highly significant interactions) and "patternless 
residuals". You may generate your model mainly via stepwise model selection, but then check 
whether further changes are indicated to remove weak predictors.

(B) Fit a logistic regression model to the occurrence rates of esophageal cancer cases among 
cases and controls in terms of the other (categorical) predictor variables in the dataset 
"esoph" contained in the standard R "datasets" distribution. Are any of the pairwise interactions 
between the "agegp", "alcgp" and "tobgp" variables significant predictors of esophageal 
cancer in these data ?