Statistics 430 Introduction to Statistical Computing & SAS 
Section 0201, MWF 2, Rm MTH B0421          Fall 2008

Click here for a generic Stat 430 Course Syllabus.

Pointer to new Homework Assignments and old HW solutions

Pointer to Illustrative Scripts

Instructor: Eric Slud, Statistics Program, Math. Dept.

Office:    Mth 2314, x5-5469, email evs@math.umd.edu

Office hours:    M 10-12, Th 1-2

This course is an introduction to statistical and graphical techniques of data analysis and their implementation in the SAS programming language/platform. The emphasis is on data analysis skills, but since one such important skill is justification of assumptions and understanding of the rationale behind analyses, the course develops ideas and explains concepts from statistical theory.
 

Prerequisite: Stat 400. The material needed is mostly definitions and concepts, and some basic algebraic manipulations involving probabilities. But later in the course, some understanding of Stat 400 material on distributions of functions of random variables will help you make sense of statistical simulation methods. (That material will be reviewed as needed.)
 

Text: Ronald Cody and Jeffrey Smith, Applied Statistics and the SAS Programming Language,
5th ed., Prentice-Hall, 2006.

Course requirements and Grading:    Most of the work for the course will consist of 8--10 graded Problem Sets. These will involve writing and running small SAS programs and interpreting the sequence of data-analysis operations and outputs.
While you will be permitted to share hints and information concerning SAS programming, the reasoning behind analyses, summaries of them, interpretation of results, and the edited copies you hand in must be exclusively your own work.

In addition, there will be an in-class test toward the end of October, on basics of the SAS language and concepts underlying data-display and statistics in categorical data, two-sample comparisons, and simple linear regression. Finally, there will be a slightly more ambitious data-analysis term project in place of a Final Exam (due Mon, Dec. 18, 4pm).
The course grade will be based on a weighted average of your homework, test, and project grades, with 50% weight on Homework scores (with none dropped) and 25% for the Test and 25% for the Term Project.


MIDTERM TEST

The In-class Midterm test will be tentatively on Friday October 27, 2006.   It will cover
material from Chapters 1-3 (omitting sections 3.M-P and R), 5 through 5.F, 6.A-B,
13.A-D,H-I, 14.A-D. Only material covered in class and scripts will be within scope
for the test. I will post a sample test to the web-page around 10/20/06.
NOTE: you can bring one 2-sided notebook sheet to the test as a memory aid.
Except for your notebook sheet, the test is closed-book. You can use a calculator,
but I will not ask for much if any arithmetic.


Click here for Data Analysis Term Project Guidelines.     (Due Date: Thursday December 18, 4pm)


The University of Maryland, College Park has a nationally recognized Code of Academic Integrity, administered by the Student Honor Council. This Code sets standards for academic integrity at Maryland for all undergraduate and graduate students. As a student you are responsible for upholding these standards for this course. It is very important for you to be aware of the consequences of cheating, fabrication, facilitation, and plagiarism. For more information on the Code of Academic Integrity or the Student Honor Council, please click here.


If you need help...

My office hours are (tentatively) Monday 10-12 and Thursday 1-2.  I will often be available at other times too, except on Tuesdays, but please send an e-mail or arrange with me in class for an office appointment.


Homework Assignments and Solutions.

The first problem assignment was given on the Course Outline page. Solutions to
that assignment and selected problem solutions (other than those included in example
Scripts) will be posted to the HWSoln Directory as the term progresses.


           HOMEWORK GUIDELINE

Please remember for all Homework and Project papers to be handed in for this course:
the consistent guideline is to hand only as much SAS code as will show that you did
the computations correctly using SAS, and only as much output, edited into a coherent
narrative where narrative and explanations are requested, as is needed to answer the
questions asked and to justify the sequence of steps and conclusions you have made.

You will be graded down for handing in lots of extraneous material !



      HANDOUTS   (with many more to come)

(1).    Click SASintro for a step-by-step discussion about how to get started in SAS on
University or other machines.
        By clicking here, you can download free `X-windows' software that will allow you
to create the X-windows needed to use SAS in your WAM account from a home PC.
NOTE: a student has reminded me that the xlivecd software works only for windows versions up to and including XP, not Vista. Another approach which I have just tried and which works very well is to install the free software Xming: do this by following the instructions here VERY closely.

(2).    Click Plotting for some information about how to generate high-quality plots in SAS.

(3).     Click here for a handout containing a useful list of available SAS functions (of which
the Sample Statistics, Quantile Functions, and Probability & Density functions will be
the most useful in this course.)

(4).    Click here to find a copy of the course outline and the first problem assignment.

(5).    I have provided a series of illustrative scripts, including handouts
from class and expanded examples of working SAS programs discussed in class.
Click Scripts to find the directory of text Logs and Scripts of SAS example sessions.

(6).    For a handout discussing the relative interpretability of  relative risks and odds
ratios in analyzing two-way frequency-table datasets, click here.

(7).    Click here for a sample test indicating coverage by topics along with some
sample questions. For a current Sample Test, click here .

(8).    A handout giving the theoretical formulas for confidence and prediction
intervals in simple linear (normal-errors) regression can be found here. It contains
justifications and formulas for the calculations SAS does of CLM and CLI confidence
and prediction limits.


DATA DIRECTORY:  Click here to find a directory of available Datasets.

Throughout the term, additional links will be posted here to various online
data sources and repositories:

  •   UCI Machine Learning Repository containing many datasets with
              challenging structure.
  •   StatLib has subdirectories "data", "jasadata", "disease", and "DASL"
              containing datasets on specialized topics, from methodological
              journal articles, etc.
  •   Examples and datasets from a web-page in a data-analysis course Biostat 510
              taught by Kathy Welch at U of Michigan can be found here (scroll down to
              BioStat 510) or in the corresponding file directory.

  • Important Dates

    Return to my home page.

    © Eric V Slud, Sept. 2, 2008.