
Introduction to SAS PROCedures: Data Analysis
last updated: 13SEP03

- Lets PROCeed… the Saga!
- Data sets to practice with
- Data description and simple inference
- Multiple Regression
- ANOVA- balanced design
- ANOVA- UNbalanced design
- Repeated Measures
- MANOVA
- As you know, SAS has two steps: DATA and PROC’s. You are probably
Datasets from the book:
A Handbook of Small Data Sets edited by D.J. Hand, et al.- STATS@uic.edu:
Statistics, Data Resources, and Advanced topics - STATS@uic.edu:
SAS II Seminar– Looking ahead and Examples - Examine variable distributions and tests of normality
PROC UNIVARIATE DATA=mydata NORMAL PLOT ; VAR v1 v2; RUN; Options:
- NORMAL - runs test of normality
- PLOT - generates stem-leaf, box, and normal probability plots.
by now familiar with the DATA step which is discussed in detail in our
Introductory SAS seminars. You also should
refer to Introduction to SAS PROCedures for a brief
listing of the most common procedures used for statistical data analysis.
»
NOTE: These examples assume that you are using temporary data files. For
permanent data files add the library name to the file names. Also, the
appropriate <options> have been written in to get the
desired results. Other options are not mentioned here. For convenience,
SAS syntax is in capital letters, and user input in small letters. In some
cases the DATA= options has been ommitted assuming that the
PROC is being
run on the last file opened.
Data description and simple inference
- Histogram
PROC GCHART; VBAR v1 v2; RUN;
- Scatterplot
PROC GPLOT; VBAR v1*v2; RUN;
PROC CORR DATA=mydata ; VAR v1 v2; RUN;
- Includes setup of 2 symbols to be used
SYMBOL1 V=dot; SUMBOL2 V=triangle; PROC GPLOT; VBAR v1*v2=v3; RUN; Where v3 is a categorical variable.
PROC SORT DATA=mydata; BY v3; RUN; PROC CORR DATA=mydata ; VAR v1 v2; BY v3; RUN;
- Create new output file with standardized residuals, and Cooks
distances.PROC REG DATA=mydata ; MODEL y1=x1--x10/SELECTION=STEPWISE; OUTPUT OUT=mydatanew PREDICTED=rhat STUDENT=residstd COOK=cookdist; RUN;
PROC UNIVARIATE DATA=mydatanew NORMAL PLOT; VAR residstd; RUN;
PROC GPLOT DATA=mydatanew ; PLOT residstd * (x1 x2 x3) PLOT residstd * rhat; RUN;
number to be used for the plot.
DATA plotcook; SET mydatanew; obsnum=_N_ ; RUN; SYMBOL1=needle; PROC GPLOT; PLOT cookdist * obsnum; RUN;
- Includes interaction effects and multiple comparisons of means.
PROC ANOVA DATA=mydata ; CLASS x1 x2; MODEL y1= x1 x2 x1*x2; MEANS x1 x2/SCHEFFE; RUN;
their level
PROC SORT DATA=mydata; BY x1; RUN; PROC UNIVARIATE data=mydata PLOT; VAR y1; BY x1; RUN; PROC UNIVARIATE data=mydata PLOT; VAR y1; BY x2; RUN;
- Order of categorical variable changed to get desired Type I SS.
PROC GLM; CLASS x1 x2; MODEL y1= x1 x2 x1*x2; RUN; PROC GLM; CLASS x1 x2; MODEL y1= x2 x1 x1*x2; RUN;
PROC MEANS DATA=mydata <options> ; VAR <variables> RUN;
PROC MEANS DATA=mydata <options> ; VAR <variables> RUN;
| 2003-9-14 VDC: |
|
|