-Introduction to SAS — Part I

These handouts were written by me


Introduction to SAS for Windows/X-Windows– Part I

 VDC –Last update: 01FEB2002


ALL STATISTICAL SEMINARS REQUIRE that you have WINDOWS knowledge
or
experience at
INTERMEDIATE Level.

 

 
 
 

 

 
 
 

Summary

 

  • In this seminar, we try to cover…

    • General overview of the program and options
    • Introduction to PROGRAM EDITOR, LOG, and OUTPUT windows
    • How to use SAS online HELP,
      ONLINE
      DOCS
      ,
      and other help facilities like ASSIST.
    • How to read in an external text [ASCII] file containing data
    • The elements and purpose of the DATA step
    • The elements and purpose of the PROC step
    • How to write a simple program in the PROGRAM EDITOR window
  • PLATFORM; Windows, or UNIX.
  • DURATION: Two hours
  • PREREQUISITE: Introduction to Windows
  • HANDS ON: Yes. — Example files located at the network directory

    n:\clasfile\stats\.


 

 
 
 

Fundamental SAS concepts and terminology

 

  • What is SAS?SAS is a programmming language which has powerful capabilities for
    data manipulation, statistical analysis, report writing, and generating
    plots.
    Once learned, it is quite easy to transfer what you have learn to
    any other platform where SAS is available because the program looks and
    works identical irrespective of the platform.
  • How do I run SAS?

    SAS can be invoked by any one of the following ways…

    • Interactively

      Using SAS interactively means that you run SAS and submit the statements
      directly to the program. SAS will give you online outputs and messages.
      For MS Windows, this also means nowadays that you are running the *full
      graphical* version of the program. In the past, you could be running SAS
      interactively via SAS/ASSSIST or via the SAS Display Manager.

      • SAS/ASSIST is an menu driven help environment originally
        develped for completing tasks for
        those operating systems which did not support graphical interphases.
        Nowadays, this
        system is still available in current versions of SAS but it is quite
        archaic compared to the Windows environment options and pulldown menus.
        When you call ASSIST, you will be presented with a set of icons which
        allow you complete tasks by using a point-click approach.
      • Using the SAS “Display Manager System”.

        SAS Display Manager is a term used for the original interactive windowing
        enrironment developed by SAS for text based operating systems. Under MS
        Windows, this concept is pretty much obsolete.
        The Display Manager System is SAS original response for the need of an
        icon guided system under older platforms which only used text displays
        [e.g. UNIX, CMS, MVS, etc]. Given that now we use graphical displays in
        our personal computers, it only becomes critical to understand how the DMS
        system works for those occassions when a graphical display is not
        available. Under the UNIX system this is a practical way of
        using SAS upon not having a graphical interphase– See SAS under UNIX
        below– but access to this display depends on the version of SAS you are
        using. The Display Manager is not
        available in UNIX for SAS V8.

        The Display Manager is now becoming an obsolete term and it should be
        equated with the windows and pulldown menus available under MS Windows
        environment. Such pulldown menus are quite practical for completing
        simple tasks
        and taking advantage of new features of the SAS
        system for version 8 or higher
        .

    • Non-interactive or “batch” mode

      The term “batch processing” has its origins in prompt based operating
      systems where a job was sumitted to a program and a background process
      completed the task yielding some output.
      When using this method, you have to first edit your code with any text
      [ASCII] editor, and then you submit your
      code by invoking SAS at the prompt of your operating system. For
      example, if I have a file named mycode.sas, I would write
      at the DOS prompt: sas c:\datatemp\mycode.sas. A set of
      options are available to you at the time of submitting the code via
      the prompt of your operating system. This method is available both under
      UNIX and MS Windows environments and it precludes using the graphical
      version of the program. It is suggested if the graphical version is not
      available, or if the complexity of the program you are submitting has a
      high demand of the operating system resources such that you are better
      off not invoking the SAS program itself. Note: If you have a visual
      imparment which requires for you to use a speech synthetizer, this is your
      best option to interact with SAS.

      With this method, SAS will
      run as a background process and when finished, it will generate a
      log and output files to the directory location specified in
      the code submmitted.

  • Fundamental Concepts

    • SAS Library

      A SAS library is the physical location where the files are kept. That is,
      the specific directory location where files are saved to. The default
      SAS library is WORK. You can specify other libraries. To do so, you
      use the LIBNAME command. You need to give a nickname, or LIBREF, to the library
      you create. How to identify the directory location
      is dependent on the platform you are using.

    • SAS dataset vs. text data files and portable files

      A SAS dataset a set of data written out by the SAS program.
      It is
      written in computer language [binary format] and it cannot be read
      by any text editor- if you do you try to read it with a text ediitor
      will see funny looking characters. A SAS dataset takes
      less disk space and is easier for SAS to read it in and manipulate it.

      A data file is written as an ASCII text file. A text file can be read by any
      text editor. When you save a SAS dataset as a portable file, what SAS does is
      take out all platform related information writting into a text file
      the descriptor portion of the data and the data itself.

    • Permanent vs. Temporary Datasets

      A temporary dataset is one which will exist as long as the current
      SAS session is running. A temporary dataset has a one level name.
      A temporary datase will be discarded at the moment you close the SAS
      session and all the variables and data created will NOT saved.
      An example is the file named:

      mydata

      A permanent dataset is a SAS dataset which is SAVED and can be used in later
      sessions. A permanent dataset has a two level file reference
      name. A two level file reference name consists of: a library path nickname
      [LIBREF] and a file name [FILENAME]. You create a permanent dataset by
      using the syntax:
      libref.filename

      For example, if I submit the command:

      libname purple "a:/";

      Then I can save the dataset mydata as:

      purple.mydata

     

     
     
     

    SAS versions: 6.1x vs 8.x

     

     

     
     
     

    SAS Programming: DATA and PROCedures

     

    • DATA vs. PROC step

      SAS has two types of steps: DATA and PROCedure steps. The DATA step is
      used to perform data and variable manipulation, database
      management, and programming. Each of these steps has a set of options and
      statements which comprise the syntax which will be submitted to
      SAS for processing.

    • INFILE vs. CARDS or DATALINES vs. SET statement

      The INFILE statement is used to read in text files [ASCII].
      You need to know how what is the organization of this
      file. For example and in a personal computer, you would
      write something like,

      
      INFILE "c:\sas612\sasuser\mydata.dat";
      

      Under UNIX, you might write something like:

      INFILE "~netid/sasdata/mydata.dat";
      


      If you include your data as part of your DATA step, you will use the CARDS
      or the DATALINES command.

      CARDS;
      1 2 3
      2 3 4
      3 4 5 ;
      

      or…


      DATALINES;
      1 2 3
      2 3 4
      3 4 5 ;
      

      The SET statement is only used to read in SAS datasets. It cannot
      be used to read in any file format other than SAS.

    • INPUT and INFORMATS

      The INPUT statement is used to determine the names
      of the variables SAS is going to read in, as well as their location or order.
      You can define the way SAS is going to read in variables. For this, you would use
      an INFORMAT, which are formats for variables being read in.

     

     
     
     

    Elementary DATA step and PROC step: Your first SAS program!

     

    • setup.sas

      libname cdir "c:\temp\";
      libname ddata "d:\sasdata\";
      libname dv8 "d:\sasdata\courses\v8enhance\";
      libname dv6data "d:\sasdata\courses\v8enhance\v6data\";
      libname dv8data "d:\sasdata\courses\v8enhance\v8data\";

    • We are going to use the dataset
      ex1.dat
      which is available at:
      n:\clasfile\stats\ex1.dat
      and which contains the data we will use
      throught our SAS seminars

      libname adir 'a:\';
      libname cdir 'c:\temp\';
       /* if I wanted this to be a permanent dataset
      I would use just the filename test1 */
      
      data cdir.test1;  /* this creates a permanent file */
      infile 'n:\clasfile\stats\ex1.dat';
      input  region $ 1-2 citysize $ 4-4 pop 6-12 product $ 14-17
      saletype $ 19-19 quantity 21-23 amount 26-34;
      run;
      
      proc print data=cdir.test1 nobs;
      title "Seminar 601: Introduction to SAS";
      run;
      
      proc means data=cdir.test1;
      var pop quantity amount;
      run;
      

     

     
     
     

    Some elementary PROCedures.

     

      PROC PRINT
      PROC PRINTTO 	-redirects the log or standard print file.
      PROC SUMMARY	-Computes descriptive statistics and frequencies
      			[similar to MEANS].
      PROC TABULATE	-Prints tables of descriptieve statistics and frequencies.
      PROC CONTENTS	-describes the contents of a SAS Dataset
      PROC CORR	-run correlations
      PROC FORMAT	-defines and prints formats and informats
      PROC SORT	-Sorts observations in a SAS dataset.
      PROC FREQ	-Print tables of frequencies.
      PROC MEANS 	-Equivalent to the SUMMARY proc with the PRINT option.
      PROC UNIVARIATE	-Computes statistics.
      PROC PLOT	-Produces scatter graphs using text characters.
      PROC TRANSPOSE	-Transposes datasets; converts observations to variables
      			and variables to observations.
      

     

     
     
     

    Where else can I go for more information?

     

     

     


      2004-8-3  VDC: mailboxWWWSTATS@uic.edu


      UIC Home Page Search UIC Pages Contact UIC