Example SAS Program to Read the Survey Data Set

The following SAS program can be used to read and list the data values contained in the survey data set:


OPTIONS LINESIZE=72 ;

* The following data are from a survey of the quality ratings of three products. The variable INC is the person's annual income expressed in thousands of dollars ;

TITLE 'Product Survey' ;

DATA work.survey ;
     INPUT id 1-3 sex $ 6 age 9-10
           inc 12-13 r1 16 r2 17 r3 18;
     r_tot = r1 + r2 + r3 ;
CARDS ;
  1  F  35 17  722
 17  M  50 14  553
 33  F  45  6  727
 49  M  24 14  757
 65  F  52  9  477
 81  M  44 11  777
  2  F  34 17  653
 18  M  40 14  752
 34  F  47  6  656
 50  M  35 17  575
PROC PRINT ;
RUN ;


SAS programs typically consist of two parts: DATA steps and PROC (procedure) steps; however, a program may consist of a data step which creates a SAS Data Set for later analysis or of a single PROC statement which accesses an existing SAS Data Set. Each SAS statement ends in a semicolon.

DATA statement: A data step begins with a DATA statement, e.g., "DATA work.survey ;" above. SAS data set names consist of two parts, a library reference name (work above) and a filename (survey) which are separated from one another by a period (.).

INPUT statement: The INPUT statement specifies the name of the variables which are to be read and, optionally, the format used to read these variables.

CARDS statement: The "CARDS;" statement specifies that the data values specified by the INPUT statement now follow. When the "CARDS;" statement is included, the end of the input data is indicated by the first line containing a semicolon (;) and this line is then interpreted as a SAS statement. Data can be read from External Raw Data Files by use of an INFILE statement placed prior to the INPUT statement; do not include a "CARDS:" statement when reading from an external data file.

Conditional, Computational, and Transforamtion Statements: The statement "r_tot = r1 + r2 + r3 ;" is an example of a data transformation statement and illustrates how SAS can create new variables from existing ones. This statement directs the SAS program to create a new variable by adding together the values of variables r1, r2, and r3. Data Assignment and Transformation Statements may contain any of the operators and functions recognized by the SAS System. Loop and conditional processing structures are also available.

Note: All data transformation statements must be included within a data step.

PROC statements: SAS procedures begin with the word PROC. PRINT, MEANS, and PLOT are examples of commonly used SAS procedure commands. Each procedure statement may be followed by additional statements to specify supplemental details of procedure processing. For example:

would create a scatter plot with inc as the vertical axis and age as the horizontal axis. Since no supplemental statements appear with the PROC PRINT statement in the example program, the default output for this procedures is generated for all variables in the data set.

SAS programs may also include supplemental statements such as OPTIONS, comments, and TITLES as illustrated above.

Changing Default Program Options

The OPTIONS statement provides you with the capability to change default parameters set by the SAS processor. For example, the LINESIZE option specifies the width of printed output; a width of 72 characters creates easy-to-read output files.

Commenting Your Program for Legibility

All text following a asterisk (*) will be ignored by the SAS processor until a semicolon, which indicates the end of the comment, is encountered. If more than one SAS statement appears on a line, place an asterisk before each of the statements which is to be disregarded.

A comment may also be initiated by placing /* within a line. This type of comment is terminated by placing a */ at the end of the text to be included in the comment. If /* is used to begin a comment, avoid placing the /* in column 1 if the program might ever be submitted to MVS for processing as this pair of characters is interpreted by MVS as the "end-of-job" indicator.

Including Output Titles

The TITLE statement provides a title for the SAS procedure output. The text of the title should be enclosed in single or double quotation marks. Titles are centered on pages unless the NOCENTER option is in effect. If a single quote is to be included in the title, it should be indicated by two single quotes. Up to 10 title lines can be coded per procedure. For example:

The TITLE statement remains in effect until another TITLE statement of the same level or lower numeric value appears. If a new TITLE2 statement were specified in a program containing the two lines indicated above, the TITLE1 statement would remain in effect, but the TITLE3 would be canceled. The TITLE1 statement can be abbreviated as TITLE.

Note: If you do not include a title specification, SAS may write 'The SAS System' at the top of the program output as the default title. If you do not wish to have any title at the top of your output, include TITLE; before the first procedure statement in the program.

Defining Input Data

The DATA statement initializes a new data set and is usually one of the first statements in a SAS program. For example the following line assigns the name survey to the current data set which will be stored in the library named work:

For information on using SAS data sets, see: Using SAS Data Sets.

Note: It is not necessary to give data sets a name. A DATA; statement is sufficient to specify a new data set; however, it is good practice to assign names to data sets so that you can refer to these names in subsequent data and procedure steps. Data Sets are named using the same conventions as for SAS variables. If you are unable to choose descriptive names for your data sets, the names one, two, three, etc., are commonly employed.

In the above example

The libref "work" is a special libref and is preassigned by the SAS System as the library used to store temporary "work" data sets which are deleted when the current SAS session ends.

Note: Since "work" is the default data library used when no libref is given, it is not necessary to specify the libref "work" if you do not wish to save your SAS data set for a future run. Thus using either of the following statements would be equivalent and create the temporary SAS data set "survey" which would be deleted from disk storage when the the current SAS session ends:

  • DATA work.survey ;

  • DATA survey ;

    Using either of the above, the SASLOG, will refer to data set by the name WORK.SURVEY.

    Unless a two part name (libref and filename separated by a period) is specified on the DATA statement, the data set is temporary and will be discarded when your SAS session ends. If you wish to user your SAS data sets in a future SAS session, use a libref other than work and use a LIBNAME statement within your SAS program to define the location of the corresponding data library. Use of the LIBNAME statement is discussed in: Using SAS Data Sets.

    The INPUT statement indicates the names of each of the variables used in the analysis and specifies how the data is to be read. In the example program above, the INPUT statement directs the SAS program to read variable id from columns 1 to 3, sex from column 6, age from columns 9 and 10, inc from columns 12 and 13, r1 from colum 16, r2 from column 17, and r3 from column 18. The dollar sign $ following variable sex indicates that this variable is a character variable and may contain non-numeric values. For additional information on reading input data and available formats, see: Input Data Specifications.

    When several SAS variable names begin with the same combination of alphabetic characters, a set of variable names can be abbreviated by indicating the range to be included. For example, the set of variables r1 r2 r3 r4 r5 r6 r7 r8 could be specified by r1-r8. In future references, the set of variables sex, age, inc, r1, and r2 in the survey data set could be abbreviated as sex--r2.

    The CARDS; statement, placed before your data, indicates that the next lines are to be read as data. When a semicolon is encountered, the SAS processor exits from data input mode and interprets that line (the line containing the semicolon), and all lines which follow, as data specification or procedure statements. Your data should not contain any blank lines or semicolons.

    Note: The LINES; statement is equivalent to the CARDS; statement and can be used in its place in most implementations of Version 6 of the SAS System. Although the LINES statement is more meaningful than CARDS, which has become obsolete with the discontinuance of computer punch cards, this documentation will continue to use the CARDS; statement as this usage is consistent with most current documentation from SAS Institute.

    Note: If you would like to be able to read data into a SAS program from an external data file, see: Using External Raw Data Files

    Listing Values in a Data Set

    SAS uses the word PROC to identify procedures used in data processing. The statement PROC PRINT tells the SAS processor to print the contents of the last data set to the output listing file. Options are available for printing selected variables or observations. PROC PRINT and other SAS data processing procedures are discussed in more detail in Selected SAS Procedures.

    Normally a copy of the input data is only printed if you include a PROC PRINT statement in the program. Another way to get a copy of the input data is the LIST statement. The LIST statement is typically placed just before the CARDS statement and it instructs SAS to print the data lines as they appear after the CARDS statement. The output from the LIST statement is written into the SASLOG.

    Ending Your Program

    When a SAS program is executed interactively, using the Display Manager System, a RUN; statement should be included after the last statement in the program prior to submitting it for execution.

    Executing Additional SAS Procedures Using the Survey Data Set

    Execute each of the following sets of SAS statements, one RUN statement at a time and observe the results generated by each:


    Virginia Tech Computing Center--Distributed Information Systems
    Last updated: December 12, 1997