This handout presents several optional SAS program statements that may be used to further define your data set. Special attention is given to statements that allow the user to create new variables, modify the values of existing variables, and control the manipulation of individual observations in the data set.

**Part 1: Assignment Statements**

Assignment statements are used to create new variables and modify values of existing variables. These statements must be used in the DATA step. In the event, that you need to create/modify variables during a procedural step you must first interject a new DATA step in the SAS program.

Form: variable = expression;

An expression is a sequence of variable names, constants, and possibly function names linked together by operators. When an assignment statement is executed, the expression is evaluated and the result of the expression is assigned to the variable.

Example 1: To create a new variable

DATA Census90;

INPUT State $ Pop Area;

PPrcnt = Pop * 100 / Area;

CARDS;

NC 5.062 48985

Example 2: To modify an existing variable

DATA Census90;

INPUT State $ Pop Area;

Pop = Pop * 100 / Area;

CARDS;

NC 5.062 48985

Result: Example 1, creates a new variable called, ‘PPrcnt’ representing population density. The data set will have four variables: State, Pop, Area, and PPrcnt. Example 2, creates the same concept (density) but replaces the original value of the population variable. The data set in the second example will only contain three variables: State, Pop, and Area. And Pop will now have the modified values. The original data for the POP variable is now inaccessible in the data set.

**Part 2: Arithmetic Operators**

The prioritized list of the arithmetic operators recognized by SAS is shown below. The priority of the evaluation can be changed by the user by incorporating an expression in parentheses.

Operators Expressions

** Exponentiation A = B**2

* Multiplication A = B*C

/ Division A = B/C

+ Addition A = B+C

– Subtraction A = B-C

Example: An expression performing a multiplication before an exponentiation

A = B ** (2*C)

**Part 3: SAS Functions**

Functions can also appear in an expression. A SAS function is an instruction to perform a given series of calculations on an expression. A large library of functions in several categories including mathematics, statistics, character, and data and time is available.

Form: Keyword(argument1, argument2, … );

Where keyword names the function and the argument(s) or expression are enclosed in parentheses. Multiple arguments are separated by commas.

Example: ANS = 2 * LOG(X+Y);

**Part 4: IF/THEN/ELSE Statements**

Conditional execution of data step program statements is implemented using the IF/THEN/ELSE statements.

Form: IF expression THEN statement;

ELSE statement;

Observe that IF/THEN and ELSE are two separate SAS statements. Each time the IF statement is executed the expression following the IF is evaluated. When the expression is true for the observation, the statement following the THEN is executed. The ELSE statement, which is optional, can be used to control a specific action if the IF condition is false. In the event an ELSE statement is not specified no action is taken if the expression is false.

Comparison Operators

The following is a list of comparison operators that can be used to express a relationship between two quantities in an expression following the IF.

= or EQ equal to

^= or NE not equal to

<> or NE not equal to

> or GT greater than

< or LT less than

>= or GE greater than equal to

<= or LE less than equal to

^< or NL not less than

^> or NG not greater than

Example: DATA Census90;

INPUT State $ Pop Area;

IF State = ‘NC’ THEN Pop = Pop * 1000;

Result: The POP values are modified only for the state of NC.

Logical Operators

Expression containing comparison operators usually include logical operators. The following is a list of possible logical operators.

OR Execute statement if either comparison operator is true.

& or AND Execute the statement only if both comparison operators are true.

^^ or NOT If the comparison operator is false then the result of the logical operator is true or vice versa.

Example: DATA Census90;

INPUT State $ Pop Area;

IF State = ‘NC’ OR state = ‘SC’ THEN Pop = Pop * 1000;

ELSE Pop = Pop * 10000;

Result: Pop values are multiplied by 1000 for the states of SC and NC. All other states have their POP values multiplied by 10000.

IN Operator

The IN operator facilitates making comparisons to a list of items. For instance the Logical Operator example is rewritten as follows:

Example: DATA Census90;

INPUT State $ Pop Area;

IF State IN (‘NC’, ‘SC’) THEN Pop = Pop * 1000;

ELSE Pop = Pop * 10000;

**Part 5: The Subsetting IF And IF/THEN DELETE Statements**

These statements control which observations are written to a SAS data set. A subsetting IF statement controls which observations are included in the data set. The IF/THEN DELETE statement controls which observations are deleted from the data set.

Form: The general form for a subsetting IF is:

IF expression;

The general form for an IF/THEN DELETE statement is:

IF expression THEN DELETE;

Example 1: For subsetting IF

DATA se;

INPUT State $ Pop Area Region $;

IF Region = ‘SE';

Example 2: For IF/THEN DELETE

DATA NotSe;

INPUT State $ Pop Area Region $;

IF Region = ‘SE’ THEN DELETE;

Result: Example 1, will select only those states in the SE region. Example 2, will select only those states that are NOT is the SE region. That is, all observations with a region code of SE will be deleted.

**Part 6: WHERE Processing**

The WHERE statement is used to select a subset of observations from an existing SAS data set that satisfies one or more conditions. WHERE processing is very similar to IF statement processing.

Form: WHERE expression;

The word expression is as defined in part 4, & 5. However, there are several special WHERE statement operators that may also be used in the expression. These are stated and explained below.

Special WHERE Operators

The five new operators are:

BETWEEN – AND

CONTAINS or ?

LIKE

NULL or IS MISSING

=*

The following data set is defined for the purpose of exemplifying the form and uses of the WHERE operators.

ID NAME MILES CITY

WD2327 Masters 60000 Dallus

WD8734 Morris 27000 Boston

WD6743 Ashton 32000 Dallas

. Cannon 18000 Seattle

WD0354 Cash 75000 Chicago

BETWEEN-AND

Allow the user to select observations based on a range of variable values.

Example: WHERE Miles BETWEEN 30000 AND 60000;

Result: Only observation 1, and 3 will be selected.

CONTAINS or ?

Only those observations are selected that contain a specified character string.

Example: WHERE Name CONTAINS ‘ASH';

Result: The new data set will have only observations 3, and 5.

LIKE

Select only those observations that satisfy a pattern matching criteria.

Example: WHERE Name LIKE ‘M%';

Result: Only observation 1, and 2 are selected.

Example: WHERE Name LIKE ‘M_R%';

Result: Only observation 2 is selected.

IS NULL or IS MISSING

Select observations for which the value of the variable is missing or null.

Example: WHERE ffid IS MISSING;

Result: Only observation 4 is selected

=* (Sounds-Like Operator)

Select only those observations that contain a spelling variation of the word or words specified in the WHERE expression.

Example: WHERE city =* ‘DALLAS';

Result: Observations 1 and 3 are selected.