For this handout we will examine a dataset that is part of the data collected from “A study of preventive lifestyles and women’s health” conducted by a group of students in School of Public Health, at the University of Michigan during the1997 winter term. There are 370 women in this study aged 40 to 91 years.
Description of variables:
Variable Name Description Column Location
IDNUM Identification number 1-4 STOPMENS 1= Yes, 2= NO, 9= Missing 5 AGESTOP1 88=NA (haven't stopped) 99= Missing 6-7 NUMPREG1 88=NA (no births) 99= Missing 8-9 AGEBIRTH 88=NA (no births) 99= Missing 10-11 MAMFREQ4 1= Every 6 months 12 2= Every year 3= Every 2 years 4= Every 5 years 5= Never 6= Other 9= Missing DOB 01/01/00 to 12/31/57 13-20 99/99/99= Missing EDUC 1= No formal school 21-22 2= Grade school 3= Some high school 4= High school graduate/ Diploma equivalent 5= Some college education/ Associate’s degree 6= College graduate 7= Some graduate school 8= Graduate school or professional degree 9= Other 99= Missing TOTINCOM 1= Less than $10,000 23 2= $10.000 to 24,999 3= $25,000 to 39,999 4= $40.000 to 54,999 5= More than $55,000 8= Don’t know 9= Missing
SMOKER 1= Yes, 2= No, 9= Missing 24 WEIGHT1 999= Missing 25-27
The yearcutoff option is used, which defines the 100-year window SAS will use for a two-digit year. We set yearcutoff=1900 so that a date of birth of 12/21/05 will be read as Dec 21, 1905, rather than as Dec 21, 2005 (the default yearcutoff for SAS 9.2 is 1920).
options yearcutoff=1900;
The data step commands read in the raw data and set up the missing value codes. We set up the missing value code for DOB to be 09/09/99, using a SAS date constant