Analysis of variance and regression November 13, 2007 SAS language - - PowerPoint PPT Presentation
Analysis of variance and regression November 13, 2007 SAS language - - PowerPoint PPT Presentation
Analysis of variance and regression November 13, 2007 SAS language The SAS environments Reading in, data-step Summary statistics Subsetting data More on reading in, missing values Combination of data sets Lene Theil
SAS language
- The SAS environments
- Reading in, data-step
- Summary statistics
- Subsetting data
- More on reading in, missing values
- Combination of data sets
Lene Theil Skovgaard,
- Dept. of Biostatistics,
Institute of Public Health, University of Copenhagen e-mail: L.T.Skovgaard@biostat.ku.dk http://staff.pubhealth.ku.dk/~lts/regression07_2
SAS language, November 2007 1
SAS exercises on this course
- Two teachers to help you
- Private user names and passwords !!
- two share each machine
- many of you know SAS ANALYST from course on basic
statistics, but here we focus on the SAS language
- References:
– Aa. T. Andersen, T.V. Bedsted, M. Feilberg, R.B. Jakobsen and A. Milhøj: Elementær indføring i SAS. Akademisk Forlag (in Danish, 2002) – by Aa. T. Andersen, M. Feilberg, R.B. Jakobsen and A. Milhøj: Statistik med SAS. Akademisk Forlag (in Danish, 2002) – R.P Cody og J.K. Smith: Applied statistics and the SAS programming language. 4. ed., Prentice Hall, 1997.
SAS language, November 2007 2
Menus vs. Language
- Menus
+ No learning by heart + No syntax error + Stepwise learning − Inflexible − A bit hard to find your whereabouts − Does not contain everything − Tedious in the long run
SAS language, November 2007 3
Menus vs. Language
- Language:
− Some learning by heart − Many syntax errors in the beginning + Logical coherent + Reproducably + Easier to document + Easier to communicate
SAS language, November 2007 4
Basic structure
- SAS Core
– Database system (“Engine”) – Programming language
- SAS Base
– Data manipulation: DATA, SORT, PRINT, (PLOT) – Minimal statistics: MEANS, UNIVARIATE, TABULATE
- Special modules
– SAS/STAT: TTEST, GLM, GENMOD, etc. – SAS/GRAPH: GPLOT – SAS/ASSIST, QC, ETS, FSP, IML, . . . – SAS ANALYST – SAS Enterprise
SAS language, November 2007 5
SAS in a nutshell Program + Raw data Data file
- −
→ Data file Output
- + Log
- Batch SAS:
– *.sas Program file – *.log Log file – *.lst Output file
SAS language, November 2007 6
- SAS Display Manager
— Environment for program development and data handling – Program editor: common or enhanced – Output window – Log window – Graphics window – Explorer, Viewtable, Toolbar, Results Note: Program code must be saved
SAS language, November 2007 7
Example O’Neill et.al. (1983): Lung function for 25 patients with cystic fibrosis.
SAS language, November 2007 8
Some of these data may be found in the text file T:\pemax.txt (created using e.g. Wordpad)
age sex height weight fev1 pemax 7 1 109 13.1 32 95 7 2 112 12.9 19 85 8 1 124 14.1 22 100 8 2 125 16.2 41 85 8 1 127 21.5 52 95 9 1 130 17.5 44 80 11 2 139 30.7 28 65 12 2 150 28.4 18 110 12 1 146 25.1 24 70 13 2 155 31.5 23 95 13 1 156 39.9 39 110 14 2 153 42.1 26 90 14 1 160 45.6 45 100 15 2 158 51.2 45 80 16 2 160 35.9 31 134 17 2 153 34.8 29 134 17 1 174 44.7 49 165 17 2 176 60.1 29 120 17 1 171 42.6 38 130 19 2 156 37.2 21 85 19 1 174 54.6 37 85 20 1 178 64.0 34 160 23 1 180 73.8 57 165 23 1 175 51.1 33 95 23 1 179 71.5 52 195
SAS language, November 2007 9
Reading in data (more later on...)
data sasuser.pemax; infile ’T:\pemax.txt’ firstobs=2; input age sex height weight fev1 pemax; run;
To execute the program, we click on ’running man’, and then we look at the log file
NOTE: 25 records were read from the infile ’pemax.txt’. The minimum record length was 21. The maximum record length was 21. NOTE: The data set SASUSER.PEMAX has 25 observations and 6 variables. NOTE: DATA statement used: real time 0.11 seconds cpu time 0.01 seconds
No output
SAS language, November 2007 10
What if it did not work as intended?
- 1. Find out why!
- 2. Correct
- 3. Try again
SAS is executed sequentially. If we want to add something, we can just do it later. Recall commands
- When a program bit has been executed,
it may sometimes disappear from the program editor
- Earlier bits may be recovered using F4
- Note, that the bits accumulate: If you use F4 several times, you
will get the previous bits successively after one another
SAS language, November 2007 11
Definition of new variables, transformation We want to study body mass index, bmi:
data sasuser.pemax; infile ’T:\pemax.txt’ firstobs=2; input age sex height weight fev1 pemax; bmi=weight/(height/100)**2; run; proc print data=sasuser.pemax; run;
Obs age sex height weight fev1 pemax bmi 1 7 1 109 13.1 32 95 11.0260 2 7 2 112 12.9 19 85 10.2838 3 8 1 124 14.1 22 100 9.1701
SAS language, November 2007 12
Transformations
- Arithmetics
– The usual operators: + - * / – Raising to a power: **, e.g.. x**2 – Square root: sqrt(x) – Logarithms: log(x), log10(x), log2(x) All logarithms are proportional log2(x) = log(x)
log(2)
- Relations:
= < > <= >= <> (unequal) eq lt gt le ge ne (alternative notation)
- Logical operators:
and
- r
not
SAS language, November 2007 13
Other types of variable definitions
data sasuser.pemax; infile ’T:\pemax.txt’ firstobs=2; input age sex height weight fev1 pemax; length csex $ 6 ; /* in order to avoid truncation */ if sex=1 then csex=’male’; if sex=2 then csex=’female’; fat=(bmi>18); run; proc print data=sasuser.pemax; var csex age bmi fat; run; Obs csex age bmi fat 1 female 7 11.0260 2 male 7 10.2838 . . . . . . . . . . 14 male 15 20.5095 1
SAS language, November 2007 14
Ingrediences in DATA step
- Specification line (name of new data set)
- Data source (here: read from file)
- Variables to read in
- Possible calculations
- Possible redefinitions
- To be concluded with run;
SAS language, November 2007 15
Variables
- The columns in a data set
- May be numerical variables
(contain numbers)
- — or character variables
(contain text strings, letters)
- Values of a character variable is enclosed in citation signs, e.g.
’male’ (except in data files)
- Period (.) denotes a missing value for a numerical variable
SAS language, November 2007 16
Variable names
- SAS does not care about upper/lower case (SEX, sex and Sex
refer to the same variable)
- Names may be up to 32 characters long
(previously only 8)
- Names may contain English letters,
digits and underscore (_)
- — but they are not allowed to start with a digit
SAS language, November 2007 17
Calculation of summary statistics in SAS
proc means data=sasuser.pemax; run;
The MEANS Procedure Variable N Mean Std Dev Minimum Maximum
- age
25 14.4800000 5.0589854 7.0000000 23.0000000 sex 25 1.4400000 0.5066228 1.0000000 2.0000000 fev1 25 34.7200000 11.1971723 18.0000000 57.0000000 pemax 25 109.1200000 33.4369058 65.0000000 195.0000000 bmi 25 15.3422331 3.8633242 9.1701353 22.7777778
- These are default,
- thers may be chosen as options
SAS language, November 2007 18
From Help pages:
/*Some of the keywords available with PROC MEANS: N - number of observations MEAN - mean value MIN - minimum value MAX - maximum value SUM - total of values NMISS - number of missing values MAXDEC=n - set maximum number of decimal places */ statistic-keyword(s) specifies which statistics to compute and the order to display them in the output. The available keywords in the PROC statement are Descriptive statistic keywords CLM RANGE CSS SKEWNESS|SKEW CV STDDEV|STD KURTOSIS|KURT STDERR LCLM SUM MAX SUMWGT MEAN UCLM MIN USS N VAR NMISS Quantile statistic keywords MEDIAN|P50 Q3|P75 P1 P90 P5 P95 P10 P99 Q1|P25 QRANGE Hypothesis testing keyword PROBT T
SAS language, November 2007 19
If we want to see the medians:
proc means data=sasuser.pemax median; var age bmi fev1; run; The MEANS Procedure Variable Median
- age
14.0000000 bmi 14.8660771 fev1 33.0000000
- Oops: Now, we got only the median!
SAS language, November 2007 20
proc means data=sasuser.pemax N mean median; var age bmi fev1; run; The MEANS Procedure Variable N Mean Median
- age
25 14.4800000 14.0000000 bmi 25 15.3422331 14.8660771 fev1 25 34.7200000 33.0000000
SAS language, November 2007 21
Sorting the data
- often used because other procedures demand this
- Example
proc sort data=sasuser.pemax
- ut=sorted_pemax;
by sex descending weight; run;
- If out=xxx is omitted, the original data set will be replaced by
the sorted data.
- Note the option DESCENDING
in front of weight
SAS language, November 2007 22
BY statement
- may be found in many procedures: (MEANS, REG, GLM, . . . )
- performs the analyses within each group separately
- demands sorted data
proc sort data=sasuser.pemax; by sex; run; proc means; where sex ne .; by sex; run;
- Remember to delete missing values, otherwise they will form a
separate group
SAS language, November 2007 23
Output data set
proc sort data=sasuser.pemax; by sex; run; proc means noprint data=sasuser.pemax mean; by sex; var age bmi fev1;
- utput out=summa mean=mage mbmi mfev1;
run; proc print data=summa; /* a temporary data set */ run;
Obs sex _TYPE_ _FREQ_ mage mbmi mfev1 1 14 15.2143 15.6578 39.8571 2 1 11 13.5455 14.9406 28.1818
SAS language, November 2007 24
Alternative procedure: UNIVARIATE
proc univariate data=sasuser.pemax normal; var bmi; run;
a lot of output... (shown on the next page) Several tests for normality created by the option normal:
Tests for Normality Test
- -Statistic---
- ----p Value-----
Shapiro-Wilk W 0.967049 Pr < W 0.5715 Kolmogorov-Smirnov D 0.069046 Pr > D >0.1500 Cramer-von Mises W-Sq 0.025458 Pr > W-Sq >0.2500 Anderson-Darling A-Sq 0.217215 Pr > A-Sq >0.2500
SAS language, November 2007 25
The UNIVARIATE Procedure Variable: bmi Moments N 25 Sum Weights 25 Mean 15.3422331 Sum Observations 383.555827 Std Deviation 3.86332415 Variance 14.9252735 Skewness 0.27214922 Kurtosis
- 0.7599282
Uncorrected SS 6242.80947 Corrected SS 358.206564 Coeff Variation 25.1809768 Std Error Mean 0.77266483 Basic Statistical Measures Location Variability Mean 15.34223 Std Deviation 3.86332 Median 14.86608 Variance 14.92527 Mode . Range 13.60764 Interquartile Range 5.36231 Tests for Location: Mu0=0 Test
- Statistic-
- ----p Value------
Student’s t t 19.85626 Pr > |t| <.0001 Sign M 12.5 Pr >= |M| <.0001 Signed Rank S 162.5 Pr >= |S| <.0001 Quantiles (Definition 5) Quantile Estimate 100% Max 22.77778 99% 22.77778 95% 22.31516 90% 20.50953 75% Q3 17.98454 50% Median 14.86608 25% Q1 12.62222 10% 10.35503 5% 10.28380 1% 9.17014 0% Min 9.17014 Extreme Observations
- -----Lowest------
- ----Highest-----
Value Obs Value Obs 9.17014 3 19.4021 18 10.28380 2 20.1995 22 10.35503 6 20.5095 14 10.36800 4 22.3152 25 11.02601 1 22.7778 23
SAS language, November 2007 26
Tables One-way tables:
proc freq; tables csex; run;
Cumulative Cumulative csex Frequency Percent Frequency Percent
- female
14 56.00 14 56.00 male 11 44.00 25 100.00
Two-way tables (cross tabulation):
proc freq; tables csex*fat / nopercent nocol; run;
Table of csex by fat csex fat Frequency| Row Pct | 0| 1| Total
- --------+--------+--------+
female | 10 | 4 | 14 | 71.43 | 28.57 |
- --------+--------+--------+
male | 9 | 2 | 11 | 81.82 | 18.18 |
- --------+--------+--------+
Total 19 6 25 76.00 24.00 100.00
SAS language, November 2007 27
Filtering data (selecting subsets)
- In DATA-step
– regarding observations: IF, WHERE, DELETE – regarding variables: DROP, KEEP
- In procedures
– Regarding observations: WHERE – Regarding variables: VAR-statement (depending on procedure)
SAS language, November 2007 28
WHERE If we only want to look at the girls:
data pemax; /* temporary data set */ set sasuser.pemax; where csex=’female’; proc print data=pemax; var csex age bmi; run; Obs csex age bmi 1 female 7 10.2838 2 female 8 10.3680 3 female 11 15.8894 4 female 12 12.6222 .............
SAS language, November 2007 29
IF, DELETE Alternative ways of writing:
- if csex=’female’;
- if csex ne ’male’;
Look out, if data contains missing values!
- if sex=’male’ then delete;
SAS language, November 2007 30
Filterings may be combined:
data pemax; set sasuser.pemax; where csex=’female’ and age>12; proc print data=pemax;; var csex age bmi; run;
Obs csex age bmi 1 female 13 13.1113 2 female 14 17.9845 3 female 15 20.5095 4 female 16 14.0234 5 female 17 14.8661 6 female 17 19.4021 7 female 19 15.2860
SAS language, November 2007 31
DROP and KEEP Now that we have bmi, we may not need height and weight:
data pemax; set sasuser.pemax; drop height weight;
If you only want to keep a single variable: data pemax; set sasuser.pemax (keep=bmi);
SAS language, November 2007 32
WHERE in procedures If we for a specific procedure only want to look at the girls (but continue to work with all data):
proc print data=sasuser.pemax; where csex=’female’; var csex age bmi; run; Obs csex age bmi 1 female 7 10.2838 2 female 8 10.3680 3 female 11 15.8894 4 female 12 12.6222 .............
Here, we have created no new data set, neither temporary nor permanent.
SAS language, November 2007 33
Reading in
- from file
- data lines directly in program
- character variables
- columns or free fromat
- data seperation
- missing values
- import from Excel
SAS language, November 2007 34
Data lines directly in program data sasuser.pemax; input age sex height weight fev1 pemax; bmi=weight/(height/100)**2; datalines; 7 1 109 13.1 32 95 7 2 112 12.9 19 85 8 1 124 14.1 22 100 23 1 179 71.5 52 195 ; run;
SAS language, November 2007 35
Reading in character variables
data sasuser.pemax; length sex $ 6; /* to avoid truncation */ input age sex $ height weight fev1 pemax; bmi=weight/(height/100)**2; datalines; 7 male 109 13.1 32 95 7 female 112 12.9 19 85 8 male 124 14.1 22 100 23 male 179 71.5 52 195 ; run;
SAS language, November 2007 36
Semicolon separated data Until now, data have been nicely separated by blanks. Now it looks a bit different.....
age;sex;height;weight;fev1;pemax 7;male;109;13.1;32;95 7;female;112;12.9;19;85 8;male;124;14.1;22;100 8;female;125;16.2;41;85 ..... .....
data sasuser.pemax; * we now specify a list of possible delimiters; infile ’pemax2.txt’ firstobs=2 dlm=’;’; input age sex $ height weight fev1 pemax; run;
SAS language, November 2007 37
Formatted input Now, the values are not separated at all! (often useful for many binary observations, e.g. questionnaire data)
data sasuser.pemax; length sex $ 1; input age 1-2 sex $ 3 height 4-6 weight 7-10 fev1 11-12 pemax 13-15; datalines; 7M10913.132 95 7F11212.919 85 8M12414.122100 23M17971.552195 ; run;
SAS language, November 2007 38
Missing values
- Numeric variables (numbers) must be ’.’ (period)
- Character variables (letters):
– ’NA’, ’missing value’ ’.’ etc. – blanks ?? will of course not work if blanks are used as delimiters
- Take care with -9, 999 etc.
SAS language, November 2007 39
Example 1: List input, numeric variables
data sasuser.pemax; input age sex height weight fev1 pemax; datalines; 7 1 . 13.1 32 95 7 2 112 12.9 19 85 8 . 124 14.1 22 100 ..... ..... ; run;
SAS language, November 2007 40
Example 2: Formatted input, blanks
data sasuser.pemax; length sex $ 1; input age 1-2 sex $ 3 height 4-6 weight 7-10 fev1 11-12 pemax 13-15; datalines; 7M 13.132 95 7F11212.919 85 8 12414.122100 ..... ..... ; run;
SAS language, November 2007 41
Example 3: Semicolon separated, no symbols for missing
data sasuser.pemax; infile ’pemax2.tal’ firstobs=2 dlm=’; ,’ dsd; input age csex $ height weight fev1 pemax; bmi=weight/(height/100)**2; datalines; 7;male,;13.1;32;95 7;female;112;12.9;19;85 8;;124;14.1;22;100 ..... ; run; Option dsd means:
DSD Changes how SAS treats delimiters when list input is used and sets the default delimiter to a comma. When you specify DSD, SAS treats two consecutive delimiters as a missing value.
SAS language, November 2007 42
General options (e.g. in first line of the program):
- ptions obs=100 nocenter
linesize=75 pagesize=60 missing=’-’;
- OBS=: Number of observations (from start) to be included in the
analyses
- CENTER/NOCENTER:
Output appearence
- LINESIZE= Maximum number of characters on each line (at most
256)
- PAGESIZE= Maximum number of lines on each page
- MISSING= Specifies the symbol for missing values (default is .)
SAS language, November 2007 43
Excel files may be imported directly to SAS proc import datafile=’tables.xls’
- ut=sasuser.tables
dbms=excel2000 replace; getnames=yes; run;
SAS language, November 2007 44
SAS is a programming language
- A program is like “a knitting recipe”:
A series of instructions which have to be executed in a specific
- rder.
- Note: SAS is not a spread sheet.
Output does not change if you change the data (rerun the program to do this)
SAS language, November 2007 45
A simple SAS program
data sasuser.pemax; \ infile ’N:\pemax.txt’ firstobs=2; | input age sex height | weight fev1 pemax; | Data Step | bmi=weight/(height/100)**2; | run; / proc print data=sasuser.pemax; \ run; | proc means data=sasuser.pemax; | Proc Steps var age bmi; | run; /
SAS language, November 2007 46
DATA steps and PROC steps
- A SAS program distinguishes between two types of operations
(“steps”)
- DATA steps which define data sets,
reading from text files, calculation of derived variables, selection
- f cases, etc.
- PROC steps which contain standard procedures, operating on
data sets. Note: it is in general not possible to calculate anything in a PROC step.
- Traditionally, a SAS program is arranged so that the DATA step
is at the top, but they may be mixed, if you define new data sets along the way.
SAS language, November 2007 47
Basics regarding SAS language
- Almost everything (except for calculations in a data step) starts
with a keyword and ends with a semicolon
- Statements are bits of code separated by semicolon
OPTIONS ls=80; PROC GLM data=sasuser.pemax; MODEL height = age / solution; RUN;
- Keywords:
OPTIONS, PROC, GLM, MODEL, RUN
- Certain statements belong together in blocks
SAS language, November 2007 48
Things to keep in mind
- The slash (/) is often used to mark the start of options
- Semicolon and slash are necessary:
’solution’ is not a variable name and ’run’ is not an option.
SAS language, November 2007 49
Formatting code / designing programs
- SAS generally does not care about extra blanks and line shifts
- It is however considered good practice to write at most one
statement on each line
- Indenting facilitates the reading considerably.
(Sooner or later, you will end up reading your own old code!)
SAS language, November 2007 50
How to organize your analyses etc.? If only we knew. But it is to some extent a matter of taste. Some thoughts, though...
- Interactive program execution is easy, but may be dangerous!
- Do you remember what was done?
What if you have to ’just correct a few data values’?
- Remember to save the code, at least for the most important
analyses.
- Collect bits and pieces to a more coherent program and save it as
a .sas-file, which can stand alone
- Look out for ’carrying over’ effects when using Display Manager.
Try out your program in a fresh SAS session.
- Save also the log-files. They are as important as the output files.
SAS language, November 2007 51
SAS libraries
- SAS “is born” with four libraries, the most important being
WORK and SASUSER
- Look at Properties in the Explorer to see exactly where they are
located
- WORK is a temporary library, which means that it disappears
(with all its contents) when SAS is closed. These data files are denoted work.pemax
- r simply pemax
- SASUSER is permanent. Data sets stored here will be there also
next time you enter SAS. These files are denoted sasuser.pemax
SAS language, November 2007 52
Private libraries, LIBNAME
- If you want a separate library for each project (surely the best in
the long run), you will have to use LIBNAME statement, e.g. libname mysas ’p:\paper1\sasdata’; data mysas.pemax; infile ..... Note: The folder has to be created before use!
- This LIBNAME statement may be saved in the autoexec.sas
(too advanced for this course)
SAS language, November 2007 53
SAS data set
- First part (before the period) is the SAS-library
- The second part is the name of the data set
- If the first part is omitted, the library is taken to be WORK
- Seen from Windows point of view, the data files have the
extension sas7bdat
- Data sets have two logical parts,
a describing part and the data itself
- PROC CONTENTS resp. PROC PRINT will show these
SAS language, November 2007 54
PROC CONTENTS
The CONTENTS Procedure Data Set Name: SASUSER.PEMAX Observations: 25 Member Type: DATA Variables: 6 Engine: V8 Indexes: Created: 14:13 Wednesday, Observation Length: 48 April 14, 2004 Last Modified: 14:13 Wednesday, Deleted Observations: 0 April 14, 2004 Protection: Compressed: NO Data Set Type: Sorted: NO Label:
SAS language, November 2007 55
- ----Engine/Host Dependent Information-----
Data Set Page Size: 8192 Number of Data Set Pages: 1 First Data Page: 1 Max Obs per Page: 169 Obs in First Data Page: 25 Number of Data Set Repairs: 0 File Name: /saswork/SAS_workC3CA00000753_ rasch/pemax.sas7bdat Release Created: 8.0202M0 Host Created: SunOS Inode Number: 7884 Access Permission: rw-r--r-- Owner Name: lts File Size (bytes): 16384
- ----Alphabetic List of Variables and Attributes-----
# Variable Type Len Pos
- 1
age Num 8 5 fev1 Num 8 32 3 height Num 8 16 6 pemax Num 8 40 2 sex Num 8 8 4 weight Num 8 24
SAS language, November 2007 56
Combination of data sets
- “vertically”: more cases, identical variables
- “horisontally” (merging): new variables, same cases
More cases is easy: It is possible to have more data sets in the same SET statement of a data step.
- Ex. Combining two groups:
data all; set group1 group2; run;
If the data sets do not have exactly the same variables, missing values are filled in
SAS language, November 2007 57
Merging (horisontal combination)
- the data sets ought to have a common key variable, e.g. id
- all data sets have to be sorted according to id
proc sort data=info1; by id; run; proc sort data=info2; by id; run; data info; merge info1 info2; by id; run;
- BY may be omitted, but only if data are complete and are
sorted in the same way (not recommended, take good care!)
SAS language, November 2007 58
Lists of variables
- Sometimes you may want to refer to many variables at one time,
e.g. in case of repeated measurements or just many connected variables proc freq; tables ques1-ques392; run;
- names looking alike: x1-x20
- SAS order of appearance: age--weight
- All character variables: _CHARACTER_
- All numerical variables: _NUMERIC_
- All variables: _ALL_
SAS language, November 2007 59
Formats: Information of how to read or write a variable
- Built-in formats
(numerical, dates, characters)
- User defined formats
Why use formats?
- Nicer looking output
- Grouping for creation of tables
- Protection against errors in data
SAS language, November 2007 60
Formats, continued
- standard formats: 10.3, best12., E12., $10., date10.,
yymmdd10..
- allways contain a period
(keep that in mind!)
- is associated permanently with the variable in the DATA step, or:
- is specified ad hoc with a FORMAT statement in PROC steps.
- User defined formats are created with PROC FORMAT
SAS language, November 2007 61
Example of use of formats
proc format; invalue sexin ’M’=1 ’F’=2; value sexout 1=’male’ 2=’female’; data; informat sex sexin.; format sex sexout.; input sex; datalines; M F m ; proc print; run;
which creates the output:
Obs sex 1 male 2 female 3 .
SAS language, November 2007 62
Example of date formats:
data; input x yymmdd10.; /* this covers several formats */ format x yymmddd.; cards; 2004-05-31 04/1/1 001201 ; proc print; run; Obs x 1 04-05-31 2 04-01-01 3 00-12-01
The actual value is time in days since 1/1 1960.
SAS language, November 2007 63
Times in SAS Example with longitudinal measurements of blood pressure during the day:
data longitudinal; informat time TIME10.2; input person time bp; datalines; 1 04:45:23.12 123 1 07:21:06.32 145 2 05:15:23.42 132 2 09:18:18.02 153 ; run; proc print data=longitudinal; run; Obs time person bp 1 17123.12 1 123 2 26466.32 1 145 3 18923.42 2 132 4 33498.02 2 153
SAS language, November 2007 64
We wish to refer to ’time since start of treatment’ First we have to pick out the starting times, i.e. the first observations for each individual:
data starttimes; set longitudinal; by person; if first.person; start=time; proc print data=starttimes; run; Obs time person bp start 1 17123.12 1 123 17123.12 2 18923.42 2 132 18923.42
SAS language, November 2007 65
and then we have to merge the two data sets, and calculate differences:
data merge_two; merge longitudinal starttimes; by person; timer=(time-start)/60**2; run; proc print data=merge_two; run;
Obs time person bp start timer 1 17123.12 1 123 17123.12 0.00000 2 26466.32 1 145 17123.12 2.59533 3 18923.42 2 132 18923.42 0.00000 4 33498.02 2 153 18923.42 4.04850