Statistical Graphics for the SAS System Computing for Research I - - PowerPoint PPT Presentation

statistical graphics for the sas system
SMART_READER_LITE
LIVE PREVIEW

Statistical Graphics for the SAS System Computing for Research I - - PowerPoint PPT Presentation

Statistical Graphics for the SAS System Computing for Research I 01/24/2012 N. Baker Introduction to SAS/GRAPH SAS/GRAPH is the primary graphics component of SAS system. Includes charts, plots, and maps in both 2 and 3 dimensions.


slide-1
SLIDE 1

Statistical Graphics for the SAS System

Computing for Research I 01/24/2012

  • N. Baker
slide-2
SLIDE 2

Introduction to SAS/GRAPH

  • SAS/GRAPH is the primary graphics component
  • f SAS system.
  • Includes charts, plots, and maps in both 2 and 3

dimensions.

  • Procedures included GCHART, GPLOT, GMAP,

GCONTOUR etc…

  • We will focus on PROC GPLOT
slide-3
SLIDE 3

Examples

What Can Be Done using SAS GRAPH

slide-4
SLIDE 4

What can be done with SAS/GRAPH?

These samples courtesy of Robert Allison

slide-5
SLIDE 5

What can be done with SAS/GRAPH?

These samples courtesy of Robert Allison

slide-6
SLIDE 6

What can be done with SAS/GRAPH?

These samples courtesy of Robert Allison

slide-7
SLIDE 7

What can be done with SAS/GRAPH?

These samples courtesy of Robert Allison

slide-8
SLIDE 8

Introduction

Elements of SAS/GRAPH

slide-9
SLIDE 9

Elements of SAS/GRAPH

Overview

Taken from SAS 9.2 documentation

ODS Destination Elements Global Statements Procedure Step

slide-10
SLIDE 10

Elements of SAS/GRAPH

PROC GPLOT: Specifying an input data set

Similar to all other SAS PROC’s

– Proc gplot data=<libname>.<data set><options>;

Options include setting annotate data sets, image mapping for drill-down plots in web applications, Creating Uniform axis across plots, and specifying SAS catalog for placement of output.

slide-11
SLIDE 11

Elements of SAS/GRAPH

PROC GPLOT: Specifying an input data set

Similar to all other SAS PROC’s

– Proc gplot data=<libname>.<data set><options>;

Options include setting annotate data sets, image mapping for drill-down plots in web applications, Creating Uniform axis across plots, and specifying SAS catalog for placement of output.

slide-12
SLIDE 12

Elements of SAS/GRAPH

PROC GPLOT: Plotting

  • You can use up to 2 plots statements at a time,

however, at least one Plot statement is required.

  • The plot statement is used to control the axis,

plotting points, labels, tick marks, and the plot legend.

  • The only required arguments are…

– Plot <Y Variable>*<X Variable> / <options>;

slide-13
SLIDE 13

Elements of SAS/GRAPH

PROC GPLOT: Plotting Options

  • Options for plotting

– Plot options

  • Legend= or nolegend: specifies figure legend options
  • Overlay: allows overlay of more than one Y variable
  • Skipmiss: breaks the plotting line where Y values are missing

– Appearance option

  • Axis: Specifies axis label and value options
  • Symbol: Specified symbol options
  • href, vref: Draws vertical or horizontal reference lines on plot
  • frame/fr
  • r noframe/nofr: specifies whether or not to frame the

plot

  • caxis/ca, cframe/cfr, chref/ch, cvref/cv, ctext/c: specifies

colors used for axis, frame, text or reference lines.

slide-14
SLIDE 14

Introduction to SAS/GRAPH

  • We will begin with rather simple code and

let SAS decide how our graph will look.

  • Then we will step through a few options

that allow us to control and adjust the graphic output.

slide-15
SLIDE 15

Examples

2 Variable Plotting / Scatter plots

slide-16
SLIDE 16

Examples

2 Variables

  • Suppose subjects are given a doses of

experimental medication based on body weight

  • ver a 24 hour period (mg/24hrs). Variable X
  • On the following day, each subject had their

Vascular Cell Adhesion Molecule (μg/ml) levels

  • measured. Variable Y1
  • The investigators are interested in seeing a plot
  • f the dose given vs. the plasma VCAM levels to

see if there may be an effect of the drug dose.

slide-17
SLIDE 17

Examples

2 Variables

y1 1 2 3 x 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9

Very basic plot, below we get all of the default options. Not very exciting. Definitely not publication quality.

slide-18
SLIDE 18

Examples

2 Variables

y1 1 2 3 x 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9

Very basic plot, below we get all of the default options. Not very exciting. Definitely not publication quality. Cannot read axis marks Axis labels don’t describe the data Crowded Axis

slide-19
SLIDE 19

Examples

2 Variables: AXIS Statements

  • AXIS<1..99> <options>;

– Label Option;

  • Angle/a=degrees (0-359)
  • Color/c=text color
  • Font/f=font
  • Height/h=text height (default=1)
  • Justify=(left/center/right)
  • Label=“text string”

– Options precede label

  • axis1 label=(a=90 c=black f=“arial”

h=1.2 “time” a=90 c=black f=“arial” h=1.0 “hours”);

slide-20
SLIDE 20

Examples

2 Variables: AXIS Statements

  • AXIS<1..99> <options>;

– Order Option

  • Order=(a to b by c): major tick marks will show up

at intervals based on c.

– Example order=(0 to 3 by 1);

– Value Option

  • value=(“”

“” “”): applies text label to each major tick.

– Example Value=( “Start” “Middle” “End”)

slide-21
SLIDE 21

Examples

2 Variables: AXIS Statements

Resets previous

  • ptions

Horizontal axis (X Variable) Vertical axis (Y Variable) Call Axis statements NOTE: you can also place the AXIS statements within the gplot proc

slide-22
SLIDE 22

Examples

2 Variables: AXIS Statements

Plasma Level

1 2 3

D o se

m g /24 H rs

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9

The LABEL options helped make the axis labels meaningful, but the axis tick marks remain crowded

slide-23
SLIDE 23

Examples

2 Variables: AXIS Statement

Added ORDER

  • ption to Axis

statement

slide-24
SLIDE 24

Examples

2 Variables: AXIS Statement

Plasma Level

1 2 3

D ose

m g/24 Hrs

0.0 0.5 1.0 1.5 2.0

The axis are less crowded, but still very hard to read, using the Value= option will help.

slide-25
SLIDE 25

Examples

2 Variables: AXIS Statement

Added VALUE

  • ption to Axis

statement

slide-26
SLIDE 26

Examples

2 Variables

Plasma Level

0.0 1.0 2.0 3.0

D o se

m g/24 H rs 0.0 0.5 1.0 1.5 2.0

Now about those data points!

slide-27
SLIDE 27

Examples

2 Variables: Symbol Statement

  • Symbol<1…255> <options>;

– Symbol options

  • Color= value color
  • Ci=line color
  • Height=symbol height
  • Line=line

type

  • Value=symbol
  • Width=thickness factor
  • Interpol=point interpolations
slide-28
SLIDE 28

Examples

2 Variables: Symbol Statement

  • Symbol<1…255> <options>;

– Symbol options

  • Interpolation options

– Join, box, hilo interpolation, regression, spline, standard deviations.

  • value options

– Dot, circle, star, square, plus, minus, “text value”.

  • Color options

– 256 colors available,

www.devenezia.com/docs/SAS/sas-colors.html

slide-29
SLIDE 29

Examples

2 Variables: Symbol Statement Symbol options

  • Interpolation options

– None – Join: points connected by straight line – Needle: vertical line from horizontal axis to point – Stepx: (L,R,C) step function, stepxJ will add a verticle line to each step plot – stdkxxx: (M,P,J,B,T) k=1,2,3 (standard deviations) or » stdM=SEM, stdp=uses pooled sample variance, stdj=joins the errors, T will give tops and bottoms to error lines, where B will request error bars. – HILOxxx: (T,B,C,J)

slide-30
SLIDE 30

Examples

2 Variables: Symbol Statement Symbol options

  • Interpolation options

– R-series interpolation – Rxxxxxxx » RL: linear regression » RQ: Quadratic Regression » RC: Cubic Regression » CLM: CI for mean predicted values » CLI: CI for Individual predicted values » 90, 95, 99: confidence limits » Example: RLCLM95 -> Gives a linear regression line with the 95% CL for mean predicted values

slide-31
SLIDE 31

Examples

2 Variables: SYMBOL Statement

Plasma Level

0.0 1.0 2.0 3.0

D o s e

m g /2 4 H r s 0 .0 0 .5 1 .0 1 .5 2 .0

slide-32
SLIDE 32

Examples

2 Variables: Adding Regression Lines

Plasma Level

0.0 1.0 2.0 3.0

D o s e

m g /2 4 H r s 0 .0 0 .5 1 .0 1 .5 2 .0

R e g re s s io n E q u a t io n : y 1 = 0 . 4 8 1 1 7 3 + 1 . 2 6 9 4 3 3 * x

slide-33
SLIDE 33

Examples

Grouping Variables

  • Many times we want to look at group

differences.

  • Demographic groups, treatment groups,

etc…

  • Grouping variable must be in the data file.
slide-34
SLIDE 34

Examples

Grouping Variables

You need to add a new SYMBOL statement for the each additional group. Add the grouping variable to the PLOT statement

slide-35
SLIDE 35

Examples

Grouping Variables

Plasma Level

0.0 1.0 2.0 3.0

Dose

m g/24 Hrs 0.0 0.5 1.0 1.5 2.0

gender

Fem ale M ale

Not bad, but the default figure legend is not well placed.

slide-36
SLIDE 36

Examples

Grouping Variables: Legend Statement

  • Legend<1…99> <options>;

– Legend options

  • Across=: number of columns
  • Down=: number of rows
  • Frame/noframe
  • Position=(bottom, middle, top) (left, center, right)

(inside, outside)

  • Origin=(x,y)
  • Label=
  • Order=
  • Value=

These options are the same as within the axis statement discussed earlier

slide-37
SLIDE 37

Examples

Grouping Variables: Legend Statement

Legend Statement Call Legend Statement

slide-38
SLIDE 38

Examples

Grouping Variables: Legend Statement

Plasma Level

0.0 1.0 2.0 3.0

D o se

m g/24 H rs 0.0 0.5 1.0 1.5 2.0

G ender Fem ale M ale

slide-39
SLIDE 39

Examples

Repeated Measures/Longitudinal Plotting

slide-40
SLIDE 40

Examples

Repeated Measures/Longitudinal Plotting

  • Suppose that you have many observations on

each subject taken at various time points.

  • 40 subjects
  • 2 treatments (Placebo and Active med)
  • 5 time points (baseline plus 4 1-week intervals)

– During the last week, both treatment groups receive Placebo

  • Data should be in the Long format

At diagnosis, subjects are randomized to an experimental treatment or placebo. During the final week of treatment, all subjects will receive active medication.

slide-41
SLIDE 41

Examples

Repeated Measures/Longitudinal Plotting

Create appropriate axis and legend statements as before.

AXIS for X (time) variable AXIS for Y (Response) variable Added TITLE statement for plot

slide-42
SLIDE 42

Examples

Repeated Measures/Longitudinal Plotting

Response

1 0 2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 0 1 0 0

T im e S in c e D ia g n o s is : W e e k s

B a s e lin e 1 2 3 4

In d i v i d u a l D i s e a s e P r o g r e s s i o n

slide-43
SLIDE 43

Examples

Repeated Measures/Longitudinal Plotting

Joins the dots, By ID

Response

1 0 2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 0 1 0 0

T im e S in c e D ia g n o s is : W e e k s

B a s e lin e 1 2 3 4

In d i v i d u a l D i s e a s e P r o g r e s s i o n

slide-44
SLIDE 44

Examples

Repeated Measures/Longitudinal Plotting

Response

1 0 2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 0 1 0 0

T im e S in c e D ia g n o s is : W e e k s

B a s e lin e 1 2 3 4

In d i v i d u a l D i s e a s e P r o g r e s s i o n

T re a tm e n t G ro u p Tr e a tm a n t A P la c e b o

Plot data by trt group and create a symbol statement for each group

slide-45
SLIDE 45

Examples

Using the Overlay statement to stack plots

slide-46
SLIDE 46

Examples

Overlay 2 plots w/ the same data

Suppose that you are asked to graphically show progression of tumor growth for a group of subjects and

  • verlay the progression of each treatment group.

50 subjects randomized to either low or high dose medication. Tumor size is measured at baseline as well as the following 9 weeks. The investigator would like an easy to present plot containing both pieces of information for a presentation to his peers.

slide-47
SLIDE 47

Examples

Overlay 2 plots w/ the same data

Tumor Growth

10 20 30 40 50 60 70 80

Time Since Diagnosis: W eeks

Baseline 1 2 3 4 5 6 7 8 9

Individual Disease Progression

Tumor Growth

10 20 30 40 50 60 70 80

Tim e Since Diagnosis: W eeks

Baseline 1 2 3 4 5 6 7 8 9

Individual Disease Progression

T reatment G roup Low Dose High Dose

Plot of individual values as before Plot of treatment group means and Standard errors as before Grouping variable Symbol repeats

slide-48
SLIDE 48

Examples

Overlay 2 plots w/ the same data

slide-49
SLIDE 49

Examples

Overlay 2 plots w/ the same data

Tumor Growth

10 20 30 40 50 60 70 80

Time Since Randomization: W eeks

Baseline 1 2 3 4 5 6 7 8 9 10 20 30 40 50 60 70 80

Individual Disease Progression

Treatment Group

Low Dose High Dose

slide-50
SLIDE 50

Examples

Overlay multiple plots from different variables

Use proc logistic to output the predicted probability of developing nephropathy given the baseline Oxidized LDL immune complex level as well as the 95% confidence limits. Many PROCs can output predicted values, adjusted means, along with point wise confidence values.

slide-51
SLIDE 51

Examples

Overlay multiple plots from different variables

  • Prob. of Nephropathy (95% CI)

0.0 0.2 0.4 0.6 0.8

Baseline Ox LDL-IC

1.80 2.14 2.48 2.82 3.16 3.50 3.84 4.18 4.52 4.86 5.20 5.54 5.88 6.22 6.56 6.90

slide-52
SLIDE 52

Examples

Overlay multiple plots from different variables

Baseline LDL & HDL (mg/dl)

50 100 150 200 250

Baseline OxLDL-IC Quartile

1st 2nd 3rd 4th

Baseline OxLDL-IC

100 200 300 400 500

Baseline Characteristics

OxLDL-IC LDL Cholesterol HDL Cholesterol

Baseline Characteristics

OxLDL-IC

slide-53
SLIDE 53

Examples

Overlay multiple plots from different variables

slide-54
SLIDE 54

The Annotate Facility

slide-55
SLIDE 55

The Annotate Facility

Introduction

The Annotate Facility allows SAS users to customize graphical

  • utput. The customizations can be data driven or user
  • specified. Text, shapes, lines and images can be added

to output graphics Step 1. Create an annotate data set

This data file will give commands to SAS/GRAPH Specific variables must be in the annotate data set. Others are allowed but ignored. What, how, and where are defined by these variables. Table 1 list important variables.

slide-56
SLIDE 56

The Annotate Facility

Introduction

slide-57
SLIDE 57

The Annotate Facility

Introduction

The Annotate FUNCTION variable tells SAS what to do The annotate coordinate system allows for flexibility in placing

  • bjects within the output. There are 12 possible conditions.
slide-58
SLIDE 58

The Annotate Facility

Introduction

slide-59
SLIDE 59

The Annotate Facility

Introduction

slide-60
SLIDE 60

The Annotate Facility

HbA1c

6.0 % 6.5 % 7.0 % 7.5 % 8.0 % 8.5 % 9.0 % 9.5 % 10.0 %

S tu d y T im e

Y e ars

B L 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9

M ean HbA 1c % durring D C C T /E D IC study

Proc GPLOT global options help make graphs more pleasing, however, there are cases where more work is needed to fully explain the data

slide-61
SLIDE 61

The Annotate Facility

Created shaded regions to designate study sections Deleted regions of non interest Added treatment group and study section labels

slide-62
SLIDE 62

The Annotate Facility

slide-63
SLIDE 63

The Annotate Facility

$ 0.00 $ 0.10 $ 0.20 $ 0.30 $ 0.40 $ 0.50 $ 0.60 $ 0.70 $ 0.80 $ 0.90 $ 1.00

Year

  • Jan. 2007
  • Jan. 2008
  • Jan. 2009
  • Jan. 2010
  • Jan. 2011

Individual Net Worth

As a Function of Original Worth

BORING!

Suppose you want To jazz up your plots for a

  • Presentation. You can place a picture

Or graphic behind you data to accent The results. We are going to place an image behind the data, but only below The data series. NEAT!

slide-64
SLIDE 64

The Annotate Facility

Anno data set 1: Will place the image

  • f the dollar over

the plotting area. Anno data set 2: Will create white Space above the Plotted line over time. SET the anno data sets and call them in the GPLOT statement

slide-65
SLIDE 65

The Annotate Facility

$ 0.00 $ 0.10 $ 0.20 $ 0.30 $ 0.40 $ 0.50 $ 0.60 $ 0.70 $ 0.80 $ 0.90 $ 1.00

Year

  • Jan. 2007
  • Jan. 2008
  • Jan. 2009
  • Jan. 2010
  • Jan. 2011

Individual Net Worth

As a Function of Original Worth

slide-66
SLIDE 66

Awesome SAS v9.3 Upgrades

  • You no longer need to turn on/off ODS Graphics for modeling
  • utputs (proc phreg, proc reg, proc model ect…).
  • Using png
  • utputs, SAS Graph now uses anti-aliasing to create

smoother more publication quality lines.

  • Fill and text colors on graphs can now be specified to be semi-

transparent using Alpha Channel Color Transparency.

  • Plot markers such as “square”

can now be filled with the specified marker color using the command “squarefilled”

  • r

“triangefilled”.

slide-67
SLIDE 67

The End