Unit15:RoadMap(VERBAL) - - PowerPoint PPT Presentation

unit 15 road map verbal
SMART_READER_LITE
LIVE PREVIEW

Unit15:RoadMap(VERBAL) - - PowerPoint PPT Presentation

Unit15:RoadMap(VERBAL)


slide-1
SLIDE 1
  • Unit15:RoadMap(VERBAL)

!"#$ %&'( )*'(#+ READING&(&,-, .&'( )/'(#+ 0.&1 RACE& (,23,"4,5&)-,6 7.&1 HOMEWORK8)&(,9$ ,-$ FREELUNCH:!&(,!;(<&"&, ESL!;&;;&(,!" ,)!; +68&= 3+6&8& $;$outliers#= 4+>888linearity normality =

  • +>888homoskedasticity =

+6&;;&!" 8)&;!= 9+/(8;&&;!!"8)= +*(8;&(!!"8)= +6(;(;&:&&&= +>888independence ;2%'2#=

? .) !$%;

slide-2
SLIDE 2

3

Unit15:RoadMap(Schematic)

? .) !$%;

SinglePredictor

ChiSquares ChiSquares Regression ANOVA Polychotomous ChiSquares Dichotomous ChiSquares Logistic Regression Polychotomous Regression ANOVA T(tests Regression Continuous Dichotomous Continuous

Outcome MultiplePredictors

ChiSquares ChiSquares Regression ANOVA Polychotomous ChiSquares Dichotomous ChiSquares Logistic Regression Polychotomous Regression ANOVA Multiple Regression Continuous Dichotomous Continuous

Outcome

Units11(14,19,B: Dealingwith Assumption Violations

slide-3
SLIDE 3

4 ? .) !$%;

Unit15:Roadmap(SPSSOutput)

Unit Unit 4 4 Unit Unit 15 15

slide-4
SLIDE 4
  • ? .)

!$%;

Unit15:PartialCorrelationMatrices Unit15PostHole: Interpretacorrelationmatrixand/orpartialcorrelationmatrix and notewhattheymayforeshadowaboutmultipleregression. Unit15TechnicalMemoandSchoolBoardMemo: Useacorrelationmatrixandapartialcorrelationmatrixtoget a handleonfourvariablesofyourchoice(onecontinuousoutcome variable,onepredictorvariable,andtwocontrolvariables)in preparationformultipleregression. Unit15Review: ReviewUnits4and5.

slide-5
SLIDE 5
  • ? .)

!$%;

Unit15:TechnicalMemoandSchoolBoardMemo

WorkProducts(PartIofII): I. TechnicalMemo:Haveonesectionperanalysis.Foreachsection, followthisoutline. A. Introduction i. Stateatheory(orperhapshunch)fortherelationship—thinkcausally,becreative.(1Sentence) ii. Statearesearchquestionforeachtheory(orhunch)—thinkcorrelationally,beformal.Nowthatyouknowthestatistical machinerythatjustifiesaninferencefromasampletoapopulation,begineachresearchquestion,“Inthepopulation,…” (1 Sentence)

  • iii. Listyourvariables,andlabelthem“outcome” and“predictor,” respectively.
  • iv. Includeyourtheoreticalmodel.

B. Univariate Statistics.Describeyourvariables,usingdescriptivestatistics.Whatdotheyrepresentormeasure? i. Describethedataset.(1Sentence) ii. Describeyourvariables.(1ParagraphEach) a. Definethevariable(parentheticallynotingthemeanands.d.asdescriptivestatistics). b. Interpretthemeanandstandarddeviationinsuchawaythatyouraudiencebeginstoformapictureofthewaythe worldis.Neverlosesightofthesubstantivemeaningofthenumbers. c. Polishofftheinterpretationbydiscussingwhetherthemeanand standarddeviationcanbemisleading,referencing themedian,outliersand/orskewasappropriate. d. Notevaliditythreatsduetomeasurementerror. C. Correlations.Provideanoverviewoftherelationshipsbetweenyourvariablesusingdescriptivestatistics.Focusfirstonthe relationshipbetweenyouroutcomeandquestionpredictor,second(tiedontherelationshipsbetweenyouroutcomeandcontrol predictors,second(tiedontherelationshipsbetweenyourquestionpredictorandcontrolpredictors,andfourthonthe relationship(s)betweenyourcontrolvariables. a. Includeyourownsimple/partialcorrelationmatrixwithawell(writtencaption. b. Interpretyoursimplecorrelationmatrix.Notewhatthesimplecorrelationmatrixforeshadowsforyourpartialcorrelation matrix;“cheat” herebypeekingatyourpartialcorrelationandthinkingbackwards.Sometimes,yoursimplecorrelation

  • matrixrevealspossibilitiesinyourpartialcorrelationmatrix. Othertimes,yoursimplecorrelationmatrixprovidesforegone

conclusions.Youcanstareatacorrelationmatrixallday,solimityourselftotwoinsights. c. Interpretyourpartialcorrelationmatrixcontrollingforonevariable.Notewhatthepartialcorrelationmatrixforeshadows forapartialcorrelationmatrixthatcontrolsfortwovariables.Limityourselftotwoinsights.

slide-6
SLIDE 6

9 ? .) !$%;

Unit15:TechnicalMemoandSchoolBoardMemo

WorkProducts(PartIIofII): I. TechnicalMemo(continued) D. RegressionAnalysis.Answeryourresearchquestionusinginferentialstatistics.Weaveyourstrategyintoacoherentstory. i. Includeyourfittedmodel. ii. UsetheR2 statistictoconveythegoodnessoffitforthemodel(i.e.,strength).

  • iii. Todeterminestatisticalsignificance,testeachnullhypothesis thatthemagnitudeinthepopulationiszero,reject(ornot)

thenullhypothesis,anddrawaconclusion(ornot)fromthesampletothepopulation.

  • iv. Create,displayanddiscussatablewithataxonomyoffittedregressionmodels.

v. Usespreadsheetsoftwaretographtherelationship(s),andincludeawell(writtencaption.

  • vi. Describethedirectionandmagnitudeoftherelationship(s)inyoursample,preferablywithillustrativeexamples.Drawout

thesubstanceofyourfindingsthroughyournarrative.

  • vii. Useconfidenceintervalstodescribetheprecisionofyourmagnitudeestimatessothatyoucandiscussthemagnitudeinthe

population.

  • viii. Ifregressiondiagnosticsrevealaproblem,describetheproblem andtheimplicationsforyouranalysisand,ifpossible,

correcttheproblem. i. Primarily,checkyourresidual(versus(fitted(RVF)plot.(GlanceattheresidualhistogramandP(Pplot.) ii. Checkyourresidual(versus(predictorplots.

  • iii. Checkforinfluentialoutliersusingleverage,residualandinfluencestatistics.
  • iv. Checkyourmaineffectsassumptionsbycheckingforinteractions beforeyoufinalizeyourmodel.

X. ExploratoryDataAnalysis.Exploreyourdatausingoutlierresistantstatistics. i. Foreachvariable,useacoherentnarrativetoconveytheresultsofyourexploratoryunivariate analysisofthedata.Don’t losesightofthesubstantivemeaningofthenumbers.(1ParagraphEach) 1. Noteiftheshapeforeshadowsaneedtononlinearlytransformand,ifso,whichtransformationmightdothetrick. ii. Foreachrelationshipbetweenyouroutcomeandpredictor,useacoherentnarrativetoconveytheresultsofyour exploratorybivariate analysisofthedata.(1ParagraphEach) 1. Ifarelationshipisnon(linear,transformtheoutcomeand/orpredictortomakeitlinear. 2. Ifarelationshipisheteroskedastic,considerusingrobuststandarderrors. II. SchoolBoardMemo:Concisely,preciselyandplainlyconveyyourkeyfindingstoalayaudience.Notethat,whereasyouarebuildingonthe technicalmemoformostofthesemester,yourschoolboardmemoisfresheachweek.(Max200Words)

  • III. MemoMetacognitive
slide-7
SLIDE 7
  • ? .)

!$%;

Unit15:ResearchQuestion

@+>;&;& ))8;);;$ &0+7;SESESL AGEGENERALKNOWLEDGE &8HEADSTARTHOURS ");$ *+!7" !7";#("8 ;((8 ,9# '(+ %&+ GENERALKNOWLEDGE#/@&&A@ B8;B; 0.&+ HEADSTARTHOURS#>.6)>C 5B; 7.&+ SES#27D<E&&& ESL#2*&8&*!;3 ";; ,# AGE#2;DB;! D+

ε β β β β β + + + + + = AGE ESL SES OURS HEADSTARTH WLEDGE GENERALKNO

4 3 2 1

slide-8
SLIDE 8
  • SPSSDATA
slide-9
SLIDE 9
  • SimpleCorrelationMatrix

R2=.01 R2=.00 R2=.19 R2=.11 R2=.06 R2=.04 R2=.00 R2=.00 R2=.06 R2=.02 Let’scallanR2 statisticof.00“nocorrelation” (eventhough,ifwegoouttoenoughdecimalplaces,therewillbesomecorrelation) Let’scallanR2 statisticfrom.01to.05a“weakcorrelation” Let’scallanR2 statisticfrom.06to.15a“moderatecorrelation” Let’scallanR2 statisticgreaterthan.15a“strongcorrelation” Whetheracorrelationisstrongorweakisrelative.Neverbelieveachartthatimpliesotherwise.

slide-10
SLIDE 10
  • SimpleCorrelationMatrix

NoticethatGENERALKNOWLEDGE andSES haveastrongcorrelation. NoticethatGENERALKNOWLEDGE andHEADSTARTHOURS haveaweakcorrelation. NoticethatHEADSTARTHOURS andSES haveamoderatecorrelation. AlsonoticethatAGEhasamoderatecorrelationwithGENERALKNOWLEDGE butnocorrelation withHEADSTARTHOURS,ESL orSES.

slide-11
SLIDE 11
  • WhatDoThoseCirclesReallyRepresent?Variance(Unit5Redux I)

Thissquarerepresents theaveragesquared meandeviation,ina word,THEvariance. Mean=509 Variance=1600 SD=40

'&F8);(;; ;A ($/& &;; $'&;G $

slide-12
SLIDE 12

3

WhatDoThoseCirclesReallyRepresent?Variance(Unit5Redux II)

Themeansquareresidual(ormean squareerror)isthevariance*ofthe residuals:

*Notquite…noticethedegreesof freedom.

6= &&'( D! /'

@G # &&8 $/;$!( $6&G$@ GF;G$

slide-13
SLIDE 13

4

WhatDoThoseCirclesReallyRepresent?Variance(Unit5Redux III)

Themeansquareresidual(ormean squareerror)isthevariance*ofthe residuals:

*Notquite…noticethedegreesof freedom.

ThatsmallsquareisthevarianceintheoutcomestillinneedofpredictingAFTERthe(one) predictorhasdoneallitspredictivework.Toseehowsmallissmall,wecancomparethevariance( still(in(need(of(predictingwiththeoriginalvariance…

Thissquarerepresents theaveragesquared meandeviation,ina word,THEvariance.

OutcomeVariance PredictorVariance PredictorVariance

Unpredicted Variance Predicted Variance Predictedby what? Noticethattheoutcomevarianceandthepredictorvarianceareidenticalinsize.That’sbecause(forconceptual purposes)westandardizedboththeoutcomeandpredictorsothat eachmeaniszeroandeachstandarddeviationis

  • ne.Ifthestandarddeviationisone,thenthevarianceisalso one.I.e.,ifasideofthesquareisone,thenthearea
  • fthesquareisalsoone.Bystandardizing,wecompareapplestoapples.

Also,noticethatifthepredictoroverlaps95%oftheoutcome,thentheoutcomeoverlaps95%ofthepredictor.I.e., theoutcomepredictsthepredictorjustaswellasthepredictor predictstheoutcome.Correlationsaresymmetrical!

Now,whatdothosecirclesreallyrepresent?Theyjustrepresent variance.Insteadofsquares,we usecircles.AlthoughI’mtemptedtouse“Booleansquares” inthefutureforthesakeofclarity.

slide-14
SLIDE 14
  • Three’sCompany

NoticethatGENERALKNOWLEDGE andSES haveastrongcorrelation. NoticethatGENERALKNOWLEDGE andHEADSTARTHOURS haveaweakcorrelation. NoticethatHEADSTARTHOURS andSES haveamoderatecorrelation. Ifwelookatallthreecircles

  • verlappingsimultaneously,the

yellowandbluecanstaywhere theyare,andtheredmusttakea smallbiteoutoftheyellowanda mediumbiteoutoftheblue. Graphically,togettherightsized bites(moreorless),we’llsquish theredcircle,butthisispurely graphical,notconceptual. Conceptually,therearedifferent waystogettherightsizebites.

slide-15
SLIDE 15
  • ThreePossibilities

A:VariationinGENERALKNOWLEDGE uniquelypredictedbyHEADSTARTHOURS. B:VariationinGENERALKNOWLEDGE uniquelypredictedbySES. C:VariationinGENERALKNOWLEDGE jointlypredictedbyHEADSTARTHOURS%andSES. D:VariationinGENERALKNOWLEDGE unpredictedbyHEADSTARTHOURS%andSES. A D 5 5 D C C A 5 D

HEADSTARTHOURS andSES eachuniquely predictvariationinGENERALKNOWLEDGE, buttheydonotjointlypredictvariationin GENERALKNOWLEDGE. HEADSTARTHOURS andSES jointlypredict variationinGENERALKNOWLEDGE,butonly SES uniquelypredictsvariationin GENERALKNOWLEDGE. HEADSTARTHOURS andSES eachuniquely predictvariationinGENERALKNOWLEDGE, buttheyalsojointlypredictvariationin GENERALKNOWLEDGE.

slide-16
SLIDE 16

9 Model2Residuals variance=38.8 Model2:R2 =.19

DeterminingUniquelyPredictedVariation:R2 Change(IofIII)

“Unique” isrelativetotheotherpredictorsinthemodel.Inotherwords,uniquelypredictedvariation ispredictedvariationuniquefromthevariationpredictedbythe“control” predictorsinthemodel.

ε β β β + + + = OURS HEADSTARTH SES WLEDGE GENERALKNO

2 1

GENERALKNOWLEDGE variance=47.8

ε β β + + = SES WLEDGE GENERALKNO

1

Model1: Model2:

Model1Residuals variance=38.8 Model1:R2 =.19 HEADSTARTHOURS doesnotuniquelypredict variationinGENERALKNOWLEDGE%overand abovethevariationpredictedbySES.

2 Outcome 2 Residual 2

1 σ σ − = R

TheadditionofHEADSTARTHOURS toourmodeldoesnotdecrease theresidualvariance.I.e.,itdoes nottellusanythingwedidnot knowwithSES alone!

slide-17
SLIDE 17
  • DeterminingUniquelyPredictedVariation:R2 Change(IIofIII)

GENERALKNOWLEDGE variance=47.8 Model1Residuals variance=38.8 Model1:R2 =.19 Model2Residuals variance=38.8 Model2:R2 =.19

D

Outcome Predictor Control A+B+C+D A+D D 19 . D C B A C B D A 1 : 1 Model

2

= + + + + = + + + + − = D C B A R

B C A

19 . D C B A C B A D C B A D 1 : 2 Model

2

= + + + + + = + + + − = R

D C B NoA

GK SES H S

A ≈ ∴

Nowweareseeingthattherearetwotypesof predictors:questionpredictors (“predictors,” forshort) andcontrolpredictors (“controls,” forshort). Wehavebeentrainingourselvestothink

  • fvariablesintermsof“outcomes” and

“predictors.”

slide-18
SLIDE 18
  • DeterminingUniquelyPredictedVariation:R2 Change(IIIofIII)

GENERALKNOWLEDGE variance=47.8 Model1Residuals variance=38.8 Model1:R2 =.19

ChangeintheR2 statistic isonewaytodetermineuniquelypredictedvariation.Thinkofmodels nestedwithinmodels.Model1istightlynested withinModel2ifModel2hasnotonlythesame

  • utcomeandpredictorsasModel1butalsooneadditionalpredictor.Theadditionalpredictor

uniquelypredictsvariationintheoutcomeifandonlyifthereisanincreaseintheR2statisticfrom thetightlynestedmodeltothetightlynestingmodel;thisincreaseiscalledthepartialR2 statistic.

ε β β β + + + = OURS HEADSTARTH SES WLEDGE GENERALKNO

2 1

ε β β + + = SES WLEDGE GENERALKNO

1

Model1: Model2:

Model2Residuals variance=38.8 Model2:R2 =.19

Models1and2forma(small)set

  • fhierarchicallynestedmodels.

INCREASE?

slide-19
SLIDE 19
  • DeterminingUniquelyPredictedVariation:PartialCorrelation(I ofIV)

Partialcorrelation (i.e.,thepartialr%statistic)isanotherwaytodetermineuniquelypredicted variation.Thepartialcorrelationmeasurestherelationshipafterwepartialoutacontrolvariable(or setofcontrolvariables).Apartialcorrelationcanbegreaterorlessthanthesimplecorrelation. Partialcorrelationscanchangesignsfromtheirsimplecorrelations! Ifweignorepositive/negativesigns,wecangetagoodhandleonpartialcorrelationsthroughtheR2 statistic.*RecallthatwhenwesquareaPearsoncorrelation(r),wegetanR2 statistic.Welosethe signbutwegetacoolinterpretationintermsofproportionofpredictedvariance.Becausewelose thesign,wecannotgetbacktothePearsoncorrelationbysquarerootingtheR2 statistic,butwecan gettotheabsolutevalueofthePearsoncorrelation:|r|.

D

Outcome Predictor Control

C

( )

D A A D A 1 1 partial

2 Residual Model

  • Control

2 Residual Predictor

  • Plus
  • Model
  • Control

2

+ = + − = − = D r σ σ

D C B A C A D C B A 1 1

2 Outcome 2 Residual Model

  • Simple

2 2

+ + + + = + + + + − = − = = D B R r σ σ

ThesquareofthePearsonproductmomentcorrelation: Thesquareofapartialcorrelationbetweenapredictorandanoutcomecontrolling foroneormorevariables: A B C

Acontrolmodel isamodelinwhichallthepredictorvariablesarecontrolpredictors.

*Thepartial R2 fromthe previousslides isNOTdirectly analogousto thepartialr.

slide-20
SLIDE 20

3

DeterminingUniquelyPredictedVariation:PartialCorrelation(IIofIV)

GENERALKNOWLEDGE variance=47.8 Model1Residuals variance=38.8 Model2Residuals variance=38.8

Model2:R2 =.19 (Partialr)2 =.00

A+B+C+D A+D D

WhereastheR2statisticusestheoutcomevarianceasitsbase,the(partialr)2 statisticfor HEADSTARTHOURS usestheresidualvariancefromthecontrolmodel(e.g.,Model1)asitsbase.

19 . D C B A C B A D C B A 1 1

2 Outcome 2 Residual 2 Model 2

= + + + + + = + + + − = − = D R σ σ

00 . D A A D A 1 1 ) (partial

2 Residual 1 Model 2 Residual 2 Model 2

= + = + − = − = D r σ σ

Theinsighthereisthatwecanuseresidualsas thebasisforourcalculations.

D

Outcome Predictor Control

C A B C

slide-21
SLIDE 21

3

DeterminingUniquelyPredictedVariation:PartialCorrelation(IIIofIV)

Partialcorrelation (partialr)isacorrelationbetweentwosetsofresiduals.Here,weareusing residualsascontrolledobservations(whichwehavedoneinpreviousunitstoidentifysubjectswho wereperformingbetterorworsethanexpected).Onesetofresidualscomesfromaregressionofour

  • utcome%variable onourcontrolvariable(s).Theothersetofresidualscomesfromaregressionof
  • urpredictor%variable onourcontrolvariable(s).Thecorrelationbetweenthetwosetsofresiduals

(i.e.,thepartialcorrelation)tellsusnotwhethertheobservationsarecorrelated,butratherthe partialcorrelationtellsuswhetherthecontrolled%observationsarecorrelated.

NoticeinthediagramfromPartIofourexpositiononpartialcorrelationthat,afterwepartialoutthe controlvariable,wehavelessvariationintheoutcomevariable andthepredictorvariable(I.e.,each fullmoonbecomesacrescentmoon).Thecrescentsrepresentresiduals,andwheretheyoverlap, the

  • verlaprepresentstheircorrelation.

ε β β + + = SES WLEDGE GENERALKNO

1

ModelGKONSES:

Letε fromModel1becalledGKONSESERROR anditsz(transformationZGKONSESERROR.

ε β β + + = SES OURS HEADSTARTH

1

ModelHSONSES:

ε β β + + = ROR ZHSONSESER ROR ZGKONSESER

1

Letε fromModel2becalledHSONSESERROR anditsz(transformationZHSONSESERROR.

ModelVoila:

Β1 equalsthepartialcorrelationbetweenGENERALKNOWLEDGEandHEADSTARTHOURS, controllingforSES.Recallthatwhenweregressastandardizedoutcomeonastandardized predictortheslopecoefficientisthePearsoncorrelation(r).

D

Outcome Predictor Control

C A B C

slide-22
SLIDE 22

33

DeterminingUniquelyPredictedVariation:PartialCorrelation(IVofIV)

β1 =slope=4.68

OneunitofSESispositivelyassociatedwith 4.68pointsonthegeneralknowledgetest.

β1 =slope=partialr =(.020

Note:theslopeisthecorrelationbecausethe

  • utcomeandpredictorarestandardized.

WhencontrollingforSES,hoursperweekofHeadStarthasapartialcorrelationof(.020withscoresonthegeneralknowledgetest.

(Standardized)Residuals ControlledObservations ControllingforSES (Standardized)Residuals ControlledObservations ControllingforSES

β1 =slope=2.95

OneunitofSESispositively associatedwith2.95hoursper weekofHeadStart.

slide-23
SLIDE 23

34

ComparingtheSimpleCorrelationtothePartialCorrelation

β1 =slope=partialr =(.020

Note:theslopeisthecorrelationbecausethe

  • utcomeandpredictorarestandardized.

β1 =slope=r =(.122

Note:theslopeisthecorrelationbecausethe

  • utcomeandpredictorarestandardized.

Surprisingthingscanhappenuponstatisticalcontrol.Thecorrelationsbetweenresiduals(i.e.,controlled

  • bservations)canbeverydifferentfromthecorrelationsbetweenvariables(i.e.,uncontrolledobservations).

Wecanmakeafewobservationsaboutthedifferencesbetweenthe controlledrelationshipandtheuncontrolled relationship(controllingforSES)ofGENERALKNOWLEDGE andHEADSTARTHOURS. Thecontrolledrelationshipisweakerwithapartialr of(.020fromanr of(.122.Uponstatisticalcontrol,the relationshipbecomesstatisticallyinsignificant(p=.576fromp<.001fortheuncontrolledrelationship.)Thismakes substantivesensetome.HeadStartisaprogramforeducationallyatriskchildren,withlowSESbeingaprimaryrisk factor.HeadStartparticipantsarelikelytoreadworse,andthatispreciselywhytheyareHeadStartparticipants. ThequestionisnotwhetherHeadStartparticipantsreadbetterorworsethannon(participants.Rather,thequestion iswhethertheyreadbetterthantheywouldiftheyhadn’tparticipatedinHeadStart.Weneedtreatmentandcontrol groupsthatareequal(inexpectation)toanswerthatquestion.Intheabsenceofacontrolgroup,wecanuse statisticalcontrol,whichisinfinitelylessvalidbutoftenthebestwehave.Arandomizedcontrolgroupcontrolsforall variablesobserved,unobservedandunobservable,whereasstatisticalcontrolcontrolsforafewobservedvariables. Itappearsthatthe normalityassumption (andperhapsthe linearityassumption) isbettermetinthe controlled relationship. GLMassumption violationscanappear

  • rdisappearupon

statisticalcontrol.

slide-24
SLIDE 24

3-

APartialCorrelationMatrix(Partialling OutSES)

(partialr)2

=.00

(partialr)2

=.07

(partialr)2

=.00

(partialr)2

=.08

(partialr)2

=.00

(partialr)2

=.01 Youcanseethat,aswithsimplecorrelationmatrices,partialcorrelationmatricesaresymmetric aboutthediagonal,sowhichvariablesweconsidertheoutcomeorpredictorinanygivencellis arbitrary.

slide-25
SLIDE 25

3

APartialCorrelationMatrix(Partialling OutSES)

slide-26
SLIDE 26

39

ASimple/PartialCorrelationMatrix

GENERAL% KNOWLEDGE HEADSTART% HOURS AGE ESL HEADSTARTHOURS (.122*** (.020 AGE .247*** .258*** .019 .028 ESL (.332*** (.277*** .152*** .109** (.038 (.032 SES .433*** (( (.242*** (( .033 (( (.201*** (( Key:*p<.05,**p<.01,***p<.001

Figure15.1.Asimple/partialcorrelationmatrixinwhichthetopentryineachcelldenotesthesimple correlationandbottomentryofeachcelldenotesthepartialcorrelationcontrollingforSES (n=816).

GENERALKNOWLEDGE andHEADSTARTHOURS haveaweaknegativecorrelation(r=(.122,p<.001))thatallbut disappearswhenwecontrolforSES (partialr =(.020,notstatisticallysignificant).Thecorrelationsbetween GENERALKNOWLEDGE%andAGE andbetweenGENERALKNOWLEDGE andESL aremoderate,andtheyremain moderatewhenwepartialoutSES.OfparticularinterestisESL whichnotonlyremainsmoderatelycorrelatedwith

  • uroutcomeGENERALKNOWLEDGE uponstatisticalcontrolofSES (aswejustmentioned)butalsowhichremains

correlatedwithourquestionpredictor,HEADSTARTHOURS.ThissuggeststhatifwecontrolforESL inadditionto SES,therelationshipbetweenGENERALKNOWLEDGE andHEADSTART maydifferfromthesimpleandpartial (controllingforSES)correlations.Ontheotherhand,AGE isnotcorrelatedwithbothGENERALKNOWLEDGE and HEADSTARTHOURS butonlyGENERALKNOWLEDGE.ThissuggeststhatifwecontrolforAGE inadditiontoSES,the correlationbetweenGENERALKNOWLEDGE andHEADSTARTHOURS willincrease.(Note:Youshouldbeabletonail thefirsttwosentences.Forthefollowingsentences,Iwantyou totryyourhandatforeshadowing.Usethe “ExtremeScenarios” slideasaguide.)

slide-27
SLIDE 27

3

DigthePostHole

Unit15PostHole: Interpretacorrelationmatrixand/orpartialcorrelationmatrix andnotewhat theymayforeshadowaboutmultipleregression.

H Interpretthepartialcorrelationmatrixinthesamewayasyou wouldasimple correlationmatrix,butbesuretonote,“Controllingfor….” H Tryyourbestwiththeforeshadowing.Afterafewminutes,take astab. Useextremecorrelations,high(near±1)orlow(near0),inconjunctionwiththe necessaryconsequencesfromthefollowing“ExtremeScenarios” slide. Whentheoutcome,predictorandcontrolareallmoderatelycorrelatedamong themselves,anythingcanhappen!

slide-28
SLIDE 28

3

PartialCorrelationsCanBeGreater/LessThanTheirSimpleCorrelations

D

Outcome Predictor

B C A

Control Outcome Control Predictor

SmallSimpleCorrelation LargePartialCorrelation

Outcome Control Predictor

LargeSimpleCorrelation SmallPartialCorrelation

D A A D C B A + < + + + + C A D A A D C B A + > + + + + C A

slide-29
SLIDE 29

3

ExtremeScenariosForConceptual/ForeshadowingPurposes

OutcomeVariable:READING – astandardizedreadingscore PredictorVariable:HOMEWORK – self(reportedhoursspentperweekonhomework ControlVariable:SES – asocio(economicstatuscompositescore WeareinterestedintherelationshipbetweenREADINGandHOMEWORK,andwewantto lookpasttheuniversalconfoundofSES.

partial%r ≥ simple%r Why?AnyvariationthatHOMEWORK predictsinREADING willbe uniquefromthevariationthatSES predictsinREADING,butSES willdecreasethe variationinneedofpredictinginsofarasitiscorrelatedwith READING. #6:HOMEWORK isperfectly uncorrelatedwithSES (r =0.00). partial%r =0.00Why?HOMEWORK predictsthesamevariationasSES,soitcannot predictanyuniquevariation. #5:HOMEWORK isperfectly correlatedwithSES (r =1.00). partial%r =0.00Why?HOMEWORK predictsnovariationatall,soitcannotpredict anyuniquevariation. #4:HOMEWORK isperfectly uncorrelatedwithREADING (r =0.00). partial%r =1.00(Unless#1)Why?HOMEWORK predictsalltheuniquevariationin READING afterSES predictsitsvariation. #3:HOMEWORK isperfectly correlatedwithREADING (r =1.00). partial%r%=%simple%r Why?AnyvariationthatHOMEWORK predictsinREADING willbe uniquefromthevariationthatSES predicts(becauseSESdoesnotpredictany!). #2:SES isperfectlyuncorrelatedwith READING (r =0.00). partial%r%=%0.00Why?ThereisnouniquevariationleftinREADING forHOMEWORK topredict. #1:%SES isperfectlycorrelatedwith READING (r =1.00).

Consequenceforthecorrelationbetween READINGandHOMEWORKcontrollingforSES ExtremeScenario

slide-30
SLIDE 30

4 ? .) !$%;

AnsweringourRoadmapQuestion

Unit15:Whatarethecorrelationsamongreading,ESL,andhomework, controllingforSES?

First,let’sspeculate basedonthissimple correlationmatrix andoursubstantive knowledgeorhunches (orprejudices?).We knowthatfreelunch eligibility,ourproxy forlowSES,is negativelycorrelated withreadingscores. Weseethat homeworkhoursis correlatedwith readingscores,but wehavetowonder: IsSESaconfoundingthirdvariableinthecorrelationbetweenhomeworkandreading?Perhapsthe homework/readingcorrelationisjusttheSES/readingcorrelation indisguise?Wemustwonderthis insofarashomeworkandSESarecorrelated.Infact,SESandreadingarenothighlycorrelated(r =( 0.09).Nevertheless,westillhavetowonderhowmuchofthehomework/readingcorrelationis uniquelypredicted(asidefromtheSES/readingcorrelation).Itispossible,perhapslikely,thatatleast someofthesamevariationinreadingscoresisjointlypredictedbybothhomeworkandSES.

slide-31
SLIDE 31

4 ? .) !$%;

AnsweringourRoadmapQuestion

Unit15:Whatarethecorrelationsamongreading,ESL,andhomework, controllingforSES?

Controllingforfreeluncheligibility,thereremainsaapositivecorrelationbetweenhoursspenton homeworkperweekandreadingscores(partial%r =.165,p <.001).Thus,homeworkpredictsunique variationinreadingscoresoverandabovethevariationpredictedbyfreeluncheligibility.Wemay considerfurthercontrollingforESLstatus,butitscorrelationswithbothhomeworkandreading scoresaresolowthatitwillprobablynotinformtherelationshipbetweenhomeworkandreading scores.

slide-32
SLIDE 32

43 ? .) !$%;

AnsweringourRoadmapQuestion

Unit15:Whatarethecorrelationsamongreading,ESL,andhomework, controllingforSES?

READING HOMEWORK ESL HOMEWORK .183*** .165*** ESL (.053*** (.029*** .005 .014 FREELUNCH (.267*** (( (.092*** (( .093*** (( Key:*p<.05,**p<.01,***p<.001

Figure15.2.Asimple/partialcorrelationmatrixinwhichthetopentryineachcelldenotesthe simplecorrelationandbottomentryofeachcelldenotesthepartialcorrelationcontrollingforfree luncheligibility(n=816).

NoticethatthepartialcorrelationsforREADING/HOMEWORKandREADING/ESLareless thantheirsimplecorrelations,butthepartialcorrelationforHOMEWORK/ESLisgreater thanitssimplecorrelation.Substantively,thedifferencesseem trivial,but, pedagogically,thisisagoodillustrationofthepossibilities. Sometimes,thedirectionof correlationcanswitchuponstatisticalcontrol.Wewillseewhy inUnit16.

slide-33
SLIDE 33

44 ? .) !$%;

Unit15Appendix:KeyConcepts

WhyResiduals?UnaccountedVariables,MeasurementError,IndividualVariation “Unique” isrelativetotheotherpredictorsinthemodel.Inotherwords,uniquely predictedvariationispredictedvariationuniquefromthevariationpredictedbythe “control” predictorsinthemodel. Partialcorrelationscanchangesignsfromtheirsimplecorrelations! *ThepartialR2fromthepreviousslidesisNOTdirectlyanalogoustothepartialr. Surprisingthingscanhappenuponstatisticalcontrol.Thecorrelationsbetween residuals(i.e.,controlledobservations)canbeverydifferentfromthecorrelations betweenvariables(i.e.,uncontrolledobservations). GLMassumptionviolationscanappearordisappearuponstatisticalcontrol.

slide-34
SLIDE 34

4- ? .) !$%;

Unit15Appendix:KeyInterpretations

HEADSTARTHOURS andSES eachuniquelypredictvariationinGENERALKNOWLEDGE,buttheydonotjointlypredict variationinGENERALKNOWLEDGE. HEADSTARTHOURS andSES jointlypredictvariationinGENERALKNOWLEDGE,butonlySES uniquelypredictsvariation inGENERALKNOWLEDGE. HEADSTARTHOURS andSES eachuniquelypredictvariationinGENERALKNOWLEDGE,buttheyalsojointlypredict variationinGENERALKNOWLEDGE. HEADSTARTHOURS doesnotuniquelypredictvariationinGENERALKNOWLEDGE%overandabovethevariation predictedbySES. WhencontrollingforSES,hoursperweekofHeadStarthasapartialcorrelationof(.020withscoresonthegeneral knowledgetest. GENERALKNOWLEDGE andHEADSTARTHOURS haveaweaknegativecorrelation(r=(.122,p<.001))thatallbut disappearswhenwecontrolforSES (partialr =(.020,notstatisticallysignificant).Thecorrelationsbetween GENERALKNOWLEDGE%andAGE andbetweenGENERALKNOWLEDGE andESL aremoderate,andtheyremainmoderate whenwepartialoutSES.OfparticularinterestisESL whichnotonlyremainsmoderatelycorrelatedwithouroutcome GENERALKNOWLEDGE uponstatisticalcontrolofSES (aswejustmentioned)butalsowhichremainscorrelatedwith

  • urquestionpredictor,HEADSTARTHOURS.ThissuggeststhatifwecontrolforESL inadditiontoSES,therelationship

betweenGENERALKNOWLEDGE andHEADSTART maydifferfromthesimpleandpartial(controllingforSES) correlations.Ontheotherhand,AGE isnotcorrelatedwithbothGENERALKNOWLEDGE andHEADSTARTHOURS but

  • nlyGENERALKNOWLEDGE.ThissuggeststhatifwecontrolforAGE inadditiontoSES,thecorrelationbetween

GENERALKNOWLEDGE andHEADSTARTHOURS willincrease.(Note:Youshouldbeabletonailthefirsttwosentences. Forthefollowingsentences,Iwantyoutotryyourhandatforeshadowing.Usethe“ExtremeScenarios” slideasa guide.) Controllingforfreeluncheligibility,thereremainsaapositivecorrelationbetweenhoursspentonhomeworkper weekandreadingscores(partial%r =.165,p <.001).Thus,homeworkpredictsuniquevariationinreadingscoresover andabovethevariationpredictedbyfreeluncheligibility.WemayconsiderfurthercontrollingforESLstatus,butits correlationswithbothhomeworkandreadingscoresaresolowthatitwillprobablynotinformtherelationship betweenhomeworkandreadingscores.

slide-35
SLIDE 35

4 ? .) !$%;

Unit15Appendix:KeyTerminology

Varianceisjustahardworkingnumbertrying,trying,tryingto summarizethevariationofaunivariate distribution.Itisoneofmanystatisticalsummariesofvariation,includingrange,midspread andstandard deviation.Varianceistheaveragesquareddeviationfromthemean. Themeansquareresidual(orerror)representsthevarianceintheoutcomethatisleftoverafterwefitourmodel. Itisanaverage.Everyobservationhasaresidual.Wecansquarethatresidual.Themeansquareresidualis justtheaveragesquaredresidual. Wehavebeentrainingourselvestothinkofvariablesintermsof“outcomes” and“predictors.” Nowweareseeing thattherearetwotypesofpredictors:questionpredictors (“predictors,” forshort)andcontrolpredictors (“controls,” forshort). ChangeintheR2statistic isonewaytodetermineuniquelypredictedvariation.Thinkofmodelsnestedwithin models.Model1istightlynested withinModel2ifModel2hasnotonlythesameoutcomeandpredictorsas Model1butalsooneadditionalpredictor.Theadditionalpredictoruniquelypredictsvariationintheoutcome ifandonlyifthereisanincreaseintheR2statisticfromthe tightlynestedmodeltothetightlynestingmodel; thisincreaseiscalledthepartialR2statistic. Partialcorrelation (i.e.,thepartialr%statistic)isanotherwaytodetermineuniquelypredictedvariation.The partialcorrelationmeasurestherelationshipafterwepartialoutacontrolvariable(orsetofcontrol variables).Apartialcorrelationcanbegreaterorlessthanthesimplecorrelation. Acontrolmodel isamodelinwhichallthepredictorvariablesarecontrolpredictors. Partialcorrelation (partialr)isacorrelationbetweentwosetsofresiduals.Here,weareusingresidualsas controlledobservations(whichwehavedoneinpreviousunitsto identifysubjectswhowereperforming betterorworsethanexpected).Onesetofresidualscomesfromaregressionofouroutcome%variable onour controlvariable(s).Theothersetofresidualscomesfromaregressionofourpredictor%variable onourcontrol variable(s).Thecorrelationbetweenthetwosetsofresiduals(i.e.,thepartialcorrelation)tellsusnot whethertheobservationsarecorrelated,butratherthepartialcorrelationtellsuswhetherthecontrolled%

  • bservationsarecorrelated.
slide-36
SLIDE 36

49 ? .) !$%;

Unit15Appendix:Formulas

slide-37
SLIDE 37

4 ? .) !$%;

Unit15Appendix:SPSSSyntax

PARTIAL CORR /VARIABLES=READING HOMEWORK ESL BY FREELUNCH /SIGNIFICANCE=TWOTAIL /MISSING=LISTWISE.

slide-38
SLIDE 38

4

Analyze>Correlate>Partial…

slide-39
SLIDE 39

4

slide-40
SLIDE 40
  • ? .)

!$%;

PerceivedIntimacyofAdolescentGirls(Intimacy.sav)

H Source:HGSEthesisbyDr.LindaKilner entitledIntimacyinFemale Adolescent'sRelationshipswithParentsandFriends(1991).Kilner collectedtheratingsusingtheAdolescentIntimacyScale. H Sample:64adolescentgirlsinthesophomore,juniorandseniorclasses

  • falocalsuburbanpublicschoolsystem.

H Variables:

*&D DI# @D DI@# D7;8D DI7# )'(8D DI'# .&2&8D DI.# 7&8D DI7# *&5 5I# @5 5I@# D7;85 5I7# )'(85 5I'# .&2&85 5I.# 7&85 5I7#

H Overview:Datasetcontainsself(ratingsoftheintimacythat adolescentgirlsperceivethemselvesashavingwith:(a)their motherand(b)theirboyfriend.

slide-41
SLIDE 41
  • ? .)

!$%;

PerceivedIntimacyofAdolescentGirls(Intimacy.sav)

slide-42
SLIDE 42
  • 3

? .) !$%;

PerceivedIntimacyofAdolescentGirls(Intimacy.sav)

slide-43
SLIDE 43
  • 4

? .) !$%;

PerceivedIntimacyofAdolescentGirls(Intimacy.sav)

slide-44
SLIDE 44
  • ? .)

!$%;

HighSchoolandBeyond(HSB.sav)

H Source:SubsetofdatagraciouslyprovidedbyValerieLee,Universityof Michigan. H Sample:Thissubsamplehas1044studentsin205schools.Missing data

  • ntheoutcometestscoreandfamilySESwereeliminated.Inaddition,

schoolswithfewerthan3studentsincludedinthissubsetofdatawere excluded. H Variables:

'((J

5&)#,5&),% "#,",% :#,<,D 5C!#5! .2#>.2 .3#>.23 5C@#5&; 557&#5&& <!7&#<<81&&

'((E&J

.&D#K>.&; >A#>A .&*#K>.&; 5C!I#2;!> .2I#2;.2> .23I#2;.23> 5C@I#2;&> 557&I#2;(&&> <!7&I#2;81&&>

H Overview:HighSchool&Beyond– Subsetofdata focusedonselectedstudentandschoolcharacteristics aspredictorsofacademicachievement.

slide-45
SLIDE 45
  • ? .)

!$%;

HighSchoolandBeyond(HSB.sav)

slide-46
SLIDE 46
  • 9

? .) !$%;

HighSchoolandBeyond(HSB.sav)

slide-47
SLIDE 47
  • ? .)

!$%;

HighSchoolandBeyond(HSB.sav)

slide-48
SLIDE 48
  • ? .)

!$%;

UnderstandingCausesofIllness(ILLCAUSE.sav)

H Source:PerrinE.C.,Sayer A.G.,andWillettJ.B.(1991). SticksAndStonesMayBreakMyBones:ReasoningAboutIllness CausalityAndBodyFunctioningInChildrenWhoHaveAChronicIllness, Pediatrics,88(3),608(19. H Sample:301children,includingasub(sampleof205whowere describedasasthmatic,diabetic,or healthy.Afterfurtherreductions duetothelist/wise%deletion%ofcaseswithmissingdataononeormore variables,theanalyticsub(sampleusedinclassendsupcontaining:33 diabeticchildren,68asthmaticchildrenand93healthychildren. H Variables:

/""72!# 7E;/7 !# 7E! ;&8!$# ..'@# 7E&.(.&'&(@ 2!# 7E2;/D !!2# 7E&;@ 7&/# ,2&*(&,> 2&# ,2&,> *(&# ,*(&,>

H Overview:Dataforinvestigatingdifferencesinchildren’s understandingofthecausesofillness,bytheirhealth status.

slide-49
SLIDE 49
  • ? .)

!$%;

UnderstandingCausesofIllness(ILLCAUSE.sav)

slide-50
SLIDE 50
  • ? .)

!$%;

UnderstandingCausesofIllness(ILLCAUSE.sav)

slide-51
SLIDE 51
  • ? .)

!$%;

UnderstandingCausesofIllness(ILLCAUSE.sav)

slide-52
SLIDE 52

3 ? .) !$%;

ChildrenofImmigrants(ChildrenOfImmigrants.sav)

H Source:Portes,Alejandro,&RubenG.Rumbaut (2001).%Legacies:%The%Story%of% the%Immigrant%SecondGeneration.BerkeleyCA:UniversityofCaliforniaPress. H Sample:Randomsampleof880participantsobtainedthroughthewebsite. H Variables:

;#

;2&&

<&# K&8;(&; D#

,D,<

*#

*& >;&#

!#

7!& H Overview:“CILSisalongitudinalstudydesignedtostudythe adaptationprocessoftheimmigrantsecondgenerationwhichis definedbroadlyasU.S.(bornchildrenwithatleastoneforeign(born parentorchildrenbornabroadbutbroughtatanearlyagetothe UnitedStates.Theoriginalsurveywasconductedwithlargesamples

  • fsecond(generationchildrenattendingthe8thand9thgradesin

publicandprivateschoolsinthemetropolitanareasofMiami/Ft. LauderdaleinFloridaandSanDiego,California” (fromthewebsite descriptionofthedataset).

slide-53
SLIDE 53

4 ? .) !$%;

ChildrenofImmigrants(ChildrenOfImmigrants.sav)

slide-54
SLIDE 54
  • ? .)

!$%;

ChildrenofImmigrants(ChildrenOfImmigrants.sav)

slide-55
SLIDE 55
  • ? .)

!$%;

ChildrenofImmigrants(ChildrenOfImmigrants.sav)

slide-56
SLIDE 56

9 ? .) !$%;

HumanDevelopmentinChicagoNeighborhoods(Neighborhoods.sav)

H Source:Sampson,R.J.,Raudenbush,S.W.,&Earls,F.(1997).Neighborhoods andviolentcrime:Amultilevelstudyofcollectiveefficacy.Science,%277,918( 924. H Sample:Thedatadescribedhereconsistofinformationfrom343Neighborhood ClustersinChicagoIllinois.Someofthevariableswereobtainedbyprojectstaff fromthe1990Censusandcityrecords.Othervariableswereobtainedthrough questionnaireinterviewswith8782Chicagoresidentswhowereinterviewedin theirhomes. H Variables: ># >&&$ D# >& *# 7&*; /I7&#/; (# ( .# . 7!# 7&!&& '&# K66'&'& .&'# K6.&'&

H ThesedatawerecollectedaspartoftheProjecton HumanDevelopmentinChicagoNeighborhoodsin1995.

slide-57
SLIDE 57
  • ? .)

!$%;

HumanDevelopmentinChicagoNeighborhoods(Neighborhoods.sav)

slide-58
SLIDE 58
  • ? .)

!$%;

HumanDevelopmentinChicagoNeighborhoods(Neighborhoods.sav)

slide-59
SLIDE 59
  • ? .)

!$%;

HumanDevelopmentinChicagoNeighborhoods(Neighborhoods.sav)

slide-60
SLIDE 60

9 ? .) !$%;

4(HStudyofPositiveYouthDevelopment(4H.sav)

H Sample:Thesedataconsistofseventhgraderswhoparticipatedin Wave3ofthe4(HStudyofPositiveYouthDevelopmentatTufts University.Thissubfile isasubstantiallysampled(downversionofthe

  • riginalfile,asallthecaseswithanymissingdataontheseselected

variableswereeliminated. H Variables:

:<# ,<,D D!# CDE!& # 1 *# * 7# </# <E ./& .# . *# , 1*# ,C 9L*#

H 4(HStudyofPositiveYouthDevelopment H Source:SubsetofdatafromIARYD,TuftsUniversity

2&7# 1.&2&&7& &7# 1.&&7& .7# 1.&.&7& .2# 1.&.&2& 75# 1.&7&5 6# 16

slide-61
SLIDE 61

9 ? .) !$%;

4(HStudyofPositiveYouthDevelopment(4H.sav)

slide-62
SLIDE 62

93 ? .) !$%;

4(HStudyofPositiveYouthDevelopment(4H.sav)

slide-63
SLIDE 63

94 ? .) !$%;

4(HStudyofPositiveYouthDevelopment(4H.sav)