Coded variables Some variables can be represented on different - PowerPoint PPT Presentation

ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II Coded variables Some variables can be represented on different scales. E.g., temperature in degrees Celsius or Fahrenheit. Suppose some response Y is modeled as a linear function of temperature: E ( Y ) = β 0 + β 1 x , with x = temperature in degrees Fahrenheit. 1 / 23 Principles of Model Building Coding Independent Variables

ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II If x ∗ = temperature in degrees Celsius, then x = 32 + 1 . 8 x ∗ . So E ( Y ) = β 0 + β 1 (32 + 1 . 8 x ∗ ) = ( β 0 + 32 β 1 ) + (1 . 8 β 1 ) x ∗ = β ∗ 0 + β ∗ 1 x ∗ , where β ∗ 0 = β 0 + 32 β 1 and β ∗ 1 = 1 . 8 β 1 . 2 / 23 Principles of Model Building Coding Independent Variables

ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II So if Y is linearly related to x , then it is also linearly related to x ∗ , with different coefficients β ∗ 0 and β ∗ 1 . We sometimes code variables to make an equation more easily interpreted. When a variable takes only two distinct values, we often code them as − 1 and +1. E.g., if x is temperature with levels 80 ◦ F and 100 ◦ F, and x ∗ = ( x − 90) / 10 , then x ∗ = − 1 when x = 80, and x ∗ = 1 when X = 100. 3 / 23 Principles of Model Building Coding Independent Variables

ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II A variable with three levels can similarly be coded as − 1, 0, and +1, provided the three levels are equally spaced. The interpretation of the corresponding coefficient β ∗ is, as always, the change in E ( Y ) when x ∗ changes by 1, with all other variables fixed. But with a variable coded like this, a change of 1 in x ∗ means moving, say, from the midpoint value to the high value. The corresponding change in E ( Y ) is often called the effect of the variable. 4 / 23 Principles of Model Building Coding Independent Variables

ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II When a variable takes more than two or three values, it is sometimes standardized : i = u i = x i − ¯ x x ∗ . s x All coefficients are then in the units of Y , so they can be compared numerically. If Y is also standardized, the coefficients are dimensionless. These are called standardized regression coefficients, and are widely used in some fields. Despite what the text says, standardization has no effect on computational errors, with modern algorithms. 5 / 23 Principles of Model Building Coding Independent Variables

ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II Models with One Qualitative Variable Recall: a qualitative variable with l levels is represented by ( l − 1) indicator (or dummy) variables. For a chosen reference level, all the indicator variables are 0; For each other level, the corresponding indicator variable is 1, and the others are 0. 6 / 23 Principles of Model Building Models with One Qualitative Variable

ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II Example Per-user software maintenance cost, by state (sample of 10 users per state). path <- file.path("Text", "Exercises&Examples", "BIDMAINT.txt") maint <- read.table(path, header = TRUE) plot(COST ~ STATE, maint) summary(lm(COST ~ STATE, maint)) Call: lm(formula = COST ~ STATE, data = maint) Residuals: Min 1Q Median 3Q Max -299.80 -95.83 -37.90 153.32 295.20 7 / 23 Principles of Model Building Models with One Qualitative Variable

ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 279.60 53.43 5.233 1.63e-05 *** STATEKentucky 80.30 75.56 1.063 0.2973 STATETexas 198.20 75.56 2.623 0.0141 * --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 168.9 on 27 degrees of freedom Multiple R-squared: 0.205, Adjusted R-squared: 0.1462 F-statistic: 3.482 on 2 and 27 DF, p-value: 0.04515 8 / 23 Principles of Model Building Models with One Qualitative Variable

ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II The fitted equation is E ( Y ) = 279 . 6 + 80 . 3 x 1 + 198 . 2 x 2 where: x 1 = indicator variable for Kentucky, x 2 = indicator variable for Texas. For Kansas, x 1 = x 2 = 0, so E ( Y ) = 279 . 6. That is, the “intercept” is actually the expected value for the reference state, Kansas. 9 / 23 Principles of Model Building Models with One Qualitative Variable

ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II For Kentucky, x 1 = 1 and x 2 = 0, so E ( Y ) = 279 . 6 + 80 . 3 = 359 . 9. That is, the coefficient STATEKentucky is the difference between the expected value for Kentucky and the expected value for the reference state. Simlilarly, the coefficient STATETexas is the difference between the expected value for Texas and the expected value for the reference state. 10 / 23 Principles of Model Building Models with One Qualitative Variable

ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II In R, the default reference level is the first in alphabetic order. The default can be overridden using the factor() function. Often these differences themselves are of no special interest, and the focus is on testing whether there are any differences: H 0 : β 1 = β 2 = · · · = β l = 0. The value of the F -statistic is unaffected by the choice of reference level. 11 / 23 Principles of Model Building Models with One Qualitative Variable

ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II Two Qualitative Variables E.g., two brands of diesel engine and three types of fuel. path <- file.path("Text", "Exercises&Examples", "DIESEL.txt") diesel <- read.table(path, header = TRUE) par(mfrow = c(1, 2)); plot(PERFORM ~ FUEL + BRAND, diesel) Try main-effects model (additive, no interaction): summary(aov(PERFORM ~ FUEL + BRAND, diesel)) Alternative interaction model: summary(aov(PERFORM ~ FUEL * BRAND, diesel)) 12 / 23 Principles of Model Building Models with Two Qualitative Variables

ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II Graph the interactions: with(diesel, interaction.plot(FUEL, BRAND, PERFORM)) with(diesel, interaction.plot(BRAND, FUEL, PERFORM)) Complicated story: For F1 and F2, effects are additive, with B1 performing better than B2; For F3, B2 performs better than B1. 13 / 23 Principles of Model Building Models with Two Qualitative Variables

ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II Three or More Qualitative Variables With a response y and independent variables a , b , c , . . . , model might contain: main effects: y ~ a + b + c + ... ; two-way interactions: y ~ a + b + c + a:b + a:c + b:c + ... ; higher-order interactions: y ~ a + b + c + a:b + a:c + b:c + a:b:c + ... ; Often only main effects and low-order interactions are significant. 14 / 23 Principles of Model Building Three or More Qualitative Variables

ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II To estimate the highest-order interactions, we need observations for all possible combinations of levels–a factorial design. E.g., 2 × 3 = 6 for the diesel engines. With several variables, all with at least 2 levels, the number of combinations can be large. Sometimes a carefully chosen fraction of all possible combinations is used–a fractional factorial design. 15 / 23 Principles of Model Building Three or More Qualitative Variables

ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II Models with Both Quantitative and Qualitative Variables Example Diesel engine performance Y , as a function of: engine speed, x 1 ; fuel type, with levels F 1 , F 2 , and F 3 ; take F 1 as the reference level, and x 2 and x 3 as indicators for F 2 and F 3 , respectively. 16 / 23 Principles of Model Building Both Quantitative and Qualitative Variables

ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II Simple model, ignoring fuel type: second-order model in x 1 : E ( Y ) = β 0 + β 1 x 1 + β 2 x 2 1 . Additive model: include main effects of fuel type: E ( Y ) = β 0 + β 1 x 1 + β 2 x 2 1 + β 3 x 2 + β 4 x 3 . Switching fuel from F 1 to F 2 adds β 3 to the performance Y , independently of engine speed x 1 . Interaction model: E ( Y ) = β 0 + β 1 x 1 + β 2 x 2 1 + β 3 x 2 + β 4 x 3 + β 5 x 1 x 2 + β 6 x 1 x 3 + β 7 x 2 1 x 2 + β 8 x 2 1 x 3 . 17 / 23 Principles of Model Building Both Quantitative and Qualitative Variables

Coded variables Some variables can be represented on different - PowerPoint PPT Presentation

ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II Coded variables Some variables can be represented on different scales. E.g., temperature in degrees Celsius or Fahrenheit. Suppose some response

M12 X-coded 10Gb/s M12 X-Coded Field installable for Rail D4 Industrial Ethernet, Ethernet/IP

YCL Week 3 Lets talk about variables! Variables Variables are containers for data. Variables

Coded Computational Photography ! EE367/CS448I: Computational Imaging and Display !

Turbo Codes and Turbo-Coded Modulation Turbo Codes and Turbo-Coded Modulation in CDMA Mobile

Closures & Scoping Variables Parameters Local variables Free variables

Cyclic Coded Integer-Forcing Equalization Or Ordentlich Joint work with Uri Erez EE-Systems, Tel

Network Coding-Aware Queue Network Coding Aware Queue Management for Unicast Flows over Coded

Applies to research involving coded private information or human biological specimens that is

Binary-Coded Genetic Algorithm Lecture 22 ME EN 575 Andrew Ning aning@byu.edu Outline

5/22/2013 Gottschalk v. Benson (1972) 8. The method of converting signals from binary coded

Improved Computation-Communication Trade-Off for Coded Distributed Computing using Linear

Coded QR Decomposi.on Quang Minh Nguyen, MIT Haewon Jeong, Harvard University Pulkit Grover,

Improved Lower Bounds for Coded Caching Aditya Ramamoorthy Iowa State University Joint work with

Measuring Neighborhood Effects and the Use of Geo-coded Variables Ninez A. Ponce, MPP, PhD

Discrete Random Variables October 7, 2010 Discrete Random Variables Random Variables In many

CSS CUSTOM PROPERTIES (VARIABLES) What CSS Variables are? CSS variables are entities defined by

Chapter 5 Section 5 MA1032 Data, Functions & Graphs Sidney Butler Michigan Technological

Our Changing Climate Unit 2 - Lesson 5: Our Changing Climate // P. 1 Unit 2 - Lesson 5: Our

WIND ROSES FOR TeX DOCUMENTS Alan Wetmore alan.wetmore@gmail.com 1 Thursday, October 24, 13 1

Isolette Example Safety Critical Software SAnToS Laboratory Kansas State University John

Ch.2: Loops and lists (part 2) Joakim Sundnes 1 , 2 Hans Petter Langtangen 1 , 2 Simula Research

Unit 4 Input (cin) More Assignment Statements 2 Review of Data Types bool true or

1 We have recorded the average high temperature in October at the San Diego airport, each year

MOL2NET, 2018 , 4, http://sciforum.net/conference/mol2net-04 2 enormous amounts followed by

Coded variables Some variables can be represented on different - PowerPoint PPT Presentation

ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II Coded variables Some variables can be represented on different scales. E.g., temperature in degrees Celsius or Fahrenheit. Suppose some response

M12 X-coded 10Gb/s M12 X-Coded Field installable for Rail D4 Industrial Ethernet, Ethernet/IP

YCL Week 3 Lets talk about variables! Variables Variables are containers for data. Variables

Coded Computational Photography ! EE367/CS448I: Computational Imaging and Display !

Turbo Codes and Turbo-Coded Modulation Turbo Codes and Turbo-Coded Modulation in CDMA Mobile

Closures &amp; Scoping Variables Parameters Local variables Free variables

Cyclic Coded Integer-Forcing Equalization Or Ordentlich Joint work with Uri Erez EE-Systems, Tel

Network Coding-Aware Queue Network Coding Aware Queue Management for Unicast Flows over Coded

Applies to research involving coded private information or human biological specimens that is

Binary-Coded Genetic Algorithm Lecture 22 ME EN 575 Andrew Ning aning@byu.edu Outline

5/22/2013 Gottschalk v. Benson (1972) 8. The method of converting signals from binary coded

Improved Computation-Communication Trade-Off for Coded Distributed Computing using Linear

Coded QR Decomposi.on Quang Minh Nguyen, MIT Haewon Jeong, Harvard University Pulkit Grover,

Improved Lower Bounds for Coded Caching Aditya Ramamoorthy Iowa State University Joint work with

Measuring Neighborhood Effects and the Use of Geo-coded Variables Ninez A. Ponce, MPP, PhD

Discrete Random Variables October 7, 2010 Discrete Random Variables Random Variables In many

CSS CUSTOM PROPERTIES (VARIABLES) What CSS Variables are? CSS variables are entities defined by

Chapter 5 Section 5 MA1032 Data, Functions &amp; Graphs Sidney Butler Michigan Technological

Our Changing Climate Unit 2 - Lesson 5: Our Changing Climate // P. 1 Unit 2 - Lesson 5: Our

WIND ROSES FOR TeX DOCUMENTS Alan Wetmore alan.wetmore@gmail.com 1 Thursday, October 24, 13 1

Isolette Example Safety Critical Software SAnToS Laboratory Kansas State University John

Ch.2: Loops and lists (part 2) Joakim Sundnes 1 , 2 Hans Petter Langtangen 1 , 2 Simula Research

Unit 4 Input (cin) More Assignment Statements 2 Review of Data Types bool true or

1 We have recorded the average high temperature in October at the San Diego airport, each year

MOL2NET, 2018 , 4, http://sciforum.net/conference/mol2net-04 2 enormous amounts followed by

Closures & Scoping Variables Parameters Local variables Free variables

Chapter 5 Section 5 MA1032 Data, Functions & Graphs Sidney Butler Michigan Technological