Week 1: Introduction to Remote Learning Format, Policies, Guiding - PowerPoint PPT Presentation

BUS 41100 Applied Regression Analysis Week 1: Introduction to Remote Learning Format, Policies, Guiding Principles Max H. Farrell The University of Chicago Booth School of Business

Remote Instruction Guiding Principles ◮ Be patient ◮ Be flexible ◮ Learn something ◮ Student interaction When in doubt, ask ! I haven’t thought of everything, and everyone’s needs are different. 1

What is class going to look like? Synchronous (but recorded) ◮ Lectures: live during class, format will evolve over time ◮ Office hours: twice a week, times TBD Your work ◮ Group work: homework & project. Randomly assigned groups to facilitate interaction. ◮ Midterm exam: on your own. Resources ◮ Course website: slides, data, etc ◮ Piazza: Q & A ◮ Textbook: Sheather. Recommended, not required, see syllabus 2

Your work Turned-in work: clear, concise, and on message ◮ Fewer plots usually better ◮ Results and analysis, not output/code Homework ◮ Not exam practice! Not similar at all ◮ Reinforce & extend ideas , challenge you ◮ Open-ended analysis Exams ◮ Narrower scope ◮ Test core concepts/abilities ◮ Look at sample exams to get a sense of style Project: Your glimpse at real life! 3

Course Overview Rough outline ◮ Weeks 1 – 4: Simple and Multiple Linear Regression ◮ Weeks 5 – 6: Panel and Times Series Data ◮ Week 7: Logistic Regression ◮ Week 8 – 9: Model Building ◮ Week 10: Presentations But . . . we will be flexible and patient ◮ Cover the material we can learn well ◮ Fix an exam in somewhere 4

BUS 41100 Applied Regression Analysis Week 1: Introduction, Simple Linear Regression Data visualization, conditional distributions, correlation, and least squares regression Max H. Farrell The University of Chicago Booth School of Business

The basic problem Formulate a Available model to Use estimate data on predict or to make a two or more estimate a (business) variables value of decision interest 1

Regression: What is it? ◮ Simply: The most widely used statistical tool for understanding relationships among variables ◮ A conceptually simple method for investigating relationships between one or more factors and an outcome of interest ◮ The relationship is expressed in the form of an equation or a model connecting the outcome to the factors 2

Regression in business ◮ Optimal portfolio choice: - Predict the future joint distribution of asset returns - Construct an optimal portfolio (choose weights) ◮ Determining price and marketing strategy: - Estimate the effect of price and advertisement on sales - Decide what is optimal price and ad campaign ◮ Credit scoring model: - Predict the future probability of default using known characteristics of borrower - Decide whether or not to lend (and if so, how much) 3

Regression in everything Straight prediction questions: ◮ What price should I charge for my car? ◮ What will the interest rates be next month? ◮ Will this person like that movie? Explanation and understanding: ◮ Does your income increase if you get an MBA? ◮ Will tax incentives change purchasing behavior? ◮ Is my advertising campaign working? 4

Data Visualization Example: pickup truck prices on Craigslist We have 4 dimensions to consider. > data <- read.csv("pickup.csv") > names(data) [1] "year" "miles" "price" "make" A simple summary is > summary(data) year miles price make Min. :1978 Min. : 1500 Min. : 1200 Dodge:10 1st Qu.:1996 1st Qu.: 70958 1st Qu.: 4099 Ford :12 Median :2000 Median : 96800 Median : 5625 GMC :24 Mean :1999 Mean :101233 Mean : 7910 3rd Qu.:2003 3rd Qu.:130375 3rd Qu.: 9725 Max. :2008 Max. :215000 Max. :23950 5

First, the simple histogram (for each continuous variable). > par(mfrow=c(1,3)) > hist(data$year) > hist(data$miles) > hist(data$price) Histogram of data$year Histogram of data$miles Histogram of data$price 15 15 20 15 10 10 Frequency Frequency Frequency 10 5 5 5 0 0 0 1975 1980 1985 1990 1995 2000 2005 2010 0 50000 100000 150000 200000 250000 0 5000 10000 15000 20000 25000 data$year data$miles data$price Data is “binned” and plotted bar height is the count in each bin. 6

We can use scatterplots to compare two dimensions. > par(mfrow=c(1,2)) > plot(data$year, data$price, pch=20) > plot(data$miles, data$price, pch=20) ● ● ● ● ● ● ● ● 15000 15000 ● ● data$price data$price ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5000 ● 5000 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 1980 1990 2000 0 50000 150000 data$year data$miles 7

Add color to see another dimension. > par(mfrow=c(1,2)) > plot(data$year, data$price, pch=20, col=data$make) > legend("topleft", levels(data$make), fill=1:3) > plot(data$miles, data$price, pch=20, col=data$make) ● ● Dodge Ford ● ● ● ● GMC ● ● 15000 15000 ● ● data$price data$price ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5000 ● 5000 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 1980 1990 2000 0 50000 150000 data$year data$miles 8

Boxplots are also super useful. > year_boxplot <- factor(1*(year<1995) + 2*(1995<=year & year<2000) + 3*(2000<=year & year<2005) + 4*(2005<=year & year<2009), labels=c("<1995", "’95-’99", "2000-’04", "’05-’09")) > boxplot(price ~ make, ylab="Price ($)", main="Make") > boxplot(price ~ year_boxplot, ylab="Price ($)", main="Year") Make Year ● 15000 15000 ● Price ($) Price ($) ● 5000 5000 ● Dodge Ford GMC <1995 '95−'99 2000−'04 '05−'09 The box is the Interquartile Range (IQR; i.e., 25 th to 75 th %), with the median in bold. The whiskers extend to the most extreme point which is no more than 1.5 times the IQR width from the box. 9

Regression is what we’re really here for. > plot(data$year, data$price, pch=20, col=data$make) > abline(lm(price ~ year),lwd=1.5) ● ● Dodge Ford ● ● ● ● GMC ● ● 15000 15000 ● ● data$price data$price ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5000 ● 5000 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 1980 1990 2000 0 50000 150000 data$year data$miles ◮ Fit a line through the points, but how? ◮ lm stands for l inear m odel ◮ Rest of the course: formalize and explore this idea 10

Predicting house prices Problem: ◮ Predict market price based on observed characteristics Solution: ◮ Look at property sales data where we know the price and some observed characteristics. ◮ Build a decision rule that predicts price as a function of the observed characteristics. = ⇒ We have to define the variables of interest and develop a specific quantitative measure of these variables 11

What characteristics do we use? ◮ Many factors or variables affect the price of a house ◮ size of house ◮ number of baths ◮ garage, air conditioning, etc. ◮ size of land ◮ location ◮ Easy to quantify price and size but what about other variables such as location, aesthetics, workmanship, etc? 12

To keep things super simple, let’s focus only on size of the house. The value that we seek to predict is called the dependent (or output) variable, and we denote this as ◮ Y = price of house (e.g. thousands of dollars) The variable that we use to guide prediction is the explanatory (or input) variable, and this is labelled ◮ X = size of house (e.g. thousands of square feet) 13

What do the data look like? > size <- c(.8,.9,1,1.1,1.4,1.4,1.5,1.6, + 1.8,2,2.4,2.5,2.7,3.2,3.5) > price <- c(70,83,74,93,89,58,85,114, + 95,100,138,111,124,161,172) > plot(size, price, pch=20) ● 160 ● 140 ● 120 ● price ● ● 100 ● ● ● ● ● 80 ● ● ● 60 ● 1.0 1.5 2.0 2.5 3.0 3.5 size 14

Appears to be a linear relationship between price and size : ◮ as size goes up, price goes up. Fitting a line by the “eyeball” method: > abline(35, 40, col="red") ● 160 ● 140 ● 120 ● price ● ● 100 ● ● ● ● ● 80 ● ● ● 60 ● 1.0 1.5 2.0 2.5 3.0 3.5 size 15

Week 1: Introduction to Remote Learning Format, Policies, Guiding - PowerPoint PPT Presentation

BUS 41100 Applied Regression Analysis Week 1: Introduction to Remote Learning Format, Policies, Guiding Principles Max H. Farrell The University of Chicago Booth School of Business Remote Instruction Guiding Principles Be patient Be

Virtual Learning St. John the Baptist School Remote Learning vs. Virtual Learning Remote

MATH2130-F17 Week 13 Week 14 Week 15, Inner Farid Aliniaeifard Product Space CU BOULDER

Time Matters Week 7 Week 6 Prototyping + Needfinding Week 7 Week 8 Implementation Week 9

Math 610 Section 700 - Recitation week 3 week 4 week 6 week 8 TA: Peng Wei Office: Blocker

DTCP + Remote Access Proposal for Discussion with 3S October 28, 2009 1 Remote Access (RA)

COLLARTS SOURCING REMOTE INTERNSHIPS WHAT IS A REMOTE INTERNSHIP? COLLARTS REMOTE INTERNSHIPS

Brookline (PSB) Remote Learning Expert Advisory Panel 3: Remote Learning Capacity Building June

Galatians: week 3 Galatians 3:1-29 Week 1: Galatians 1:1-2:14 Week 2: Galatians 2:15-21 Week 3:

GPSD Reopening Plan Summary School Day Monday/Tuesday Wednesday Thursday/Friday In-Person

Remote Access Plus Enterprise Remote Access Sofuware - Product Overview Covers all your

Remote Procedure Calls (RPCs) and Remote Method Invocation (RMI) CS425/ECE428 SPRING 2019

Remote Working Toolkit Remote Work Best Practices Remote work provides a unique & exciting

Vermont M nt Marble: A e: Americas s nt Stone Monument Sto Class S s Schedule e Week

Week 1: Christ: The Source of True Happiness Week 2: Happiness, the Gospel and Living Well Week

Islands of the Pacific Northwest One or Two Week Cruise Week 1: September 14 th 20 th Week 2:

Menu Day Week 1 Week 2 Week 3 Week 4 Monday +Pork and Apple Casserole or +Meat Loaf or Lamb

Course Objectives Answer the following questions: What is Fund Accounting? Why use Fund

Changes in U.S. Family Finances 2013-2016: Results from the Survey of Consumer Finances By

Fixed Income Investor Presentation FY 2017 Results 23 February 2018 Ewen Stevenson Chief

Chapter Objectives To identify basic shareholder rights and the means by which corporations make

11/2/2018 Nattawoot Koowattanatianchai 1 Investment Analysis & Portfolio Management

The Effect of DNS on Tors Anonymity Benjamin Greschbach KTH Royal Institute of Technology

ITS TIME FOR THE RESOLUTION TIM VERBELEN Senior Researcher imec Ghent University 1

dnstap-whoami Robert Edmonds (edmonds@fsi.io) Farsight Security, Inc. Intro DNS nameservers

Week 1: Introduction to Remote Learning Format, Policies, Guiding - PowerPoint PPT Presentation

BUS 41100 Applied Regression Analysis Week 1: Introduction to Remote Learning Format, Policies, Guiding Principles Max H. Farrell The University of Chicago Booth School of Business Remote Instruction Guiding Principles Be patient Be

Virtual Learning St. John the Baptist School Remote Learning vs. Virtual Learning Remote

MATH2130-F17 Week 13 Week 14 Week 15, Inner Farid Aliniaeifard Product Space CU BOULDER

Time Matters Week 7 Week 6 Prototyping + Needfinding Week 7 Week 8 Implementation Week 9

Math 610 Section 700 - Recitation week 3 week 4 week 6 week 8 TA: Peng Wei Office: Blocker

DTCP + Remote Access Proposal for Discussion with 3S October 28, 2009 1 Remote Access (RA)

COLLARTS SOURCING REMOTE INTERNSHIPS WHAT IS A REMOTE INTERNSHIP? COLLARTS REMOTE INTERNSHIPS

Brookline (PSB) Remote Learning Expert Advisory Panel 3: Remote Learning Capacity Building June

Galatians: week 3 Galatians 3:1-29 Week 1: Galatians 1:1-2:14 Week 2: Galatians 2:15-21 Week 3:

GPSD Reopening Plan Summary School Day Monday/Tuesday Wednesday Thursday/Friday In-Person

Remote Access Plus Enterprise Remote Access Sofuware - Product Overview Covers all your

Remote Procedure Calls (RPCs) and Remote Method Invocation (RMI) CS425/ECE428 SPRING 2019

Remote Working Toolkit Remote Work Best Practices Remote work provides a unique &amp; exciting

Vermont M nt Marble: A e: Americas s nt Stone Monument Sto Class S s Schedule e Week

Week 1: Christ: The Source of True Happiness Week 2: Happiness, the Gospel and Living Well Week

Islands of the Pacific Northwest One or Two Week Cruise Week 1: September 14 th 20 th Week 2:

Menu Day Week 1 Week 2 Week 3 Week 4 Monday +Pork and Apple Casserole or +Meat Loaf or Lamb

Course Objectives Answer the following questions: What is Fund Accounting? Why use Fund

Changes in U.S. Family Finances 2013-2016: Results from the Survey of Consumer Finances By

Fixed Income Investor Presentation FY 2017 Results 23 February 2018 Ewen Stevenson Chief

Chapter Objectives To identify basic shareholder rights and the means by which corporations make

11/2/2018 Nattawoot Koowattanatianchai 1 Investment Analysis &amp; Portfolio Management

The Effect of DNS on Tors Anonymity Benjamin Greschbach KTH Royal Institute of Technology

ITS TIME FOR THE RESOLUTION TIM VERBELEN Senior Researcher imec Ghent University 1

dnstap-whoami Robert Edmonds (edmonds@fsi.io) Farsight Security, Inc. Intro DNS nameservers

Remote Working Toolkit Remote Work Best Practices Remote work provides a unique & exciting

11/2/2018 Nattawoot Koowattanatianchai 1 Investment Analysis & Portfolio Management