Data Representation and Intro SQL January 28, 2020 Data Science - - PowerPoint PPT Presentation

data representation and intro sql
SMART_READER_LITE
LIVE PREVIEW

Data Representation and Intro SQL January 28, 2020 Data Science - - PowerPoint PPT Presentation

Data Representation and Intro SQL January 28, 2020 Data Science CSCI 1951A Brown University Instructor: Ellie Pavlick HTAs: Josh Levin, Diane Mutako, Sol Zitter 1 But first! Waitlistwe are working our way through Top Hat; Have


slide-1
SLIDE 1

Data Representation and Intro SQL

January 28, 2020 Data Science CSCI 1951A Brown University Instructor: Ellie Pavlick HTAs: Josh Levin, Diane Mutako, Sol Zitter

1

slide-2
SLIDE 2
  • Waitlist—we are working our way through
  • Top Hat; Have pen/paper or sit by someone who does—this will help

for working through longer in-class exercises

  • Start/end times. Please don’t leave early! 3 minutes per day = one

whole lecture! 😪

  • Sign the collab policy!
  • SQL out, labs starting this week
  • Project Mixer next week
  • I have office hours today—come say hi! Talk to me about project ideas

But first!

2

slide-3
SLIDE 3

But first! Burning Questions?

3

slide-4
SLIDE 4

Data Representation and Intro SQL

4

slide-5
SLIDE 5

5

slide-6
SLIDE 6

DATABASES FOR DATA SCIENTIST

Requirement Engineering Conceptual Modeling Logical and Physical Modeling Asking and Answering Questions (Analysis) “Book of Duty” Conceptual Design (ER) Logical Design (schema, table names, data types), Physical Design (indices, memory layout, optimizations) Relational Algebra, SQL

https://cs.brown.edu/courses/csci1951-a/res/static/lec02_databases.pdf

6

slide-7
SLIDE 7

DATABASES FOR DATA SCIENTIST

Requirement Engineering Conceptual Modeling Logical and Physical Modeling Asking and Answering Questions (Analysis) “Book of Duty” Conceptual Design (ER) Logical Design (schema, table names, data types), Physical Design (indices, memory layout, optimizations) Relational Algebra, SQL

https://cs.brown.edu/courses/csci1951-a/res/static/lec02_databases.pdf

7

slide-8
SLIDE 8
  • Informal description of data domain
  • In natural language:
  • What are the objects you care about?
  • What properties/attributes of those objects are you measuring?
  • What are the relationships between them?
  • What assumptions are we making? (E.g. sizes, cardinalities)
  • What is the workload on the database (Read-only? Read/write?)
  • Permissions and privacy concerns?

“Book of Duty”/“Miniworld”

8

slide-9
SLIDE 9
  • Informal description of data domain
  • In natural language:
  • What are the objects you care about?
  • What properites/attributes of those objects are you measuring?
  • What are the relationships between them?
  • What assumptions are we making? (E.g. sizes, cardinalities)
  • What is the workload on the database (Read-only? Read/write?)
  • Permissions and privacy concerns?

“Book of Duty”

9

slide-10
SLIDE 10
  • Project: We want to analyze political trends

surrounding 2020 primary candidates

  • Plan: Crawl Twitter for posts from or about 2020

primary candidates. We want to analyze the spread of

  • pinions, through through following/follower

relationships, share/retweet chains, and language use

Description of “Miniworld”

10

slide-11
SLIDE 11
  • Objects used: People, Tweets, Candidates
  • Domains of attributes of objects: Tweets have timestamps (date), authors (Person), text

(max 140 characters), attachments, hashtags…

  • Identifiers, references / relationships: People have unique IDs, People author Tweets, People

retweet Tweets, People like Tweets, People follow People, Tweets mention People…

  • Cardinalities: Every tweet has exactly one author, one author can have many tweets, a

tweet can mention 0 to many candidates, a candidate can be mentioned in 0 to many tweets

  • Distributions: Some people tweet a lot, others just follow; some candidates are

mentioned a lot; follower/followee asymmetries

  • Workload: scrape and populate ones, read often
  • Priorities and service level agreements: “right to be forgotten” rules…

Description of “Miniworld”

11

slide-12
SLIDE 12
  • Objects used: People, Tweets, Candidates
  • Domains of attributes of objects: Tweets have timestamps (date), authors (Person), text

(max 140 characters), attachments, hashtags…

  • Identifiers, references / relationships: People have unique IDs, People author Tweets, People

retweet Tweets, People like Tweets, People follow People, Tweets mention People…

  • Cardinalities: Every tweet has exactly one author, one author can have many tweets, a

tweet can mention 0 to many candidates, a candidate can be mentioned in 0 to many tweets

  • Distributions: Some people tweet a lot, others just follow; some candidates are

mentioned a lot; follower/followee asymmetries

  • Workload: scrape and populate ones, read often
  • Priorities and service level agreements: “right to be forgotten” rules…

Description of “Miniworld”

12

slide-13
SLIDE 13
  • Objects used: People, Tweets, Candidates
  • Domains of attributes of objects: Tweets have timestamps (date), authors (Person), text

(max 140 characters), attachments, hashtags…

  • Identifiers, references / relationships: People have unique IDs, People author Tweets, People

retweet Tweets, People like Tweets, People follow People, Tweets mention People…

  • Cardinalities: Every tweet has exactly one author, one author can have many tweets, a

tweet can mention 0 to many candidates, a candidate can be mentioned in 0 to many tweets

  • Distributions: Some people tweet a lot, others just follow; some candidates are

mentioned a lot; follower/followee asymmetries

  • Workload: scrape and populate ones, read often
  • Priorities and service level agreements: “right to be forgotten” rules…

Description of “Miniworld”

13

slide-14
SLIDE 14
  • Objects used: People, Tweets, Candidates
  • Domains of attributes of objects: Tweets have timestamps (date), authors (Person), text

(max 140 characters), attachments, hashtags…

  • Identifiers, references / relationships: People have unique IDs, People author Tweets, People

retweet Tweets, People like Tweets, People follow People, Tweets mention People…

  • Cardinalities: Every tweet has exactly one author, one author can have many tweets, a

tweet can mention 0 to many candidates, a candidate can be mentioned in 0 to many tweets

  • Distributions: Some people tweet a lot, others just follow; some candidates are

mentioned a lot; follower/followee asymmetries

  • Workload: scrape and populate ones, read often
  • Priorities and service level agreements: “right to be forgotten” rules…

Description of “Miniworld”

14

slide-15
SLIDE 15
  • Objects used: People, Tweets, Candidates
  • Domains of attributes of objects: Tweets have timestamps (date), authors (Person), text

(max 140 characters), attachments, hashtags…

  • Identifiers, references / relationships: People have unique IDs, People author Tweets, People

retweet Tweets, People like Tweets, People follow People, Tweets mention People…

  • Cardinalities: Every tweet has exactly one author, one author can have many tweets, a

tweet can mention 0 to many candidates, a candidate can be mentioned in 0 to many tweets

  • Distributions: Some people tweet a lot, others just follow; some candidates are

mentioned a lot; follower/followee asymmetries

  • Workload: scrape and populate ones, read often
  • Priorities and service level agreements: “right to be forgotten” rules…

Description of “Miniworld”

15

slide-16
SLIDE 16
  • Objects used: People, Tweets, Candidates
  • Domains of attributes of objects: Tweets have timestamps (date), authors (Person), text

(max 140 characters), attachments, hashtags…

  • Identifiers, references / relationships: People have unique IDs, People author Tweets, People

retweet Tweets, People like Tweets, People follow People, Tweets mention People…

  • Cardinalities: Every tweet has exactly one author, one author can have many tweets, a

tweet can mention 0 to many candidates, a candidate can be mentioned in 0 to many tweets

  • Distributions: Some people tweet a lot, others just follow; some candidates are

mentioned a lot; follower/followee asymmetries

  • Workload: scrape and populate ones, read often
  • Priorities and service level agreements: “right to be forgotten” rules…

Description of “Miniworld”

16

slide-17
SLIDE 17
  • Objects used: People, Tweets, Candidates
  • Domains of attributes of objects: Tweets have timestamps (date), authors (Person), text

(max 140 characters), attachments, hashtags…

  • Identifiers, references / relationships: People have unique IDs, People author Tweets, People

retweet Tweets, People like Tweets, People follow People, Tweets mention People…

  • Cardinalities: Every tweet has exactly one author, one author can have many tweets, a

tweet can mention 0 to many candidates, a candidate can be mentioned in 0 to many tweets

  • Distributions: Some people tweet a lot, others just follow; some candidates are

mentioned a lot; follower/followee asymmetries

  • Workload: scrape and populate once, read often
  • Priorities and service level agreements: “right to be forgotten” rules…

Description of “Miniworld”

17

slide-18
SLIDE 18
  • Objects used: People, Tweets, Candidates
  • Domains of attributes of objects: Tweets have timestamps (date), authors (Person), text

(max 140 characters), attachments, hashtags…

  • Identifiers, references / relationships: People have unique IDs, People author Tweets, People

retweet Tweets, People like Tweets, People follow People, Tweets mention People…

  • Cardinalities: Every tweet has exactly one author, one author can have many tweets, a

tweet can mention 0 to many candidates, a candidate can be mentioned in 0 to many tweets

  • Distributions: Some people tweet a lot, others just follow; some candidates are

mentioned a lot; follower/followee asymmetries

  • Workload: scrape and populate once, read often
  • Priorities and service level agreements: “right to be forgotten” rules…

Description of “Miniworld”

18

slide-19
SLIDE 19

DATABASES FOR DATA SCIENTIST

Requirement Engineering Conceptual Modeling Logical and Physical Modeling Asking and Answering Questions (Analysis) “Book of Duty” Conceptual Design (ER) Logical Design (schema, table names, data types), Physical Design (indices, memory layout, optimizations) Relational Algebra, SQL

https://cs.brown.edu/courses/csci1951-a/res/static/lec02_databases.pdf

19

slide-20
SLIDE 20

Entity-Relationship (ER) Model

20

slide-21
SLIDE 21

ENTITY Attribute

RELATIONSHIP

Entity-Relationship (ER) Model

21

slide-22
SLIDE 22

PERSON Name ID TWEET Timestamp Author Text Mentions CANDIDATE Name ID Party

Entity-Relationship (ER) Model

22

slide-23
SLIDE 23

TWEET Timestamp Author Text Mentions 01/26/2019 03:14:15 JoeSchmo

Thinkin bout @SenGillibrand. Is it Kristen? Kirsten? Keirsten? idk so confused help #gillibrand2020 #maga #blessed SenGillibrand

Entity-Relationship (ER) Model

23

slide-24
SLIDE 24

TWEET Timestamp Author Text Mentions 01/26/2019 03:14:15 JoeSchmo

Thinkin bout @SenGillibrand. Is it Kristen? Kirsten? Keirsten? idk so confused help #gillibrand2020 #maga #blessed SenGillibrand

Entity-Relationship (ER) Model

Multivalued Attribute

24

slide-25
SLIDE 25

TWEET Timestamp Author Text Mentions 01/26/2019 03:14:15 JoeSchmo

@DNC Thinkin bout @SenGillibrand. Is it Kristen? Kirsten? Keirsten? idk so confused help #gillibrand2020 #maga #blessed DNC, SenGillibrand

Entity-Relationship (ER) Model

Multivalued Attribute

25

slide-26
SLIDE 26

TWEET Author Text Mentions JoeSchmo Time Day Month Year Hour Min. 2019 01 26 03 14

Entity-Relationship (ER) Model

@DNC Thinkin bout @SenGillibrand. Is it Kristen? Kirsten? Keirsten? idk so confused help #gillibrand2020 #maga #blessed DNC, SenGillibrand

Composite Attribute

26

slide-27
SLIDE 27

Clicker Question!

27

slide-28
SLIDE 28

TWEET Time Day Month Year Hour Min. 2019 01 26 03 14

Entity-Relationship (ER) Model

Clicker Question! Which representation is better? (a) Composite (b) Normal

TWEET Timestamp 01/26/2019 03:14:15

28

slide-29
SLIDE 29

TWEET Time Day Month Year Hour Min. 2019 01 26 03 14

Entity-Relationship (ER) Model

Clicker Question! Which representation is better? (a) Composite (b) Normal

TWEET Timestamp 01/26/2019 03:14:15

29

slide-30
SLIDE 30

TWEET Timestamp Author Text Mentions 01/26/2019 03:14:15 JoeSchmo

Entity-Relationship (ER) Model

Find all tweets sent between 2am and 4am that mention democratic primary candidates

❌❌❌

@DNC Thinkin bout @SenGillibrand. Is it Kristen? Kirsten? Keirsten? idk so confused help #gillibrand2020 #maga #blessed DNC, SenGillibrand

30

slide-31
SLIDE 31

TWEET Author Text Mentions JoeSchmo Time Day Month Year Hour Min. 2019 01 26 03 14

Find all tweets sent between 2am and 4am that mention democratic primary candidates

✔✔✔

Entity-Relationship (ER) Model

@DNC Thinkin bout @SenGillibrand. Is it Kristen? Kirsten? Keirsten? idk so confused help #gillibrand2020 #maga #blessed DNC, SenGillibrand

31

slide-32
SLIDE 32

TWEET Author Text Mentions JoeSchmo Time Day Month Year Hour Min. 2019 01 26 03 14

Entity-Relationship (ER) Model

@DNC Thinkin bout @SenGillibrand. Is it Kristen? Kirsten? Keirsten? idk so confused help #gillibrand2020 #maga #blessed DNC, SenGillibrand

Key Attribute: Designated attribute that uniquely identifies the entry

32

slide-33
SLIDE 33

TWEET Author Text Mentions JoeSchmo Time Day Month Year Hour Min. 2019 01 26 03 14

Entity-Relationship (ER) Model

@DNC Thinkin bout @SenGillibrand. Is it Kristen? Kirsten? Keirsten? idk so confused help #gillibrand2020 #maga #blessed DNC, SenGillibrand

…? Key Attribute: Designated attribute that uniquely identifies the entry

33

slide-34
SLIDE 34

TWEET Author Text Mentions JoeSchmo Time Day Month Year Hour Min. 2019 01 26 03 14

Entity-Relationship (ER) Model

@DNC Thinkin bout @SenGillibrand. Is it Kristen? Kirsten? Keirsten? idk so confused help #gillibrand2020 #maga #blessed DNC, SenGillibrand

…? Key Attribute: Designated attribute that uniquely identifies the entry

34

slide-35
SLIDE 35

TWEET Author Text Mentions JoeSchmo Time Day Month Year Hour Min. 2019 01 26 03 14

Entity-Relationship (ER) Model

@DNC Thinkin bout @SenGillibrand. Is it Kristen? Kirsten? Keirsten? idk so confused help #gillibrand2020 #maga #blessed DNC, SenGillibrand

…? Key Attribute: Designated attribute that uniquely identifies the entry

35

slide-36
SLIDE 36

TWEET Author Text Mentions JoeSchmo Time Day Month Year Hour Min. 2019 01 26 03 14

Entity-Relationship (ER) Model

@DNC Thinkin bout @SenGillibrand. Is it Kristen? Kirsten? Keirsten? idk so confused help #gillibrand2020 #maga #blessed DNC, SenGillibrand

…? Key Attribute: Designated attribute that uniquely identifies the entry

36

slide-37
SLIDE 37

TWEET Author Text Mentions JoeSchmo Time Day Month Year Hour Min. 2019 01 26 03 14

Entity-Relationship (ER) Model

@DNC Thinkin bout @SenGillibrand. Is it Kristen? Kirsten? Keirsten? idk so confused help #gillibrand2020 #maga #blessed DNC, SenGillibrand

…? Key Attribute: Designated attribute that uniquely identifies the entry

37

slide-38
SLIDE 38

TWEET Author Text Mentions JoeSchmo Time Day Month Year Hour Min. 2019 01 26 03 14

Entity-Relationship (ER) Model

@DNC Thinkin bout @SenGillibrand. Is it Kristen? Kirsten? Keirsten? idk so confused help #gillibrand2020 #maga #blessed DNC, SenGillibrand

…? Key Attribute: Designated attribute that uniquely identifies the entry

38

slide-39
SLIDE 39

Clicker Question!

39

slide-40
SLIDE 40

Entity-Relationship (ER) Model

Clicker Question! Is it a good idea to use author +timestamp as a key? (a) Yes! (b) No!

TWEET Author Text Mentions JoeSchmo Time Day Month Year Hour Min. 2019 01 26 03 14

@DNC Thinkin bout @SenGillibrand. Is it Kristen? Kirsten? Keirsten? idk so confused help #gillibrand2020 #maga #blessed

DNC, SenGillibrand

40

slide-41
SLIDE 41

Entity-Relationship (ER) Model

Clicker Question! Is it a good idea to use author +timestamp as a key? (a) Yes! (b) No!

TWEET Author Text Mentions JoeSchmo Time Day Month Year Hour Min. 2019 01 26 03 14

@DNC Thinkin bout @SenGillibrand. Is it Kristen? Kirsten? Keirsten? idk so confused help #gillibrand2020 #maga #blessed

DNC, SenGillibrand

41

slide-42
SLIDE 42

TWEET Author Text Mentions JoeSchmo Time Day Month Year Hour Min. 2019 01 26 03 14

Entity-Relationship (ER) Model

@DNC Thinkin bout @SenGillibrand. Is it Kristen? Kirsten? Keirsten? idk so confused help #gillibrand2020 #maga #blessed DNC, SenGillibrand

Key

Key Attribute…?

42

slide-43
SLIDE 43

TWEET Author Text Mentions JoeSchmo Time Day Month Year Hour Min. 2019 01 26 03 14

Entity-Relationship (ER) Model

@DNC Thinkin bout @SenGillibrand. Is it Kristen? Kirsten? Keirsten? idk so confused help #gillibrand2020 #maga #blessed DNC, SenGillibrand

Key 83948547823

Key Attribute…?

43

slide-44
SLIDE 44

Mentions PERSON Name ID TWEET Timestamp Author Text CANDIDATE Name ID Party

Entity-Relationship (ER) Model

ID

Relationships

44

slide-45
SLIDE 45

Mentions PERSON Name ID TWEET Timestamp Author Text CANDIDATE Name ID Party

Entity-Relationship (ER) Model

ID

Relationships

45

slide-46
SLIDE 46

Mentions

TWEET Timestamp

Author

Text Name ID Party

Entity-Relationship (ER) Model

ID CANDIDATE

Relationships

46

slide-47
SLIDE 47

TWEET Timestamp

Author

Text Name ID Party

Entity-Relationship (ER) Model

ID CANDIDATE

Mentioned_in

Relationships

47

slide-48
SLIDE 48

TWEET Timestamp

Author

Text Name ID Party

Entity-Relationship (ER) Model

ID CANDIDATE

Mentions

Relationships

48

slide-49
SLIDE 49

TWEET Timestamp

Author

Text Name ID Party

Entity-Relationship (ER) Model

ID CANDIDATE

Mentions

Direct/ Indirect? Sentiment

Relationships

49

slide-50
SLIDE 50

TWEET PERSON

Author_of

CANDIDATE

Mentions

Entity-Relationship (ER) Model

Name ID Timestamp Text Name ID Party ID Direct/ Indirect? Sentiment

50

slide-51
SLIDE 51

TWEET PERSON

Author_of

CANDIDATE

Mentions

(1,1) (0,N)

Entity-Relationship (ER) Model

Name ID Timestamp Text Name ID Party ID Direct/ Indirect? Sentiment

cardinalities (min, max)

51

slide-52
SLIDE 52

TWEET PERSON

Author_of

CANDIDATE

Mentions

(1,1) (0,N)

Entity-Relationship (ER) Model

Name ID Timestamp Text Name ID Party ID Direct/ Indirect? Sentiment

cardinalities (min, max)

a tweet can have any number of mentions, or no mentions

52

slide-53
SLIDE 53

TWEET PERSON

Author_of

CANDIDATE

Mentions

(1,1) (0,N)

Entity-Relationship (ER) Model

Name ID Timestamp Text Name ID Party ID Direct/ Indirect? Sentiment

cardinalities (min, max)

each mention corresponds

  • ne and only one

candidate

53

slide-54
SLIDE 54

Design Decisions

  • You can talk about things that don’t exist! Make

sure the representation supports the analysis you want to do.

  • Should this concept be an entity? Attribute?

Relation?

  • As with most things, there is no good answer
  • Draft, refine, document, iterate…

54

slide-55
SLIDE 55

Design Decisions

  • You can’t talk about things that don’t exist! Make

sure the representation supports the analysis you want to do.

  • Should this concept be an entity? Attribute?

Relation?

  • As with most things, there is no good answer
  • Draft, refine, document, iterate…

55

slide-56
SLIDE 56

Design Decisions

  • You can’t talk about things that don’t exist! Make

sure the representation supports the analysis you want to do.

  • Should this concept be an entity? Attribute?

Relation?

  • As with most things, there is no good answer
  • Draft, refine, document, iterate…

56

slide-57
SLIDE 57

Design Decisions

  • You can’t talk about things that don’t exist! Make

sure the representation supports the analysis you want to do.

  • Should this concept be an entity? Attribute?

Relation?

  • As with most things, there is no good answer
  • Draft, refine, document, iterate…

57

slide-58
SLIDE 58

Design Decisions

  • You can’t talk about things that don’t exist! Make

sure the representation supports the analysis you want to do.

  • Should this concept be an entity? Attribute?

Relation?

  • As with most things, there is no good answer
  • Draft, refine, document, iterate…

58

slide-59
SLIDE 59

Before we proceed… Burning Questions?

59

slide-60
SLIDE 60

DATABASES FOR DATA SCIENTIST

Requirement Engineering Conceptual Modeling Logical and Physical Modeling Asking and Answering Questions (Analysis) “Book of Duty” Conceptual Design (ER) Logical Design (schema, table names, data types), Physical Design (indices, memory layout, optimizations) Relational Algebra, SQL

https://cs.brown.edu/courses/csci1951-a/res/static/lec02_databases.pdf

60

slide-61
SLIDE 61

Relational Model

61

slide-62
SLIDE 62

Relational Model

TWEET ID Timestamp Author Text Mentions 389472 1/1/19 12:34 Bob hey NULL 123794 1/1/19 12:32 Maria lol {Bob} 596208 1/2/19 1:04 Yu :-D NULL

62

slide-63
SLIDE 63

ID Timestamp Author Text Mentions 389472 1/1/19 12:34 Bob hey NULL 123794 1/1/19 12:32 Maria lol {Bob} 596208 1/2/19 1:04 Yu :-D NULL

Relational Model

TWEET Relation

63

slide-64
SLIDE 64

ID Timestamp Author Text Mentions 389472 1/1/19 12:34 Bob hey NULL 123794 1/1/19 12:32 Maria lol {Bob} 596208 1/2/19 1:04 Yu :-D NULL

Relational Model

TWEET Relation Name

64

slide-65
SLIDE 65

ID Timestamp Author Text Mentions 389472 1/1/19 12:34 Bob hey NULL 123794 1/1/19 12:32 Maria lol {Bob} 596208 1/2/19 1:04 Yu :-D NULL

Relational Model

TWEET Attribute Domain: D = dom(Timestamp) = Valid time strings = ##/##/#### ##:##

65

slide-66
SLIDE 66

ID Timestamp Author Text Mentions 389472 1/1/19 12:34 Bob hey NULL 123794 1/1/19 12:32 Maria lol {Bob} 596208 1/2/19 1:04 Yu :-D NULL

Relational Model

TWEET Tuple

66

slide-67
SLIDE 67

ID Timestamp Author Text Mentions 389472 1/1/19 12:34 Bob hey NULL 123794 1/1/19 12:32 Maria lol {Bob} 596208 1/2/19 1:04 Yu :-D NULL

Relational Model

TWEET Relation Schema (R)

67

slide-68
SLIDE 68

ID Timestamp Author Text Mentions 389472 1/1/19 12:34 Bob hey NULL 123794 1/1/19 12:32 Maria lol {Bob} 596208 1/2/19 1:04 Yu :-D NULL

Relational Model

TWEET Relation Schema (R) Relation State r(R)

68

slide-69
SLIDE 69

ID Timestamp Author Text Mentions 389472 1/1/19 12:34 Bob hey NULL 123794 1/1/19 12:32 Maria lol {Bob} 596208 1/2/19 1:04 Yu :-D NULL

Relational Model

TWEET Intension Extension

69

slide-70
SLIDE 70

ID Timestamp Author Text Mentions 389472 1/1/19 12:34 Bob hey NULL 123794 1/1/19 12:32 Maria lol {Bob} 596208 1/2/19 1:04 Yu :-D NULL

Relational Model

TWEET

Find all the tweets by authors named Diane.

70

slide-71
SLIDE 71

ID Timestamp Author Text Mentions 389472 1/1/19 12:34 Bob hey NULL 123794 1/1/19 12:32 Maria lol {Bob} 596208 1/2/19 1:04 Yu :-D NULL

Relational Model

TWEET

SELECT * FROM TWEET WHERE Name is “Diane” Find all the tweets by authors named Diane.

71

slide-72
SLIDE 72

ID Timestamp Author Text Mentions 389472 1/1/19 12:34 Bob hey NULL 123794 1/1/19 12:32 Maria lol {Bob} 596208 1/2/19 1:04 Yu :-D NULL

Relational Model

TWEET

SELECT * FROM TWEET WHERE Name is “Diane” Find all the tweets by authors named Diane.

72

slide-73
SLIDE 73

ID Timestamp Author Text Mentions 389472 1/1/19 12:34 Bob hey NULL 123794 1/1/19 12:32 Maria lol {Bob} 596208 1/2/19 1:04 Yu :-D NULL

Relational Model

TWEET

SELECT * FROM TWEET WHERE Name is “Diane” Find all the tweets by authors named Diane.

∅ ✔

73

slide-74
SLIDE 74

ID Timestamp Author Text Mentions 389472 1/1/19 12:34 Bob hey NULL 123794 1/1/19 12:32 Maria lol {Bob} 596208 1/2/19 1:04 Yu :-D NULL

Relational Model

TWEET

SELECT * FROM TWEET WHERE Name is “Diane” Find all the tweets by authors named Diane.

∅ ✔

“Closed world assumption”

74

slide-75
SLIDE 75

ID Timestamp Author Text Mentions 389472 1/1/19 12:34 Bob hey NULL 123794 1/1/19 12:32 Maria lol {Bob} 596208 1/2/19 1:04 Yu :-D NULL

Relational Model

TWEET

Find all the tweets which weight less than 45lbs

75

slide-76
SLIDE 76

ID Timestamp Author Text Mentions 389472 1/1/19 12:34 Bob hey NULL 123794 1/1/19 12:32 Maria lol {Bob} 596208 1/2/19 1:04 Yu :-D NULL

Relational Model

TWEET

SELECT * FROM TWEET WHERE Weight < 45 Find all the tweets which weight less than 45lbs

76

slide-77
SLIDE 77

ID Timestamp Author Text Mentions 389472 1/1/19 12:34 Bob hey NULL 123794 1/1/19 12:32 Maria lol {Bob} 596208 1/2/19 1:04 Yu :-D NULL

Relational Model

TWEET

SELECT * FROM TWEET WHERE Weight < 45 Find all the tweets which weight less than 45lbs

???? ❌

77

slide-78
SLIDE 78

SQL

78

slide-79
SLIDE 79

SQL

  • Data Definition Language (DDL): Defining data

types and Relation Schemas (intensions!)

  • Data Manipulation and Query Language (DML):
  • Populating/updating data bases (extensions!)
  • Querying data bases

79

slide-80
SLIDE 80

Creating and Manipulating Tables

80

slide-81
SLIDE 81

TWEET PERSON Author_of CANDIDATE Mentions Name ID Timestamp Text Name ID Party ID Direct/ Indirect? Sentiment

Creating and Manipulating Tables

81

slide-82
SLIDE 82

Data Types

  • Numeric: INT, FLOAT, REAL, DOUBLE
  • Character Strings: CHAR(n), VARCHAR(n), CLOB(size)
  • CLOB(2MB) for large objects e.g. documents/web pages
  • Bit Strings: BIT(n), BIT VARYING(n), BLOB
  • BLOB(20MB) e.g. for images
  • Boolean
  • Dates: DATE, TIME, TIMESTAMP, TIME WITH TIME ZONE

82

slide-83
SLIDE 83

Data Types

  • Numeric: INT, FLOAT, REAL, DOUBLE
  • Character Strings: CHAR(n), VARCHAR(n), CLOB(size)
  • CLOB(2MB) for large objects e.g. documents/web pages
  • Bit Strings: BIT(n), BIT VARYING(n), BLOB
  • BLOB(20MB) e.g. for images
  • Boolean
  • Dates: DATE, TIME, TIMESTAMP, TIME WITH TIME ZONE

83

slide-84
SLIDE 84

Data Types

  • Numeric: INT, FLOAT, REAL, DOUBLE
  • Character Strings: CHAR(n), VARCHAR(n), CLOB(size)
  • CLOB(2MB) for large objects e.g. documents/web pages
  • Bit Strings: BIT(n), BIT VARYING(n), BLOB
  • BLOB(20MB) e.g. for images
  • Boolean
  • Dates: DATE, TIME, TIMESTAMP, TIME WITH TIME ZONE

84

slide-85
SLIDE 85

Data Types

  • Numeric: INT, FLOAT, REAL, DOUBLE
  • Character Strings: CHAR(n), VARCHAR(n), CLOB(size)
  • CLOB(2MB) for large objects e.g. documents/web pages
  • Bit Strings: BIT(n), BIT VARYING(n), BLOB
  • BLOB(20MB) e.g. for images
  • Boolean
  • Dates: DATE, TIME, TIMESTAMP, TIME WITH TIME ZONE

85

slide-86
SLIDE 86

Data Types

  • Numeric: INT, FLOAT, REAL, DOUBLE
  • Character Strings: CHAR(n), VARCHAR(n), CLOB(size)
  • CLOB(2MB) for large objects e.g. documents/web pages
  • Bit Strings: BIT(n), BIT VARYING(n), BLOB
  • BLOB(20MB) e.g. for images
  • Boolean
  • Dates: DATE, TIME, TIMESTAMP, TIME WITH TIME ZONE

86

slide-87
SLIDE 87

Data Types

  • Numeric: INT, FLOAT, REAL, DOUBLE
  • Character Strings: CHAR(n), VARCHAR(n), CLOB(size)
  • CLOB(2MB) for large objects e.g. documents/web pages
  • Bit Strings: BIT(n), BIT VARYING(n), BLOB
  • BLOB(20MB) e.g. for images
  • Boolean
  • Dates: DATE, TIME, TIMESTAMP, TIME WITH TIME ZONE

87

slide-88
SLIDE 88

Data Types

  • Numeric: INT, FLOAT, REAL, DOUBLE
  • Character Strings: CHAR(n), VARCHAR(n), CLOB(size)
  • CLOB(2MB) for large objects e.g. documents/web pages
  • Bit Strings: BIT(n), BIT VARYING(n), BLOB
  • BLOB(20MB) e.g. for images
  • Boolean
  • Dates: DATE, TIME, TIMESTAMP, TIME WITH TIME ZONE

88

slide-89
SLIDE 89

Data Types

  • Numeric: INT, FLOAT, REAL, DOUBLE
  • Character Strings: CHAR(n), VARCHAR(n), CLOB(size)
  • CLOB(2MB) for large objects e.g. documents/web pages
  • Bit Strings: BIT(n), BIT VARYING(n), BLOB
  • BLOB(20MB) e.g. for images
  • Boolean
  • Dates: DATE, TIME, TIMESTAMP, TIME WITH TIME ZONE

89

slide-90
SLIDE 90

Creating Tables

90

slide-91
SLIDE 91

Creating Tables

TWEET: <ID, Time, Text>

91

slide-92
SLIDE 92

Creating Tables

create table TWEET ( ID INT, Time TIMESTAMP, Text ??? ); TWEET: <ID, Time, Text>

92

slide-93
SLIDE 93

Creating Tables

create table TWEET ( ID INT, Time TIMESTAMP, Text ??? ); TWEET: <ID, Time, Text> CHAR(n), VARCHAR(n), CLOB(size) ??

93

slide-94
SLIDE 94

Creating Tables

create table TWEET ( ID INT, Time TIMESTAMP, Text VARCHAR(140) ); TWEET: <ID, Time, Text>

94

slide-95
SLIDE 95

Creating Tables

create table TWEET ( ID INT, Time TIMESTAMP, Text VARCHAR(140) ); TWEET: <ID, Time, Text>

  • CHAR(n): faster -> can use static

memory allocation; no length checks in

  • perations, so less overhead
  • VARCHAR(n): uses less space on

average

95

slide-96
SLIDE 96

create table PERSON ( Handle VARCHAR(100), Name VARCHAR(1000) ); PERSON: <Handle, Name>

Creating Tables

96

slide-97
SLIDE 97

create table PERSON ( Handle VARCHAR(100), Name VARCHAR(1000), ProfilePic ??? ); PERSON: <Handle, Name, ProfilePic>

Creating Tables

97

slide-98
SLIDE 98

create table PERSON ( Handle VARCHAR(100), Name VARCHAR(1000), ProfilePic BLOB(20MB), ); PERSON: <Handle, Name, ProfilePic>

Creating Tables

98

slide-99
SLIDE 99

create table PERSON ( Handle VARCHAR(100), Name VARCHAR(1000), ProfilePic BLOB(20MB), ProfilePage ??? ); PERSON: <Handle, Name, ProfilePic, ProfilePage>

Creating Tables

99

slide-100
SLIDE 100

create table PERSON ( Handle VARCHAR(100), Name VARCHAR(1000), ProfilePic BLOB(20MB), ProfilePage CLOB(20MB) ); PERSON: <Handle, Name, ProfilePic, ProfilePage>

Creating Tables

100

slide-101
SLIDE 101

Clicker Question!

101

slide-102
SLIDE 102

TWEET: <ID:INT, Time:TIMESTAMP, Text:VARCHAR(140)>

create table AUTHOR ( ??? );

PERSON: <Handle:VARCHAR(100), Name:VARCHAR(1000)>

TWEET PERSON Author_of Name Handle Timestamp Text ID

(1,1) (1,1)

Clicker Question!

102

slide-103
SLIDE 103

TWEET: <ID:INT, Time:TIMESTAMP, Text:VARCHAR(140)>

create table AUTHOR ( ??? );

PERSON: <Handle:VARCHAR(100), Name:VARCHAR(1000)>

TWEET PERSON Author_of Name Handle Timestamp Text ID

(1,1) (1,1)

Clicker Question!

create table TABLE_NAME ( Attr Data_type, … );

103

slide-104
SLIDE 104

TWEET PERSON Author_of Name Handle Timestamp Text ID

create table AUTHOR ( Tweet INT, Person VARCHAR(100), ); create table AUTHOR ( Tweet INT, Person VARCHAR(1000), ); create table AUTHOR ( Tweet INT, Person INT, );

(1,1) (1,1)

(a) (b) (c)

TWEET: <ID:INT, Time:TIMESTAMP, Text:VARCHAR(140)> PERSON: <Handle:VARCHAR(100), Name:VARCHAR(1000)>

104

slide-105
SLIDE 105

TWEET PERSON Author_of Name Handle Timestamp Text ID

create table AUTHOR ( Tweet INT, Person VARCHAR(100), ); create table AUTHOR ( Tweet INT, Person VARCHAR(1000), ); create table AUTHOR ( Tweet INT, Person INT, );

(1,1)

(a) (b) (c)

TWEET: <ID:INT, Time:TIMESTAMP, Text:VARCHAR(140)> PERSON: <Handle:VARCHAR(100), Name:VARCHAR(1000)>

(1,1)

105

slide-106
SLIDE 106

TWEET PERSON Author_of Name Handle Timestamp Text ID

create table AUTHOR ( Tweet INT, Person VARCHAR(100), ); create table AUTHOR ( Tweet INT, Person VARCHAR(1000), ); create table AUTHOR ( Tweet INT, Person INT, );

(1,1) (1,1)

(a) (b) (c)

TWEET: <ID:INT, Time:TIMESTAMP, Text:VARCHAR(140)> PERSON: <Handle:VARCHAR(100), Name:VARCHAR(1000)> Should use handle because they will be unique.

106

slide-107
SLIDE 107

Inserting Tuples

107

slide-108
SLIDE 108

ID Timestamp Text TWEET: <ID:INT, Time:TIMESTAMP, Text:VARCHAR(140)>

Inserting Tuples

108

slide-109
SLIDE 109

ID Timestamp Text 389472 2019-01-01 12:34:56 hey TWEET: <ID:INT, Time:TIMESTAMP, Text:VARCHAR(140)>

insert into TWEET values( 389472, 2019-01-01 12:34:56, “hey”);

Inserting Tuples

109

slide-110
SLIDE 110

ID Timestamp Text 389472 2019-01-01 12:34:56 hey NULL 2019-01-01 12:34:57 lol

insert into TWEET(Timestamp, Text) values( 2019-01-01 12:34:57, “lol”);

create table TWEET ( ID INT, Time TIMESTAMP, Text VARCHAR(140) );

110

slide-111
SLIDE 111

ID Timestamp Text 389472 2019-01-01 12:34:56 hey NULL 2019-01-01 12:34:57 lol

insert into TWEET(Timestamp, Text) values( 2019-01-01 12:34:57, “lol”);

create table TWEET ( ID INT, Time TIMESTAMP, Text VARCHAR(140) );

111

slide-112
SLIDE 112

ID Timestamp Text 389472 2019-01-01 12:34:56 hey 2019-01-01 12:34:57 lol

insert into TWEET(Timestamp, Text) values( 2019-01-01 12:34:57, “lol”);

create table TWEET ( ID INT DEFAULT 0, Time TIMESTAMP, Text VARCHAR(140) );

112

slide-113
SLIDE 113

ID Timestamp Text 389472 2019-01-01 12:34:56 hey

insert into TWEET(Timestamp, Text) values( 2019-01-01 12:34:57, “lol”);

create table TWEET ( ID INT NOT NULL, Time TIMESTAMP, Text VARCHAR(140) );

113

slide-114
SLIDE 114

Primary Keys

114

slide-115
SLIDE 115

create table TWEET ( ID INT, Time TIMESTAMP, Text VARCHAR(140), ); TWEET: <ID, Time, Text>

Primary Keys

115

slide-116
SLIDE 116

create table TWEET ( ID INT PRIMARY KEY, Time TIMESTAMP, Text VARCHAR(140), ); TWEET: <ID, Time, Text>

Primary Keys

116

slide-117
SLIDE 117

create table TWEET ( ID INT PRIMARY KEY, Time TIMESTAMP, Text VARCHAR(140), ); TWEET: <ID, Time, Text> Enforces “NOT NULL ”

Primary Keys

117

slide-118
SLIDE 118

create table TWEET ( ID INT PRIMARY KEY, Time TIMESTAMP, Text VARCHAR(140), ); TWEET: <ID, Time, Text> Enforces Uniqueness

Primary Keys

118

slide-119
SLIDE 119

create table TWEET ( ID INT, Time TIMESTAMP, Text VARCHAR(140), PRIMARY KEY (ID) );

TWEET: <ID, Time, Text>

create table TWEET ( ID INT PRIMARY KEY, Time TIMESTAMP, Text VARCHAR(140), );

=

Primary Keys

119

slide-120
SLIDE 120

create table AUTHOR ( Tweet INT, Person VARCHAR(100), PRIMARY KEY (Tweet, Person) );

Primary Keys

120

slide-121
SLIDE 121

Clicker Question!

121

slide-122
SLIDE 122

Clicker Lightening Round!

TWEET

ID Timestamp Text 389472 2019-01-01 12:34:56 hey 123794 2019-01-01 12:34:57 lol 596208 2019-01-02 3:14:15 :-D

👏

(b)

👎

(a)

122

slide-123
SLIDE 123

TWEET

ID Timestamp Text 389472 2019-01-01 12:34:56 hey 123794 2019-01-01 12:34:57 lol 596208 2019-01-02 3:14:15 :-D

create table TWEET ( ID INT PRIMARY KEY, Time TIMESTAMP , Text VARCHAR(140) ); insert into TWEET values(5, “2019-01-01 12:34:57”, “lol”);

👏

Clicker Lightening Round! (b)

👎

(a)

123

slide-124
SLIDE 124

TWEET

ID Timestamp Text 389472 2019-01-01 12:34:56 hey 123794 2019-01-01 12:34:57 lol 596208 2019-01-02 3:14:15 :-D

create table TWEET ( ID INT PRIMARY KEY, Time TIMESTAMP , Text VARCHAR(140) ); insert into TWEET values(5, “2019-01-01 12:34:57”, “lol”);

👏

Clicker Lightening Round! (b)

👎

(a)

124

slide-125
SLIDE 125

TWEET

ID Timestamp Text 389472 2019-01-01 12:34:56 hey 123794 2019-01-01 12:34:57 lol 596208 2019-01-02 3:14:15 :-D

create table TWEET ( ID INT PRIMARY KEY, Time TIMESTAMP , Text VARCHAR(140) ); insert into TWEET values(E7w3WKVDB, “2019-01-01 12:34:57”, “lol”);

👏

Clicker Lightening Round! (b)

👎

(a)

125

slide-126
SLIDE 126

TWEET

ID Timestamp Text 389472 2019-01-01 12:34:56 hey 123794 2019-01-01 12:34:57 lol 596208 2019-01-02 3:14:15 :-D

create table TWEET ( ID INT PRIMARY KEY, Time TIMESTAMP , Text VARCHAR(140) ); insert into TWEET values(E7w3WKVDB, “2019-01-01 12:34:57”, “lol”);

👏

Clicker Lightening Round! (b)

👎

(a)

data type mismatch

126

slide-127
SLIDE 127

TWEET

ID Timestamp Text 389472 2019-01-01 12:34:56 hey 123794 2019-01-01 12:34:57 lol 596208 2019-01-02 3:14:15 :-D

create table TWEET ( ID INT PRIMARY KEY, Time TIMESTAMP , Text VARCHAR(140) ); insert into TWEET(Timestamp, Text) values(“2019-01-01 12:34:57”, “lol”);

👏

Clicker Lightening Round! (b)

👎

(a)

127

slide-128
SLIDE 128

TWEET

ID Timestamp Text 389472 2019-01-01 12:34:56 hey 123794 2019-01-01 12:34:57 lol 596208 2019-01-02 3:14:15 :-D

create table TWEET ( ID INT PRIMARY KEY, Time TIMESTAMP , Text VARCHAR(140) ); insert into TWEET(Timestamp, Text) values(“2019-01-01 12:34:57”, “lol”);

👏

Clicker Lightening Round! (b)

👎

(a)

requires NOT NULL

128

slide-129
SLIDE 129

TWEET

ID Timestamp Text 389472 2019-01-01 12:34:56 hey 123794 2019-01-01 12:34:57 lol 596208 2019-01-02 3:14:15 :-D

create table TWEET ( ID INT NOT NULL, Time TIMESTAMP, Text VARCHAR(140) DEFAULT “lol” ); insert into TWEET(ID, Text) values(389472, “2019-01-01 12:34:57”);

👏

Clicker Lightening Round! (b)

👎

(a)

129

slide-130
SLIDE 130

TWEET

ID Timestamp Text 389472 2019-01-01 12:34:56 hey 123794 2019-01-01 12:34:57 lol 596208 2019-01-02 3:14:15 :-D

create table TWEET ( ID INT NOT NULL, Time TIMESTAMP, Text VARCHAR(140) DEFAULT “lol” ); insert into TWEET(ID, Text) values(389472, “2019-01-01 12:34:57”);

👏

Clicker Lightening Round! (b)

👎

(a)

weird, but technically fine

130

slide-131
SLIDE 131

create table TWEET ( ID INT NOT NULL, Time TIMESTAMP, Text VARCHAR(140) DEFAULT “lol” ); insert into TWEET(ID, Text) values(389472, “2019-01-01 12:34:57”);

👏

Clicker Lightening Round! (b)

👎

(a)

weird, but technically fine

ID Timestamp Text 389472 2019-01-01 12:34:56 hey 123794 2019-01-01 12:34:57 lol 596208 2019-01-02 3:14:15 :-D 389472 NULL 2019-01-01 12:34:57

131

slide-132
SLIDE 132

TWEET

ID Timestamp Text 389472 2019-01-01 12:34:56 hey 123794 2019-01-01 12:34:57 lol 596208 2019-01-02 3:14:15 :-D

create table TWEET ( ID INT PRIMARY KEY, Time TIMESTAMP , Text VARCHAR(140) ); insert into TWEET values(389472, “2019-01-04 12:14:37”, “ugh”);

👏

Clicker Lightening Round! (b)

👎

(a)

132

slide-133
SLIDE 133

TWEET

ID Timestamp Text 389472 2019-01-01 12:34:56 hey 123794 2019-01-01 12:34:57 lol 596208 2019-01-02 3:14:15 :-D

create table TWEET ( ID INT PRIMARY KEY, Time TIMESTAMP , Text VARCHAR(140) ); insert into TWEET values(389472, “2019-01-04 12:14:37”, “ugh”);

👏

Clicker Lightening Round! (b)

👎

(a)

primary key needs to be unique

133

slide-134
SLIDE 134

create table AUTHOR ( Tweet INT, Person VARCHAR(100), PRIMARY KEY (Tweet, Person) ); create table TWEET ( ID INT, Time TIMESTAMP, Text VARCHAR(140) ); create table PERSON ( Handle VARCHAR(100), Name VARCHAR(1000), );

Foreign Keys

134

slide-135
SLIDE 135

create table AUTHOR ( Tweet INT, Person VARCHAR(100), PRIMARY KEY (Tweet, Person) FOREIGN KEY (Tweet) REFERENCES TWEET(ID), FOREIGN KEY (Person) REFERENCES PERSON(Handle), ); create table TWEET ( ID INT, Time TIMESTAMP, Text VARCHAR(140) ); create table PERSON ( Handle VARCHAR(100), Name VARCHAR(1000), );

Foreign Keys

135

slide-136
SLIDE 136

Foreign Keys

  • Not required to be Primary Keys, but!
  • Have to be unique
  • Have to be not NULL
  • NULLs are all considered distinct, i.e. NULL != NULL
  • So! Generally stick to the rule of making FK reference a PK
  • If you can’t do this, try refactoring your DB to make it

possible (if you are in a position to do this)

136

slide-137
SLIDE 137

Referential Integrity

137

slide-138
SLIDE 138

create table PERSON ( Handle VARCHAR(100), Name VARCHAR(1000), PRIMARY KEY (Handle) );

TWEET

ID Timestamp Text 389472 2019-01-01 12:34:56 hey 123794 2019-01-01 12:34:57 lol 596208 2019-01-02 3:14:15 :-D 782138 2019-01-04 12:34:57 1951A 4 lyfe

PERSON

Handle Name j Josh d Diane s Sol

AUTHOR

Person Tweet j 389472 d 123794 s 596208 d 782138 create table AUTHOR ( Tweet INT, Person VARCHAR(100), FOREIGN KEY (Tweet) REFERENCES TWEET(ID), FOREIGN KEY (Person) REFERENCES PERSON(Handle), ); create table TWEET ( ID INT PRIMARY KEY, Time TIMESTAMP , Text VARCHAR(140) );

138

slide-139
SLIDE 139

PERSON

Handle Name j Josh d Diane s Sol

AUTHOR

Person Tweet j 389472 d 123794 s 596208 d 782138 create table PERSON ( Handle VARCHAR(100), Name VARCHAR(1000), PRIMARY KEY (Handle) );

TWEET

ID Timestamp Text 389472 2019-01-01 12:34:56 hey 123794 2019-01-01 12:34:57 lol 596208 2019-01-02 3:14:15 :-D 782138 2019-01-04 12:34:57 1951A 4 lyfe create table AUTHOR ( Tweet INT, Person VARCHAR(100), FOREIGN KEY (Tweet) REFERENCES TWEET(ID), FOREIGN KEY (Person) REFERENCES PERSON(Handle), ); create table TWEET ( ID INT PRIMARY KEY, Time TIMESTAMP , Text VARCHAR(140) );

139

slide-140
SLIDE 140

PERSON

Handle Name j Josh d Diane s Sol

AUTHOR

Person Tweet j 389472 d 123794 s 596208 d 782138 create table PERSON ( Handle VARCHAR(100), Name VARCHAR(1000), PRIMARY KEY (Handle) );

TWEET

ID Timestamp Text 389472 2019-01-01 12:34:56 hey 123794 2019-01-01 12:34:57 lol 596208 2019-01-02 3:14:15 :-D 782138 2019-01-04 12:34:57 1951A 4 lyfe create table AUTHOR ( Tweet INT, Person VARCHAR(100), FOREIGN KEY (Tweet) REFERENCES TWEET(ID), FOREIGN KEY (Person) REFERENCES PERSON(Handle), ); create table TWEET ( ID INT PRIMARY KEY, Time TIMESTAMP , Text VARCHAR(140) ); DELETE FROM TWEET WHERE ID = “596208”

140

slide-141
SLIDE 141

PERSON

Handle Name j Josh d Diane s Sol

AUTHOR

Person Tweet j 389472 d 123794 s 596208 d 782138 create table PERSON ( Handle VARCHAR(100), Name VARCHAR(1000), PRIMARY KEY (Handle) );

TWEET

ID Timestamp Text 389472 2019-01-01 12:34:56 hey 123794 2019-01-01 12:34:57 lol 782138 2019-01-04 12:34:57 1951A 4 lyfe create table AUTHOR ( Tweet INT, Person VARCHAR(100), FOREIGN KEY (Tweet) REFERENCES TWEET(ID), FOREIGN KEY (Person) REFERENCES PERSON(Handle), ); create table TWEET ( ID INT PRIMARY KEY, Time TIMESTAMP , Text VARCHAR(140) ); DELETE FROM TWEET WHERE ID = “596208”

😲

141

slide-142
SLIDE 142

PERSON

Handle Name j Josh d Diane s Sol

AUTHOR

Person Tweet j 389472 d 123794 s NULL d 782138 create table PERSON ( Handle VARCHAR(100), Name VARCHAR(1000), PRIMARY KEY (Handle) );

TWEET

ID Timestamp Text 389472 2019-01-01 12:34:56 hey 123794 2019-01-01 12:34:57 lol 782138 2019-01-04 12:34:57 1951A 4 lyfe create table AUTHOR ( Tweet INT, Person VARCHAR(100), FOREIGN KEY (Tweet) REFERENCES TWEET(ID) ON DELETE SET NULL, FOREIGN KEY (Person) REFERENCES PERSON(Handle), ); create table TWEET ( ID INT PRIMARY KEY, Time TIMESTAMP , Text VARCHAR(140) ); DELETE FROM TWEET WHERE ID = “596208”

142

slide-143
SLIDE 143

create table PERSON ( Handle VARCHAR(100), Name VARCHAR(1000), PRIMARY KEY (Handle) );

TWEET

ID Timestamp Text 389472 2019-01-01 12:34:56 hey 123794 2019-01-01 12:34:57 lol 782138 2019-01-04 12:34:57 1951A 4 lyfe

PERSON

Handle Name j Josh d Diane s Sol

AUTHOR

Person Tweet j 389472 d 123794 d 782138 create table AUTHOR ( Tweet INT, Person VARCHAR(100), FOREIGN KEY (Tweet) REFERENCES TWEET(ID) ON DELETE CASCADE, FOREIGN KEY (Person) REFERENCES PERSON(Handle), ); create table TWEET ( ID INT PRIMARY KEY, Time TIMESTAMP , Text VARCHAR(140) ); DELETE FROM TWEET WHERE ID = “596208”

143

slide-144
SLIDE 144

PERSON

Handle Name j Josh d Diane s Sol

AUTHOR

Person Tweet j 389472 d 123794 s NULL d 782138 create table PERSON ( Handle VARCHAR(100), Name VARCHAR(1000), PRIMARY KEY (Handle) );

TWEET

ID Timestamp Text 389472 2019-01-01 12:34:56 hey 123794 2019-01-01 12:34:57 lol 782138 2019-01-04 12:34:57 1951A 4 lyfe create table AUTHOR ( Tweet INT, Person VARCHAR(100), FOREIGN KEY (Tweet) REFERENCES TWEET(ID) ON DELETE RESTRICT, FOREIGN KEY (Person) REFERENCES PERSON(Handle), ); create table TWEET ( ID INT PRIMARY KEY, Time TIMESTAMP , Text VARCHAR(140) ); DELETE FROM TWEET WHERE ID = “596208”

144

slide-145
SLIDE 145

PERSON

Handle Name j Josh d Diane s Sol

AUTHOR

Person Tweet j 389472 d 123794 s 596208 d 782138 create table PERSON ( Handle VARCHAR(100), Name VARCHAR(1000), PRIMARY KEY (Handle) );

TWEET

ID Timestamp Text 389472 2019-01-01 12:34:56 hey 123794 2019-01-01 12:34:57 lol 596208 2019-01-02 3:14:15 :-D 782138 2019-01-04 12:34:57 1951A 4 lyfe create table AUTHOR ( Tweet INT, Person VARCHAR(100), FOREIGN KEY (Tweet) REFERENCES TWEET(ID) ON DELETE RESTRICT ON UPDATE CASCADE, FOREIGN KEY (Person) REFERENCES PERSON(Handle), ); create table TWEET ( ID INT PRIMARY KEY, Time TIMESTAMP , Text VARCHAR(140) ); DELETE FROM TWEET WHERE ID = “596208”

145

slide-146
SLIDE 146

Clicker Question!

146

slide-147
SLIDE 147

TWEET

ID Text Author 1 hey s 2 lol s 3 :-D d

Clicker Question!

PERSON

Handle Name j Josh d Diane s Sol

RETWEET

Person Tweet j 1 j 2 d 1 create table PERSON ( Handle VARCHAR(100), Name VARCHAR(1000), ); create table TWEET ( ID INT, Text VARCHAR(140), Author VARCHAR(100), FOREIGN KEY (Author) REFERENCES PERSON(Handle) ON DELETE CASCADE, ); create table RETWEET ( Person VARCHAR(100), Tweet INT, FOREIGN KEY (Person) REFERENCES PERSON(Handle) ON DELETE SET NULL, FOREIGN KEY (Tweet) REFERENCES TWEET(ID) ON DELETE SET NULL, );

147

slide-148
SLIDE 148

TWEET

ID Text Author 1 hey s 2 lol s 3 :-D d

PERSON

Handle Name j Josh d Diane s Sol

RETWEET

Person Tweet j 1 j 2 d 1

DELETE FROM PERSON WHERE Handle = “s”

What happens…?

create table PERSON ( Handle VARCHAR(100), Name VARCHAR(1000), ); create table TWEET ( ID INT, Text VARCHAR(140), Author VARCHAR(100), FOREIGN KEY (Author) REFERENCES PERSON(Handle) ON DELETE CASCADE, ); create table RETWEET ( Person VARCHAR(100), Tweet INT, FOREIGN KEY (Person) REFERENCES PERSON(Handle) ON DELETE SET NULL, FOREIGN KEY (Tweet) REFERENCES TWEET(ID) ON DELETE SET NULL, );

148

slide-149
SLIDE 149

Clicker Question!

ID Text Author 3 :-D d Handle Name j Josh d Diane Person Tweet Handle Name j Josh d Diane Handle Name j Josh d Diane Person Tweet j NULL j NULL d NULL ID Text Author 3 :-D d

PERSON TWEET RETWEET

(a) (b) (c)

Person Tweet NULL NULL NULL NULL NULL NULL ID Text Author 3 :-D d

149

slide-150
SLIDE 150

Clicker Question!

ID Text Author 3 :-D d Handle Name j Josh d Diane Person Tweet Handle Name j Josh d Diane Handle Name j Josh d Diane Person Tweet j NULL j NULL d NULL ID Text Author 3 :-D d

PERSON TWEET RETWEET

(a) (b) (c)

Person Tweet NULL NULL NULL NULL NULL NULL ID Text Author 3 :-D d

150

slide-151
SLIDE 151

TWEET

ID Text Author 1 hey s 2 lol s 3 :-D d

RETWEET

Person Tweet j 1 j 2 d 1

DELETE FROM PERSON WHERE Handle = “s”

What happens…?

PERSON

Handle Name j Josh d Diane s Sol create table PERSON ( Handle VARCHAR(100), Name VARCHAR(1000), ); create table TWEET ( ID INT, Text VARCHAR(140), Author VARCHAR(100), FOREIGN KEY (Author) REFERENCES PERSON(Handle) ON DELETE CASCADE, ); create table RETWEET ( Person VARCHAR(100), Tweet INT, FOREIGN KEY (Person) REFERENCES PERSON(Handle) ON DELETE SET NULL, FOREIGN KEY (Tweet) REFERENCES TWEET(ID) ON DELETE SET NULL, );

151

slide-152
SLIDE 152

TWEET

ID Text Author 1 hey s 2 lol s 3 :-D d

RETWEET

Person Tweet j 1 j 2 d 1

DELETE FROM PERSON WHERE Handle = “s”

What happens…?

PERSON

Handle Name j Josh d Diane create table PERSON ( Handle VARCHAR(100), Name VARCHAR(1000), ); create table TWEET ( ID INT, Text VARCHAR(140), Author VARCHAR(100), FOREIGN KEY (Author) REFERENCES PERSON(Handle) ON DELETE CASCADE, ); create table RETWEET ( Person VARCHAR(100), Tweet INT, FOREIGN KEY (Person) REFERENCES PERSON(Handle) ON DELETE SET NULL, FOREIGN KEY (Tweet) REFERENCES TWEET(ID) ON DELETE SET NULL, );

152

slide-153
SLIDE 153

TWEET

ID Text Author 1 hey s 2 lol s 3 :-D d

RETWEET

Person Tweet j 1 j 2 d 1

DELETE FROM PERSON WHERE Handle = “s”

What happens…?

PERSON

Handle Name j Josh d Diane create table PERSON ( Handle VARCHAR(100), Name VARCHAR(1000), ); create table TWEET ( ID INT, Text VARCHAR(140), Author VARCHAR(100), FOREIGN KEY (Author) REFERENCES PERSON(Handle) ON DELETE CASCADE, ); create table RETWEET ( Person VARCHAR(100), Tweet INT, FOREIGN KEY (Person) REFERENCES PERSON(Handle) ON DELETE SET NULL, FOREIGN KEY (Tweet) REFERENCES TWEET(ID) ON DELETE SET NULL, );

153

slide-154
SLIDE 154

TWEET

ID Text Author 3 :-D d

RETWEET

Person Tweet j 1 j 2 d 1

DELETE FROM PERSON WHERE Handle = “s”

What happens…?

PERSON

Handle Name j Josh d Diane create table PERSON ( Handle VARCHAR(100), Name VARCHAR(1000), ); create table TWEET ( ID INT, Text VARCHAR(140), Author VARCHAR(100), FOREIGN KEY (Author) REFERENCES PERSON(Handle) ON DELETE CASCADE, ); create table RETWEET ( Person VARCHAR(100), Tweet INT, FOREIGN KEY (Person) REFERENCES PERSON(Handle) ON DELETE SET NULL, FOREIGN KEY (Tweet) REFERENCES TWEET(ID) ON DELETE SET NULL, );

154

slide-155
SLIDE 155

TWEET

ID Text Author 3 :-D d

RETWEET

Person Tweet j 1 j 2 d 1

DELETE FROM PERSON WHERE Handle = “s”

What happens…?

PERSON

Handle Name j Josh d Diane create table PERSON ( Handle VARCHAR(100), Name VARCHAR(1000), ); create table TWEET ( ID INT, Text VARCHAR(140), Author VARCHAR(100), FOREIGN KEY (Author) REFERENCES PERSON(Handle) ON DELETE CASCADE, ); create table RETWEET ( Person VARCHAR(100), Tweet INT, FOREIGN KEY (Person) REFERENCES PERSON(Handle) ON DELETE SET NULL, FOREIGN KEY (Tweet) REFERENCES TWEET(ID) ON DELETE SET NULL, );

155

slide-156
SLIDE 156

TWEET

ID Text Author 3 :-D d

RETWEET

Person Tweet j NULL j NULL d NULL

DELETE FROM PERSON WHERE Handle = “s”

What happens…?

PERSON

Handle Name j Josh d Diane create table PERSON ( Handle VARCHAR(100), Name VARCHAR(1000), ); create table TWEET ( ID INT, Text VARCHAR(140), Author VARCHAR(100), FOREIGN KEY (Author) REFERENCES PERSON(Handle) ON DELETE CASCADE, ); create table RETWEET ( Person VARCHAR(100), Tweet INT, FOREIGN KEY (Person) REFERENCES PERSON(Handle) ON DELETE SET NULL, FOREIGN KEY (Tweet) REFERENCES TWEET(ID) ON DELETE SET NULL, );

156

slide-157
SLIDE 157

k bye

157