Physical Database Design Ramakrishnan & Gehrke, Chapter 17 & - PowerPoint PPT Presentation

Physical Database Design Ramakrishnan & Gehrke, Chapter 17 & 18 340151 Big Data & Cloud Services (P. Baumann)

Alternative File Organizations  File organization = Method of arranging a file of records on external storage • Goal: quickly find records needed by query  Several alternatives • Heap files • Sorted Files • Indexes 340151 Big Databases & Cloud Services (P. Baumann) 2

Index Selection Guidelines  Understand workload: • Queries vs. update • What relations (sizes!), attributes, conditions, joins (selectivity!), …?  Attributes in WHERE clause are candidates for index keys • Exact match condition suggests hash index, range query suggests tree index • Consider multi-attribute search keys for several WHERE clause conditions • Order of attributes important for range queries  Choose indexes that benefit as many queries as possible • impact on updates: Indexes make queries faster, updates slower • require disk space  understand how DBMS evaluates queries & creates query evaluation plans 340151 Big Databases & Cloud Services (P. Baumann) 3

Decisions to Make  What indexes? • Which relations? What field(s) search key? Several indexes? • For each index, what kind of an index should it be?  Change conceptual schema? guided by workload, in addition to redundancy issues • Consider alternative normalized schemas? (many choices!) • “undo’’ some decompositions, settle for a lower normal form, such as 3NF? (denormalization) • Horizontal partitioning, replication, views ...see manuals  If made after a database is in use, called schema evolution 340151 Big Databases & Cloud Services (P. Baumann) 4

Example Schema Contracts (Cid, Sid, Jid, Did, Pid, Qty, Val) Depts (Did, Budget, Report) Suppliers (Sid, Address) Parts (Pid, Cost) Projects (Jid, Mgr)  Contracts = CSJDPQV; ICs: JP C, SD P; C is primary key • superkey? • What normal form? 340151 Big Databases & Cloud Services (P. Baumann) 5

Denormalization Contracts (Cid, Sid, Jid, Did, Pid, Qty, Val)  Suppose following query is important: • “Value of contract less than department budget?”  To speed up, add field budget B (from Departments) to Contracts • New FD for Dept./Budget: Did B • Contracts no longer in 3NF  might choose to modify Contracts • sufficiently important? No good performance otherwise? • i.e., by indexes, choosing alternative 3NF schema 340151 Big Databases & Cloud Services (P. Baumann) 6

Decomposition of a BCNF Relation  Suppose { SDP, CSJDQV } in BCNF • no reason to decompose further (assuming that all known ICs are FDs)  However, suppose that these queries are important • “Find the contracts held by supplier S” • “Find the contracts that department D is involved in”  Decomposing CSJDQV further into CS, CD and CJQV: • could speed up these queries (Why?) • following query is slower: “Find total value of all contracts held by supplier S.” 340151 Big Databases & Cloud Services (P. Baumann) 7

Masking Conceptual Schema Changes CREATE VIEW Contracts(cid, sid, jid, did, pid, qty, val) AS SELECT * FROM LargeContracts UNION SELECT * FROM SmallContracts  Contracts  LargeContracts + SmallContracts can be masked by view  queries with condition val>10000: preferable LargeContracts for efficient execution 340151 Big Databases & Cloud Services (P. Baumann) 8

Tuning Queries and Views  query runs slower than expected? check if index needs to be re-built or statistics too old  DBMS may not be executing plan you had in mind. Common problems: • Selections involving null values • Selections involving arithmetic or string expressions • Selections involving OR conditions • Lack of evaluation features like index-only strategies or certain join methods or poor size estimation  Check plan used, adjust choice of indexes or rewrite query/view • Avoid nested queries, temporary relations, complex conditions, and operations like DISTINCT and GROUP BY 340151 Big Databases & Cloud Services (P. Baumann) 9

PS: A Moderately Complex Query 340151 Big Databases & Cloud Services (P. Baumann) 10

Key Performance Factors  Ref: discussion "what are the key points to improve the query performance" on the LinkedIn Database list, 2012-07-20 340151 Big Databases & Cloud Services (P. Baumann) 11

Summary  Many alternative file organizations, each appropriate in some situation  If selection queries frequent: sort file or build an index • Hash vs tree indexes vs sorted files  Understand workload & DBMS query plans 340151 Big Databases & Cloud Services (P. Baumann) 12

Physical Database Design Ramakrishnan & Gehrke, Chapter 17 & - PowerPoint PPT Presentation

Physical Database Design Ramakrishnan & Gehrke, Chapter 17 & 18 340151 Big Data & Cloud Services (P. Baumann) Alternative File Organizations File organization = Method of arranging a file of records on external storage Goal:

Database Utilities 10/17/2007 DC/Win Database Utilities Opening Database Utilities From File on

Physical Database Design 5DV120 Database System Principles Ume a University Department of

Database Design October 24, 2008 Database Design Outline Database Design E-R diagrams

NEBC Database Course 2008 Database Servers Database Interfaces Tim Booth : tbooth@ceh.ac.uk

Design & Analysis of Design & Analysis of Design & Analysis of Physical Design

This Lecture Physical Database Design RAID Arrays Efficiency and Storage Parity

Database design Given a domain, know how to design a database that correctly models the

National Address Database National Address Database What is a National Address Database?

DATABASE SECURITY CS4750 Database Systems Prof. Nada Basit Email: basit@virginia.edu Fall

DATABASE SECURITY CS4750 Database Systems Prof. Nada Basit Email: basit@virginia.edu Fall

DATABASE SYSTEMS Database programming in a web environment Database System Course, 2016-2017

DATABASE SYSTEMS Database programming in a web environment Database System Course AGENDA FOR

Advanced Database CS 525: Organization? Advanced Database =Database Implementation

CSc 337 LECTURE 24: CREATING A DATABASE AND MORE JOINS Creating a database In the command line

EPIC PHYSICAL THERAPY BIG IDEA PREPARED BY HUMMINGBIRD CREATIVE GROUP EPIC PHYSICAL THERAPY |

Physical Maps Physical Maps What are they? Physical maps uses colours and shading to show

GEOG 580 Data Management for GIS Entity-Relationship Data Model Recap Structured Query

(Description) Logics for Information Modelling and Access - or - How to Use an Ontology Enrico

Database Systems II Introduction CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 1

COGS 121 HCI Programming Studio Week 03 - Tech Lecture Housekeeping Assignment #1

Overview of Database Systems Kanda Runapongsa (krunapon@kku.ac.th) Dept of Computer Engineering

Introduction to Database Technology Elmasri/Navathe ch 1-2 Padron-McCarthy/Risch ch 1 Sobhan

BBM371- Data Management Lecture 1: Course policies, Introduction to Database Management Systems

Database Systems Take-home test 2, spring, 2019 Due by May 7, 2019 (submit answers as hard copy)

Physical Database Design Ramakrishnan & Gehrke, Chapter 17 & - PowerPoint PPT Presentation

Physical Database Design Ramakrishnan & Gehrke, Chapter 17 & 18 340151 Big Data & Cloud Services (P. Baumann) Alternative File Organizations File organization = Method of arranging a file of records on external storage Goal:

Database Utilities 10/17/2007 DC/Win Database Utilities Opening Database Utilities From File on

Physical Database Design 5DV120 Database System Principles Ume a University Department of

Database Design October 24, 2008 Database Design Outline Database Design E-R diagrams

NEBC Database Course 2008 Database Servers Database Interfaces Tim Booth : tbooth@ceh.ac.uk

Design &amp; Analysis of Design &amp; Analysis of Design &amp; Analysis of Physical Design

This Lecture Physical Database Design RAID Arrays Efficiency and Storage Parity

Database design Given a domain, know how to design a database that correctly models the

National Address Database National Address Database What is a National Address Database?

DATABASE SECURITY CS4750 Database Systems Prof. Nada Basit Email: basit@virginia.edu Fall

DATABASE SECURITY CS4750 Database Systems Prof. Nada Basit Email: basit@virginia.edu Fall

DATABASE SYSTEMS Database programming in a web environment Database System Course, 2016-2017

DATABASE SYSTEMS Database programming in a web environment Database System Course AGENDA FOR

Advanced Database CS 525: Organization? Advanced Database =Database Implementation

CSc 337 LECTURE 24: CREATING A DATABASE AND MORE JOINS Creating a database In the command line

EPIC PHYSICAL THERAPY BIG IDEA PREPARED BY HUMMINGBIRD CREATIVE GROUP EPIC PHYSICAL THERAPY |

Physical Maps Physical Maps What are they? Physical maps uses colours and shading to show

GEOG 580 Data Management for GIS Entity-Relationship Data Model Recap Structured Query

(Description) Logics for Information Modelling and Access - or - How to Use an Ontology Enrico

Database Systems II Introduction CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 1

COGS 121 HCI Programming Studio Week 03 - Tech Lecture Housekeeping Assignment #1

Overview of Database Systems Kanda Runapongsa (krunapon@kku.ac.th) Dept of Computer Engineering

Introduction to Database Technology Elmasri/Navathe ch 1-2 Padron-McCarthy/Risch ch 1 Sobhan

BBM371- Data Management Lecture 1: Course policies, Introduction to Database Management Systems

Database Systems Take-home test 2, spring, 2019 Due by May 7, 2019 (submit answers as hard copy)

Design & Analysis of Design & Analysis of Design & Analysis of Physical Design