CS377: Database Systems Distributed Databases Distributed Databases - PowerPoint PPT Presentation

CS377: Database Systems Distributed Databases Distributed Databases �� Department of Mathematics and Computer Science Emory University 1

Centralized DBMS on a Network �� 2

Distributed DBMS Environment �� 3

Distributed Database System � A distributed database (DDB) is a collection of multiple, �� databases distributed over a �� A distributed database management system (D– DBMS) is the software that manages the DDB and provides an access mechanism that makes this distribution transparent to the users. � Distributed database system (DDBS) = DDB + D– DBMS 4

Distributed Database System The EMPLOYEE, PROJECT, and WORKS_ON tables may be fragmented horizontally and stored with possible replication as shown below. 6

Distributed DBMS Promises � Transparent management of distributed, fragmented, and replicated data � Improved reliability/availability through distributed transactions � Improved performance � Easier and more economical system expansion 7

Distributed DBMS Issues � �� How to distribute the database � �� Optimize cost = data transmission + local processing 8

Distributed DBMS Issues � �� Synchronization of concurrent accesses � Consistency and isolation of transactions' effects � Deadlock management � �� How to make the system resilient to failures � Atomicity and durability 9

Distributed database design � Data distribution � TopAdown A mostly in designing systems from scratch � BottomAup A when the databases already exist at a number of sites � Unit of distribution � relation � fragments of relations (subArelations) � Data are inherently fragmented, e.g. in locality � Allow concurrent execution of a number of transactions that access different portions of a relation 10

Example Employee relation E (#,name,loc,sal,…) 40% of queries: 40% of queries: Qa: select * Qb: select * from E from E where loc=Sa where loc=Sa where loc=Sb where loc=Sb and… and ... Motivation: Two sites: Sa, Sb Qa → ← Qb �� 11

Fragmentation Alternatives – Horizontal �� PROJ 1 : projects with budgets �� less than $200,000 �� !!!! ��" �% ��&��'�"�() �# !!! �� #� ��$�� % !!!! �� PROJ 2 : projects with budgets �* ��+� #�!!!! �� $�� $�� !!!!! !!!!! �� greater than or equal to greater than or equal to $200,000 �� % �� #� ��$�� % !!!! �� !!!! ��" �* ��+� #�!!!! �� % ��&��'�"�() �# !!! �� $�� !!!!! �� 12

Fragmentation Alternatives – Vertical �� PROJ 1 :information about �� project budgets �� !!!! ��" �% ��&��'�"�() �# !!! �� #� ��$�� % !!!! �� PROJ 2 :information about �* ��+� #�!!!! �� $�� $�� !!!!! !!!!! �� project names and project names and locations �� % �� " �� !!!! �% ��&��'�"�() �� % �# !!! �#� ��$�� #� % !!!! �* ��+� �� * #�!!!! � ��$�� !!!!! 13

Data Fragmentation, Replication and Allocation � �� A horizontal subset of a relation which contain those of tuples which satisfy selection conditions. � E.g. Employee relation with selection condition (DNO = 5) � Can be specified by a σ � Can be specified by a σ σ σ Ci (R) operation in the relational algebra. σ σ Ci (R) operation in the relational algebra. σ σ � Complete horizontal fragmentation � A set of horizontal fragments whose conditions C1, C2, …, Cn include all the tuples in RA every tuple in R satisfies (C1 OR C2 OR … OR Cn). � Disjoint complete horizontal fragmentation: No tuple in R satisfies (Ci AND Cj) where i ≠ j. � How to reconstruct R from complete horizontal fragments? 14

Three common horizontal partitioning techniques � Round robin � Hash partitioning � Range partitioning 15 15

• Round robin R D 0 D 1 D 2 t1 t1 t2 t2 t3 t3 t4 t4 t4 t4 ... t5 16

• Hash partitioning R D 0 D 1 D 2 t1 → h(k 1 )=2 t1 t2 → h(k 2 )=0 t2 t3 → h(k 3 )=0 t3 → h(k 3 )=0 t3 t3 t4 → h(k 4 )=1 t4 ... 17

• Range partitioning R D 0 D 1 D 2 t1: A=5 t1 �� t2: A=8 t2 � � t3: A=2 t3 t4: A=3 t4: A=3 t4 t4 �� ... 18

Data Fragmentation, Replication and Allocation � �� A vertical subset of a relation that contains a subset of columns. � E.g. Employee relation: a vertical fragment of Name, Bdate, Sex � Can be specified by a Π Li (R) operation in the relational algebra. � Can be specified by a Π Li (R) operation in the relational algebra. � Each fragment must include the primary key attribute of the parent relation Employee � Complete vertical fragmentation � A set of vertical fragments whose projection lists L1, L2, …, Ln include all the attributes in R but share only the primary key of R. � L1 ∪ L2 ∪ ... ∪ Ln = ATTRS (R) � Li ∩ Lj = PK(R) for any i j � How to reconstruct R from complete vertical fragments? 19

Data Fragmentation, Replication and Allocation � �� A combination of Vertical fragmentation and Horizontal fragmentation. � This is achieved by SELECTAPROJECT operations which is represented by Π Li ( σ which is represented by Π Li ( σ σ σ σ Ci (R)) σ σ σ Ci (R)) 20

Data Fragmentation, Replication and Allocation � !��"�� A definition of a set of fragments (horizontal or vertical or mixed) that can reconstruct the original database � #��"�� Distribution of fragments to sites of distributed databases. It � Distribution of fragments to sites of distributed databases. It can be fully or partially replicated or can be partitioned � ��$�� Full replication: database is replicated to all sites. � Partial replication: some selected part is replicated 21

Distributed Database System The EMPLOYEE, PROJECT, and WORKS_ON tables may be fragmented horizontally and stored with possible replication as shown below. 22

Distributed DBMS Issues � �� How to distribute the database � �� Optimize cost = data transmission + local processing 23

CS377: Database Systems Distributed Databases Distributed Databases - PowerPoint PPT Presentation

CS377: Database Systems Distributed Databases Distributed Databases Department of Mathematics and Computer Science Emory University 1 Centralized DBMS on a Network

Distributed Databases Distributed database management system A distributed database (DDB) is

Creating Databases and Tables Introduction to Databases in Python Creating Databases

Distributed Databases 1 19.1 Distributed Database System A distributed database system

Database Utilities 10/17/2007 DC/Win Database Utilities Opening Database Utilities From File on

Distributed Databases Chapter 16 1 What is a Distributed Database? Database whose relations

Inductive Inductive Inductive Inductive Databases Databases Databases Databases and

Lecture 11: Persistent Memory Databases 1 / 71 Persistent Memory Databases Recap

CS4224/CS5424 Lecture 1 Introduction Distributed Database Systems A distributed database is a

Introduction to Database Systems Datalog & Deductive Databases Textbook Reference Database

Databases and PHP Creating and Using Databases in mySQL Database Basics l Remember our Database

Neo4j and graph databases Presented By: Stephanie McIntyre Graph Databases: The Database Model

Module 3: Creating and Managing Databases Overview Creating Databases Creating

CSc 337 LECTURE 33: DATABASES Databases Database (DB) - an organized collection of data. We

Databases and PHP Accessing databases from PHP PHP & Databases l PHP can connect to

Outline Introduction Background Distributed DBMS Architecture Distributed Database Design

Introduction to Databases Introduction to Databases in Python A database consists of tables

Strengthening Your Model-Based Enterprise with Validation Doug Cheney Raising ITI TranscenData

Placement Introduction A very important step in physical design cycle. A poor placement

STRATASYS F123 SERIES MAXIMIZED CADS BUSINESS Jesse Hahne | CAD Co-Owner Marc McCauley | CAD

Towards a realistic design for a forward tracker at the ILC I. Garcia M.A. Villarejo M. Vos 1

CAD' Model' Red-B Sketch'Model'Review'' Product Questions ! How can we make MagneGlasses a more

Understand Your Design Parametric Workflow in ANSYS PRACE Autumn School 2013 - Industry Oriented

Elmer Alternative Pre-processing tools ElmerTeam CSC IT Center for Science Mesh generation

An Incremental Algorithm for Computing Cylindrical Algebraic Decompositions Changbo Chen, Marc

Sambuz

Useful Links

Newsletter

Mail Us

CS377: Database Systems Distributed Databases Distributed Databases - PowerPoint PPT Presentation

CS377: Database Systems Distributed Databases Distributed Databases Department of Mathematics and Computer Science Emory University 1 Centralized DBMS on a Network

Distributed Databases Distributed database management system A distributed database (DDB) is

Creating Databases and Tables Introduction to Databases in Python Creating Databases

Distributed Databases 1 19.1 Distributed Database System A distributed database system

Database Utilities 10/17/2007 DC/Win Database Utilities Opening Database Utilities From File on

Distributed Databases Chapter 16 1 What is a Distributed Database? Database whose relations

Inductive Inductive Inductive Inductive Databases Databases Databases Databases and

Lecture 11: Persistent Memory Databases 1 / 71 Persistent Memory Databases Recap

CS4224/CS5424 Lecture 1 Introduction Distributed Database Systems A distributed database is a

Introduction to Database Systems Datalog &amp; Deductive Databases Textbook Reference Database

Databases and PHP Creating and Using Databases in mySQL Database Basics l Remember our Database

Neo4j and graph databases Presented By: Stephanie McIntyre Graph Databases: The Database Model

Module 3: Creating and Managing Databases Overview Creating Databases Creating

CSc 337 LECTURE 33: DATABASES Databases Database (DB) - an organized collection of data. We

Databases and PHP Accessing databases from PHP PHP &amp; Databases l PHP can connect to

Outline Introduction Background Distributed DBMS Architecture Distributed Database Design

Introduction to Databases Introduction to Databases in Python A database consists of tables

Strengthening Your Model-Based Enterprise with Validation Doug Cheney Raising ITI TranscenData

Placement Introduction A very important step in physical design cycle. A poor placement

STRATASYS F123 SERIES MAXIMIZED CADS BUSINESS Jesse Hahne | CAD Co-Owner Marc McCauley | CAD

Towards a realistic design for a forward tracker at the ILC I. Garcia M.A. Villarejo M. Vos 1

CAD' Model' Red-B Sketch'Model'Review'' Product Questions ! How can we make MagneGlasses a more

Understand Your Design Parametric Workflow in ANSYS PRACE Autumn School 2013 - Industry Oriented

Elmer Alternative Pre-processing tools ElmerTeam CSC IT Center for Science Mesh generation

An Incremental Algorithm for Computing Cylindrical Algebraic Decompositions Changbo Chen, Marc

Sambuz

Useful Links

Newsletter

Mail Us

Introduction to Database Systems Datalog & Deductive Databases Textbook Reference Database

Databases and PHP Accessing databases from PHP PHP & Databases l PHP can connect to