cs377 database systems distributed databases distributed
play

CS377: Database Systems Distributed Databases Distributed Databases - PowerPoint PPT Presentation

CS377: Database Systems Distributed Databases Distributed Databases Department of Mathematics and Computer Science Emory University 1 Centralized DBMS on a Network


  1. CS377: Database Systems Distributed Databases Distributed Databases �������� Department of Mathematics and Computer Science Emory University 1

  2. Centralized DBMS on a Network ������ ������ ������ ������ ������������� ������� ������ ������ 2

  3. Distributed DBMS Environment ������ ������ ������ ������ ������������� ������� ������ ������ 3

  4. Distributed Database System � A distributed database (DDB) is a collection of multiple, ���������������������� databases distributed over a ����������������� � A distributed database management system (D– DBMS) is the software that manages the DDB and provides an access mechanism that makes this distribution transparent to the users. � Distributed database system (DDBS) = DDB + D– DBMS 4

  5. Distributed Database System The EMPLOYEE, PROJECT, and WORKS_ON tables may be fragmented horizontally and stored with possible replication as shown below. 6

  6. Distributed DBMS Promises � Transparent management of distributed, fragmented, and replicated data � Improved reliability/availability through distributed transactions � Improved performance � Easier and more economical system expansion 7

  7. Distributed DBMS Issues � ��������������������������� � How to distribute the database � ���������������� � Optimize cost = data transmission + local processing 8

  8. Distributed DBMS Issues � ������������������� � Synchronization of concurrent accesses � Consistency and isolation of transactions' effects � Deadlock management � ����������� � How to make the system resilient to failures � Atomicity and durability 9

  9. Distributed database design � Data distribution � TopAdown A mostly in designing systems from scratch � BottomAup A when the databases already exist at a number of sites � Unit of distribution � relation � fragments of relations (subArelations) � Data are inherently fragmented, e.g. in locality � Allow concurrent execution of a number of transactions that access different portions of a relation 10

  10. Example Employee relation E (#,name,loc,sal,…) 40% of queries: 40% of queries: Qa: select * Qb: select * from E from E where loc=Sa where loc=Sa where loc=Sb where loc=Sb and… and ... Motivation: Two sites: Sa, Sb Qa → ← Qb �� �� 11

  11. Fragmentation Alternatives – Horizontal ���� PROJ 1 : projects with budgets ��� ����� ������ ��� less than $200,000 �� ��������������� � !!!! �������" �% ����&������'�"�() �# !!! �������� �#� ���$��� % !!!! �������� PROJ 2 : projects with budgets �* ���������+� #�!!!! ����� � � ���$��� ���$��� !!!!! !!!!! ������ ������ greater than or equal to greater than or equal to $200,000 ���� � ���� % ��� ��� ��� ����� ������ ��� ����� ������ �#� ���$��� % !!!! �������� �� ��������������� � !!!! �������" �* ���������+� #�!!!! ����� �% ����&������'�"�() �# !!! �������� � ���$��� !!!!! ������ 12

  12. Fragmentation Alternatives – Vertical ���� PROJ 1 :information about ��� ����� ������ ��� project budgets �� ��������������� � !!!! �������" �% ����&������'�"�() �# !!! �������� �#� ���$��� % !!!! �������� PROJ 2 :information about �* ���������+� #�!!!! ����� � � ���$��� ���$��� !!!!! !!!!! ������ ������ project names and project names and locations ���� � ���� % ��� ����� ��� ��� ������ �� ��������������� �������" �� � !!!! �% ����&������'�"�() �������� �% �# !!! �#� ���$��� �������� �#� % !!!! �* ���������+� ����� �* #�!!!! � ���$��� ������ � !!!!! 13

  13. Data Fragmentation, Replication and Allocation � ������������������������ � A horizontal subset of a relation which contain those of tuples which satisfy selection conditions. � E.g. Employee relation with selection condition (DNO = 5) � Can be specified by a σ � Can be specified by a σ σ σ Ci (R) operation in the relational algebra. σ σ Ci (R) operation in the relational algebra. σ σ � Complete horizontal fragmentation � A set of horizontal fragments whose conditions C1, C2, …, Cn include all the tuples in RA every tuple in R satisfies (C1 OR C2 OR … OR Cn). � Disjoint complete horizontal fragmentation: No tuple in R satisfies (Ci AND Cj) where i ≠ j. � How to reconstruct R from complete horizontal fragments? 14

  14. Three common horizontal partitioning techniques � Round robin � Hash partitioning � Range partitioning 15 15

  15. • Round robin R D 0 D 1 D 2 t1 t1 t2 t2 t3 t3 t4 t4 t4 t4 ... t5 16

  16. • Hash partitioning R D 0 D 1 D 2 t1 → h(k 1 )=2 t1 t2 → h(k 2 )=0 t2 t3 → h(k 3 )=0 t3 → h(k 3 )=0 t3 t3 t4 → h(k 4 )=1 t4 ... 17

  17. • Range partitioning R D 0 D 1 D 2 t1: A=5 t1 ������������ ������ t2: A=8 t2 � � t3: A=2 t3 t4: A=3 t4: A=3 t4 t4 ������ ������ ... 18

  18. Data Fragmentation, Replication and Allocation � ���������������������� � A vertical subset of a relation that contains a subset of columns. � E.g. Employee relation: a vertical fragment of Name, Bdate, Sex � Can be specified by a Π Li (R) operation in the relational algebra. � Can be specified by a Π Li (R) operation in the relational algebra. � Each fragment must include the primary key attribute of the parent relation Employee � Complete vertical fragmentation � A set of vertical fragments whose projection lists L1, L2, …, Ln include all the attributes in R but share only the primary key of R. � L1 ∪ L2 ∪ ... ∪ Ln = ATTRS (R) � Li ∩ Lj = PK(R) for any i j � How to reconstruct R from complete vertical fragments? 19

  19. Data Fragmentation, Replication and Allocation � ������������� �������������� � A combination of Vertical fragmentation and Horizontal fragmentation. � This is achieved by SELECTAPROJECT operations which is represented by Π Li ( σ which is represented by Π Li ( σ σ σ σ Ci (R)) σ σ σ Ci (R)) 20

  20. Data Fragmentation, Replication and Allocation � !���������������"��� � A definition of a set of fragments (horizontal or vertical or mixed) that can reconstruct the original database � #������������"��� � Distribution of fragments to sites of distributed databases. It � Distribution of fragments to sites of distributed databases. It can be fully or partially replicated or can be partitioned � �������$�������� � Full replication: database is replicated to all sites. � Partial replication: some selected part is replicated 21

  21. Distributed Database System The EMPLOYEE, PROJECT, and WORKS_ON tables may be fragmented horizontally and stored with possible replication as shown below. 22

  22. Distributed DBMS Issues � ��������������������������� � How to distribute the database � ���������������� � Optimize cost = data transmission + local processing 23

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend