SLIDE 1
Structural and Computational Properties of Possibilistic Armstrong - - PowerPoint PPT Presentation
Structural and Computational Properties of Possibilistic Armstrong - - PowerPoint PPT Presentation
Structural and Computational Properties of Possibilistic Armstrong Databases Seyeong Jeong, Haoming Ma, Ziheng Wei, Sebastian Link The University of Auckland, New Zealand ER 2020 Vienna, Austria Context of the Work Apply possibility theory to
SLIDE 2
SLIDE 3
Context of the Work
Apply possibility theory to schema design for uncertain data Develop computational support for acquiring possibilistic functional dependencies that serve as meaningful input to schema design
SLIDE 4
Context of the Work
Apply possibility theory to schema design for uncertain data Develop computational support for acquiring possibilistic functional dependencies that serve as meaningful input to schema design
Mining constraints from data
SLIDE 5
Context of the Work
Apply possibility theory to schema design for uncertain data Develop computational support for acquiring possibilistic functional dependencies that serve as meaningful input to schema design
Mining constraints from data Iterative example-based acquisition (Armstrong relations)
SLIDE 6
Context of the Work
Apply possibility theory to schema design for uncertain data Develop computational support for acquiring possibilistic functional dependencies that serve as meaningful input to schema design
Mining constraints from data Iterative example-based acquisition (Armstrong relations)
Validation of integrity requirements for uncertain data
SLIDE 7
The Big Picture
SLIDE 8
The Big Picture
SLIDE 9
The Big Picture
SLIDE 10
The Big Picture
SLIDE 11
The Big Picture
SLIDE 12
The Big Picture
SLIDE 13
The Big Picture
SLIDE 14
The Big Picture
SLIDE 15
The Big Picture
SLIDE 16
The Big Picture
SLIDE 17
The Big Picture
SLIDE 18
The Big Picture
SLIDE 19
The Big Picture
SLIDE 20
The Big Picture
SLIDE 21
The Big Picture
SLIDE 22
The Big Picture
SLIDE 23
Possibilistic Relations and Functional Dependencies
SLIDE 24
Possibilistic Relations and Functional Dependencies
SLIDE 25
Possibilistic Relations and Functional Dependencies
β1-cut Σ1 MT → R α4-cut r4 Tuples with α ≥ α4
SLIDE 26
Possibilistic Relations and Functional Dependencies
β1-cut Σ1 MT → R α4-cut r4 Tuples with α ≥ α4 BCNF for Σ1 R1 = MTR with key MT R2 = MTP with key MTP α4-lossless no α4-redundancy β1-preserving
SLIDE 27
Possibilistic Relations and Functional Dependencies
β2-cut Σ2 MT → R, RT → P α3-cut r3 Tuples with α ≥ α3
SLIDE 28
Possibilistic Relations and Functional Dependencies
β2-cut Σ2 MT → R, RT → P α3-cut r3 Tuples with α ≥ α3 BCNF for Σ2 R1 = RTP with key RT R2 = RTM with key MT α3-lossless no α3-redundancy β2-preserving
SLIDE 29
Possibilistic Relations and Functional Dependencies
β3-cut Σ3 MT → R, RT → P, PT → M α2-cut r2 Tuples with α ≥ α2
SLIDE 30
Possibilistic Relations and Functional Dependencies
β3-cut Σ3 MT → R, RT → P, PT → M α2-cut r2 Tuples with α ≥ α2 BCNF for Σ3 PTMR with keys MT, RT, PT α2-lossless no α2-redundancy β3-preserving
SLIDE 31
Possibilistic Relations and Functional Dependencies
β4-cut Σ4 MT → R, RT → P, P → M α1-cut r1 Tuples with α = α1
SLIDE 32
Possibilistic Relations and Functional Dependencies
β4-cut Σ4 MT → R, RT → P, P → M α1-cut r1 Tuples with α = α1 BCNF for Σ4 R1 = PM with key P R2 = MTR with key MT R3 = RTP with key RT α1-lossless no α1-redundancy β4-preserving
SLIDE 33
Computational Support for Finding Meaningful pFDs
Definition (Possibilistic Armstrong Relation) A p-relation r over a p-schema (R, α1 > · · · > αk+1) is Armstrong for a given set Σ of pFDs over the p-schema if and only if for all pFDs ϕ over the p-schema, r satisfies ϕ if and only if Σ implies ϕ.
SLIDE 34
Computational Support for Finding Meaningful pFDs
Definition (Possibilistic Armstrong Relation) A p-relation r over a p-schema (R, α1 > · · · > αk+1) is Armstrong for a given set Σ of pFDs over the p-schema if and only if for all pFDs ϕ over the p-schema, r satisfies ϕ if and only if Σ implies ϕ.
R = {Project, Time, Manager, Room} β1 > β2 > β3 > β4 > β5 Σ consists of: (Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4)
SLIDE 35
Computational Support for Finding Meaningful pFDs
Definition (Possibilistic Armstrong Relation) A p-relation r over a p-schema (R, α1 > · · · > αk+1) is Armstrong for a given set Σ of pFDs over the p-schema if and only if for all pFDs ϕ over the p-schema, r satisfies ϕ if and only if Σ implies ϕ.
R = {Project, Time, Manager, Room} β1 > β2 > β3 > β4 > β5 Σ consists of: (Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4) Possibilistic Armstrong Relation for Σ Project Time Manager Room αi Eagle Mon, 9am Ann Aqua α1 Hippo Mon, 1pm Ann Aqua α1 Kiwi Mon, 1pm Pete Buff α1 Kiwi Tue, 2pm Pete Buff α1 Lion Tue, 4pm Gill Buff α1 Lion Wed, 9am Gill Cyan α1 Lion Wed, 11am Bob Cyan α2 Lion Wed, 11am Jack Cyan α3 Lion Wed, 11am Pam Lava α3 Tiger Wed, 11am Pam Lava α4
SLIDE 36
Computational Support for Finding Meaningful pFDs
Definition (Possibilistic Armstrong Relation) A p-relation r over a p-schema (R, α1 > · · · > αk+1) is Armstrong for a given set Σ of pFDs over the p-schema if and only if for all pFDs ϕ over the p-schema, r satisfies ϕ if and only if Σ implies ϕ.
R = {Project, Time, Manager, Room} β1 > β2 > β3 > β4 > β5 Σ consists of: (Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4) Possibilistic Armstrong Relation for Σ Project Time Manager Room αi Eagle Mon, 9am Ann Aqua α1 Hippo Mon, 1pm Ann Aqua α1 Kiwi Mon, 1pm Pete Buff α1 Kiwi Tue, 2pm Pete Buff α1 Lion Tue, 4pm Gill Buff α1 Lion Wed, 9am Gill Cyan α1 Lion Wed, 11am Bob Cyan α2 Lion Wed, 11am Jack Cyan α3 Lion Wed, 11am Pam Lava α3 Tiger Wed, 11am Pam Lava α4
SLIDE 37
Computational Support for Finding Meaningful pFDs
Definition (Possibilistic Armstrong Relation) A p-relation r over a p-schema (R, α1 > · · · > αk+1) is Armstrong for a given set Σ of pFDs over the p-schema if and only if for all pFDs ϕ over the p-schema, r satisfies ϕ if and only if Σ implies ϕ.
R = {Project, Time, Manager, Room} β1 > β2 > β3 > β4 > β5 Σ consists of: (Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4) Perhaps, (Project → Manager, β3) should hold? Possibilistic Armstrong Relation for Σ Project Time Manager Room αi Eagle Mon, 9am Ann Aqua α1 Hippo Mon, 1pm Ann Aqua α1 Kiwi Mon, 1pm Pete Buff α1 Kiwi Tue, 2pm Pete Buff α1 Lion Tue, 4pm Gill Buff α1 Lion Wed, 9am Gill Cyan α1 Lion Wed, 11am Bob Cyan α2 Lion Wed, 11am Jack Cyan α3 Lion Wed, 11am Pam Lava α3 Tiger Wed, 11am Pam Lava α4
SLIDE 38
Computing Possibilistic Armstrong Relations
maximal sets maxΣi(R) of R for FD set Σi the subsets X ⊆ R such that for some attribute A ∈ R, X is maximal with the property that X → A is not implied by Σi
SLIDE 39
Computing Possibilistic Armstrong Relations
maximal sets maxΣi(R) of R for FD set Σi the subsets X ⊆ R such that for some attribute A ∈ R, X is maximal with the property that X → A is not implied by Σi As classically: introduce tuples that realize maximal sets
SLIDE 40
Computing Possibilistic Armstrong Relations
maximal sets maxΣi(R) of R for FD set Σi the subsets X ⊆ R such that for some attribute A ∈ R, X is maximal with the property that X → A is not implied by Σi As classically: introduce tuples that realize maximal sets Strategy:
For i = 1, . . . , k, compute maxi(R) for Σi − Σi−1 with Σ0 = ∅
SLIDE 41
Computing Possibilistic Armstrong Relations
maximal sets maxΣi(R) of R for FD set Σi the subsets X ⊆ R such that for some attribute A ∈ R, X is maximal with the property that X → A is not implied by Σi As classically: introduce tuples that realize maximal sets Strategy:
For i = 1, . . . , k, compute maxi(R) for Σi − Σi−1 with Σ0 = ∅ For i = k, . . . , 1, realize sets in maxi(R) with tuples of degree αk+1−i
SLIDE 42
Computing Possibilistic Armstrong Relations
maximal sets maxΣi(R) of R for FD set Σi the subsets X ⊆ R such that for some attribute A ∈ R, X is maximal with the property that X → A is not implied by Σi As classically: introduce tuples that realize maximal sets Strategy:
For i = 1, . . . , k, compute maxi(R) for Σi − Σi−1 with Σ0 = ∅ For i = k, . . . , 1, realize sets in maxi(R) with tuples of degree αk+1−i
Algorithm returns a p-Armstrong relation for input pFD set Σ since every maximal set of Σi is an agree set in rk+1−i
SLIDE 43
Example for Computing the Maximal Set Families
R = {Project, Time, Manager, Room} where Σ is:
(Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4)
Maximal set computation:
A maxΣ1(A) maxΣ2(A) maxΣ3(A) maxΣ4(A) Project Time Manager Room
SLIDE 44
Example for Computing the Maximal Set Families
R = {Project, Time, Manager, Room} where Σ is:
(Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4)
Maximal set computation:
A maxΣ1(A) maxΣ2(A) maxΣ3(A) maxΣ4(A) Project {MTR} Time Manager Room
SLIDE 45
Example for Computing the Maximal Set Families
R = {Project, Time, Manager, Room} where Σ is:
(Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4)
Maximal set computation:
A maxΣ1(A) maxΣ2(A) maxΣ3(A) maxΣ4(A) Project {MTR} Time {MRP} Manager Room
SLIDE 46
Example for Computing the Maximal Set Families
R = {Project, Time, Manager, Room} where Σ is:
(Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4)
Maximal set computation:
A maxΣ1(A) maxΣ2(A) maxΣ3(A) maxΣ4(A) Project {MTR} Time {MRP} Manager {PTR} Room
SLIDE 47
Example for Computing the Maximal Set Families
R = {Project, Time, Manager, Room} where Σ is:
(Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4)
Maximal set computation:
A maxΣ1(A) maxΣ2(A) maxΣ3(A) maxΣ4(A) Project {MTR} Time {MRP} Manager {PTR} Room {MP, PT}
SLIDE 48
Example for Computing the Maximal Set Families
R = {Project, Time, Manager, Room} where Σ is:
(Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4)
Maximal set computation:
A maxΣ1(A) maxΣ2(A) maxΣ3(A) maxΣ4(A) Project {MTR} {MR, T} Time {MRP} Manager {PTR} Room {MP, PT}
SLIDE 49
Example for Computing the Maximal Set Families
R = {Project, Time, Manager, Room} where Σ is:
(Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4)
Maximal set computation:
A maxΣ1(A) maxΣ2(A) maxΣ3(A) maxΣ4(A) Project {MTR} {MR, T} Time {MRP} {MRP} Manager {PTR} Room {MP, PT}
SLIDE 50
Example for Computing the Maximal Set Families
R = {Project, Time, Manager, Room} where Σ is:
(Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4)
Maximal set computation:
A maxΣ1(A) maxΣ2(A) maxΣ3(A) maxΣ4(A) Project {MTR} {MR, T} Time {MRP} {MRP} Manager {PTR} {PTR} Room {MP, PT}
SLIDE 51
Example for Computing the Maximal Set Families
R = {Project, Time, Manager, Room} where Σ is:
(Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4)
Maximal set computation:
A maxΣ1(A) maxΣ2(A) maxΣ3(A) maxΣ4(A) Project {MTR} {MR, T} Time {MRP} {MRP} Manager {PTR} {PTR} Room {MP, PT} {MP, PT}
SLIDE 52
Example for Computing the Maximal Set Families
R = {Project, Time, Manager, Room} where Σ is:
(Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4)
Maximal set computation:
A maxΣ1(A) maxΣ2(A) maxΣ3(A) maxΣ4(A) Project {MTR} {MR, T} {MR, T} Time {MRP} {MRP} Manager {PTR} {PTR} Room {MP, PT} {MP, PT}
SLIDE 53
Example for Computing the Maximal Set Families
R = {Project, Time, Manager, Room} where Σ is:
(Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4)
Maximal set computation:
A maxΣ1(A) maxΣ2(A) maxΣ3(A) maxΣ4(A) Project {MTR} {MR, T} {MR, T} Time {MRP} {MRP} {MRP} Manager {PTR} {PTR} Room {MP, PT} {MP, PT}
SLIDE 54
Example for Computing the Maximal Set Families
R = {Project, Time, Manager, Room} where Σ is:
(Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4)
Maximal set computation:
A maxΣ1(A) maxΣ2(A) maxΣ3(A) maxΣ4(A) Project {MTR} {MR, T} {MR, T} Time {MRP} {MRP} {MRP} Manager {PTR} {PTR} {PR, T} Room {MP, PT} {MP, PT}
SLIDE 55
Example for Computing the Maximal Set Families
R = {Project, Time, Manager, Room} where Σ is:
(Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4)
Maximal set computation:
A maxΣ1(A) maxΣ2(A) maxΣ3(A) maxΣ4(A) Project {MTR} {MR, T} {MR, T} Time {MRP} {MRP} {MRP} Manager {PTR} {PTR} {PR, T} Room {MP, PT} {MP, PT} {MP, T}
SLIDE 56
Example for Computing the Maximal Set Families
R = {Project, Time, Manager, Room} where Σ is:
(Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4)
Maximal set computation:
A maxΣ1(A) maxΣ2(A) maxΣ3(A) maxΣ4(A) Project {MTR} {MR, T} {MR, T} {MR, T} Time {MRP} {MRP} {MRP} Manager {PTR} {PTR} {PR, T} Room {MP, PT} {MP, PT} {MP, T}
SLIDE 57
Example for Computing the Maximal Set Families
R = {Project, Time, Manager, Room} where Σ is:
(Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4)
Maximal set computation:
A maxΣ1(A) maxΣ2(A) maxΣ3(A) maxΣ4(A) Project {MTR} {MR, T} {MR, T} {MR, T} Time {MRP} {MRP} {MRP} {MRP} Manager {PTR} {PTR} {PR, T} Room {MP, PT} {MP, PT} {MP, T}
SLIDE 58
Example for Computing the Maximal Set Families
R = {Project, Time, Manager, Room} where Σ is:
(Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4)
Maximal set computation:
A maxΣ1(A) maxΣ2(A) maxΣ3(A) maxΣ4(A) Project {MTR} {MR, T} {MR, T} {MR, T} Time {MRP} {MRP} {MRP} {MRP} Manager {PTR} {PTR} {PR, T} {R, T} Room {MP, PT} {MP, PT} {MP, T}
SLIDE 59
Example for Computing the Maximal Set Families
R = {Project, Time, Manager, Room} where Σ is:
(Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4)
Maximal set computation:
A maxΣ1(A) maxΣ2(A) maxΣ3(A) maxΣ4(A) Project {MTR} {MR, T} {MR, T} {MR, T} Time {MRP} {MRP} {MRP} {MRP} Manager {PTR} {PTR} {PR, T} {R, T} Room {MP, PT} {MP, PT} {MP, T} {MP, T}
SLIDE 60
Example for Computing the Maximal Set Families
R = {Project, Time, Manager, Room} where Σ is:
(Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4)
Maximal set computation:
A maxΣ1(A) maxΣ2(A) maxΣ3(A) maxΣ4(A) Project {MTR} {MR, T} {MR, T} {MR, T} Time {MRP} {MRP} {MRP} {MRP} Manager {PTR} {PTR} {PR, T} {R, T} Room {MP, PT} {MP, PT} {MP, T} {MP, T}
SLIDE 61
Example for Computing the Maximal Set Families
R = {Project, Time, Manager, Room} where Σ is:
(Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4)
Maximal set computation:
A maxΣ1(A) maxΣ2(A) maxΣ3(A) maxΣ4(A) Project {MTR} {MR, T} {MR, T} {MR, T} Time {MRP} {MRP} {MRP} {MRP} Manager {PTR} {PTR} {PR, T} {R, T} Room {MP, PT} {MP, PT} {MP, T} {MP, T}
SLIDE 62
Example for Computing the Maximal Set Families
R = {Project, Time, Manager, Room} where Σ is:
(Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4)
Maximal set computation:
A maxΣ1(A) maxΣ2(A) maxΣ3(A) maxΣ4(A) Project {MTR} {MR, T} {MR, T} {MR, T} Time {MRP} {MRP} {MRP} {MRP} Manager {PTR} {PTR} {PR, T} {R, T} Room {MP, PT} {MP, PT} {MP, T} {MP, T}
SLIDE 63
Example for Computing the Maximal Set Families
R = {Project, Time, Manager, Room} where Σ is:
(Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4)
Maximal set computation:
A maxΣ1(A) maxΣ2(A) maxΣ3(A) maxΣ4(A) Project {MTR} {MR, T} {MR, T} {MR, T} Time {MRP} {MRP} {MRP} {MRP} Manager {PTR} {PTR} {PR, T} {R, T} Room {MP, PT} {MP, PT} {MP, T} {MP, T}
SLIDE 64
Example for Computing the Maximal Set Families
R = {Project, Time, Manager, Room} where Σ is:
(Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4)
Maximal set computation:
A maxΣ1(A) maxΣ2(A) maxΣ3(A) maxΣ4(A) Project {MTR} {MR, T} {MR, T} {MR, T} Time {MRP} {MRP} {MRP} {MRP} Manager {PTR} {PTR} {PR, T} {R, T} Room {MP, PT} {MP, PT} {MP, T} {MP, T}
SLIDE 65
Example for Computing the Maximal Set Families
R = {Project, Time, Manager, Room} where Σ is:
(Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4)
Maximal set computation:
A maxΣ1(A) maxΣ2(A) maxΣ3(A) maxΣ4(A) Project {MTR} {MR, T} {MR, T} {MR, T} Time {MRP} {MRP} {MRP} {MRP} Manager {PTR} {PTR} {PR, T} {R, T} Room {MP, PT} {MP, PT} {MP, T} {MP, T}
SLIDE 66
Example for Computing the Maximal Set Families
R = {Project, Time, Manager, Room} where Σ is:
(Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4)
Maximal set computation:
A maxΣ1(A) maxΣ2(A) maxΣ3(A) maxΣ4(A) Project {MTR} {MR, T} {MR, T} {MR, T} Time {MRP} {MRP} {MRP} {MRP} Manager {PTR} {PTR} {PR, T} {R, T} Room {MP, PT} {MP, PT} {MP, T} {MP, T}
SLIDE 67
Example for Computing the Maximal Set Families
R = {Project, Time, Manager, Room} where Σ is:
(Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4)
Maximal set computation:
A maxΣ1(A) maxΣ2(A) maxΣ3(A) maxΣ4(A) Project {MTR} {MR, T} {MR, T} {MR, T} Time {MRP} {MRP} {MRP} {MRP} Manager {PTR} {PTR} {PR, T} {R, T} Room {MP, PT} {MP, PT} {MP, T} {MP, T}
SLIDE 68
Example for Computing the Maximal Set Families
R = {Project, Time, Manager, Room} where Σ is:
(Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4)
Maximal set computation:
A maxΣ1(A) maxΣ2(A) maxΣ3(A) maxΣ4(A) Project {MTR} {MR, T} {MR, T} {MR, T} Time {MRP} {MRP} {MRP} {MRP} Manager {PTR} {PTR} {PR, T} {R, T} Room {MP, PT} {MP, PT} {MP, T} {MP, T}
SLIDE 69
Example for Computing the Maximal Set Families
R = {Project, Time, Manager, Room} where Σ is:
(Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4)
Maximal set computation:
A maxΣ1(A) maxΣ2(A) maxΣ3(A) maxΣ4(A) Project {MTR} {MR, T} {MR, T} {MR, T} Time {MRP} {MRP} {MRP} {MRP} Manager {PTR} {PTR} {PR, T} {R, T} Room {MP, PT} {MP, PT} {MP, T} {MP, T} α1
SLIDE 70
Example for Computing the Maximal Set Families
R = {Project, Time, Manager, Room} where Σ is:
(Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4)
Maximal set computation:
A maxΣ1(A) maxΣ2(A) maxΣ3(A) maxΣ4(A) Project {MTR} {MR, T} {MR, T} {MR, T} Time {MRP} {MRP} {MRP} {MRP} Manager {PTR} {PTR} {PR, T} {R, T} Room {MP, PT} {MP, PT} {MP, T} {MP, T} α2 α1
SLIDE 71
Example for Computing the Maximal Set Families
R = {Project, Time, Manager, Room} where Σ is:
(Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4)
Maximal set computation:
A maxΣ1(A) maxΣ2(A) maxΣ3(A) maxΣ4(A) Project {MTR} {MR, T} {MR, T} {MR, T} Time {MRP} {MRP} {MRP} {MRP} Manager {PTR} {PTR} {PR, T} {R, T} Room {MP, PT} {MP, PT} {MP, T} {MP, T} α3 α2 α1
SLIDE 72
Example for Computing the Maximal Set Families
R = {Project, Time, Manager, Room} where Σ is:
(Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4)
Maximal set computation:
A maxΣ1(A) maxΣ2(A) maxΣ3(A) maxΣ4(A) Project {MTR} {MR, T} {MR, T} {MR, T} Time {MRP} {MRP} {MRP} {MRP} Manager {PTR} {PTR} {PR, T} {R, T} Room {MP, PT} {MP, PT} {MP, T} {MP, T} α4 α3 α2 α1
SLIDE 73
Example for Computing a Possibilistic Armstrong Relation
R = {Project, Time, Manager, Room} with Σ (Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4)
SLIDE 74
Example for Computing a Possibilistic Armstrong Relation
R = {Project, Time, Manager, Room} with Σ (Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4) Computation of p-Armstrong relation max4(R): MR, T, MRP, R, MP max3(R) − max4(R): PR max2(R) − ∪j=3,4 maxj(R): PTR, PT max1(R) − ∪j=2,3,4 maxj(R): MTR Project Time Manager Room αi
SLIDE 75
Example for Computing a Possibilistic Armstrong Relation
R = {Project, Time, Manager, Room} with Σ (Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4) Computation of p-Armstrong relation max4(R): MR, T, MRP, R, MP max3(R) − max4(R): PR max2(R) − ∪j=3,4 maxj(R): PTR, PT max1(R) − ∪j=2,3,4 maxj(R): MTR Project Time Manager Room αi α1
SLIDE 76
Example for Computing a Possibilistic Armstrong Relation
R = {Project, Time, Manager, Room} with Σ (Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4) Computation of p-Armstrong relation max4(R): MR, T, MRP, R, MP max3(R) − max4(R): PR max2(R) − ∪j=3,4 maxj(R): PTR, PT max1(R) − ∪j=2,3,4 maxj(R): MTR Project Time Manager Room αi α1 1 1 α1
SLIDE 77
Example for Computing a Possibilistic Armstrong Relation
R = {Project, Time, Manager, Room} with Σ (Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4) Computation of p-Armstrong relation max4(R): MR, T, MRP, R, MP max3(R) − max4(R): PR max2(R) − ∪j=3,4 maxj(R): PTR, PT max1(R) − ∪j=2,3,4 maxj(R): MTR Project Time Manager Room αi α1 1 1 α1 2 1 2 2 α1
SLIDE 78
Example for Computing a Possibilistic Armstrong Relation
R = {Project, Time, Manager, Room} with Σ (Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4) Computation of p-Armstrong relation max4(R): MR, T, MRP, R, MP max3(R) − max4(R): PR max2(R) − ∪j=3,4 maxj(R): PTR, PT max1(R) − ∪j=2,3,4 maxj(R): MTR Project Time Manager Room αi α1 1 1 α1 2 1 2 2 α1 2 3 2 2 α1
SLIDE 79
Example for Computing a Possibilistic Armstrong Relation
R = {Project, Time, Manager, Room} with Σ (Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4) Computation of p-Armstrong relation max4(R): MR, T, MRP, R, MP max3(R) − max4(R): PR max2(R) − ∪j=3,4 maxj(R): PTR, PT max1(R) − ∪j=2,3,4 maxj(R): MTR Project Time Manager Room αi α1 1 1 α1 2 1 2 2 α1 2 3 2 2 α1 4 4 4 2 α1
SLIDE 80
Example for Computing a Possibilistic Armstrong Relation
R = {Project, Time, Manager, Room} with Σ (Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4) Computation of p-Armstrong relation max4(R): MR, T, MRP, R, MP max3(R) − max4(R): PR max2(R) − ∪j=3,4 maxj(R): PTR, PT max1(R) − ∪j=2,3,4 maxj(R): MTR Project Time Manager Room αi α1 1 1 α1 2 1 2 2 α1 2 3 2 2 α1 4 4 4 2 α1 4 5 4 5 α1
SLIDE 81
Example for Computing a Possibilistic Armstrong Relation
R = {Project, Time, Manager, Room} with Σ (Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4) Computation of p-Armstrong relation max4(R): MR, T, MRP, R, MP max3(R) − max4(R): PR max2(R) − ∪j=3,4 maxj(R): PTR, PT max1(R) − ∪j=2,3,4 maxj(R): MTR Project Time Manager Room αi α1 1 1 α1 2 1 2 2 α1 2 3 2 2 α1 4 4 4 2 α1 4 5 4 5 α1 4 6 6 5 α2
SLIDE 82
Example for Computing a Possibilistic Armstrong Relation
R = {Project, Time, Manager, Room} with Σ (Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4) Computation of p-Armstrong relation max4(R): MR, T, MRP, R, MP max3(R) − max4(R): PR max2(R) − ∪j=3,4 maxj(R): PTR, PT max1(R) − ∪j=2,3,4 maxj(R): MTR Project Time Manager Room αi α1 1 1 α1 2 1 2 2 α1 2 3 2 2 α1 4 4 4 2 α1 4 5 4 5 α1 4 6 6 5 α2 4 6 7 5 α3
SLIDE 83
Example for Computing a Possibilistic Armstrong Relation
R = {Project, Time, Manager, Room} with Σ (Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4) Computation of p-Armstrong relation max4(R): MR, T, MRP, R, MP max3(R) − max4(R): PR max2(R) − ∪j=3,4 maxj(R): PTR, PT max1(R) − ∪j=2,3,4 maxj(R): MTR Project Time Manager Room αi α1 1 1 α1 2 1 2 2 α1 2 3 2 2 α1 4 4 4 2 α1 4 5 4 5 α1 4 6 6 5 α2 4 6 7 5 α3 4 6 8 8 α3
SLIDE 84
Example for Computing a Possibilistic Armstrong Relation
R = {Project, Time, Manager, Room} with Σ (Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4) Computation of p-Armstrong relation max4(R): MR, T, MRP, R, MP max3(R) − max4(R): PR max2(R) − ∪j=3,4 maxj(R): PTR, PT max1(R) − ∪j=2,3,4 maxj(R): MTR Project Time Manager Room αi α1 1 1 α1 2 1 2 2 α1 2 3 2 2 α1 4 4 4 2 α1 4 5 4 5 α1 4 6 6 5 α2 4 6 7 5 α3 4 6 8 8 α3 9 6 8 8 α4
SLIDE 85
Example for Computing a Possibilistic Armstrong Relation
R = {Project, Time, Manager, Room} with Σ (Manager, Time → Room, β1) (Room, Time → Project, β2) (Project, Time → Manager, β3) (Project → Manager, β4) Computation of p-Armstrong relation max4(R): MR, T, MRP, R, MP max3(R) − max4(R): PR max2(R) − ∪j=3,4 maxj(R): PTR, PT max1(R) − ∪j=2,3,4 maxj(R): MTR Project Time Manager Room αi α1 1 1 α1 2 1 2 2 α1 2 3 2 2 α1 4 4 4 2 α1 4 5 4 5 α1 4 6 6 5 α2 4 6 7 5 α3 4 6 8 8 α3 9 6 8 8 α4 Project Time Manager Room αi Eagle Mon, 9am Ann Aqua α1 Hippo Mon, 1pm Ann Aqua α1 Kiwi Mon, 1pm Pete Buff α1 Kiwi Tue, 2pm Pete Buff α1 Lion Tue, 4pm Gill Buff α1 Lion Wed, 9am Gill Cyan α1 Lion Wed, 11am Bob Cyan α2 Lion Wed, 11am Jack Cyan α3 Lion Wed, 11am Pam Lava α3 Tiger Wed, 11am Pam Lava α4
SLIDE 86
Experiments for fixed number k of p-degrees in schema size n
For fixed k, the output size and times are both low-degree polynomial in n (same average behavior as exhibited for certain data)
SLIDE 87
Experiments for fixed schema size n in number k of p-degrees
For fixed n, there is logarithmic output size and constant time in k Size growth results from significant number of maximal sets realized by small k Computation time agnostic to k as each FD is visited only once
SLIDE 88
Tool
SLIDE 89
Summary
Perfect sample: pFDs exhibit highest certainty degree by which they are implied Computational support for identifying meaningful possibilistic FDs The latter serve as input to schema design algorithms for uncertain data Size of samples growths logarithmically in number of available degrees Computation time for samples is constant in number of available degrees Future work
How to combine sampling with discovery to identify dirty data and meaningful rules? How do human experts best benefit from the computational support?
SLIDE 90
Some Literature
Sebastian Link, Henri Prade: Relational database schema design for uncertain data. Inf. Syst. 84: 88-110 (2019) Henning Koehler, Sebastian Link: Qualitative Cleaning of Uncertain Data. CIKM 2016: 2269-2274 Sebastian Link, Henri Prade: Possibilistic Functional Dependencies and Their Relationship to Possibility Theory. IEEE Trans. Fuzzy Syst. 24(3): 757-763 (2016) Neil Hall, Henning Koehler, Sebastian Link, Henri Prade, Xiaofang Zhou: Cardinality constraints
- n qualitatively uncertain data. Data Knowl. Eng. 99: 126-150 (2015)
Nishita Balamuralikrishna, Yingnan Jiang, Henning Koehler, Uwe Leck, Sebastian Link, Henri Prade: Possibilistic keys. Fuzzy Sets Syst. 376: 1-36 (2019) Ziheng Wei, Sebastian Link: A Fourth Normal Form for Uncertain Data. CAiSE 2019: 295-311 Ziheng Wei, Sebastian Link: DataProf: Semantic Profiling for Iterative Data Cleansing and Business Rule Acquisition. SIGMOD Conference 2018: 1793-1796
SLIDE 91