repairing entities using star constraints in multi
play

Repairing Entities using Star Constraints in Multi-relational Graphs - PowerPoint PPT Presentation

Repairing Entities using Star Constraints in Multi-relational Graphs Peng Lin 1 Qi Song 1 Yinghui Wu 2,3 Jiaxing Pi 4 1 2 4 3 Erroneous entities: how to capture? Multi-relational graphs: a labeled graph with attributes on nodes


  1. Repairing Entities using Star Constraints in Multi-relational Graphs Peng Lin 1 Qi Song 1 Yinghui Wu 2,3 Jiaxing Pi 4 1 2 4 3

  2. Erroneous entities: how to capture? Β§ Multi-relational graphs: a labeled graph with attributes on nodes π’˜ 𝟏 Player name: VanPersie playsFor teammate playsFor coachedBy Coach Club Club Player name: Wenger name: AFC name: MU name: Rooney operates trainsAt operates trainsAt worksAt Stadium Facility Stadium Facility name: ATC name: EM name: OT name: AON owner: AHP owner: AHP owner: MUP owner: MUP city: LDN city: BZ city: MAN city: LD π’˜ 𝟐 π’˜ πŸ‘ π’˜ πŸ“ π’˜ πŸ’ Graph G: a football database 1

  3. Erroneous entities: how to capture? Β§ Multi-relational graphs: a labeled graph with attributes on nodes Β§ Entity errors: incorrect node attributes π’˜ 𝟏 Player name: VanPersie playsFor teammate playsFor coachedBy Coach Club Club Player name: Wenger name: AFC name: MU name: Rooney operates trainsAt operates trainsAt worksAt Stadium Facility Stadium Facility name: ATC name: EM name: OT name: AON owner: AHP owner: AHP owner: MUP owner: MUP city: LDN city: BZ city: MAN city: LD π’˜ 𝟐 π’˜ πŸ‘ π’˜ πŸ“ π’˜ πŸ’ Graph G: a football database 1

  4. Erroneous entities: how to capture? Β§ Multi-relational graphs: a labeled graph with attributes on nodes Β§ Entity errors: incorrect node attributes Β§ Semantics: relevant paths from a center node β€œFor stadium and facility relevant to player ( π’˜ 𝟏 ) π’˜ 𝟏 Player from Premier League, if they have the same name: VanPersie owner, then they should locate at the same city.” playsFor teammate playsFor coachedBy Coach Club Club Player name: Wenger name: AFC name: MU name: Rooney operates trainsAt operates trainsAt worksAt Stadium Facility Stadium Facility name: ATC name: EM name: OT name: AON owner: AHP owner: AHP owner: MUP owner: MUP city: LDN city: BZ city: MAN city: LD π’˜ 𝟐 π’˜ πŸ‘ π’˜ πŸ“ π’˜ πŸ’ Graph G: a football database 1

  5. Regular path queries Regular expressions: 𝑆 = π‘š π‘š &' 𝑆 % 𝑆|𝑆 βˆͺ 𝑆 Β§ π’˜ 𝟏 Player name: VanPersie playsFor teammate playsFor coachedBy Coach Club Club Player name: Wenger name: AFC name: MU name: Rooney operates trainsAt operates trainsAt worksAt Facility Stadium Stadium Facility name: ATC name: EM name: OT name: AON owner: AHP owner: AHP owner: MUP owner: MUP city: LDN city: BZ city: MAN city: LD π’˜ 𝟐 π’˜ πŸ‘ π’˜ πŸ“ π’˜ πŸ’ Graph G: a football database 2

  6. Regular path queries Regular expressions: 𝑆 = π‘š π‘š &' 𝑆 % 𝑆|𝑆 βˆͺ 𝑆 Β§ Β§ Paths from Player to Stadium 𝑆 ! = (playsFor , operates) βˆͺ (coachedBy , worksAt) Β§ π’˜ 𝟏 Player name: VanPersie playsFor teammate playsFor coachedBy Coach Club Club Player name: Wenger name: AFC name: MU name: Rooney operates trainsAt operates trainsAt worksAt Facility Stadium Stadium Facility name: ATC name: EM name: OT name: AON owner: AHP owner: AHP owner: MUP owner: MUP city: LDN city: BZ city: MAN city: LD π’˜ 𝟐 π’˜ πŸ‘ π’˜ πŸ“ π’˜ πŸ’ Graph G: a football database 2

  7. Regular path queries Regular expressions: 𝑆 = π‘š π‘š &' 𝑆 % 𝑆|𝑆 βˆͺ 𝑆 Β§ Β§ Paths from Player to Stadium 𝑆 ! = (playsFor , operates) βˆͺ (coachedBy , worksAt) Β§ π’˜ 𝟏 Player Β§ Paths from Player to Facility 𝑆 " = (playsFor , operates) βˆͺ (teammate #! , trainsAt) name: VanPersie Β§ playsFor teammate playsFor coachedBy Coach Club Club Player name: Wenger name: AFC name: MU name: Rooney operates trainsAt operates trainsAt worksAt Facility Stadium Stadium Facility name: ATC name: EM name: OT name: AON owner: AHP owner: AHP owner: MUP owner: MUP city: LDN city: BZ city: MAN city: LD π’˜ 𝟐 π’˜ πŸ‘ π’˜ πŸ“ π’˜ πŸ’ Graph G: a football database 2

  8. Contributions StarRepair framework Repair 𝐻’ Graph 𝐻 , StarFDs Ξ£ Error detection Repair ( 𝐻 does not satisfy Ξ£ ) ( 𝐻’ satisfies Ξ£ ) 3

  9. Contributions StarFDs: star functional dependencies Entity repair problem: minimum new constraints for graphs editing cost, NP-hard and APX-hard StarRepair framework Repair 𝐻’ Graph 𝐻 , StarFDs Ξ£ Error detection Repair ( 𝐻 does not satisfy Ξ£ ) ( 𝐻’ satisfies Ξ£ ) Feasible framework with provable guarantees whenever possible 3

  10. Contributions StarFDs: star functional dependencies Entity repair problem: minimum new constraints for graphs editing cost, NP-hard and APX-hard StarRepair framework Repair 𝐻’ Graph 𝐻 , StarFDs Ξ£ Error detection Repair ( 𝐻 does not satisfy Ξ£ ) ( 𝐻’ satisfies Ξ£ ) Repair workflow Is approximable? Feasible framework with provable guarantees whenever possible No Yes Is optimal repairable? Heuristic solution Yes No Optimal solution Approximation solution 3

  11. Star constraints StarFDs: πœ’ = (𝑄(𝑣 ( ), π‘Œ β†’ 𝑍) Β§ Star pattern 𝑄(𝑣 ( ) : Β§ Value constraints: π‘Œ β†’ 𝑍 Β§ 4

  12. Star constraints StarFDs: πœ’ = (𝑄(𝑣 ( ), π‘Œ β†’ 𝑍) Β§ Star pattern 𝑄(𝑣 ( ) : Β§ Value constraints: π‘Œ β†’ 𝑍 Β§ - A two-level tree with center node 𝑣 ( - Each branch is a regular expression 𝒗 𝟏 Player 𝑺 𝟐 𝑺 πŸ‘ Stadium Facility 𝒗 𝟐 𝒗 πŸ‘ 𝑆 % = (playsFor 0 operates) βˆͺ (coachedBy 0 worksAt) 𝑆 # = (playsFor 0 operates) βˆͺ (teammate $% 0 trainsAt) 4

  13. Star constraints StarFDs: πœ’ = (𝑄(𝑣 ( ), π‘Œ β†’ 𝑍) Β§ Star pattern 𝑄(𝑣 ( ) : Β§ Value constraints: π‘Œ β†’ 𝑍 Β§ - A two-level tree with center node 𝑣 ( - π‘Œ and 𝑍 are two sets of literals Literals: 𝑣. 𝐡 = 𝑑 , or 𝑣. 𝐡 = 𝑣 ) . 𝐡′ - Each branch is a regular expression - 𝒗 𝟏 Player π‘Œ : 𝑣 $ . league = EPL, 𝑣 ! . owner = 𝑣 " . owner 𝑺 𝟐 𝑺 πŸ‘ 𝑍 : 𝑣 ! . city = 𝑣 " . city Stadium Facility 𝒗 𝟐 𝒗 πŸ‘ 𝑆 % = (playsFor 0 operates) βˆͺ (coachedBy 0 worksAt) 𝑆 # = (playsFor 0 operates) βˆͺ (teammate $% 0 trainsAt) 4

  14. Star constraints Β§ Matching semantics: maximum set matched by star pattern 𝒗 𝟏 Player 𝑺 πŸ‘ 𝑺 𝟐 Facility Stadium 𝒗 𝟐 𝒗 πŸ‘ Star pattern 𝑄(𝑣 $ ) π‘Œ : 𝑣 & . league = EPL, 𝑣 % . owner = 𝑣 # . owner 𝑍 : 𝑣 % . city = 𝑣 # . city 5

  15. Star constraints 𝒗 𝟏 matches π’˜ 𝟏 Β§ Matching semantics: maximum set matched by star pattern 𝒗 𝟐 matches π’˜ 𝟐 and π’˜ πŸ“ 𝒗 πŸ‘ matches π’˜ πŸ‘ and π’˜ πŸ’ π’˜ 𝟏 Player name: VanPersie 𝒗 𝟏 Player playsFor teammate playsFor coachedBy 𝑺 πŸ‘ 𝑺 𝟐 Coach Club Club Player name: Wenger name: AFC name: MU name: Rooney Facility Stadium operates trainsAt operates trainsAt worksAt 𝒗 𝟐 𝒗 πŸ‘ Facility Stadium Stadium Facility Star pattern 𝑄(𝑣 $ ) name: ATC name: EM name: OT name: AON owner: AHP owner: AHP owner: MUP owner: MUP π‘Œ : 𝑣 & . league = EPL, 𝑣 % . owner = 𝑣 # . owner city: LDN city: BZ city: MAN city: LD 𝑍 : 𝑣 % . city = 𝑣 # . city π’˜ 𝟐 π’˜ πŸ‘ π’˜ πŸ“ π’˜ πŸ’ 5

  16. Star constraints 𝒗 𝟏 matches π’˜ 𝟏 Β§ Matching semantics: maximum set matched by star pattern 𝒗 𝟐 matches π’˜ 𝟐 and π’˜ πŸ“ Inconsistencies 𝑱 : matches that π‘Œ holds but 𝑍 does not hold Β§ 𝒗 πŸ‘ matches π’˜ πŸ‘ and π’˜ πŸ’ π’˜ 𝟏 Player name: VanPersie 𝒗 𝟏 Player playsFor teammate playsFor coachedBy 𝑺 πŸ‘ 𝑺 𝟐 Coach Club Club Player name: Wenger name: AFC name: MU name: Rooney Facility Stadium operates trainsAt operates trainsAt worksAt 𝒗 𝟐 𝒗 πŸ‘ Facility Stadium Stadium Facility Star pattern 𝑄(𝑣 $ ) name: ATC name: EM name: OT name: AON owner: AHP owner: AHP owner: MUP owner: MUP π‘Œ : 𝑣 & . league = EPL, 𝑣 % . owner = 𝑣 # . owner city: LDN city: BZ city: MAN city: LD 𝑍 : 𝑣 % . city = 𝑣 # . city π’˜ 𝟐 π’˜ πŸ‘ π’˜ πŸ“ π’˜ πŸ’ 5

  17. Summary of results Problem Description Hardness Solution Input: Ξ£ Satisfiability NP-complete decide whether there exists 𝐻 that satisfies Ξ£ Input: Ξ£ and πœ’ Implication coNP-hard decide whether for all 𝐻 satisfy Ξ£ , they satisfy πœ’ Input: 𝐻 and Ξ£ Error detection PTIME Evaluate regular path queries and validate values Output: all inconsistencies 𝑱 time complexity: 𝑃( Ξ£ V + |π‘Š|( π‘Š + |𝐹|)) (validation) - Input: Ξ£ and 𝐻 that does not satisfy Ξ£ Repair NP-hard Approximable cases (PTIME checkable) time complexity 𝑃( 𝑱 Ξ£ ! + 𝑱 ( 𝑱 Ξ£ ! + |𝑱| Ξ£ )) Ouput: 𝐻′ that satisfies Ξ£ with least repair cost APX-hard - approximation ratio: 𝑱 Ξ£ ! - Optimal cases time complexity 𝑃( 𝑱 Ξ£ )) - Heuristic cases time complexity 𝑃( 𝑱 Ξ£ ! + 𝑱 ( 𝑱 Ξ£ ! + |𝑱| Ξ£ )) - bounded repairable: cost ≀ 𝑱 - Notations 𝐻 : graph π‘Š : nodes 𝐹 : edges Β§ Ξ£ : a set of StarFDs πœ’ : a single StarFD 𝑱 : all inconsistencies. 6

  18. Updates and repairs Updates 𝑃 : operators 𝑝 = (𝑀. 𝐡, 𝑏, 𝑑) with editing cost cost 𝑃 = βˆ‘ (∈+ cost 𝑝 Β§ Repair 𝑃 : applying 𝑃 to 𝐻 , such that obtain 𝐻′ that satisfies Ξ£ Β§ 7

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend