A Dichotomy for Homomorphism-Closed Queries on Probabilistic Graphs - - PowerPoint PPT Presentation

a dichotomy for homomorphism closed queries on
SMART_READER_LITE
LIVE PREVIEW

A Dichotomy for Homomorphism-Closed Queries on Probabilistic Graphs - - PowerPoint PPT Presentation

A Dichotomy for Homomorphism-Closed Queries on Probabilistic Graphs Antoine Amarilli 1 and smail lkan Ceylan 2 September 16, 2020 1 Tlcom Paris 2 University of Oxford 1/7 Uncertain data management In this talk, we manage data represented


slide-1
SLIDE 1

A Dichotomy for Homomorphism-Closed Queries

  • n Probabilistic Graphs

Antoine Amarilli1 and İsmail İlkan Ceylan2 September 16, 2020

1Télécom Paris 2University of Oxford 1/7

slide-2
SLIDE 2

Uncertain data management

In this talk, we manage data represented as a labeled graph

2/7

slide-3
SLIDE 3

Uncertain data management

In this talk, we manage data represented as a labeled graph

WorksAt Antoine Télécom Paris Antoine Paris Sud Benny Paris Sud Benny Technion İsmail

  • U. Oxford

2/7

slide-4
SLIDE 4

Uncertain data management

In this talk, we manage data represented as a labeled graph

WorksAt Antoine Télécom Paris Antoine Paris Sud Benny Paris Sud Benny Technion İsmail

  • U. Oxford

MemberOf Télécom Paris ParisTech Télécom Paris IP Paris Paris Sud IP Paris Paris Sud Paris-Saclay Technion CESAER

2/7

slide-5
SLIDE 5

Uncertain data management

In this talk, we manage data represented as a labeled graph

WorksAt Antoine Télécom Paris Antoine Paris Sud Benny Paris Sud Benny Technion İsmail

  • U. Oxford

MemberOf Télécom Paris ParisTech Télécom Paris IP Paris Paris Sud IP Paris Paris Sud Paris-Saclay Technion CESAER

Antoine Benny İsmail Télécom Paris Paris Sud Technion

  • U. Oxford

ParisTech IP Paris Paris-Saclay CESAER

slide-6
SLIDE 6

Uncertain data management

In this talk, we manage data represented as a labeled graph

WorksAt Antoine Télécom Paris Antoine Paris Sud Benny Paris Sud Benny Technion İsmail

  • U. Oxford

MemberOf Télécom Paris ParisTech Télécom Paris IP Paris Paris Sud IP Paris Paris Sud Paris-Saclay Technion CESAER

Antoine Benny İsmail Télécom Paris Paris Sud Technion

  • U. Oxford

ParisTech IP Paris Paris-Saclay CESAER

slide-7
SLIDE 7

Uncertain data management

In this talk, we manage data represented as a labeled graph

WorksAt Antoine Télécom Paris Antoine Paris Sud Benny Paris Sud Benny Technion İsmail

  • U. Oxford

MemberOf Télécom Paris ParisTech Télécom Paris IP Paris Paris Sud IP Paris Paris Sud Paris-Saclay Technion CESAER

Antoine Benny İsmail Télécom Paris Paris Sud Technion

  • U. Oxford

ParisTech IP Paris Paris-Saclay CESAER

2/7

slide-8
SLIDE 8

Uncertain data management

In this talk, we manage data represented as a labeled graph

WorksAt Antoine Télécom Paris Antoine Paris Sud Benny Paris Sud Benny Technion İsmail

  • U. Oxford

MemberOf Télécom Paris ParisTech Télécom Paris IP Paris Paris Sud IP Paris Paris Sud Paris-Saclay Technion CESAER

Antoine Benny İsmail Télécom Paris Paris Sud Technion

  • U. Oxford

ParisTech IP Paris Paris-Saclay CESAER → Problem: we are not certain about the true state of the data

2/7

slide-9
SLIDE 9

Uncertain data model

A. B. İ. Télécom Paris Paris Sud Technion

  • U. Oxford

ParisTech IP Paris Paris-Saclay CESAER

  • Uncertain data model: TID, for

tuple-independent database

  • Each fact (edge) carries a probability

3/7

slide-10
SLIDE 10

Uncertain data model

A. B. İ. Télécom Paris Paris Sud Technion

  • U. Oxford

ParisTech IP Paris Paris-Saclay CESAER 80% 10% 40% 80% 100% 90% 90% 50% 90% 100%

  • Uncertain data model: TID, for

tuple-independent database

  • Each fact (edge) carries a probability

3/7

slide-11
SLIDE 11

Uncertain data model

A. B. İ. Télécom Paris Paris Sud Technion

  • U. Oxford

ParisTech IP Paris Paris-Saclay CESAER 80% 10% 40% 80% 100% 90% 90% 50% 90% 100%

  • Uncertain data model: TID, for

tuple-independent database

  • Each fact (edge) carries a probability
  • Each fact exists with its given probability
  • All facts are independent

3/7

slide-12
SLIDE 12

Uncertain data model

A. B. İ. Télécom Paris Paris Sud Technion

  • U. Oxford

ParisTech IP Paris Paris-Saclay CESAER 80% 10% 40% 80% 100% 90% 90% 50% 90% 100%

  • Uncertain data model: TID, for

tuple-independent database

  • Each fact (edge) carries a probability
  • Each fact exists with its given probability
  • All facts are independent
  • Possible world W: subset of facts

3/7

slide-13
SLIDE 13

Uncertain data model

A. B. İ. Télécom Paris Paris Sud Technion

  • U. Oxford

ParisTech IP Paris Paris-Saclay CESAER

  • Uncertain data model: TID, for

tuple-independent database

  • Each fact (edge) carries a probability
  • Each fact exists with its given probability
  • All facts are independent
  • Possible world W: subset of facts

3/7

slide-14
SLIDE 14

Uncertain data model

A. B. İ. Télécom Paris Paris Sud Technion

  • U. Oxford

ParisTech IP Paris Paris-Saclay CESAER 80% 10% 40% 80% 100% 90% 90% 50% 90% 100%

  • Uncertain data model: TID, for

tuple-independent database

  • Each fact (edge) carries a probability
  • Each fact exists with its given probability
  • All facts are independent
  • Possible world W: subset of facts
  • Probability of W:

3/7

slide-15
SLIDE 15

Uncertain data model

A. B. İ. Télécom Paris Paris Sud Technion

  • U. Oxford

ParisTech IP Paris Paris-Saclay CESAER 80% 10% 40% 80% 100% 90% 90% 50% 90% 100%

  • Uncertain data model: TID, for

tuple-independent database

  • Each fact (edge) carries a probability
  • Each fact exists with its given probability
  • All facts are independent
  • Possible world W: subset of facts
  • Probability of W:

Pr(W) =

  • F∈W

Pr(F)

  • ×
  • F /

∈W

  • 1 − Pr(F)
  • 3/7
slide-16
SLIDE 16

Homomorphism-closed queries

  • Query: maps a graph (without probabilities) to YES/NO

4/7

slide-17
SLIDE 17

Homomorphism-closed queries

  • Query: maps a graph (without probabilities) to YES/NO
  • Conjunctive query (CQ): can I find a match of a pattern? e.g., x

y z

4/7

slide-18
SLIDE 18

Homomorphism-closed queries

  • Query: maps a graph (without probabilities) to YES/NO
  • Conjunctive query (CQ): can I find a match of a pattern? e.g., x

y z

  • Union of conjunctive queries (UCQ): does one of the CQs match?

4/7

slide-19
SLIDE 19

Homomorphism-closed queries

  • Query: maps a graph (without probabilities) to YES/NO
  • Conjunctive query (CQ): can I find a match of a pattern? e.g., x

y z

  • Union of conjunctive queries (UCQ): does one of the CQs match?

→ Homomorphism-closed query Q: if G satisfies Q and G has a homomorphism to G′ then G′ also satisfies Q

4/7

slide-20
SLIDE 20

Homomorphism-closed queries

  • Query: maps a graph (without probabilities) to YES/NO
  • Conjunctive query (CQ): can I find a match of a pattern? e.g., x

y z

  • Union of conjunctive queries (UCQ): does one of the CQs match?

→ Homomorphism-closed query Q: if G satisfies Q and G has a homomorphism to G′ then G′ also satisfies Q They generalize CQs and UCQs, but also regular path queries (RPQs), Datalog, etc.

4/7

slide-21
SLIDE 21

Problem statement: Probabilistic query evaluation (PQE)

Here is the problem PQE(Q):

  • We fix a query Q: x

y z

5/7

slide-22
SLIDE 22

Problem statement: Probabilistic query evaluation (PQE)

Here is the problem PQE(Q):

  • We fix a query Q: x

y z

  • The input is a TID D:

A. B. İ. Télécom Paris Paris Sud Technion

  • U. Oxford

ParisTech IP Paris Paris-Saclay CESAER 80% 10% 40% 80% 100% 90% 90% 50% 90% 100%

5/7

slide-23
SLIDE 23

Problem statement: Probabilistic query evaluation (PQE)

Here is the problem PQE(Q):

  • We fix a query Q: x

y z

  • The input is a TID D:

A. B. İ. Télécom Paris Paris Sud Technion

  • U. Oxford

ParisTech IP Paris Paris-Saclay CESAER 80% 10% 40% 80% 100% 90% 90% 50% 90% 100%

  • The output is the probability that the query is true

5/7

slide-24
SLIDE 24

Problem statement: Probabilistic query evaluation (PQE)

Here is the problem PQE(Q):

  • We fix a query Q: x

y z

  • The input is a TID D:

A. B. İ. Télécom Paris Paris Sud Technion

  • U. Oxford

ParisTech IP Paris Paris-Saclay CESAER 80% 10% 40% 80% 100% 90% 90% 50% 90% 100%

  • The output is the probability that the query is true

→ Question: What is the complexity of PQE(Q) depending on the query Q?

5/7

slide-25
SLIDE 25

Results on PQE

Existing dichotomy on the unions of conjunctive queries (UCQs): Theorem [Dalvi and Suciu, 2012]

  • Some UCQs Q are safe and PQE(Q) is in PTIME
  • All others are unsafe and PQE(Q) is #P-hard

6/7

slide-26
SLIDE 26

Results on PQE

Existing dichotomy on the unions of conjunctive queries (UCQs): Theorem [Dalvi and Suciu, 2012]

  • Some UCQs Q are safe and PQE(Q) is in PTIME
  • All others are unsafe and PQE(Q) is #P-hard

We study PQE for homomorphism-closed queries and show:

6/7

slide-27
SLIDE 27

Results on PQE

Existing dichotomy on the unions of conjunctive queries (UCQs): Theorem [Dalvi and Suciu, 2012]

  • Some UCQs Q are safe and PQE(Q) is in PTIME
  • All others are unsafe and PQE(Q) is #P-hard

We study PQE for homomorphism-closed queries and show: Theorem [Amarilli and Ceylan, 2020] For any query Q closed under homomorphisms:

  • Either Q is equivalent to a safe UCQ and PQE(Q) is in PTIME

6/7

slide-28
SLIDE 28

Results on PQE

Existing dichotomy on the unions of conjunctive queries (UCQs): Theorem [Dalvi and Suciu, 2012]

  • Some UCQs Q are safe and PQE(Q) is in PTIME
  • All others are unsafe and PQE(Q) is #P-hard

We study PQE for homomorphism-closed queries and show: Theorem [Amarilli and Ceylan, 2020] For any query Q closed under homomorphisms:

  • Either Q is equivalent to a safe UCQ and PQE(Q) is in PTIME
  • In all other cases, PQE(Q) is #P-hard

6/7

slide-29
SLIDE 29

Results on PQE

Existing dichotomy on the unions of conjunctive queries (UCQs): Theorem [Dalvi and Suciu, 2012]

  • Some UCQs Q are safe and PQE(Q) is in PTIME
  • All others are unsafe and PQE(Q) is #P-hard

We study PQE for homomorphism-closed queries and show: Theorem [Amarilli and Ceylan, 2020] For any query Q closed under homomorphisms:

  • Either Q is equivalent to a safe UCQ and PQE(Q) is in PTIME
  • In all other cases, PQE(Q) is #P-hard

So bad news: all homomorphism-closed queries are hard except safe UCQs

6/7

slide-30
SLIDE 30

What’s next?

  • The result only applies to graphs, not higher-arity databases
  • We conjecture that it holds for arbitrary arity

7/7

slide-31
SLIDE 31

What’s next?

  • The result only applies to graphs, not higher-arity databases
  • We conjecture that it holds for arbitrary arity
  • Adapting to unweighted PQE, where all probabilities are 1/2?
  • We have a recent result on non-hierarchical self-join-free CQs

[Amarilli and Kimelfeld, 2020]

  • Recent paper by Suciu and Kenig [Kenig and Suciu, 2020]

7/7

slide-32
SLIDE 32

What’s next?

  • The result only applies to graphs, not higher-arity databases
  • We conjecture that it holds for arbitrary arity
  • Adapting to unweighted PQE, where all probabilities are 1/2?
  • We have a recent result on non-hierarchical self-join-free CQs

[Amarilli and Kimelfeld, 2020]

  • Recent paper by Suciu and Kenig [Kenig and Suciu, 2020]

Thanks for your attention! tcs4f.org nofreeviewnoreview.org

7/7

slide-33
SLIDE 33

References i

Amarilli, A. and Ceylan, I. I. (2020). A dichotomy for homomorphism-closed queries on probabilistic graphs. In ICDT. Amarilli, A. and Kimelfeld, B. (2020). Uniform reliability of self-join-free conjunctive queries. arXiv preprint arXiv:1908.07093. Dalvi, N. and Suciu, D. (2012). The dichotomy of probabilistic inference for unions of conjunctive queries.

  • J. ACM, 59(6).
slide-34
SLIDE 34

References ii

Kenig, B. and Suciu, D. (2020). A dichotomy for the generalized model counting problem for unions of conjunctive queries. arXiv preprint arXiv:2008.00896.