Causality and Algebraic Geometry Andrew Critch UC Berkeley - - PowerPoint PPT Presentation

causality and algebraic geometry
SMART_READER_LITE
LIVE PREVIEW

Causality and Algebraic Geometry Andrew Critch UC Berkeley - - PowerPoint PPT Presentation

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu Causality and Algebraic Geometry Andrew Critch UC Berkeley September, 2012 Causality and Algebraic Geometry Andrew Critch, UC Berkeley


slide-1
SLIDE 1

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu

Causality and Algebraic Geometry

Andrew Critch

UC Berkeley

September, 2012

slide-2
SLIDE 2

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu

Outline

1

Coin- and die-biasing games

2

The causal inference problem

3

Macaulay2 Demonstration...

4

Finer algebraic invariants

5

Binary hidden Markov models

6

MPS-entangled qubits

7

Further reading

slide-3
SLIDE 3

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu Coin- and die-biasing games

1

Coin- and die-biasing games

2

The causal inference problem

3

Macaulay2 Demonstration...

4

Finer algebraic invariants

5

Binary hidden Markov models

6

MPS-entangled qubits

7

Further reading

slide-4
SLIDE 4

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu Coin- and die-biasing games

Coin-biasing games

Consider a game consisting of coin flips where earlier coin

  • utcomes affect the biases of later coins in a prescribed way.

(Imagine I have some clear, heavy plastic that I can stick to the later coins to give them any bias I want, on the fly.)

slide-5
SLIDE 5

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu Coin- and die-biasing games

Example: “DACB”

We can specify a coin biasing game with a diagram of how the coins influence each other, i.e. a graph on the coin names with a list of biases called a conditional probability table (CPT), e.g.:

slide-6
SLIDE 6

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu Coin- and die-biasing games

Example: “DACB”

Here, the first two coin flips are from two different coins, and the

  • utcomes (0 or 1) are labelled D and A. (I’m using the letters out
  • f sequence on purpose.)
slide-7
SLIDE 7

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu Coin- and die-biasing games

Example: “DACB”

Based on the outcome DA, a bias is chosen for another coin, which we flip and label its outcome C. Similarly C determines a bias for the B coin.

slide-8
SLIDE 8

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu Coin- and die-biasing games

Thus, a coin-biasing game is specified by data (V , G, Θ), where: V is a set of binary random variables, G is a directed acyclic graph (DAG) called the structure, whose vertices are the variables, and Θ is a conditional probability table (CPT) specifying the values P(Vi = v | parents(Vi) = w) for all i, v, and w Note: without the binarity restriction, this is the definition of a Bayesian network or Bayes net [J. Pearl, 1985].

slide-9
SLIDE 9

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu Coin- and die-biasing games

Now suppose the “DACB” game is running inside a box, but we don’t know its structure graph G or the CPT parameters Θ. Each time it runs, it prints us out a receipt showing the value of the variables A, B, C, and D, in that order, but nothing else: Say we got 10,000 such receipts, from which we estimate a probability table for the 16 possible outcomes...

slide-10
SLIDE 10

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu Coin- and die-biasing games

P(0000) = 0.0343, P(0001) = 0.0261, P(0010) = 0.0162, P(0011) = 0.0520, P(0100) = 0.1437, P(0101) = 0.1125, P(0110) = 0.0038, P(0111) = 0.0126, P(1000) = 0.0508, P(1001) = 0.0353, P(1010) = 0.0500, P(1111) = 0.0909, P(1100) = 0.1919, P(1101) = 0.1438, P(1110) = 0.0122, P(1111) = 0.0239 From this probability table we can infer any correlational relationships we want. How about causality? Stats 101 quiz: From the probabilities alone, can we infer G, the causal structure

  • f the game? What extra information is needed?
slide-11
SLIDE 11

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu Coin- and die-biasing games

P(0000) = 0.0343, P(0001) = 0.0261, P(0010) = 0.0162, P(0011) = 0.0520, P(0100) = 0.1437, P(0101) = 0.1125, P(0110) = 0.0038, P(0111) = 0.0126, P(1000) = 0.0508, P(1001) = 0.0353, P(1010) = 0.0500, P(1111) = 0.0909, P(1100) = 0.1919, P(1101) = 0.1438, P(1110) = 0.0122, P(1111) = 0.0239 From this probability table we can infer any correlational relationships we want. How about causality? Stats 101 quiz: From the probabilities alone, can we infer G, the causal structure

  • f the game? What extra information is needed?
slide-12
SLIDE 12

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu Coin- and die-biasing games

P(0000) = 0.0343, P(0001) = 0.0261, P(0010) = 0.0162, P(0011) = 0.0520, P(0100) = 0.1437, P(0101) = 0.1125, P(0110) = 0.0038, P(0111) = 0.0126, P(1000) = 0.0508, P(1001) = 0.0353, P(1010) = 0.0500, P(1111) = 0.0909, P(1100) = 0.1919, P(1101) = 0.1438, P(1110) = 0.0122, P(1111) = 0.0239 From this probability table we can infer any correlational relationships we want. How about causality? Stats 101 quiz: From the probabilities alone, can we infer G, the causal structure

  • f the game? What extra information is needed?
slide-13
SLIDE 13

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu Coin- and die-biasing games

P(0000) = 0.0343, P(0001) = 0.0261, P(0010) = 0.0162, P(0011) = 0.0520, P(0100) = 0.1437, P(0101) = 0.1125, P(0110) = 0.0038, P(0111) = 0.0126, P(1000) = 0.0508, P(1001) = 0.0353, P(1010) = 0.0500, P(1111) = 0.0909, P(1100) = 0.1919, P(1101) = 0.1438, P(1110) = 0.0122, P(1111) = 0.0239 From this probability table we can infer any correlational relationships we want. How about causality? Stats 101 quiz: From the probabilities alone, can we infer G, the causal structure

  • f the game? What extra information is needed?
slide-14
SLIDE 14

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu Coin- and die-biasing games

Nothing more is needed!

The probability data alone is enough information to reliably distinguish the causal structure G of the “DACB” game from other structures on 4 binary variables. The reason is that, by arising from G, the 16 probabilities p0000, p0001, . . . , p1111 are forced to satisfy a system of 13 polynomial equations fj = 0 which encode [SECRET!] properties readable from the graph that do not depend on the CPT Θ [Pistone, Riccomagno, Wynn, 2001]. These equations are almost never satisfied by coin-biasing games arising from other graphs that aren’t subgraphs of G, and coin-biasing games arising from G almost never satisfy conditional independence properties

  • f its proper subgraphs.
slide-15
SLIDE 15

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu Coin- and die-biasing games

Nothing more is needed!

The probability data alone is enough information to reliably distinguish the causal structure G of the “DACB” game from other structures on 4 binary variables. The reason is that, by arising from G, the 16 probabilities p0000, p0001, . . . , p1111 are forced to satisfy a system of 13 polynomial equations fj = 0 which encode [SECRET!] properties readable from the graph that do not depend on the CPT Θ [Pistone, Riccomagno, Wynn, 2001]. These equations are almost never satisfied by coin-biasing games arising from other graphs that aren’t subgraphs of G, and coin-biasing games arising from G almost never satisfy conditional independence properties

  • f its proper subgraphs.
slide-16
SLIDE 16

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu Coin- and die-biasing games

For now, the take-away is:

slide-17
SLIDE 17

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu The causal inference problem

1

Coin- and die-biasing games

2

The causal inference problem

3

Macaulay2 Demonstration...

4

Finer algebraic invariants

5

Binary hidden Markov models

6

MPS-entangled qubits

7

Further reading

slide-18
SLIDE 18

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu The causal inference problem

Wait, what does causality even mean?

In short, causality is the extent to which we can employ directed graphical models to predict and control real-world phenomena. I.e. it’s how well we can pretend nature is a die-biasing game. Definition [J. Pearl, 2000; awarded the 2011 Turing Prize] A (fully specified) causal theory is defined by an ordered triple (V , G, Θ): a set of random variables, a DAG on those variables, and a compatible CPT. If not all of V , often a subset O ⊂ V of

  • bserved variables is also specified, and the others are called

hidden variables. (Note: This formal framework is enough to discuss any other notion of causality I’ve seen, including all those listed on the Wikipedia and Stanford Encyclopedia of Philosophy entries.)

slide-19
SLIDE 19

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu The causal inference problem

Wait, what does causality even mean?

A joint probability distribution P on the random variables V is generated by (G, Θ) in the obvious way (like a die-biasing game), P(v1 . . . vn) =

  • i

P (vi | parents(vi)) With this framework in place, we can say that causal hypotheses are partial specifications of causal

  • theories. For example, perhaps only (V , G) is described, or
  • nly part of G.

causal inference is the problem of recovering information about (G, Θ) from the probabilities P or other partial information, and

slide-20
SLIDE 20

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu The causal inference problem

Wait, what does causality even mean?

Empirically, causal hypotheses make two kinds of predictions:

  • 1. Interventional predictions – how you expect the system to

respond if you start controlling parts of it, and

  • 2. Observational predictions – things you can see without

manipulating the system.

slide-21
SLIDE 21

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu The causal inference problem

  • 1. Interventional predictions

How to think about the intervention “set C = 1” as “graph surgery”:

slide-22
SLIDE 22

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu The causal inference problem

  • 1. Interventional predictions

In social sciences, interventions are often too expensive, unethical,

  • r impossible to do in practice, so we need to understand the purely
  • bservational predictions of causal theories in order to test them!
slide-23
SLIDE 23

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu The causal inference problem

  • 2. Observational predictions

For a given causal structure G on a set of observed variables V and unobserved variables U, there is a semialgebraic set, or semivariety, of probability distributions MG which can arise from some parameter assignment (CPT, Θ) to G. It turns out that MG is often a proper subset of the set ∆V of all possible probability distributions on V . Thus, it constrains what we expect to see in the world.

slide-24
SLIDE 24

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu The causal inference problem

  • 2. Observational predictions

Exercise: for each of the following pairs of coin biasing graphs, try to determine if they are observationally equivalent, i.e., give rise to the same semivariety of distributions:

slide-25
SLIDE 25

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu The causal inference problem

  • 2. Observational predictions

Answers:

slide-26
SLIDE 26

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu The causal inference problem

  • 2. Observational predictions

How can we tell them apart? The simplest observational predictions made by a causal theory that can be used to distinguish it are conditional independence statements implied by is graph G. These can be thought of as equational constraints

  • n its semivariety of distributions.

“A ⊥ ⊥C | B” is read “A is independent of C conditional on B”, and intuitively means “When B is known, learning A gives no new information about C.” In terms of probabilities, this is P(A | B) · P(C | B) = P(AC | B)

slide-27
SLIDE 27

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu The causal inference problem

Example: In a world where mating happens completely randomly... ⇒ A ⊥ ⊥ C | B, but A ⊥ / ⊥ C ⇒ A ⊥ ⊥ C | B, but A ⊥ / ⊥ C ⇒ A ⊥ / ⊥ C | B, but A ⊥ ⊥ C.

slide-28
SLIDE 28

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu The causal inference problem

Example: In a world where mating happens completely randomly... ⇒ A ⊥ ⊥ C | B, but A ⊥ / ⊥ C ⇒ A ⊥ ⊥ C | B, but A ⊥ / ⊥ C ⇒ A ⊥ / ⊥ C | B, but A ⊥ ⊥ C.

slide-29
SLIDE 29

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu The causal inference problem

Example: In a world where mating happens completely randomly... ⇒ A ⊥ ⊥ C | B, but A ⊥ / ⊥ C ⇒ A ⊥ ⊥ C | B, but A ⊥ / ⊥ C ⇒ A ⊥ / ⊥ C | B, but A ⊥ ⊥ C.

slide-30
SLIDE 30

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu The causal inference problem

Subtleties

As you can see, many mathematical subtleties arise in trying to infer the underlying graph of a causal process. Even die-biasing games on 3 and 4 random variables have causally important mathematical properties that are not immediately intuitive: Not every causal structure G can be recovered uniquely from the outputs of a die-biasing game on it. Instead, DAGs come in small equivalence classes with other DAGs that are “observationally indistinguishable” from them. At least 3 random variables are required to test any causal relationship, observationally. (I.e., on two variables/dice,

  • nly the DAG with no edge can be recovered, so A → B is

indistinguishable from B → A.)

slide-31
SLIDE 31

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu The causal inference problem

Subtleties

slide-32
SLIDE 32

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu The causal inference problem

Subtleties

The observational equivalence class of a causal diagram is determined by its “v-pattern”, i.e. the occurrence of induced subgraphs of the form • → • ← • (see Pearl, 2000):

slide-33
SLIDE 33

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu The causal inference problem

Subtleties

In other words, two causal structures on a set of variables V have the same semivariety if and only if they have the same v-pattern.

slide-34
SLIDE 34

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu The causal inference problem

Subtleties

The DACB graph has the nice property that it is the only graph in its observational equivalence class, so it uniquely determines its semivariety, and in fact its variety (the Zariski closure of its semivariety).

slide-35
SLIDE 35

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu The causal inference problem

Subtleties

Perhaps such combinatorial and algebraic subtleties are the reason philosophers were so confused about causality for so long? ... but not anymore! Spirtes, Glymour, and Scheines (2001) Causation, Prediction, and Search gives an excellent and thorough discussion of the problem of learning causal structures from data.

slide-36
SLIDE 36

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu The causal inference problem

Subtleties

Perhaps such combinatorial and algebraic subtleties are the reason philosophers were so confused about causality for so long? ... but not anymore! Spirtes, Glymour, and Scheines (2001) Causation, Prediction, and Search gives an excellent and thorough discussion of the problem of learning causal structures from data.

slide-37
SLIDE 37

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu The causal inference problem

Correlation = causation, but...

In summary, causal hypotheses: can still be made mathematically precise; imply testable predictions in form of interventions and conditional independences; and under the right circumstances can be reliably inferred from probabilities observed without interventions (controlled experiments). As of 2010, the statistical package PCalg for R (Kalisch, Maechler, Colombo) incoroporates algorithms for inferring DAG structures from data on causal structures of discrete and Gaussian variables.

slide-38
SLIDE 38

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu The causal inference problem

Alert When possible, controlled experiments remain an invaluable tool for testing a causal theory G: they can distinguish graphs that are

  • bservationally equivalent, they test the additional hypothesis that

we can intervene in a way that respects the hypothesized causal structure, and can identify many observationally indistinguishable causal relations in the presence of hidden variables. Remark In practice, it is difficult to falsify a causal theory G from probability observation alone, because we usually have high prior confidence in the existence of hidden variables. But observed CI relations can still serve as strong evidence in favor of the theory, because they almost never occur unless the induced subgraph on the observed variables implies them.

slide-39
SLIDE 39

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu The causal inference problem

Perhaps the best feature of graphical models: Such caveats can be made precise by writing down the relevant graphs, and analyzed accordingly (which is great for mathematicians, who usually hate talking about subjects that lack precise notation.) A hope for observational sciences I personally hope that such discoveries can inform public policy and medical decisions on questions previously considered unanswerable.

slide-40
SLIDE 40

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu The causal inference problem

Perhaps the best feature of graphical models: Such caveats can be made precise by writing down the relevant graphs, and analyzed accordingly (which is great for mathematicians, who usually hate talking about subjects that lack precise notation.) A hope for observational sciences I personally hope that such discoveries can inform public policy and medical decisions on questions previously considered unanswerable.

slide-41
SLIDE 41

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu The causal inference problem

Why didn’t anybody tell me?

Probably because they didn’t know either! Graphical modeling is by far the most structured and rigorous framework for understanding causality to date, and it’s not very

  • ld. Major advancements occurred in the late 1980s, 90s and

2000s, and there’s still a lot of work to be done...

slide-42
SLIDE 42

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu Macaulay2 Demonstration...

1

Coin- and die-biasing games

2

The causal inference problem

3

Macaulay2 Demonstration...

4

Finer algebraic invariants

5

Binary hidden Markov models

6

MPS-entangled qubits

7

Further reading

slide-43
SLIDE 43

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu Macaulay2 Demonstration...

Let’s work through what all this means computationally in an example. break for Macaulay2 demo

slide-44
SLIDE 44

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu Finer algebraic invariants

1

Coin- and die-biasing games

2

The causal inference problem

3

Macaulay2 Demonstration...

4

Finer algebraic invariants

5

Binary hidden Markov models

6

MPS-entangled qubits

7

Further reading

slide-45
SLIDE 45

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu Finer algebraic invariants

Finer algebraic invariants

When models involve hidden variables, even if they are not

  • bservationally equivalent, conditional independences alone are

not often enough to tell them apart, and finer algebraic invariants

  • r inequalities are needed to distinguish their semivarieties.
slide-46
SLIDE 46

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu Finer algebraic invariants

For example, the following two models on binary variables satisfy the same conditional independencies (none!), but their semivarieties are respectively 9 and 15 dimensional: What’s a good example of this phenomenon, on binary variables, where both models involve a hidden variable?

slide-47
SLIDE 47

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu Finer algebraic invariants

Example candidate 1:

These two models are observationally equivalent, even if we’re allowed to view H.

slide-48
SLIDE 48

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu Finer algebraic invariants

Example candidate 2:

These models can be distinguished by conditional independencies among observed nodes, namely, whether A ⊥ ⊥ B.

slide-49
SLIDE 49

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu Finer algebraic invariants

Example candidate 3:

These models are different, and cannot be distinguished by conditional independencies, so let’s examine this example.

slide-50
SLIDE 50

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu Finer algebraic invariants

Model # 1 (“naive Bayes”)

The semivariety of this model lives in ∆15 ⊂ C15 ⊂ P15, and (Proposition) its Zariski closure is the first secant variety of the Segre embedding of P1 × P1 × P1 × P1. By [Raicu 2010], this variety is cut out by the 3 × 3 minors of the 4 × 4 flattenings of the 2 × 2 × 2 × 2 tensor (pabcd). This ideal is minimally generated by 32 of the minors, which look like this: p0111p1010p1101 − p0110p1011p1101 − p0111p1001p1110 + p0101p1011p1110 + p0110p1001p1111 − p0101p1010p1111

slide-51
SLIDE 51

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu Finer algebraic invariants

Model # 1 (“naive Bayes”)

In other words, by known results from algebraic geometry we can already write down 32 cubic equations which all the distributions arising from this model must satisfy, and which generate all other such equations.

slide-52
SLIDE 52

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu Finer algebraic invariants

Model # 2:

The second model is equivalent to the model on the right, whose semivariety Zariski closure is the first secant variety of the Segre embedding of P3 × P1 × P1, whose equations we also know by [Raicu, 2010]!

slide-53
SLIDE 53

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu Finer algebraic invariants

Model # 2:

Namely, they are the 3 × 3 minors of the unique 4 × 4 flattening of the 4 × 2 × 2 tensor (p(ab)cd).

slide-54
SLIDE 54

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu Finer algebraic invariants

Question: How can we use invariants like these, or related algebraic tools, to discover the relevance of hidden variables in natural datasets, like social network data, or medical survey data? How much data would be need? How much noise could we tolerate?

slide-55
SLIDE 55

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu Binary hidden Markov models

1

Coin- and die-biasing games

2

The causal inference problem

3

Macaulay2 Demonstration...

4

Finer algebraic invariants

5

Binary hidden Markov models

6

MPS-entangled qubits

7

Further reading

slide-56
SLIDE 56

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu Binary hidden Markov models

HMM(d, k, n) are models like the one above, with n hidden nodes having d states each and n visible nodes having k states each, where all the horizontal edges have the same CPT T and all the vertical edges have the same CPT E. Bray and Morton [2005] investigated defining equations for these models, and Sch¨

  • nhuth [2011] found a complete answer (ideal

generators) for HMM(2, 2, 3).

slide-57
SLIDE 57

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu Binary hidden Markov models

The parametrization map (π, T, E) → p ∈ ∆kn−1 is generically d!-to-1, but for d = 2 hidden states we can do better: Theorem (C-, 2012) For n ≥ 3, HMM(2, k, n) can be generically parametrized by a birationally invertible map ψ : C5 → Pkn−1 with an explicitly known inverse ψ−1.

slide-58
SLIDE 58

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu Binary hidden Markov models

Using the birational parametrization ψ and the cumulant coordinates of Sturmfels-Zwiernik, one can compute that Theorem (C-, 2012) The semivariety of HMM(2, 2, 4) distributions is cut out in ∆2n−1 by 21 quadrics and 29 cubics (minimally generating its ideal), and inequalities easily computable from ψ−1.

slide-59
SLIDE 59

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu Binary hidden Markov models

Question: What do the higher-dimensional analogues of these invariants mean about the structure of natural language?

slide-60
SLIDE 60

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu MPS-entangled qubits

1

Coin- and die-biasing games

2

The causal inference problem

3

Macaulay2 Demonstration...

4

Finer algebraic invariants

5

Binary hidden Markov models

6

MPS-entangled qubits

7

Further reading

slide-61
SLIDE 61

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu MPS-entangled qubits

Probability distributions are very similar to quantum states, except the latter allow arbitrary complex coefficients and are typically normalized by L2 instead of L1. Graphical causal models are likewise similar to tensor network models of finite-dimensional model quantum states, where instead

  • f causality we have quantum entanglement, and similar

algebraic techniques apply to both. In particular, reparametrization techniques similar to that used in (C-, 2012) for hidden Markov models allow us (C-, Morton) to compute defining equations for the variety of quantum states arising from certain matrix product state (MPS) models.

slide-62
SLIDE 62

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu MPS-entangled qubits

Given a tensor (Aj

ik) ∈ C2×2×2, we consider a system of n

entangled qubits Ψ ∈ (C2)⊗n with coordinates (entries) given by the formula ψi1...in =

  • j∈{0,1}n

Aj1

i1j2Aj2 i2j3 · · · Ajn inj1

We write CMPS(2, 2, 4) for the set of such cyclic matrix product states. If we write A in Penrose diagram notation with its i-index below it, then for n = 4, Ψ is given by the digram below:

slide-63
SLIDE 63

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu MPS-entangled qubits

Theorem (C-, Morton) Inside the C6 cut out by the cyclic symmetries ψijkℓ − ψℓijk, the variety CMPS(2, 2, 4) of 4-qubit states arising as limits of matrix product states of the form ψi1i2i3i4 =

  • j∈{0,1}4

Aj1

i1j2Aj2 i2j3Aj3 i3j4Aj4 i4j1,

is an irreducible sextic hypersurface defined by the following 30-term polynomial:

slide-64
SLIDE 64

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu MPS-entangled qubits

ψ2

1010ψ4 1100 − 2ψ6 1100 − 8ψ1000ψ1010ψ3 1100ψ1110 + 12ψ1000ψ4 1100ψ1110

− 4ψ2

1000ψ2 1010ψ2 1110 + 2ψ0000ψ3 1010ψ2 1110 + 16ψ2 1000ψ1010ψ1100ψ2 1110

− 4ψ0000ψ2

1010ψ1100ψ2 1110 − 16ψ2 1000ψ2 1100ψ2 1110 + 4ψ0000ψ1010ψ2 1100ψ2 1110

− 4ψ0000ψ3

1100ψ2 1110 − 4ψ0000ψ1000ψ1010ψ3 1110 + 8ψ0000ψ1000ψ1100ψ3 1110

− ψ2

0000ψ4 1110 + 2ψ2 1000ψ3 1010ψ1111 − ψ0000ψ4 1010ψ1111

− 4ψ2

1000ψ2 1010ψ1100ψ1111 + 4ψ2 1000ψ1010ψ2 1100ψ1111

+ 2ψ0000ψ2

1010ψ2 1100ψ1111 − 4ψ2 1000ψ3 1100ψ1111 + ψ0000ψ4 1100ψ1111

− 4ψ3

1000ψ1010ψ1110ψ1111 + 4ψ0000ψ1000ψ2 1010ψ1110ψ1111

+ 8ψ3

1000ψ1100ψ1110ψ1111 − 8ψ0000ψ1000ψ1010ψ1100ψ1110ψ1111

− 2ψ0000ψ2

1000ψ2 1110ψ1111 + 2ψ2 0000ψ1010ψ2 1110ψ1111 − ψ4 1000ψ2 1111

+ 2ψ0000ψ2

1000ψ1010ψ2 1111 − ψ2 0000ψ2 1010ψ2 1111.

slide-65
SLIDE 65

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu Further reading

1

Coin- and die-biasing games

2

The causal inference problem

3

Macaulay2 Demonstration...

4

Finer algebraic invariants

5

Binary hidden Markov models

6

MPS-entangled qubits

7

Further reading

slide-66
SLIDE 66

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu Further reading

For historical perspective:

Pearl (1988) Probabilistic Reasoning in Intelligent Systems – graphs becoming popular in CS for subjective belief propagation networks, or Bayes nets. Lauritzen (1996) Graphical Models – beginning to view graphs as generative processes underlying statistical theories.) Pearl (2000) Causality: Models, Reasoning, and Inference – Pearl advocating understanding graphical models, as a framework for stating statistical theories, by essentially all scientists and medical professionals. Spirtes, Glymour, and Scheines (2001) Causation, Prediction, and Search

slide-67
SLIDE 67

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu Further reading

For commutative algebraists and algebraic geometers interested in algebraic statistics:

Pistone, Riccomagno, Wynn (2001) Algebraic Statistics Pachter, Sturmfels, (2005) Algebraic Statistics for Computational Biology Cambridge University Press. Drton, Sturmfels, Sullivant (2009) Lectures on Algebraic Statistics, Springer.

slide-68
SLIDE 68

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu Further reading

For everyone else interested in algebraic statistics:

My thesis?

slide-69
SLIDE 69

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu Further reading

On the models in this talk:

  • C. Raicu (2010), Secant Varieties of Segre-Veronese Varieties,

arXiv:1011.5867

  • A. C., (2012), Binary hidden Markov models and varieties,

arXiv:1206.0500 A.C., J. Morton (2012), Polynomial constraints on representing entangled qubits as matrix product states, arXiv:1210.2812

\end{talk}[Thank you!]

slide-70
SLIDE 70

Causality and Algebraic Geometry Andrew Critch, UC Berkeley critch@math.berkeley.edu Further reading

On the models in this talk:

  • C. Raicu (2010), Secant Varieties of Segre-Veronese Varieties,

arXiv:1011.5867

  • A. C., (2012), Binary hidden Markov models and varieties,

arXiv:1206.0500 A.C., J. Morton (2012), Polynomial constraints on representing entangled qubits as matrix product states, arXiv:1210.2812

\end{talk}[Thank you!]