Reverse engineering using computational algebra Elena Dimitrova - PowerPoint PPT Presentation

Reverse engineering using computational algebra Elena Dimitrova School of Mathematical and Statistical Sciences Clemson University http://edimit.people.clemson.edu/ Algebraic Biology E. Dimitrova (Clemson) Reverse engineering using computational algebra Algebraic Biology 1 / 57

What is reverse engineering? Sometimes, complex biological systems can seem a bit like this: (click here!). Systems biology is the study of systems of biological components. A central problem in systems biology is to use experimental data to infer the structure of a system such as a gene regulatory network. Modeling approaches Bottom-up : Build a network from the known local information about every single object. Top-down (“Reverse-engineering”): View the system as a black box, then use the available data to make a model. Previously, we’ve mostly studied the first approach to modeling. In this lecture, we’ll focus on the second approach. Many problems in statistics (e.g., linear regression) deal with the second approach. E. Dimitrova (Clemson) Reverse engineering using computational algebra Algebraic Biology 2 / 57

The blind men and the elephant An old parable from India tells of several blind men who try to determine what an elephant looks like just by touch. The blind men are trying to reverse engineer an elephant from just a few data points. E. Dimitrova (Clemson) Reverse engineering using computational algebra Algebraic Biology 3 / 57

Inferring a Boolean model (elephant) from data (observations) Consider a Boolean network on n nodes, with update function f : F n 2 → F n 2 . There are 2 n input states. Suppose we don’t know the actual function f , but through experimental data, we are able to observe several transitions: · · · s 1 = ( s 11 , s 12 , . . . , s 1 n ) s 2 = ( s 21 , . . . , s 2 n ) s m = ( s m 1 , . . . , s mn ) · · · t 1 = ( t 11 , t 12 , . . . , t 1 n ) t 2 = ( t 21 , . . . , t 2 n ) t m = ( t m 1 , . . . , t mn ) Reverse engineering Start with experimental data (observations) and reconstruct the model (elephant). The two main features are: (i) the network topology, or wiring diagram, (ii) the Boolean functions at each node: f = ( f 1 , . . . , f n ). This problem is not just limited to models over F 2 = { 0 , 1 } ; it works for models over larger finite fields F . We will call such models local models. E. Dimitrova (Clemson) Reverse engineering using computational algebra Algebraic Biology 4 / 57

Inferring a Boolean network (elephant) from data (observations) Consider the following Boolean network: f 1 ( x 1 , x 2 , x 3 ) = x 1 ∧ x 2 = x 1 x 2 f 2 ( x 1 , x 2 , x 3 ) = x 1 ∧ x 2 ∧ x 3 = x 1 x 2 x 3 f 3 ( x 1 , x 2 , x 3 ) = x 1 ∧ x 2 = x 1 x 2 . The state space of f = ( f 1 , f 2 , f 3 ) is the following graph: 001 010 011 100 101 110 000 111 Question What if we only knew part of this state space, e.g., (1 , 1 , 0) − → (1 , 0 , 1) − → (0 , 0 , 0) − → (0 , 0 , 0) . Could we recover the individual functions? How many possible models could yield this “fragment”? E. Dimitrova (Clemson) Reverse engineering using computational algebra Algebraic Biology 5 / 57

Reverse engineering the model space Broad goal Find “the best” local model f = ( f 1 , . . . , f n ) that fits the data: Input states: s 1 , . . . , s m ∈ F n with f ( s i ) = t i Output states: t 1 , . . . , t m ∈ F n Note that: f ( s i ) = ( f 1 ( s i ) , f 2 ( s i ) , . . . , f n ( s i )) = ( t i 1 , t i 2 , . . . , t in ) = t i . Question What if no models fit the data? (This is actually impossible.) What if many models fit the data? First, we’ll find all local models that fit the data. This is called the model space: � � F 1 × · · · × F n = ( f 1 , . . . , f n ) | f j ( s i ) = t ij for all i and j . Once we do this, the new problem becomes choosing the “best” one. This is called model selection. E. Dimitrova (Clemson) Reverse engineering using computational algebra Algebraic Biology 6 / 57

Similar problems in other areas of mathematics 1. Parametrize a line in R n . 2. Parametrize a plane in R n . 3. Solve the underdetermined system Ax = b . 4. Solve the differential equation x ′′ + x = 2. E. Dimitrova (Clemson) Reverse engineering using computational algebra Algebraic Biology 7 / 57

Parametrize a line in R n Suppose we want to write the equation for a line that contains a vector v ∈ R n : z t v + w v + w w t v v y x This line, which contains the zero vector , is t v = { t v : t ∈ R } . Now, what if we want to write the equation for a line parallel to v ? This line, which does not contain the zero vector , is t v + w = { t v + w : t ∈ R } . Note that ANY particular w on the line will work!!! E. Dimitrova (Clemson) Reverse engineering using computational algebra Algebraic Biology 8 / 57

Solve an underdetermined system Ax = b Suppose we have a system of equations that has “too many variables,” so there are infinitely many solutions. For example: �   x � 2 � 4 � 2 x + y + 3 z = 4 1 3  = “ Ax = b form”: y .  3 x − 5 y − 2 z = 6 3 − 5 − 2 6 z How to solve: 1. Solve the related homogeneous equation Ax = 0 (this is null space, NS( A )); 2. Find any particular solution x p to Ax = b ; 3. Add these together to get the general solution: x = NS( A ) + x p . This works because geometrically, the solution space is just a line, plane, etc. Here are two possible ways to write the solution:         1 2 1 10  +  + C 1 0  , C 1 8  .     − 1 0 − 1 − 8 E. Dimitrova (Clemson) Reverse engineering using computational algebra Algebraic Biology 9 / 57

Linear differential equations Solve the differential equation x ′′ + x = 2. How to solve: 1. Solve the related homogeneous equation x ′′ + x = 0. The solutions are x h ( t ) = a cos t + b sin t . 2. Find any particular solution x p ( t ) to x ′′ + x = 2. By inspection, we see that x p ( t ) = 2 works. 3. Add these together to get the general solution: x ( t ) = x h ( t ) + x p ( t ) = a cos t + b sin t + 2 . Note that while the general solution above is unique, its presentation need not be. For example, we could write it this way: x ( t ) = x h ( t ) + x p ( t ) = a (2 cos t − 3 sin t ) + b sin t + (2 − cos t + 8 sin t ) . Here, the particular solution has (unnecessary) “extra terms” that vanish on the homogeneous part, x ′′ + x = 0. E. Dimitrova (Clemson) Reverse engineering using computational algebra Algebraic Biology 10 / 57

Reverse engineering: Problem statement Recall that a local model over F is an n -tuple f = ( f 1 , . . . , f n ) of functions f i : F n → F . The associated finite dynamical system (FDS) map is f : F n − → F n , f : x �− → ( f 1 ( x ) , . . . , f n ( x )) . p → F p is a polynomial in F p [ x 1 , . . . , x n ] / � x p 1 − x 1 , . . . , x p If F = F p then each f i : F n n − x n � . Goal Given a set of data: Input states: s 1 , . . . , s m ∈ F n with f ( s i ) = t i Output states: t 1 , . . . , t m ∈ F n Construct the model space F 1 × · · · × F n of all local models f = ( f 1 , . . . , f n ) that fit the data: f ( s i ) = ( f 1 ( s i ) , . . . , f n ( s i )) = ( t i 1 , . . . , t in ) = t i . We’ll find each F 1 , . . . , F n separately. E. Dimitrova (Clemson) Reverse engineering using computational algebra Algebraic Biology 11 / 57

Reverse engineering: How to find F j We wish to find the set F j of all local functions (polynomials!) f j that fit the data: F j = { f j : f j ( s 1 ) = t 1 j , . . . , f j ( s m ) = t mj } . Define the set I (it is actually an “ideal” of the polynomial ring F [ x 1 , . . . , x n ]) I = { h : h ( s i ) = 0 for all i = 1 , . . . , m } = { all polynomials that vanish on the data } . Theorem The set of polynomials that fit the data at node j is F j = f j + I = { f j + h : h ∈ I } , where f j is any one particular polynomial that fits the data. Thus, to find F j , we need to do two things: 1. Find the ideal I ; ( all solutions to { f j ( s i ) = 0 , ∀ i } ) 2. Find any polynomial f j that fits the data. ( one solution to { f j ( s i ) = t ij , ∀ i } ) E. Dimitrova (Clemson) Reverse engineering using computational algebra Algebraic Biology 12 / 57

Reverse engineering: How to find I and f j 1. Finding I : Define I ( s i ) to be the set of polynomials that vanish on s i : I ( s i ) = { all polynomials h i such that h i ( s i ) = 0 } = { ( x 1 − s i 1 ) g 1 ( x ) + ( x 2 − s i 2 ) g 2 ( x ) + · · · + ( x n − s in ) g n ( x ) } = � x 1 − s i 1 , x 2 − s i 2 , . . . , x n − s in � Clearly, the set I of polynomials that vanish on all s i (for i = 1 , . . . , m ) is m � I = I ( s i ) . i =1 2. Finding f j : There are many algorithms. Lagrange interpolation is one of them: n � � 1 − ( x i − c i ) p − 1 ] . f ( x 1 , . . . , x n ) = [ f ( c 1 , . . . , c n ) ( c 1 ,..., c n ) ∈ V i =1 In this lecture, we will learn another method which has the Chinese remainder theorem lurking behind the scenes. E. Dimitrova (Clemson) Reverse engineering using computational algebra Algebraic Biology 13 / 57

Reverse engineering using computational algebra Elena Dimitrova - PowerPoint PPT Presentation

Reverse engineering using computational algebra Elena Dimitrova School of Mathematical and Statistical Sciences Clemson University http://edimit.people.clemson.edu/ Algebraic Biology E. Dimitrova (Clemson) Reverse engineering using

Reverse engineering using computational algebra Matthew Macauley Department of Mathematical

Reverse engineering using computational algebra Matthew Macauley Department of Mathematical

PV Math Department MCL Vision Credit Options Credit General General/Post- College Honors

Next-Generation Debuggers For Reverse Engineering For Reverse Engineering The ERESI team

Reverse Osmosis Reverse Osmosis Background to Market and to Market and Background Technology

Course Offerings Course Offering Algebra 1 Algebra 2 Geometry 4 th Year OR OR Math OR

JUST THE MATHS SLIDES NUMBER 1.1 ALGEBRA 1 (Introduction to algebra) by A.J. Hobson

The geometry of Boolean algebra Chris Heunen 1 / 22 Boolean algebra: example , ,

5.1 Basic Operations Chapter 5: Algebra 2 Chapter 5: Algebra

1 Boolean Algebra 1. Boolean Algebra Verification Technology Content 1.1 Boolean algebra basics

1. Boolean Algebra 1.1 Boolean Algebra Basics Verification Technology AND-operation

Relational Algebra Rolf Fagerberg DM505, Spring 2006, 4th Quarter 1 Algebra Algebra: operands

CS 166: Information Security Reverse Engineering & Digital Rights Management Prof. Tom

Reverse engineering AT32UC3As JTAG Introduction Overview LSE Summer Week 2014 TAP

Reverse Engineering TCP/ IP Reverse Engineering TCP/ IP Steven Low EAS, Caltech Joint work

Reverse Engineering CS 166 Armen Boursalian 30 Apr 2018 Reverse Engineering Take a

hebma@nju.edu.cn The context of this lecture is based on the publication [3] XVIII - 2 1

Nikhef plans (and some comments from Jos Vermeulen) Frank Filthaut, Paul de Jong, Milo

Integral Equations in Quantum Mechanics I I Bound States, II Scattering* Rubin H Landau Sally

The Jo ys of Sc heme Daniel P F riedman Computer Science Depa rtmert

Generic and parallel Grbner bases in JAS Heinz Kredel, University of Mannheim 4 th

Avoiding Register Overflow in the Bakery Algorithm The Bakery++ Algorithm The Bakery algorithm is

Development of high-strength 122-type iron-based superconducting wires and tapes for high-field

HPDedup: A Hybrid Prioritized Data Deduplication Mechanism for Primary Storage in the Cloud Huijun

Reverse engineering using computational algebra Elena Dimitrova - PowerPoint PPT Presentation

Reverse engineering using computational algebra Elena Dimitrova School of Mathematical and Statistical Sciences Clemson University http://edimit.people.clemson.edu/ Algebraic Biology E. Dimitrova (Clemson) Reverse engineering using

Reverse engineering using computational algebra Matthew Macauley Department of Mathematical

Reverse engineering using computational algebra Matthew Macauley Department of Mathematical

PV Math Department MCL Vision Credit Options Credit General General/Post- College Honors

Next-Generation Debuggers For Reverse Engineering For Reverse Engineering The ERESI team

Reverse Osmosis Reverse Osmosis Background to Market and to Market and Background Technology

Course Offerings Course Offering Algebra 1 Algebra 2 Geometry 4 th Year OR OR Math OR

JUST THE MATHS SLIDES NUMBER 1.1 ALGEBRA 1 (Introduction to algebra) by A.J. Hobson

The geometry of Boolean algebra Chris Heunen 1 / 22 Boolean algebra: example , ,

5.1 Basic Operations Chapter 5: Algebra 2 Chapter 5: Algebra

1 Boolean Algebra 1. Boolean Algebra Verification Technology Content 1.1 Boolean algebra basics

1. Boolean Algebra 1.1 Boolean Algebra Basics Verification Technology AND-operation

Relational Algebra Rolf Fagerberg DM505, Spring 2006, 4th Quarter 1 Algebra Algebra: operands

CS 166: Information Security Reverse Engineering &amp; Digital Rights Management Prof. Tom

Reverse engineering AT32UC3As JTAG Introduction Overview LSE Summer Week 2014 TAP

Reverse Engineering TCP/ IP Reverse Engineering TCP/ IP Steven Low EAS, Caltech Joint work

Reverse Engineering CS 166 Armen Boursalian 30 Apr 2018 Reverse Engineering Take a

hebma@nju.edu.cn The context of this lecture is based on the publication [3] XVIII - 2 1

Nikhef plans (and some comments from Jos Vermeulen) Frank Filthaut, Paul de Jong, Milo

Integral Equations in Quantum Mechanics I I Bound States, II Scattering* Rubin H Landau Sally

The Jo ys of Sc heme Daniel P F riedman Computer Science Depa rtmert

Generic and parallel Grbner bases in JAS Heinz Kredel, University of Mannheim 4 th

Avoiding Register Overflow in the Bakery Algorithm The Bakery++ Algorithm The Bakery algorithm is

Development of high-strength 122-type iron-based superconducting wires and tapes for high-field

HPDedup: A Hybrid Prioritized Data Deduplication Mechanism for Primary Storage in the Cloud Huijun

CS 166: Information Security Reverse Engineering & Digital Rights Management Prof. Tom