u curve search for biological states characterization and
play

U-curve Search for Biological States Characterization and Genetic - PowerPoint PPT Presentation

U-curve Search for Biological States Characterization and Genetic Network Design Marcelo Ris Universidade de So Paulo Instituto de Matemtica e Estatstica Junior Barrera Universidade de So Paulo Instituto de Matemtica e


  1. U-curve Search for Biological States Characterization and Genetic Network Design Marcelo Ris – Universidade de São Paulo – Instituto de Matemática e Estatística Junior Barrera – Universidade de São Paulo – Instituto de Matemática e Estatística Helena Brentani - Hospital do Câncer, Fundação Antônio Prudente

  2. Outline • Introduction • Feature selection problem • U-curve search algorithm • Characterization of biological states • Genetic network design • Application

  3. • Introduction • Feature selection problem • U-curve search algorithm • Characterization of biological states • Genetic network design • Application

  4. • Biological Problems P1. Biological states characterization P2. Genetic Network Design • Gene expression data P1. States samples P2. Time-course samples • Mathematical approach – Feature Selection Problem – State of the art: heuristic optimizations – U-curve algorithm

  5. • Introduction • Feature selection problem • U-curve search algorithm • Characterization of biological states • Genetic network design • Application

  6. P ( X , Y ) � {( x [ 0 ], y [ 0 ]), ( x [ 1 ], y [ 1 ]), , ( x [ m ], y [ m ])} � � [ ] x t 1 � � x [ t ] � � 2 = x [ t ] � � � � � � � x [ t ] n R A → | | ψ : K Feature Selection � � � �� � � � ∈ = ∈ x [ t ] R { r r r r i 1 2 l i � ∈ = y [ t ] K { 1 , , c } � ⊂ A { 1 , 2 , , n }

  7. Y Distribution − → P : { 1 , 0 , 1 } [ 0 , 1 ] � P(Y) = P ( y ) 1 y ∈ { − 1 , 0 , 1 } Y -1 0 1 Y Entropy P(Y’) � = − H ( Y ) P ( y ) log P ( y ) ∈ − y { 1 , 0 , 1 } > H ( Y ) H ( Y ' ) Y’ -1 0 1 = H ( Y ' ) H ( Y ' ' ) P(Y’’) Mutual Information Y’’ -1 0 1 = − ≥ ( , ) ( ) ( | ) 0 I X Y H Y H Y X

  8. Mean Conditional Entropy � � = E [ H ( Y | X )] P ( x ) P ( y | x ) log P ( y | x ) X Y | X Y | X ∈ − ∈ − x { 1 , 0 , 1 } y { 1 , 0 , 1 } Estimation � � � � � � = E [ H ( Y | X )] P ( x ) P ( y | x ) log P ( y | x ) X Y | X Y | X ∈ − ∈ − x { 1 , 0 , 1 } y { 1 , 0 , 1 } Mean Mutual Information = − E [ I ( X , Y )] H ( Y ) E [ H ( Y | X )] Estimation � � � = − E [ I ( X , Y )] H ( Y ) E [ H ( Y | X )]

  9. • Problem – find the subset A that optimizes the cost function – Ex: mean conditional entropy minimization (cost function) – Exponential • Search Space – Complete boolean lattice of order n – Each node represents a possible candidate A – Cost function: estimated for each node – Find the node with the minimum cost

  10. Boolean Lattice of order 4 4-element chain is emphasized • Heuristics: SFS, SFFS – Incremental – Does not search all the candidates space – Could not obtain the “best” result • Ex: 2 elements alone turns the result worse, but together improves it a lot

  11. • Introduction • Feature selection problem • U-curve search algorithm • Characterization of biological states • Genetic network design • Application

  12. • U-curve property of Ê[H(Y|X)] Ë[H(Y|X)] – For a fixed number of samples – For any chain of the search space – Ê[H(Y|X)] forms an U-curve |A| – Why ? – Estimation composed by: • Real measure – decreases from H(Y) to the real value E[H(Y|X)] • Estimation error – increases as more attributes are added to X

  13. • Features of the algorithm – Branch-and-Bound: go through the whole space without having to visit all the candidates – Stochastic – Some definitions: • U-cost Boolean Lattice • Local minimum • Exhausted minimum • Global minimum

  14. • Search space characterized by: – Upper Bound List – Lower Bound List An element is reachable if • there is a chain from an 10 upper or lower list element 1110 • At each step: – Select with some 6 probability a beginning list 0110 – Select an aleatory Prune element from this list Procedure – Build a chain iteratively: • Inserts to the chain an 7 aleatory reachable 0100 adjacent to the last one • Stop, when the cost of the last element is greater than the last 9 one 0000

  15. • Additional Procedures – Minimum exhausting • Avoid more than one visit to the same candidate • Using a stack – Pruning elements from an element E • Upper bound list – remove elements U’s that contain E , and inserts elemets reachable from U that not contain E • Lower bound list – remove elements L’s that are contained in E , and inserts elemets reachable from L that is not contained in E

  16. • Introduction • Feature selection problem • U-curve search algorithm • Characterization of biological states • Genetic network design • Application

  17. P ( X , Y ) Quantized Microarray � {( x [ 0 ], y [ 0 ]), ( x [ 1 ], y [ 1 ]), , ( x [ m ], y [ m ])} � � [ ] x t 1 � � x [ t ] � � 2 = x [ t ] � � � � � � � x [ t ] n R A → | | ψ : K U-curve algorithm Quantized Values � � ∈ = ∈ x [ t ] R { r , r , , r }, r i 1 2 l i � ∈ = y [ t ] K { 1 , , c } � ⊂ A { 1 , 2 , , n } Biological States

  18. • Introduction • Feature selection problem • U-curve search algorithm • Characterization of biological states • Genetic network design • Application

  19. • Dynamical Systems – State: vector x – Transition function � – x [ t+1 ] = � ( x [ t ]) • Stochastic Process – Stochastic transition function • Next State – aleatory vector realization – Ex: Markov Chain ( � X|Y , � 0 ) • Time-discrete, finite-size vector, finite domain • Aleatory state sequence � � � p p p p � � p 1 | 1 2 | 1 3 | 1 n | R | | 1 � � � � 1 � � � p p p p � � p 1 | 2 2 | 2 3 | 2 n | R | | 2 2 � � � � � π = π = p p p p p � � � � 0 3 Y | X 1 | 3 2 | 3 3 | 3 n | R | | 3 � � � � � � � � � � � � � � � � � p � � � � p p p p � � n | R | n n n n n 1 || | 2 || | 3 || | | | || | R R R R R

  20. • Probabilistic Genetic Networks - PGN π π - Markov Chain ( , ) with the following axioms : | 0 Y X π a. is homogeneou s, p independs on t , Y | X y | x n > ∀ ∈ b. p 0 , x , y R y | x π c. é condiciona lly independen t, that is, Y | X n n ∏ ∀ ∈ = x , y R , p p ( y | x ), y | x i = i 1 n π ∀ ∈ d. almost - determinis tic, that is, x R e Y | X ∈ = ∈ ≈ i N { 1 ,.., n }, there is r R | p 1 , = y r | x i n ∀ ∈ ∀ ∈ e. x R , i N , there is a sub - space of << = dimension j , j n , such as : p p , , y | x y | x i i , wher e x is the projection of x on this sub - space

  21. • Markov Chain � � � p p p p 1 | 1 2 | 1 3 | 1 n 3 | 1 � � � � � p p p p 1 | 2 2 | 2 3 | 2 n 3 | 2 � � � π = � p p p p � | 1 | 3 2 | 3 3 | 3 n Y X 3 | 3 � � � � � � � � � � � � p p p p � � n n n n n 1 | 3 2 | 3 3 | 3 3 | 3 • Probabilistic Genetic Networks - PGN � P , P , , P X | X X | X X | X 1 2 n Almost Deterministic � � � p p p p r | 1 r | 1 r | 1 r | 1 � � 1 2 3 l � � � p p p p r | 2 r | 2 r | 2 r | 2 1 2 3 l � � � = P p p p p � � X | X r | 3 r | 3 r | 3 r | 3 i 1 2 3 l � � � � � � � � � � � � p p p p � � n n n n r || R | r || R | r || R | r || R | 1 2 3 l

  22. Time-Course Gene Expression Data Expression (Gene 1) time Expression (Gene 2) time Expression (Gene 3) time . . . . . . . . . . . . . . . . . . Expression (Gene n) time Expression Measurement Techniques .... x [ 1 ] x [ 2 ] x [ 3 ] x [ 4 ] x [ 5 ] x [ 6 ] x [ 7 ] x [ 9 ] x [ 10 ] x [ 11 ] x [ 12 ] x [ 13 ] x [ m − 1 ] x [ m ]

  23. � = P ( X , Y ), j 1 , , n j Quantized Microarray at t � {( x [ 0 ], y [ 0 ]), ( x [ 1 ], y [ 1 ]), , ( x [ m ], y [ m ])} j j j � � [ ] x t 1 � � x [ t ] � � 2 = x [ t ] � � � � � � � x [ t ] n | A | ψ → : R j K U-curve algorithm Quantized j Values � ∈ = � ∈ x [ t ] R { r , r , , r }, r i 1 2 l i = + y [ t ] x [ t 1 ] j j � ⊂ A j { 1 , 2 , , n } Gene j quantized expression at t+1

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend