In-Place (Bijective) BWT Transforms Dominik Kppl Kyushu - PowerPoint PPT Presentation

In-Place (Bijective) BWT Transforms Dominik Köppl Kyushu University Daiki Hashimoto Tohoku University Diptarama Ayumi Shinohara

data structures Burrows-Wheeler Transform (BWT) [Burrows,Wheeler '94] Bijective BWT (BBWT) [Gil,Scott '12] 2

BWT of bacabbabb T = bacabbabb$ 3

BWT of bacabbabb T = bacabbabb$ all suffjxes bacabbabb$ acabbabb$ cabbabb$ abbabb$ bbabb$ babb$ abb$ bb$ b$ $ 4

BWT of bacabbabb T = bacabbabb$ all suffjxes $ bacabbabb$ b acabbabb$ a cabbabb$ c abbabb$ a bbabb$ b babb$ prev. char b abb$ a bb$ b b$ b $ 5

BWT of bacabbabb T = bacabbabb$ all suffjxes $ bacabbabb$ $ bacabbabb$ b acabbabb$ b acabbabb$ a cabbabb$ a cabbabb$ c abbabb$ c abbabb$ a bbabb$ a bbabb$ b babb$ b babb$ align prev. char b abb$ b abb$ left a bb$ a bb$ b b$ b b$ b $ b $ 6

BWT of bacabbabb T = bacabbabb$ all suffjxes BWT $ bacabbabb$ $ bacabbabb$ b $ b acabbabb$ b acabbabb$ b abb$ a cabbabb$ a cabbabb$ c abbabb$ c abbabb$ c abbabb$ b acabbabb$ a bbabb$ a bbabb$ b babb$ b babb$ b babb$ b b$ < lex sort align prev. char b abb$ b abb$ $ bacabbabb$ left a bb$ a bb$ a bb$ b b$ b b$ a bbabb$ b $ b $ a cabbabb$ lex. order 7

the BBWT is the BWT of the Lyndon factorization of an input text with respect to ≺ ω 8

the BBWT is the BWT of the Lyndon factorization 1. of an input text with respect to ≺ ω 2. 9

Lyndon words – a – aabab Lyndon word is smaller than ● any proper suffix ● any rotation 10

Lyndon words – a – aabab Lyndon word is smaller than ● any proper suffix ● any rotation not Lyndon words: – abaab (rotation aabab smaller) – abab ( abab not smaller than suffjx ab ) 11

Lyndon factorization [Chen+ '58] ● input: text T = T 1 T 2 T t ⋯ ● output: factorization T 1 ... T t with – T x is Lyndon word – T x ≥ lex T x +1 – factorization uniquely defjned – linear time [Duval'88] (Chen-Fox-Lyndon Theorem) (Chen-Fox-Lyndon theorem) 12

example T = bacabbabb Lyndon factorization : b|ac|abb|abb – b,ac,abb , and abb are Lyndon – b > lex ac > lex abb ≥ lex abb 13

≺ ω order ● u ≺ ω w : ⟺ u u u u ... < lex w w w w ... ● ab < lex aba ● aba ≺ ω ab 14

≺ ω order ● u ≺ ω w : ⟺ u u u u ... < lex w w w w ... ● ab < lex aba abababab⋯ abaabaaba⋯ ● aba ≺ ω ab 15

BBWT of bacabbabb b|ac|abb|abb 16

BBWT of bacabbabb b|ac|abb|abb b ac abb abb ca bab bab bba bba 17

BBWT of bacabbabb b|ac|abb|abb b b ac abb abb ac ca bab bab ca bba bba abb bab bba abb bab bba 18

BBWT of bacabbabb b|ac|abb|abb b abb b ac abb abb ac abb ca bab bab ca ac bba bba abb bab bab bab ≺ ω bba bba abb bba bab b bba ca 19

BBWT of bacabbabb b|ac|abb|abb BBWT b abb abb b b ac abb abb ac abb abb b ca bab bab ca ac ac c bba bba abb bab bab b bab bab bab b ≺ ω bba bba bba a abb bba bba a bab b b b bba ca ca a BBWT( T ) = bbcbbaaba 20

BBWT of bacabbabb b|ac|abb|abb BBWT b abb abb b b ac abb abb ac abb abb b ca bab bab ca ac ac c bba bba abb bab bab b bab bab bab b ≺ ω bba bba bba a abb bba bba a bab b b b bba ca ca a BBWT( T ) = bbcbbaaba BWT( T $ ) = bbcbbb$aaa 21

motivation properties of BBWT ： ● no $ necessary ● BBWT is more compressible than BWT for various inputs [Scott and Gill '12] ● BBWT is indexible (full text index) ● is computable in O( n ) time with O( n ) words [Bannai+ '19] however, O( n ) words can be too much for large n 22

in-place computation ● Σ: alphabet, σ := |Σ| alphabet size ● T : text, n := | T | ● L := n lg σ bits workspace ● aim ： in-place computation transform T BWT BBWT with ↔ ↔ | L | + O(lg n ) bits of workspace L T := b a c a b b a b b 23

known solutions work- input output time reference space text BWT in-place O( n 2 ) Crochemore+ '15 BWT text in-place O( n 2+ε ) O( n lg σ ) O( n text BBWT Bonomo+ '14 bits lg n /lg lg n ) σ : alphabet size, n : text length, 24 ε is a constant with 0 < ε < 1

in-place conversions text known O( n 2 ) O( n 2+ ε ) O( n 2 ) O( n 2+ ε ) BWT BBWT O( n 2+ ε ) working space: n lg σ + O(lg n ) bits (including text) 25

forward search F L T = bacabbabb$ b $ a b a c b a b b b b b $ b a b a c a 26

forward search F L T = bacabbabb$ b $ a b a c can calculate with b a rank and select on F and L b b b b b $ b a b a c a 30

L .rank L [ i ] ( L [ i ]) forward search F L T = bacabbabb$ 1 $ b 1 1 a b 2 2 a c 1 FL mapping: 3 a b 3 FL( i ) = L .select F [ i ] ( F .rank F [ i ] ( F [ i ]) ) 1 b b 4 2 b b 5 3 b $ 1 4 b a 1 5 b a 2 1 c a 3 F .rank F [i] ( F [ i ]) 31

L .rank L [ i ] ( L [ i ]) backward search F L T = bacabbabb$ 1 $ b 1 1 a b 2 2 a c 1 3 a b 3 1 b b 4 2 b b 5 3 b $ 1 4 b a 1 5 b F .rank F [i] ( F [ i ]) a 2 1 c a 3 32 FM index [Ferragina, Manzini '00]

L .rank L [ i ] ( L [ i ]) backward search F L T = bacabbabb$ 1 $ b 1 1 a b 2 LF mapping: 2 a c 1 LF( i ) := F .select L [ i ] ( L .rank L [ i ] ( i ) ) 3 a b 3 1 b b 4 2 b b 5 3 b $ 1 4 b a 1 5 b F .rank F [i] ( F [ i ]) a 2 1 c a 3 36 FM index [Ferragina, Manzini '00]

L .rank L [ i ] ( L [ i ]) backward search F L T = bacabbabb$ 1 $ b 1 1 a b 2 LF mapping: 2 a c 1 LF( i ) := F .select L [ i ] ( L .rank L [ i ] ( i ) ) 3 a b 3 = F .select L [ i ] (1) + L .rank L [ i ] ( i )-1 1 b b 4 2 b b 5 3 b $ 1 4 b a 1 5 b F .rank F [i] ( F [ i ]) a 2 1 c a 3 37 FM index [Ferragina, Manzini '00]

L .rank L [ i ] ( L [ i ]) backward search F L T = bacabbabb$ 1 $ b 1 1 a b 2 LF mapping: 2 a c 1 LF( i ) := F .select L [ i ] ( L .rank L [ i ] ( i ) ) 3 a b 3 = F .select L [ i ] (1) + L .rank L [ i ] ( i )-1 1 b b 4 2 b b 5 = |{ j : L [ j ] < L [ i ]}| + L .rank L [ i ] ( i ) 3 b $ 1 4 b a 1 5 b F .rank F [i] ( F [ i ]) a 2 1 c a 3 38 FM index [Ferragina, Manzini '00]

LF: time complexity If we store BWT( T ) in L : – L [ i ] = BWT[ i ]: O(1) time ⇒ for any c : L .rank c ( i ) in O( n ) time – LF( i ) = |{ j : L [ j ] < L [ i ]}| + L .rank L [ i ] ( i ) O( n ) time O( n ) time 39

FL: time complexity ● FL( i ) = L .select F [ i ] ( F .rank F [ i ] ( F [ i ]) ) FL(i) = L .select F [ i ] ( i - |{ j : L [ j ] < i }| ) ● If we know F [ i ]: FL( i ) in O( n ) time ● however, the fastest in-place computation of F [ i ] takes O( n 1+ε ) time [Munro,Raman '96] for any constant ε with 0 < ε < 1 40

road map text 1. O( n 2+ ε ) O( n 2 ) BWT BBWT 2. O( n 2+ ε ) working space: n lg σ + O(lg n ) bits (including text) 41

text BBWT → 42

text BBWT → for each Lyndon factor T x with x = 1 up to t : prepend T x [| T x |] to BBWT p 1 (insert position in BBWT ) ← for each i = | T x |-1 down to 1 : p LF( p ) + 1 ← insert T x [ i ] at BBWT[ p ] [Bonomo+ '14] 43

text BBWT → T = bacabbabb ● Lyndon factorization: b|ac|abb|abb ● fjrst: insert b 44

text BBWT → T = bacabbabb ● Lyndon factorization: b|ac|abb|abb ● fjrst: insert b F L 1 b b 1 45

text BBWT → T = bacabbabb F L 1 a b 1 ● Lyndon factorization: 2 a b 2 3 a c 1 b|ac|abb|abb 1 b b 3 ● fjrst: insert b 2 b b 4 3 b a 1 F L how to calculate? 4 b a 2 1 b b 1 5 b b 5 1 c a 3 46

BBWT( T 1 T 2 ) T = b|ac|abb|abb = T 1 T 2 T 3 T 4 ● next Lyndon factor: ac F L 1 b b 1 47

BBWT( T 1 T 2 ) T = b|ac|abb|abb = T 1 T 2 T 3 T 4 ● next Lyndon factor: ac F L F L 1 b b 1 1 b c 1 1 c b 1 48

BBWT( T 1 T 2 ) T = b|ac|abb|abb = T 1 T 2 T 3 T 4 ● next Lyndon factor: ac F L F L F L 1 b b 1 1 b c 1 1 a c 1 1 c b 1 1 b b 1 1 c a 1 49

BBWT( T 1 T 2 T 3 ) T = b|ac|abb|abb ● next Lyndon factor: abb F L 1 a c 1 1 b b 1 1 c a 1 50

BBWT( T 1 T 2 T 3 ) T = b|ac|abb|abb ● next Lyndon factor: abb F L F L 1 a c 1 1 a b 1 1 b b 1 1 b c 1 1 c a 1 2 b b 2 1 c a 1 51

In-Place (Bijective) BWT Transforms Dominik Kppl Kyushu - PowerPoint PPT Presentation

In-Place (Bijective) BWT Transforms Dominik Kppl Kyushu University Daiki Hashimoto Tohoku University Diptarama Ayumi Shinohara data structures Burrows-Wheeler Transform (BWT) [Burrows,Wheeler '94] Bijective BWT (BBWT) [Gil,Scott '12]

A PLACE TO CALL HOME A PLACE TO CALL HOME A PLACE TO CALL HOME A PLACE TO CALL HOME A PLACE

Motion Estimation by Affine Transforms Motion Estimation by Affine Transforms Motion Estimation

Bijective counting of tree-rooted maps Olivier Bernardi - LaBRI, Bordeaux Combinatorics and

Bijective proof and generalization of Siladi cs partition theorem Isaac KONAN IRIF, Paris

JUST THE MATHS SLIDES NUMBER 16.7 LAPLACE TRANSFORMS 7 (An appendix) by A.J.Hobson One

Drawing on the Web CSS CSCI-UA 380 Transforms, Transitions, and Animation Drawing on the Web

The Place Approach What is the Place Approach? What makes a Great Place The Benefits of a Great

Research that Transforms Healthcare and Transforms Lives Dianne Morrison-Beedy, PhD, RN, WHNP-BC,

Week 5 -Wednesday What did we talk about last time? Transforms Translation Rotation

F F Fast Transforms using the Cell/B.E. Processor Fast Transforms using the Cell/B.E. Processor

M- -Channel Filter Banks: Channel Filter Banks: M Block and Lapped Transforms Block and Lapped

JUST THE MATHS SLIDES NUMBER 16.8 Z-TRANSFORMS 1 (Definition and rules) by A.J.Hobson

Learning From Data Lecture 10 Nonlinear Transforms The Z -space Polynomial transforms Be

Algorithms for Lattice Transforms and 2348 2349 Lattice Transforms for 234 248 239

Sparse Fourier Transforms Eric Price UT Austin Eric Price Sparse Fourier Transforms 1 / 36

JUST THE MATHS SLIDES NUMBER 16.2 LAPLACE TRANSFORMS 2 (Inverse Laplace Transforms) by

Computational Linguistics II: Parsing Overview, Left-Recursion, Bottom-up Parsing Frank Richter

Undecidable Problems for CFGs CSCI 3130 Formal Languages and Automata Theory Siu On CHAN Chinese

The C 2 -theory of the subtrace order Dietrich Kuske Technische Universit at Ilmenau 1 / 14

The Pumping Lemma of different strings for Context-Free Languages Example: S AB A aBb

ABB: writing the future of industries in a changing world 2 nd industrial revolution 4 th

t t rt t

Reinforcement Learning Steve Tanimoto University of California, Berkeley [These slides were

Utilize Partially Faulty Links in Networks-on-Chip Changlin Chen, Ye Lu , Sorin D. Cotofana

In-Place (Bijective) BWT Transforms Dominik Kppl Kyushu - PowerPoint PPT Presentation

In-Place (Bijective) BWT Transforms Dominik Kppl Kyushu University Daiki Hashimoto Tohoku University Diptarama Ayumi Shinohara data structures Burrows-Wheeler Transform (BWT) [Burrows,Wheeler '94] Bijective BWT (BBWT) [Gil,Scott '12]

A PLACE TO CALL HOME A PLACE TO CALL HOME A PLACE TO CALL HOME A PLACE TO CALL HOME A PLACE

Motion Estimation by Affine Transforms Motion Estimation by Affine Transforms Motion Estimation

Bijective counting of tree-rooted maps Olivier Bernardi - LaBRI, Bordeaux Combinatorics and

Bijective proof and generalization of Siladi cs partition theorem Isaac KONAN IRIF, Paris

JUST THE MATHS SLIDES NUMBER 16.7 LAPLACE TRANSFORMS 7 (An appendix) by A.J.Hobson One

Drawing on the Web CSS CSCI-UA 380 Transforms, Transitions, and Animation Drawing on the Web

The Place Approach What is the Place Approach? What makes a Great Place The Benefits of a Great

Research that Transforms Healthcare and Transforms Lives Dianne Morrison-Beedy, PhD, RN, WHNP-BC,

Week 5 -Wednesday What did we talk about last time? Transforms Translation Rotation

F F Fast Transforms using the Cell/B.E. Processor Fast Transforms using the Cell/B.E. Processor

M- -Channel Filter Banks: Channel Filter Banks: M Block and Lapped Transforms Block and Lapped

JUST THE MATHS SLIDES NUMBER 16.8 Z-TRANSFORMS 1 (Definition and rules) by A.J.Hobson

Learning From Data Lecture 10 Nonlinear Transforms The Z -space Polynomial transforms Be

Algorithms for Lattice Transforms and 2348 2349 Lattice Transforms for 234 248 239

Sparse Fourier Transforms Eric Price UT Austin Eric Price Sparse Fourier Transforms 1 / 36

JUST THE MATHS SLIDES NUMBER 16.2 LAPLACE TRANSFORMS 2 (Inverse Laplace Transforms) by

Computational Linguistics II: Parsing Overview, Left-Recursion, Bottom-up Parsing Frank Richter

Undecidable Problems for CFGs CSCI 3130 Formal Languages and Automata Theory Siu On CHAN Chinese

The C 2 -theory of the subtrace order Dietrich Kuske Technische Universit at Ilmenau 1 / 14

The Pumping Lemma of different strings for Context-Free Languages Example: S AB A aBb

ABB: writing the future of industries in a changing world 2 nd industrial revolution 4 th

t t rt t

Reinforcement Learning Steve Tanimoto University of California, Berkeley [These slides were

Utilize Partially Faulty Links in Networks-on-Chip Changlin Chen*, Ye Lu , Sorin D. Cotofana*

Utilize Partially Faulty Links in Networks-on-Chip Changlin Chen, Ye Lu , Sorin D. Cotofana