Symbolic Aggregate Case of Interval . . . ApproXimation (SAX) How - PowerPoint PPT Presentation

Formulation of the . . . Symbolic Aggregate . . . SAX: Problem Towards Formulating . . . Symbolic Aggregate Case of Interval . . . ApproXimation (SAX) How Measurement . . . How Measurement . . . under Interval Uncertainty Solving the . . . What If We Minimize . . . Chrysostomos D. Stylios 1 and Vladik Kreinovich 2 Home Page Title Page 1 Laboratory of Knowledge and Intelligent Computing Department of Computer Engineering ◭◭ ◮◮ Technological Educational Institute of Epirus ◭ ◮ 47100 Kostakioi, Arta, Greece, stylios@teiep.gr Page 1 of 22 2 Department of Computer Science University of Texas at El Paso, 500 W. University Go Back El Paso, Texas 79968, USA vladik@utep.edu Full Screen Close Quit

Formulation of the . . . Symbolic Aggregate . . . 1. Formulation of the Problem SAX: Problem • Need for diagnostics: often, we are monitoring a certain Towards Formulating . . . process for possible problems; e.g.: Case of Interval . . . How Measurement . . . – we check the observed vibrations of a mechanical How Measurement . . . system indicate an abnormality; Solving the . . . – we check the vital signs of a patient to see if an What If We Minimize . . . urgent medical intervention is needed. Home Page • Sometimes, we have an algorithm that, based on the Title Page observations, decided whether intervention is needed. ◭◭ ◮◮ • However, in most practical applications – especially in ◭ ◮ medicine – no such algorithm is readily available. Page 2 of 22 • What we have instead is numerous past data series corresponding both: Go Back – to cases when situation turned out to be normal, Full Screen – and to cases with abnormality. Close Quit

Formulation of the . . . Symbolic Aggregate . . . 2. Formulation of the Problem (cont-d) SAX: Problem • We have numerous past data series corresponding both: Towards Formulating . . . Case of Interval . . . – to cases when situation turned out to be normal, How Measurement . . . – and to cases with abnormality. How Measurement . . . • We thus need to extract such an algorithm from all Solving the . . . these examples, i.e., use machine learning . What If We Minimize . . . Home Page • Most machine learning algorithms work well if we have up to dozens of inputs. Title Page ◭◭ ◮◮ • However, as a result of monitoring, we get values x ( t ) corresponding to hundreds of moments of time t . ◭ ◮ • So, to efficiently apply machine learning algorithms, we Page 3 of 22 first need to compress the input data. Go Back Full Screen Close Quit

Formulation of the . . . Symbolic Aggregate . . . 3. Symbolic Aggregate approXimation (SAX): SAX: Problem Main Idea Towards Formulating . . . • The main objective of monitoring is to catch deviations Case of Interval . . . from the normal regimes as early as possible. How Measurement . . . How Measurement . . . • As a result, monitoring is performed at a high rate, to Solving the . . . catch a deviation while this deviation is small. What If We Minimize . . . • Thus, when the monitoring is arranged properly, values Home Page change very little from one moment to the next. Title Page • So, we can safely replace the original function x ( t ) with ◭◭ ◮◮ a piece-wise constant approximation. ◭ ◮ • On each interval, we store only its endpoints and the Page 4 of 22 value of the function on this interval. Go Back • This representation indeed leads to a drastic reduction in data size. Full Screen Close Quit

Formulation of the . . . Symbolic Aggregate . . . 4. Symbolic Aggregate approXimation (cont-d) SAX: Problem • A further compression is possible since: Towards Formulating . . . Case of Interval . . . – a computer-represented real number require dozens How Measurement . . . of bits to store, corresponding to ten decimal digits, How Measurement . . . – but measurements accuracy is usually 1–10%, so Solving the . . . two decimal digits are enough. What If We Minimize . . . • Symbolic Aggregate approXimation (SAX) is a tech- Home Page nique for such a reduction. Title Page • In the interval [ x, x ] of possible values of x ( t ), we select ◭◭ ◮◮ thresholds x 0 = x, x 1 , x 2 , . . . , x m . ◭ ◮ • Then, for each moment of time t , instead of storing Page 5 of 22 x ( t ), we store the index i for which x ( t ) ∈ [ x i , x i +1 ]. Go Back • At present, SAX is the most efficient data compression Full Screen technique. Close Quit

Formulation of the . . . Symbolic Aggregate . . . 5. SAX: Details and Successes SAX: Problem • To maximize the amount of information after compres- Towards Formulating . . . sion, SAX takes into account that: Case of Interval . . . How Measurement . . . – the maximum amount of Shannon’s information � m How Measurement . . . − p i · log 2 ( p i ), where p i = Prob( x ( t ) ∈ [ x i , x i +1 ]), Solving the . . . i =0 – is attained when all the probabilities p i are equal What If We Minimize . . . 1 Home Page to each other – and is, thus, equal to p i = m + 1. Title Page • Thus, SAX selects the thresholds x i for which ◭◭ ◮◮ 1 p i = Prob( x ( t ) ∈ [ x i , x i +1 ]) = m + 1 . ◭ ◮ Page 6 of 22 • SAX techniques led to many practical applications ranging from engineering to medicine. Go Back Full Screen Close Quit

Formulation of the . . . Symbolic Aggregate . . . 6. SAX: Problem SAX: Problem • Measurement errors were a motivation for SAX tech- Towards Formulating . . . niques. Case of Interval . . . How Measurement . . . • However, SAX does not take measurement errors into How Measurement . . . account. Solving the . . . • So, we often get thresholds x i and x i +1 which are much What If We Minimize . . . closer to each other than the measurement accuracy. Home Page • Sometimes, x i and x i +1 differ by 5% while the mea- Title Page surement accuracy is 10%. ◭◭ ◮◮ • In this case, we cannot tell whether the actual value ◭ ◮ x ( t ) was in the i -th interval or in the next interval. Page 7 of 22 • It is therefore desirable to explicitly take measurement Go Back uncertainty into account in SAX techniques. Full Screen • This is what we do in this paper. Close Quit

Formulation of the . . . Symbolic Aggregate . . . 7. Case When Measurement Inaccuracy Can Be SAX: Problem Ignored (Reminder) Towards Formulating . . . • Based on the observed values x ( t ), we can find the Case of Interval . . . probabilities with which different values of x occur. How Measurement . . . How Measurement . . . • These probabilities can be naturally described by a � Solving the . . . probability density function ρ ( x ), with ρ ( x ) dx = 1. What If We Minimize . . . • In many practical situations, the observed signal is a Home Page joint effect of many different independent processes. Title Page • In such situations, the Central Limit Theorem implies ◭◭ ◮◮ that the resulting distribution is Gaussian. ◭ ◮ • We want to select the thresholds x 1 , x 2 , . . . Page 8 of 22 • We can describe, for every value x , the number ρ t ( x ) of � Go Back thresholds per unit length; the total is ρ t ( x ) dx = m . Full Screen Close Quit

Formulation of the . . . Symbolic Aggregate . . . 8. Case of No Measurement Inaccuracy (cont-d) SAX: Problem • After the data compression, the only information that Towards Formulating . . . we have about each value x ( t ) in the index i . Case of Interval . . . How Measurement . . . • So, to reconstruct the value x ( t ) based on this informa- How Measurement . . . tion, we select the midpoint � x ( t ) of the i -th subinterval. Solving the . . . • This reconstruction is approximate, there is an approx- What If We Minimize . . . def imation error ε ( t ) = � x ( t ) − x ( t ) � = 0. Home Page • Ideally, we would like to have all these errors to be as Title Page close to 0 as possible. ◭◭ ◮◮ • The vector ε = ( ε ( t 1 ) , ε ( t 2 ) , . . . ) of these errors should ◭ ◮ be close to the zero vector � 0 = (0 , 0 , . . . ): �� Page 9 of 22 ( ε ( t k )) 2 → min . d ( ε,� 0) = Go Back k Full Screen • In the continuous approximation, this is equivalent to � ( ε ( t )) 2 dt . Close minimizing Quit

Formulation of the . . . Symbolic Aggregate . . . 9. Alternative Ideas SAX: Problem • The least-squares approach is vulnerable to outliers. Towards Formulating . . . Case of Interval . . . • The second idea is to avoid this sensitivity by using How Measurement . . . ℓ p -estimates: � | ε ( t ) | p dt → min . How Measurement . . . Solving the . . . What If We Minimize . . . • The third idea is to explicitly minimize the number of Home Page bits needed to describe all the thresholds. Title Page • If x i +1 − x i ≈ 2 − b , then it is sufficient to describe the first b binary digits of the corresponding interval. ◭◭ ◮◮ ◭ ◮ • This, the number of bits needed to store each threshold is approximately equal to b ≈ − log 2 ( x i +1 − x i ). Page 10 of 22 • So, we minimize the average number of bits, i.e., the Go Back sum − � log 2 ( x i +1 − x i ) or the corresponding integral. Full Screen k Close Quit

Symbolic Aggregate Case of Interval . . . ApproXimation (SAX) How - PowerPoint PPT Presentation

Formulation of the . . . Symbolic Aggregate . . . SAX: Problem Towards Formulating . . . Symbolic Aggregate Case of Interval . . . ApproXimation (SAX) How Measurement . . . How Measurement . . . under Interval Uncertainty Solving the . . .

Aggregate Sampling Aggregate Stockpiles CIVL 3137 2 Stockpile Segregation CIVL 3137 3

Decidability Decidability and Symbolic Symbolic Verification Symbolic Symbolic Verification

Asphalt Aggregate Specifications Aggregate Specifications In order to make good asphalt

Aggregate Blending Aggregate Blending To meet the gradation specifications for a concrete or

Short-Run Aggregate Supply (SRAS) Video explanation in 2 minutes or 12 minutes AND Long-Run

Hierarchical Exact Symbolic Analysis y y of Large Analog Integrated Circuits By Symbolic Stamps

Lazy Heap Analysis with Symbolic Memory Graphs Alexander Driemeyer Outline 1. Motivation 2.

Symbolic data analysis Symbolic data analysis Clustering of large data sets of mixed units

CS 478 - Tools for Machine Learning and Data Mining Symbolic Clustering - COBWEB Symbolic

Neural-Symbolic Integration Strategies Neural-Symbolic Integration Unification Hybrid

Symbolic Execution of Linux binaries About Symbolic Execution Dynamically explore all

Cognitive Modeling Symbolic School Lecture 2: Approaches Symbolic Models 2 Symbolic

Formal Verification Methods 2: Symbolic Simulation John Harrison Intel Corporation

Symbolic Execution: Applications Symbolic execution is widely used in practice. Tools based on

Symbolic Mathematics Dr. Mihail November 20, 2018 (Dr. Mihail) Symbolic November 20, 2018 1 /

Symbolic execution as search, and the rise of solvers Search and SMT Symbolic execution is

A Parallel Bundle Method for Asynchronous Subspace Optimization in Lagrangian Relaxation Frank

Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables

QCD - introduction lagrangian, symmetries, running coupling, Coulomb gauge Lagrangian Quantum

ENVIRONMENTAL GEOMECHANICS CE-641 Lecture No. 9 Prof. D N Singh Department of Civil Engineering

Counterexample-Guided Polynomial Quantitative Loop Invariants by Lagrange Interpolation Yu-Fang

EC400 Part II, Math for Micro: Lecture 5 Leonardo Felli NAB.SZT 15 September 2010 One

Support Vector Machines Part 1 CS 760@UW-Madison Goals for the lecture you should understand

A Semi-Lagrangian discretization of non linear fokker Planck equations E. Carlini Universit` a