Data Processing in the A Computational . . . Priority Approach to . - PowerPoint PPT Presentation

Design and Control . . . How to Describe . . . Constraint . . . What Are Soft . . . Priority Approach to . . . Data Processing in the A Computational . . . Priority Approach to . . . When Is a Method . . . Presence of Interval Main Result Proof (cont-d) Uncertainty and Erroneous Acknowledgments What If Constraints . . . Measurements: Practical Proof of NP-hardness Constraint . . . New Idea Problems, Results, New Algorithm Title Page Challenges ◭◭ ◮◮ M. Ceberio, O. Kosheleva, V. Kreinovich, G. R. Keller, ◭ ◮ R. Araiza, M. Averill, and G. Xiang Page 1 of 34 University of Texas at El Paso, 500 W. University El Paso, TX 79968, USA, olgak@utep.edu Go Back Full Screen Close Quit

Design and Control . . . How to Describe . . . 1. Formulation of the Problem Constraint . . . What Are Soft . . . • There are two main reasons why measurement results differ from the actual Priority Approach to . . . values of the measured quantities: A Computational . . . • There is a small difference caused by the inaccuracy of the measuring instru- Priority Approach to . . . ment. When Is a Method . . . Main Result • This inaccuracy is characterized by probabilistic or interval uncertainty. Proof (cont-d) • Sometimes, due to an instrument malfunction or a human error, we get an Acknowledgments erroneous measurement result (outlier) that is drastically different from the What If Constraints . . . actual value. Proof of NP-hardness Constraint . . . • This uncertainty is usually characterized by a proportion of measurement New Idea results that could be erroneous (e.g., ≤ 1%). New Algorithm • Situation: most data processing algorithms based on interval computations Title Page only take into account the first type of uncertainty. ◭◭ ◮◮ • Problem: take the presence of erroneous measurements into account as well. ◭ ◮ Page 2 of 34 Go Back Full Screen Close Quit

Design and Control . . . How to Describe . . . 2. Sometimes, It Is Relatively Easy to Detect Outliers Constraint . . . What Are Soft . . . • In some cases, when the data is smooth, we can (rather easily) detect the Priority Approach to . . . corresponding outliers. A Computational . . . • Traditional engineering approach: a new measurement result x is classified Priority Approach to . . . as an outlier if x �∈ [ L, U ], where When Is a Method . . . Main Result def def L = E − k 0 · σ, U = E + k 0 · σ, Proof (cont-d) Acknowledgments and k 0 > 1 is pre-selected (most frequently, k 0 = 2, 3, or 6. What If Constraints . . . • Minor problem: in some practical situations, we only have intervals x i = Proof of NP-hardness [ x i , x i ]. Constraint . . . New Idea • For different values x i ∈ x i , we get different k 0 -sigma intervals [ L, U ]. New Algorithm • A value x is a guaranteed outlier if x �∈ [ L, U ]. Title Page • Conclusion: to detect outliers, we must know the ranges of L = E − k 0 · σ ◭◭ ◮◮ and U = E + k 0 · σ . • Good news: there exist algorithm for computing these ranges. ◭ ◮ • Not so good news: in many practical situations, e.g., in non-destructive test- Page 3 of 34 ing (NDT) of aeroplanes and roads, and in geophysical analysis, we are ac- tually interested in unusual non-smooth data points. Go Back • Problem: separating correct but unusual measurement results from the erro- Full Screen neous measurement results is a challenge. Close Quit

Design and Control . . . How to Describe . . . 3. Presence of Erroneous Measurements Make Prob- Constraint . . . lems Computationally Difficult What Are Soft . . . Priority Approach to . . . • Known fact: the presence of outliers turns easy-to-solve interval problems A Computational . . . into difficult-to-solve (NP-hard) ones. Priority Approach to . . . When Is a Method . . . • New result: this difficulty may appear even without interval uncertainty. Main Result • Situation: we know how the measured quantity y is related to the desired Proof (cont-d) parameters x j . Acknowledgments What If Constraints . . . � n • Simplest case: linear dependence, i.e., a ij · x j = y i , where y i is the result Proof of NP-hardness j =1 Constraint . . . of i -th measurement, and a ij are (known) parameters corresponding to i -th New Idea measurement. New Algorithm • Problem: given a ij , y i , and ε ∈ (0 , 1), and constraints Title Page n � ◭◭ ◮◮ a ij · x j = y i , i = 1 , . . . , N j =1 ◭ ◮ check whether we can select a consistent set of N · (1 − ε ) constraints. Page 4 of 34 Go Back Full Screen Close Quit

Design and Control . . . How to Describe . . . 4. Result: The Problem Is NP-hard Even for the Lin- Constraint . . . ear Case What Are Soft . . . Priority Approach to . . . • Idea: reduce to a known NP-hard problem. A Computational . . . Priority Approach to . . . • Subset sum: given positive integers s 1 , . . . , s n , and s , check whether s = When Is a Method . . . n � x i · s i for some x i ∈ { 0 , 1 } . Main Result i =1 Proof (cont-d) Acknowledgments • Reduction: N = n/ε constraints: What If Constraints . . . • 2 n constraints x 1 = 0, x 1 = 1 . . . , x n = 0, x n = 1; Proof of NP-hardness � • N − 2 n identical constraints s i · x i = s. Constraint . . . New Idea • Since 0 � = 1, at most N − n are satisfied. New Algorithm • If the subset problem has a solution, then: Title Page � • all N − 2 n constraints s i · x i = s are satisfied, ◭◭ ◮◮ • and for each i , x i = 0 or x i = 1, ◭ ◮ to the total of N − n = N · (1 − ε ). Page 5 of 34 • If N − n constraints are satisfied, then for every i , x i ∈ { 0 , 1 } – a solution to the subset problem. Go Back Full Screen Close Quit

Design and Control . . . How to Describe . . . 5. Constraint Propagation Techniques (Semenov, Nu- Constraint . . . merica, Jaulin, etc): Reminder What Are Soft . . . Priority Approach to . . . • Constraint propagation – traditional technique for solving constraint satis- A Computational . . . faction problems. Priority Approach to . . . When Is a Method . . . • We start with the intervals [ x 1 , x 1 ] , . . . , [ x n , x n ] containing the actual values Main Result of the unknowns x 1 , . . . , x n . Proof (cont-d) • On each iteration: Acknowledgments What If Constraints . . . – select i and a constraint f j ( x 1 , . . . , x n ) = 0, Proof of NP-hardness def – replace [ x i , x i ] with new interval x ( j ) = [ x ( j ) i , x ( j ) i ] = Constraint . . . i New Idea { x i : x i ∈ [ x i , x i ] & f j ( x 1 , . . . , x i − 1 , x i , x i +1 , . . . , x n ) = 0 New Algorithm for some x k ∈ [ x k , x k ] } . Title Page • If the process stalls, we bisect the interval for one the variables into two and ◭◭ ◮◮ try to decrease both resulting half-boxes. ◭ ◮ • Problem: cannot use it if not all constraints are valid. Page 6 of 34 Go Back Full Screen Close Quit

Design and Control . . . How to Describe . . . 6. Traditional Interval-Related Constraint Propagation Constraint . . . Techniques: Example What Are Soft . . . Priority Approach to . . . • Toy problem: find x ∈ [ − 5 , 5] for which x − x 2 = 0. A Computational . . . Priority Approach to . . . • Pre-processing: parse the expression: When Is a Method . . . r = x 2 ; x − r = 0 . Main Result Proof (cont-d) • Originally: X = [ − 5 , 5], R = [ −∞ , ∞ ]. Acknowledgments • Use the first constraint: x ∈ [ − 5 , 5] implies r ∈ [0 , 25], so for r , the new What If Constraints . . . interval is [ −∞ , ∞ ] ∩ [0 , 25] = [0 , 25]: Proof of NP-hardness X = [ − 5 , 5] , R = [0 , 25] . Constraint . . . New Idea • Use the second constraint: for x , we have [ − 5 , 5] ∩ [0 , 25] = [0 , 5], and similarly New Algorithm for r , so X = [0 , 5] , R = [0 , 5] . Title Page • Use the first constraint: x = √ r , hence ◭◭ ◮◮ X = [0 , 2 . 24] , R = [0 , 5] . ◭ ◮ • Use the second constraint: Page 7 of 34 X = [0 , 2 . 24] , R = [0 , 2 . 24] . Go Back • After a while, we stall at X = R ≈ [0 , 1], so we bisect X to [0 , 1 / 2] and [1 / 2 , 1]. Full Screen • Then, we converge to x = 0 and x = 1. Close Quit

Design and Control . . . How to Describe . . . 7. New Idea Constraint . . . What Are Soft . . . • On each iteration, we still select a variable x i , but: Priority Approach to . . . – instead of selecting a single constraint, A Computational . . . Priority Approach to . . . – we try all N constraints, and get N resulting intervals [ x ( j ) i , x ( j ) i ]. When Is a Method . . . • We know that ≥ N · (1 − ε ) constraints are satisfied. Main Result Proof (cont-d) • Hence x i ≤ x ( j ) for ≥ N · (1 − ε ) different values j . i Acknowledgments • Let us sort all N upper endpoints x ( j ) What If Constraints . . . (1 ≤ j ≤ N ) into an increasing i Proof of NP-hardness sequence u 1 ≤ u 2 ≤ . . . ≤ u N , Constraint . . . • Then we can guarantee that x i is smaller than (or equal to) at least N · (1 − ε ) New Idea terms in this sequence. New Algorithm • So, x i ≤ u N · ε . Title Page • Similarly, if we sort the lower endpoints x ( j ) into a decreasing sequence l 1 ≥ ◭◭ ◮◮ i . . . ≥ l N , then x i ≥ l N · ε . ◭ ◮ Page 8 of 34 Go Back Full Screen Close Quit

Data Processing in the A Computational . . . Priority Approach to . - PowerPoint PPT Presentation

Design and Control . . . How to Describe . . . Constraint . . . What Are Soft . . . Priority Approach to . . . Data Processing in the A Computational . . . Priority Approach to . . . When Is a Method . . . Presence of Interval Main Result

61A Lecture 30 Announcements Data Processing Data Processing 4 Data Processing Many data sets

FOOD PROCESSING FOOD PROCESSING GREEN BEAN PROCESSING GREEN BEAN PROCESSING GREEN BEAN

Ballot Processing | PP 2016 Ballot Processing | PP 2016 Keys to processing the PP from Heidi Hunt,

STAR-CCM+ Pre/Post Processing Bill Jester, CD-adapco Introduction Pre/Post Processing

Annotation Processing in a Kotlin World Zac Sweers @pandanomic Annotation Processing in a

Signal Processing - Introduction Signal Processing Analogue/digital filters: extensively used

Data Preparation Discretization Data cleaning (Data pre-processing) Data

Information Retrieval Data Processing and Storage Ilya Markov i.markov@uva.nl University of

CS6220: DATA MINING TECHNIQUES 2: Data Pre-Processing Instructor: Yizhou Sun yzsun@ccs.neu.edu

CS6220: DATA MINING TECHNIQUES 2: Data Pre-Processing Instructor: Yizhou Sun yzsun@ccs.neu.edu

Stream Processing Marco Serafini COMPSCI 532 Lecture 5 Stream vs. Batch Processing Batch

Introduction to SparkSQL Structured Data Processing in Spark 1 Structured Data Processing A

STORM AND LOW-LATENCY PROCESSING www.inf.ed.ac.uk Low latency processing Similar to data

DataCamp Data Types for Data Science DataCamp Data Types for Data Science Data types Data type

Cryosat Processing Prototype Cryosat Processing Prototype (CPP) (CPP) CRYOSAT LRM, TRK and SAR

Image Processing Tricks in Image Processing Tricks in OpenGL OpenGL Simon Green Simon Green

Statistical Geometry Processing Winter Semester 2011/2012 Shape Spaces and Surface

Generalized Order-Value Optimization Jos e Mario Mart nez www.ime.unicamp.br/ martinez

Nonlinear stability of compressible vortex sheets in two space dimensions J.-F. Coulombel (Lille)

1.2 Surface Representation & Data Structures Hao Li http://cs621.hao-li.com 1 Last Time

Fast-slow systems with chaotic noise Ian Melbourne David Kelly Department of Mathematics

CS 294-73 Software Engineering for Scientific Computing Lecture 8:

Teoria Erg odica Diferenci avel lecture 8: Riemannian geometry of space forms Instituto

Transfer operators and Besov spaces Daniel Smania ICMC-USP-Brazil joint work with Alexander

Data Processing in the A Computational . . . Priority Approach to . - PowerPoint PPT Presentation

Design and Control . . . How to Describe . . . Constraint . . . What Are Soft . . . Priority Approach to . . . Data Processing in the A Computational . . . Priority Approach to . . . When Is a Method . . . Presence of Interval Main Result

61A Lecture 30 Announcements Data Processing Data Processing 4 Data Processing Many data sets

FOOD PROCESSING FOOD PROCESSING GREEN BEAN PROCESSING GREEN BEAN PROCESSING GREEN BEAN

Ballot Processing | PP 2016 Ballot Processing | PP 2016 Keys to processing the PP from Heidi Hunt,

STAR-CCM+ Pre/Post Processing Bill Jester, CD-adapco Introduction Pre/Post Processing

Annotation Processing in a Kotlin World Zac Sweers @pandanomic Annotation Processing in a

Signal Processing - Introduction Signal Processing Analogue/digital filters: extensively used

Data Preparation Discretization Data cleaning (Data pre-processing) Data

Information Retrieval Data Processing and Storage Ilya Markov i.markov@uva.nl University of

CS6220: DATA MINING TECHNIQUES 2: Data Pre-Processing Instructor: Yizhou Sun yzsun@ccs.neu.edu

CS6220: DATA MINING TECHNIQUES 2: Data Pre-Processing Instructor: Yizhou Sun yzsun@ccs.neu.edu

Stream Processing Marco Serafini COMPSCI 532 Lecture 5 Stream vs. Batch Processing Batch

Introduction to SparkSQL Structured Data Processing in Spark 1 Structured Data Processing A

STORM AND LOW-LATENCY PROCESSING www.inf.ed.ac.uk Low latency processing Similar to data

DataCamp Data Types for Data Science DataCamp Data Types for Data Science Data types Data type

Cryosat Processing Prototype Cryosat Processing Prototype (CPP) (CPP) CRYOSAT LRM, TRK and SAR

Image Processing Tricks in Image Processing Tricks in OpenGL OpenGL Simon Green Simon Green

Statistical Geometry Processing Winter Semester 2011/2012 Shape Spaces and Surface

Generalized Order-Value Optimization Jos e Mario Mart nez www.ime.unicamp.br/ martinez

Nonlinear stability of compressible vortex sheets in two space dimensions J.-F. Coulombel (Lille)

1.2 Surface Representation &amp; Data Structures Hao Li http://cs621.hao-li.com 1 Last Time

Fast-slow systems with chaotic noise Ian Melbourne David Kelly Department of Mathematics

CS 294-73 Software Engineering for Scientific Computing Lecture 8:

Teoria Erg odica Diferenci avel lecture 8: Riemannian geometry of space forms Instituto

Transfer operators and Besov spaces Daniel Smania ICMC-USP-Brazil joint work with Alexander

1.2 Surface Representation & Data Structures Hao Li http://cs621.hao-li.com 1 Last Time