estimating
play

Estimating Estimating Covariance . . . Statistical Characteristics - PowerPoint PPT Presentation

Need for Estimating . . . Case of Interval . . . Need to Preserve . . . Computing E under . . . Estimating Estimating Covariance . . . Statistical Characteristics Estimating . . . Proof of the First . . . Under Interval Uncertainty Toward


  1. Need for Estimating . . . Case of Interval . . . Need to Preserve . . . Computing E under . . . Estimating Estimating Covariance . . . Statistical Characteristics Estimating . . . Proof of the First . . . Under Interval Uncertainty Toward Justification of . . . Towards Proving the . . . and Constraints: Home Page Mean, Variance, Covariance, Title Page and Correlation ◭◭ ◮◮ ◭ ◮ Ali Jalal-Kamali Page 1 of 46 Department of Computer Science Go Back The University of Texas at El Paso El Paso, TX 79968, USA Full Screen December 2011 Close Quit

  2. Need for Estimating . . . Case of Interval . . . 1. Need for Estimating Statistical Characteristics Need to Preserve . . . • Often, we have a sample of values x 1 , . . . , x n corre- Computing E under . . . sponding to objects of a certain type. Estimating Covariance . . . Estimating . . . • A standard way to describe the population is to de- Proof of the First . . . scribe its mean, variance, and standard deviation: Toward Justification of . . . n n � � √ E = 1 V = 1 ( x i − E ) 2 ; Towards Proving the . . . n · x i ; n · σ = V . Home Page i =1 i =1 Title Page • When we measure two quantities x and y : ◭◭ ◮◮ – we describe the means E x , E y , variances V x , V y and ◭ ◮ standard deviations σ x , σ y of both; – we also estimate their covariance and correlation: Page 2 of 46 n � Go Back C x,y = 1 C x,y n · ( x i − E x ) · ( y i − E y ); ρ x,y = . σ x · σ y Full Screen i =1 Close Quit

  3. Need for Estimating . . . Case of Interval . . . 2. Case of Interval Uncertainty Need to Preserve . . . • The above formulas assume that we know the exact Computing E under . . . values of the characteristics x 1 , . . . , x n . Estimating Covariance . . . Estimating . . . • In practice, values usually come from measurements, Proof of the First . . . and measurements are never absolutely exact. Toward Justification of . . . • The measurement results � x i are, in general, different Towards Proving the . . . from the actual (unknown) values x i : � x i � = x i . Home Page • Often, it is assumed that we know the probability dis- Title Page def tribution of the measurement errors ∆ x i = � x i − x i . ◭◭ ◮◮ • However, often, the only information available is the ◭ ◮ upper bound on the measurement error: | ∆ x i | ≤ ∆ i . Page 3 of 46 • In this case, the only information that we have about Go Back the actual value x i is that x i ∈ x i = [ x i , x i ], where Full Screen x i = � x i − ∆ i , x i = � x i + ∆ i . Close Quit

  4. Need for Estimating . . . Case of Interval . . . 3. Need to Preserve Privacy in Statistical Databases Need to Preserve . . . • In order to find relations between different quantities, Computing E under . . . we collect a large amount of data . Estimating Covariance . . . Estimating . . . • Example: we collect medical data to try to find corre- Proof of the First . . . lations between a disease and lifestyle factors. Toward Justification of . . . • In some cases, we are looking for commonsense corre- Towards Proving the . . . lations, e.g., between smoking and lung diseases. Home Page • For statistical databases to be most useful, we need to Title Page allow researchers to ask arbitrary questions . ◭◭ ◮◮ • However, this may inadvertently disclose some private ◭ ◮ information about the individuals. Page 4 of 46 • Therefore, it is desirable to preserve privacy in statis- Go Back tical databases. Full Screen Close Quit

  5. Need for Estimating . . . Case of Interval . . . 4. Intervals as a Way to Preserve Privacy in Sta- Need to Preserve . . . tistical Databases Computing E under . . . • One way to preserve privacy is to store ranges (inter- Estimating Covariance . . . vals) rather than the exact data values. Estimating . . . Proof of the First . . . • This makes sense from the viewpoint of a statistical database. Toward Justification of . . . Towards Proving the . . . • In general, this is how data is often collected: Home Page – we set some threshold values t 0 , . . . , t N and Title Page – ask a person whether the actual value x i is in the ◭◭ ◮◮ interval [ t 0 , t 1 ], or . . . , or in the interval [ t N − 1 , t N ]. ◭ ◮ • As a result, for each quantity x and for each person i : Page 5 of 46 – instead of the exact value x i , Go Back – we store an interval x i = [ x i , x i ] that contains x i . Full Screen • Each of these intervals coincides with one of the given ranges [ t 0 , t 1 ], [ t 1 , t 2 ], . . . , [ t N − 1 , t N ] . Close Quit

  6. Need for Estimating . . . Case of Interval . . . 5. Need to Estimate Statistical Characteristics Need to Preserve . . . S ( x 1 , . . . ) Under Interval Uncertainty Computing E under . . . • In both situations of measurement errors and privacy: Estimating Covariance . . . Estimating . . . – instead of the actual values x i (and y i ), Proof of the First . . . – we only know the intervals x i (and y i ) that contain Toward Justification of . . . the actual values. Towards Proving the . . . • Different values of x i (and y i ) from these intervals lead, Home Page in general, to different values of each characteristic. Title Page • It is desirable to find the range of possible values of ◭◭ ◮◮ these characteristics when x i ∈ x i (and y i ∈ y i ): ◭ ◮ S = { S ( x 1 , . . . , x n ) : x 1 ∈ x 1 , . . . , x n ∈ x n } ; Page 6 of 46 S = { S ( x 1 , . . . , x n , y 1 , . . . , y n ) : Go Back x 1 ∈ x 1 , . . . , x n ∈ x n , y 1 ∈ y 1 , . . . , y n ∈ y n } . Full Screen Close Quit

  7. Need for Estimating . . . Case of Interval . . . 6. Estimating Statistical Characteristics under In- Need to Preserve . . . terval Uncertainty: What is Known Computing E under . . . n � • The mean E = 1 Estimating Covariance . . . n · x i is an increasing function of Estimating . . . i =1 all its inputs x 1 , . . . , x n . Proof of the First . . . Toward Justification of . . . • Hence, E is the smallest when all the inputs x i ∈ [ x i , x i ] n n Towards Proving the . . . � � are the smallest ( x i = x i ): E = 1 x i ; E = 1 n · n · x i . Home Page i =1 i =1 Title Page • However, variance, covariance, and correlation are, in ◭◭ ◮◮ general, non-monotonic. ◭ ◮ • It is known that computing the ranges of these char- Page 7 of 46 acteristics under interval uncertainty is NP-hard. Go Back • The problem gets even more complex because in prac- tice, we often have additional constraints. Full Screen Close Quit

  8. Need for Estimating . . . Case of Interval . . . 7. Formulation of the Problem and What We Did Need to Preserve . . . • Reminder: under interval uncertainty, Computing E under . . . Estimating Covariance . . . – in the absence of constraints, computing the range Estimating . . . E of the mean E is feasible; Proof of the First . . . – computing the ranges V , C , and [ ρ, ρ ] is NP-hard. Toward Justification of . . . • Problem: find practically useful cases when feasible al- Towards Proving the . . . gorithms are possible. Home Page • What is known: for V , we can feasibly compute: Title Page – one of the endpoints ( V ) – always; and ◭◭ ◮◮ – both endpoints – in the privacy case. ◭ ◮ • We designed: feasible algorithms for computing: Page 8 of 46 – the range E under constraints; Go Back – the range C in the privacy case; and Full Screen – one of the endpoints ρ or ρ . Close Quit

  9. Need for Estimating . . . Case of Interval . . . 8. Computing E under Variance Constraints Need to Preserve . . . • In the previous expressions, we assumed only that x i Computing E under . . . belongs to the intervals x i = [ x i , x i ]. Estimating Covariance . . . Estimating . . . • In some cases, we have an additional a priori constraint Proof of the First . . . on x i : V ≤ V 0 , for a given V 0 . Toward Justification of . . . • For example, we know that within a species, there can Towards Proving the . . . be ≤ 0 . 1 variation of a certain characteristic. Home Page • Thus, we arrive at the following problem: Title Page – given: n intervals x i = [ x i , x i ] and a number V 0 ≥ 0; ◭◭ ◮◮ – compute: the range ◭ ◮ [ E, E ] = { E ( x 1 , . . . , x n ) : x i ∈ x i & V ( x 1 , . . . , x n ) ≤ V 0 } ; Page 9 of 46 – under the assumption that there exist values x i ∈ x i Go Back for which V ( x 1 , . . . , x n ) ≤ V 0 . Full Screen • This is a problem that we will solve in this thesis. Close Quit

  10. Need for Estimating . . . Case of Interval . . . 9. Cases Where This Problem Is (Relatively) Easy Need to Preserve . . . to Solve Computing E under . . . • First case: V 0 is ≥ the largest possible value V of the Estimating Covariance . . . variance corresponding to the given sample. Estimating . . . Proof of the First . . . • In this case, the constraint V ≤ V 0 is always satisfied. Toward Justification of . . . • Thus, in this case, the desired range simply coincides Towards Proving the . . . with the range of all possible values of E . Home Page • Second case: V 0 = 0. Title Page • In this case, the constraint V ≤ V 0 means that the ◭◭ ◮◮ variance V should be equal to 0, i.e., x 1 = . . . = x n . ◭ ◮ • In this case, we know that this common value x i be- Page 10 of 46 longs to each of n intervals x i . Go Back • So, the set of all possible values E is the intersection: Full Screen E = x 1 ∩ . . . ∩ x n . Close Quit

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend