2020 On a Projective Ensemble Approach to Two Sample Test for - - PowerPoint PPT Presentation

2020
SMART_READER_LITE
LIVE PREVIEW

2020 On a Projective Ensemble Approach to Two Sample Test for - - PowerPoint PPT Presentation

2020 On a Projective Ensemble Approach to Two Sample Test for Equality of Distributions Zhimei Li Yaowu Zhang Shanghai University of Finance and Economics Introduction 1 Projective Ensemble Test 2 CONTENTS Numerical Studies


slide-1
SLIDE 1

2020

On a Projective Ensemble Approach to Two Sample Test for Equality of Distributions

Zhimei Li Yaowu Zhang Shanghai University of Finance and Economics

slide-2
SLIDE 2

CONTENTS

1 Introduction 2 Projective Ensemble Test 3 Numerical Studies 4 Conclusion and Discussion

slide-3
SLIDE 3

LOGO

Advantages&disadvantages

1.1 Research Question 1.2 Value of Research Testing whether two samples come from the same population is one of the most fundamental problems in statistics and has applications in a wide range of areas. For example, we can check the consistency

  • f

the distribution of training samples and test samples

Introduction

1

slide-4
SLIDE 4

LOGO

2020

Introduction

  • 1. Simple closed-form, no tuning parameters,
  • 2. Be computed in quadratic time,
  • 3. Be insensitive to the dimension, consistent against all fixed alternatives,
  • 4. No moment assumption, robust to the outliers.

1

  • We apply the idea of projections and develop a new projective ensemble

approach for testing equality of distributions.

  • This method has the following advantages:

1.3 Research Method

slide-5
SLIDE 5

LOGO

Introduction

1

Some existing methods can be implemented in quadratic time but have been reported to be sensitive to heavy-tailed data, Robust counterparts are computationally challenging with a cubic time complexity.

Motivation

So we want to improve the approach proposed by Kim et al. (2020), and propose a robust test, meanwhile reduce the computational cost.

slide-6
SLIDE 6

LOGO

The first two moments are not sufficient to characterize the distribution May be inconsistent when the normality assumption violates

Normality assumption:

examples

Mean vector; covariance matrices The Student’s t test; Hotelling’s 𝑈2test; Bai & Saranadasa (1996); Li & Chen (2012); Cai et al. (2014); Cai & Liu (2016)

disadvantages

1

Introduction

1.4 Related literature

slide-7
SLIDE 7

LOGO

Advantages&disadvantages

nonparametric approaches:

examples

Use a measure of difference between 𝐺

𝑛

and 𝐻𝑜 as the test statistic

Kolmogorov-Smirnov test statistic (Smirnov, 1939): Cramér-von Mises (CvM) test statistic (Anderson, 1962) and Anderson- Darling statistic (Darling, 1957) :

1

Introduction

slide-8
SLIDE 8

LOGO

  • Difficult to generalize to multivariate

cases (Kim et al., 2020).

  • Suffer from significant power loss when p

increases. When p = 1,

  • Consistent against any fixed alternatives,

distribution free under the null,

  • No moment conditions are required,
  • Free of tuning parameters,

1

Advantages Dis- advantages

Introduction

slide-9
SLIDE 9

LOGO

Advantages&disadvantages s

reproducing kernel Hilbert space (RKHS) graph-based tests

  • k minimum spanning tree graphs;
  • k nearest neighbor graphs.
  • Inconsistent
  • Rely on selecting tuning parameters
  • Maximum mean discrepancy (MMD)

test statistic based on RKHS;

  • Energy statistic (be a special case of

the MMD).

1

disadvantages

Introduction

slide-10
SLIDE 10

LOGO

2020

Kim et al. (2020) Where: energy statistic (Baringhaus& Franz, 2004)

1

Introduction

𝜇 𝛾 is the uniform probability measure on the 𝑞-dimensional unit sphere lim

min(𝑛,𝑜)→∞ 𝜐 = 𝑛/(𝑛 + 𝑜)

(1)

slide-11
SLIDE 11

LOGO

Projection-averaging approach Energy statistic Advantages

  • nonnegative and equal to zero if and only if F = G
  • have a simple closed-form expression
  • free of tuning parameters

robust to heavy-tailed distributions or outliers quadratic computations Disadvantages cubic computations energy distance is only well- defined under the moment condition (finite first moment)

1

Introduction

Table: Comparison of Projection-averaging approach and energy statistic

slide-12
SLIDE 12

LOGO

Projection-averaging approach focused on the case that 𝛾𝑈x and 𝛾𝑈y have continuous distribution functions for all 𝛾 ∈ 𝑇𝑞−1, whereas we are targeting on a more general case and we do not need such continuous distribution assumption. These observations motivate us to carefully choose other weight functions such that 1. The integration in (2) equals zero if and only if x and y are equally distributed; 2. The choice of 𝐼(𝛾, 𝑢) does not depend on unknown functions which are difficult to estimate; 3. The integration in (2) has a closed-form expression, and is finite without any moment conditions. We apply the idea of projections and develop a new projective ensemble approach for testing equality of distributions.

1

Introduction

slide-13
SLIDE 13

LOGO

Projective Ensemble Test

The integration in Eq.(2) can be rewritten as In order to obtain a closed-form expression, we need to evaluate the three integrations in the above display. We take the first integration for example. By adopting Fubini’s theorem, it suffices to find H(β,t) such that the following integration has a closed form for given x1and x2

2

2.1 Motivation

slide-14
SLIDE 14

LOGO

Projective Ensemble Test

2

By treating x1 and x2 as constants, (𝛾, 𝑢)𝑈 as a 𝑞 + 1 dimensional multivariate joint normal random vector with cumulative distribution function 𝐼(𝛾, 𝑢), the integration can be expressed as

slide-15
SLIDE 15

LOGO

Projective Ensemble Test

2

Consequently, the integration in (2) can be expressed in a closed form, which is shown in the following Theorem.

slide-16
SLIDE 16

LOGO

At the sample level, we estimate T1, T2, and T3 by V-statistic

2

Complexity: 𝑃{(𝑛 + 𝑜)2} 2.2 Asymptotic properties

Projective Ensemble Test

slide-17
SLIDE 17

LOGO

asymptotic properties of the test statistic under the null hypothesis No moment condition No continuity assumption

2

Projective Ensemble Test

slide-18
SLIDE 18

LOGO

Under the global alternative, F ≠ G and the difference between the two distribution functions does not vary with the sample size.

2

Projective Ensemble Test

slide-19
SLIDE 19

LOGO

That is, as long as the difference is larger than O?(m + n)−1/2?, it can be consistently detected by

  • ur proposed test with probability tending to one.

Under the local alternative, F ≠ G but the difference between the two distribution functions diminishes as the sample size increases. We consider a sequence of local alternatives as follows:

2

Projective Ensemble Test

slide-20
SLIDE 20

LOGO

2

Projective Ensemble Test

slide-21
SLIDE 21

LOGO

Numerical Studies

Throughout the experiment, we set the significance level as 0.05. We repeat each experiment 1000 times and determine the critical values with 1000 permutations.

3

1. Normal distributions, 𝑜𝑦 = 𝑜𝑧 = 𝑜𝑨 = 20, 𝑞 = 10; 2. Cauchy distributions, 𝑜𝑦 = 𝑜𝑧 = 𝑜𝑨 = 20, 𝑞 = 10; 3. Cauchy distributions, 𝑜𝑦 = 20, 𝑜𝑧= 20, 𝑜𝑨 = 40, 𝑞 = 100; 4. Normal distributions, 𝑜𝑦 = 𝑜𝑧 = 20,50,100 , 𝑞 = 10. Compare x and y to inspect location shift Compare y and z to inspect scale difference Compare x and z to inspect both location shift and scale difference

slide-22
SLIDE 22

LOGO

Numerical Studies

3

We compare the performance of the projection ensemble based test (“PE”) with other competing nonparametric tests.

  • 1. the projection-averaging based Cramér-von Mises test (Kim et al., 2020, “CvM”),
  • 2. the k nearest neighbor test (Henze, 1988, “NN”),
  • 3. the modified k nearest neighbor test (Mondal et al., 2015, “MGB”),
  • 4. the energy statistic based test (Székely & Rizzo, 2004, “Energy”),
  • 5. the inter-point distance test (Biswas & Ghosh, 2014,“BG”),
  • 6. the cross-match test (Rosenbaum, 2005, “CM”),
  • 7. ball divergence test (Pan et al., 2018, “Ball”).
slide-23
SLIDE 23

LOGO

Numerical Studies

The cross-match test is not efficient in detecting the scale difference may be mainly because it relies

  • n some tuning parameters.

3

Case 1: Normal distributions, 𝑜𝑦 = 𝑜𝑧 = 𝑜𝑨 = 20, 𝑞 = 10;

slide-24
SLIDE 24

LOGO

Numerical Studies

3

Case 2: Cauchy distributions, 𝑜𝑦 = 𝑜𝑧 = 𝑜𝑨 = 20, 𝑞 = 10; Case 3: Cauchy distributions, 𝑜𝑦 = 20, 𝑜𝑧= 20, 𝑜𝑨 = 40, 𝑞 = 100;

slide-25
SLIDE 25

LOGO

Numerical Studies

heavy computations

3

Case 4: Normal distributions, 𝑜𝑦 = 𝑜𝑧 = 20,50,100 , 𝑞 = 10.

slide-26
SLIDE 26

LOGO

Numerical Studies

Summary

  • ur method is comparable with the projection-averaging based Cramér-von

Mises test in terms of power performance,

  • be superior to the other tests across almost all the cases, especially in the

presence of the heavy-tailed distributions.

  • more computationally efficient than the projection-averaging based Cramér-

von Mises test .

3

slide-27
SLIDE 27

LOGO

UCI machine learning repository: Daily Demand Forecasting Orders Data Set inspect whether the demand on Friday is significantly different from other weekdays. Question Dataset Features

3

Non urgent order (𝑌1), Urgent order (𝑌2), Three order types (𝑌3, 𝑌4, 𝑌5), Fiscal sector orders (𝑌6), Orders from the traffic controller sector(𝑌7), Three kinds of banking orders (𝑌8, 𝑌9, 𝑌10), Total orders (𝑌11).

Numerical Studies

Application

slide-28
SLIDE 28

LOGO

Cauchy combination test statistic:

  • The corresponding p-value is

0.0164

  • the demand on Friday is

significantly different from other weekdays Permutation 1000 times α = 0.05

3

Numerical Studies

slide-29
SLIDE 29

LOGO Conclusion and Discussion

◆ We apply the idea of projections and propose a robust test for the multivariate two-sample problem. ◆ It is demonstrated that with a suitable choice of the ensemble approach, we can obtain a test, which is superior to most existing tests, especially in the presence of the heavy-tailed distributions. ◆ Moreover, it is comparable with the projection-averaging based Cramér-von Mises test in terms of power performance, but much more efficient in terms of computation.

4

Conclusion

slide-30
SLIDE 30

LOGO Conclusion and Discussion

4

Discussion It’s necessary to continue reducing the computational cost: ◆ In univariate cases, we can adopt AVL tree-type implementation to develop an efficient algorithm with complexity 𝑃{ 𝑛 + 𝑜 log(𝑛 + 𝑜)} ◆ In multivariate cases, we can approximate the test statistic with random projections, whose computational cost can be reduced to 𝑃{ } 𝑛 + 𝑜 𝐿log( ) 𝑛 + 𝑜 and memory cost 𝑃 max 𝑛 + 𝑜, 𝐿 , where 𝐿 is the number of random projections.

slide-31
SLIDE 31

THANK YOU

PPT模板下载:www.1ppt.com/moban/ 行业PPT模板:www.1ppt.com/hangye/ 节日PPT模板:www.1ppt.com/jieri/ PPT素材下载:www.1ppt.com/sucai/ PPT背景图片:www.1ppt.com/beijing/ PPT图表下载:www.1ppt.com/tubiao/ 优秀PPT下载:www.1ppt.com/xiazai/ PPT教程: www.1ppt.com/powerpoint/ Word教程: www.1ppt.com/word/ Excel教程:www.1ppt.com/excel/ 资料下载:www.1ppt.com/ziliao/ PPT课件下载:www.1ppt.com/kejian/ 范文下载:www.1ppt.com/fanwen/ 试卷下载:www.1ppt.com/shiti/ 教案下载:www.1ppt.com/jiaoan/ PPT论坛:www.1ppt.cn