Outline Outline Monit Monitoring Location and ring Location and - - PowerPoint PPT Presentation

outline outline
SMART_READER_LITE
LIVE PREVIEW

Outline Outline Monit Monitoring Location and ring Location and - - PowerPoint PPT Presentation

7/25/2011 A A Dis Distribut bution-free Contro Control Char l Chart f t for r Outline Outline Monit Monitoring Location and ring Location and Scale Scale Introduction Introduction Normal Distribution Normal Distribution


slide-1
SLIDE 1

7/25/2011 1 A A Dis Distribut bution-free Contro Control Char l Chart f t for r Monit Monitoring Location and ring Location and Scale Scale

  • S. Chakraborti

Department of Information Systems, Statistics and Management Science University of Alabama U.S.A. and

  • A. Mukherjee

Department of Mathematics Alto University Finland

ISSPC 2011 Conference PUC, Rio de Janeiro, Brazil July, 2011

 Introduction Introduction  Normal Distribution Normal Distribution

Outline Outline

 Normal Distribution Normal Distribution  Joint Monitoring of Mean and Variance Joint Monitoring of Mean and Variance  Connections to statistical testing problems Connections to statistical testing problems  Classes of charts Classes of charts  Brief literature review Brief literature review  Nonparametric Nonparametric  Joint monitoring of location and scale Joint monitoring of location and scale  Proposed chart Proposed chart  Implementation Implementation  Example Example  Performance Performance  On On-

  • going/Future Work

going/Future Work

Joint Monitoring Joint Monitoring --

  • - Normal

Normal

 Variables Data  Process output normally distributed, X ~ N(µ, σ2)  Monitor both mean µ and variance σ2  Approaches  Two Chart Schemes:  Use a mean chart and a variance chart  Driven by the independence of sample mean and variance under normality  Several charts are available  One Chart Schemes:  Use a plotting statistic which is a function of two statistics, one for the mean and one for the variance (also driven by normality)  Several charts are available

Joint Monitoring Joint Monitoring

 Two Chart Schemes  Shewhart-type  EWMA-type  Others  One Chart Schemes  Shewhart-type  Hybrid—EWMA  Others  See Cheng and Thaga (2006) for an overview and references up to 2005  See McCracken et al. (2011) (in-progress) for a more recent overview

slide-2
SLIDE 2

7/25/2011 2

  • Zhang, J., Zou, C., & Wang, Z. (2011). “A new chart for detecting the process mean and

variability,” Communications in Statistics - Simulation and Computation, 40(5), 728 -743.

  • Maboudou-Tchao, E. & Hawkins, D. (2011). “Self-Starting Multivariate Control Charts for

Some Recent Literature Some Recent Literature

, , ( ) g Location and Scale,” Journal of Quality Technology, 43(2), 113-126.

  • Huang, C.C., & Chen F.L. (2010). “Economic Design of Max Charts,” Communications in

Statistics - Theory and Methods,39(16), 2961-2976.

  • Khoo, M.B.C. et al. (2010a). “Using one EWMA chart to jointly monitor the process mean

and variance,” Computational Statistics, 25, 299–316.

  • Khoo, M.B.C. et al. (2010b). “Monitoring process mean and variability with one double

EWMA Chart”, Communications in Statistics - Theory and Methods, 39(20), 3678 -3694.

  • Li, Z., Zhang, J., & Wang, Z. (2010) “Self-starting control chart for simultaneously

monitoring process mean and variance,” International Journal of Production Research, 48(15) 4537 4553 48(15), 4537-4553.

  • Zhang, J., Zou, C., & Wang, Z. (2010). “A Control Chart Based on Likelihood Ratio Test

for Monitoring Process Mean and Variability,” Quality and Reliability Engineering International, 26, 63-73.

  • Zhang J. , Li Z., & Wang Z. (2010). “A multivariate control chart for simultaneously

monitoring process mean and variability,” Computational Statistics & Data Analysis, 54 (10), 2244-2252.

  • Zhou, Q., Luo, Y., & Wang, Z. (2010). “A control chart based on likelihood ratio test for

detecting patterned mean and variance shifts,” Computational Statistics & Data Analysis, 54(6) 1634-1645

Some More Literature Some More Literature

54(6), 1634-1645.

  • Hawkins, D.M., & Deng, Q. (2009). “Combined Charts for Mean and Variance

Information,” Journal of Quality Technology, 41(4), 415-425.

  • Chao, M.T. & Cheng, S.W. (2008). “On 2-D Control Charts,” Quality Technology &

Quantitative Management, 5(3), 243-261.

  • Wu, Z., Zhang, S., & Wang, P. (2007). “A CUSUM scheme with variable sample sizes and

sampling intervals for monitoring the process mean and variance,” Quality and Reliability Engineering International, 23(2), 157-170.

  • Reynolds, M. R. & Stoumbos, Z. G. (2006). “Comparisons of some exponentially weighted

moving average control charts for monitoring the process mean and variance,” Technometrics, 48(4), 550-567

  • Yeh, A.B., Lin, D.K.J., & Venkataramani, C. (2004). “Unified CUSUM charts for

monitoring process mean and variability,” Quality Technology & Quantitative Management, 1(1), 65-86.

  • Costa, A.B.F., & Rahim, M.A. (2004). “Monitoring process mean and variability with one

non-central chi-square chart.” Journal of Applied Statistics, 31(10), 1171-1183.

  • Chen, G., Cheng, S.W., & Xie H. (2001). “Monitoring process mean and variability with one

EWMA chart,” Journal of Quality Technology, 33(2), 223-233.

  • …….

Parametric Control Charts: Parametric Control Charts:

Key Key issues issues

 Form of the distribution is assumed known  e.g. normal  Is this ever really true?  Chart properties (of normal theory charts) are

not always robust

 False alarm rate  False alarm rate  ARL, SDRL, … all are affected  Charts may lose value for practice!  Not applicable with all types of data such as

ranks

Distribution Distribution-

  • free/Nonparametric

free/Nonparametric

Nonparametric statistical inference is a collective term given to inferences that are valid under less restrictive assumptions than with classical ( t i ) t ti ti l i f Th ti th t b l d

Di t ib ti f /N t i

(parametric) statistical inference. The assumptions that can be relaxed include specifying the probability distribution of the population from which the sample was drawn and the level of measurement required of the sample data. For example, we may have to assume that the population is symmetric, which is much less restrictive than assuming the population is the normal distribution. The data may be ranks, i.e., measurements on an

  • rdinal scale, instead of precise measurements on an interval or ratio scale.

Or the data may be counts. In nonparametric inference, the null distribution

  • f the statistic on which the inference is based does not depend on the

probability distribution of the population from which the sample was drawn

Distribution-free/Nonparametric: Statistical methods that require minimal assumptions about the form of the distribution to make an inference based on a test statistic

probability distribution of the population from which the sample was drawn. In other words, the statistic has the same sampling distribution under the null hypoth-esis, irrespective of the form of the parent population. This statistic is therefore called distribution-free, and, in fact, the field of nonparametric statistics is some-times called distribution-free statistics. Nonparametric methods are often based on ranks, scores, or counts. This allows us to make less restrictive assumptions and still make an inference such as calculate a P-value or find a confidence interval. Strictly speaking, the term nonparametric implies an inference that may or may

inference based on a test statistic, a p-value, a control chart or a confidence interval!

  • - Gibbons and Chakraborti (2010)
slide-3
SLIDE 3

7/25/2011 3

A test of hypothesis is nonparametric (NP) or distribution-free (DF) if the Type I error probability is

Distribution Distribution-

  • free/Nonparametric

free/Nonparametric

the same for all continuous distributions

  • - Gibbons and Chakraborti (2010)

Nonparametric Statistical Inference, 5th ed., CRC Press

  • -- similar definition for a NP or DF confidence interval in terms of

the coverage!

A control chart is nonparametric (NP) or distribution- free (DF) if the in-control run length distribution is the free (DF) if the in control run length distribution is the same for all continuous distributions

  • - Chakraborti, van der Laan and Bakir (2001)

So all in-control performance characteristics remain the same and known for all continuous distributions!!! Example: the sign test and the sign control chart Advantages:

 Often a natural thing to do – intuitive

Nonparametric Control Nonparametric Control Charts Charts

 Often a natural thing to do

intuitive

 Distributional assumption not required  In-control properties are known (stable)  Robust  Comparable detection power

Challenges:  Not as widely known  Not always easy to construct  Not available for all problems  Erase doubts about efficiency Some Literature: Univariate:

Nonparametric Control Charts Nonparametric Control Charts

Univariate:

 Overview Papers:  Chakraborti et al. (2010, 2007, 2001)  Other recent papers  Balakrishnan et al. (2009, 2010), Li et al. (2010),

Hawkins and Deng (2010), Zou et al. (2010), Human et al. (2010), Memar and Niaki (2010), Chatterjee and Qiu (2009), Zhou et al. (2009), … Multivariate:  Boone (2010) and references therein  Other papers

Types of charts:

 Shewhart charts

Nonparametric Control Charts Nonparametric Control Charts

 Shewhart charts  EWMA charts  CUSUM charts  Other charts

Based on

 Signs  Signed-ranks  Sample Quantiles (Order statistics)  Ranks ….

slide-4
SLIDE 4

7/25/2011 4

 Charts for Location

Parameters Known, Univariate:

Nonparametric Control Charts Nonparametric Control Charts

 Charts for Location  Shewhart-type charts  Sign charts  Signed-rank charts  Refinements (runs rules, etc.)  CUSUM-type charts  Sign charts  Signed-rank charts  Refinements  EWMA-type charts  Sign charts  Signed-rank charts  Refinements  Charts for location

Parameters Unknown, Univariate:

Nonparametric Control Charts Nonparametric Control Charts

 Charts for location  Shewhart-type charts  Precedence  Rank-sum  Refinements  CUSUM-type charts  Precedence  Rank-sum  Refinements  EWMA-type charts  Precedence  Rank-sum  Refinements

Parameters Unknown, Univariate:

 Charts for scale

Nonparametric Control Charts Nonparametric Control Charts

 Some work has been done  Work in progress  Charts for location and scale

 Shewhart-type charts  Other charts 

Work in progress

 A continuous distribution is assumed but not its shape or form

Proposed NP Chart Proposed NP Chart

shape or form  Location and Scale parameters are unknown  A reference sample is available from Phase I  A Phase II chart is proposed for subgroup data  A Shewhart-type chart  Nonparametric: IC run length distribution is completely known and does not depend on the completely known and does not depend on the underlying distribution

slide-5
SLIDE 5

7/25/2011 5

 Plotting statistics is the one by Lepage (1971):  A combination of the Wilcoxon rank-sum

Proposed NP Chart Proposed NP Chart

statistic for location and the Ansari-Bradley statistic for scale.  For details of nonparametric tests, see, e.g., Gibbons and Chakraborti (2010).  The resulting control chart is referred to as the Shewhart-Lepage (SL) control chart.  Post signal diagnostics: In case of a signal, look for a shift in either location or in scale or in both.

 Let (U1 , U2, ... , Um) and (V1 , V2, ... , Vn) be two independent random samples from continuous distribution functions F(u) and G( ) F(δ +θ) ti l h F i k ti

Details Details

G(v) = F(δv+θ) respectively, where F is some unknown continuous distribution function.  The process is in-control (IC) if θ = 0 and δ = 1  For testing equality of locations, a popular nonparametric test is the Wilcoxon rank-sum (WRS) test. The WRS test statistic, say T1, is defined as Zk=1 when the k-th order statistic of the combined N (= m + n)

  • bservations is a V and Zk =0 if it is a U.

 For testing equality of scales the Ansari-Bradley (AB) test is well-known. The AB statistic, T2 is defined as

It can be shown that

More Details More Details

See Gibbons and Chakraborti (2010)

 Step 1. Gather reference sample of size m: Xm  Step 2. Collect the i-th Phase II test sample of size n: Yi,n

i,n

Step Step-

  • by

by-

  • Step

Step

 Step 3: Calculate the Wilcoxon rank sum Statistic T1i and the

Ansari-Bradley Statistic T2i between the i-th test sample and the reference sample.

 Step 4: Calculate two standardized statistics:

S1i = (T1i - μ1 )/σ1 and S2i = (T2i – μ2 )/σ2

Step 5: Calculate the Shewhart-Lepage (SL) plotting statistic Si

2 = S1i 2+ S2i 2 for i = 1,2,..

1,2,....

 Step 6. Plot Si

2 against i.

 Step 7. Plot the upper control limit (UCL) H. The lower control

limit (LCL) is 0.

 Step 8. Declare process out of control If Si

2 > H and do a follow-up

analysis

 If not, the process is thought to be in control; testing continues

slide-6
SLIDE 6

7/25/2011 6

 If process is declared out of control:

Post Signal Diagnostics Post Signal Diagnostics

 Compare each of S1i

2 and S2i 2 with specified

constants H1 and H2, respectively.

 If only S1i

2 exceeds H1, a shift in location is

indicated.

 If only S2i

2 exceeds H2, a shift in scale is

implied.

 But if both occurs, a shift in both location and

scale is indicated.

 Determination of H: S tti g th i t l g l gth t d i d

Implementation Implementation

 Setting the in control average run length at a desired

level, say ARL0  For post signal diagnostics

 Determination of H1 and H2

 One can bypass the complexities of solving integral

equations by using Monte-Carlo techniques

 Tables for charting constants are provided

Charting constants for the SL chart for various m and n with ARL0 = 500

.

Charting Constatnts Charting Constatnts

m n H H1 H2 m n H H1 H2

30 5 9.40 5.75 3.65 50 5 10.32 6.52 3.80 30 11 9.24 5.00 4.24 50 11 10.10 6.10 4.00 30 25 8.40 4.30 4.10 50 25 9.50 5.00 4.50 100 5 11.25 7.25 4.00 150 5 11.50 7.65 3.85 100 11 11.07 6.35 4.72 150 11 11.45 6.80 4.65 100 25 10.74 5.40 5.34 150 25 11.17 5.61 5.56

Piston Ring Data: Montgomery (2001) m=125, n=5; Target ARL0 = 250.

Data Example Data Example

m 125, n 5; Target ARL0 250.

slide-7
SLIDE 7

7/25/2011 7

  • Process is IC for the first eleven samples and goes OOC at sample 12
  • Samples,12,13 and 14 appear to come from an OOC process, indicating

Data Example Data Example

p , , pp p , g a shift in either the location, the scale or both.

  • Is the signal due to a shift in the location, the scale, or both?
  • For these samples we calculate

and with m = 125, n = 5 and H = 10.2 (for ARL0=250), we find H1 = 6.4 and thus H2 = 3.8, using simulations.

  • Hence the process has shifted both in location and scale at sample 12.

2 2 2 2 2 2 2 2 2 12 1,12 2,12 13 1,13 2,13 14 1,14 2,14

13.30, 9.01 and 4.29; 15.58, 9.95 and 5.63; 21.39, 12.16 and 9.23 S S S S S S S S S         

  • For these data Montgomery (page 220, Figures 5 and 6) found a shift in

the mean at sample number 12, while the variance was seen to be in control, based on a 3-sigma X-bar and a 3-sigma R chart, run separately.

  • - the ARL0 of this two-chart normal theory scheme is suspect at best!!

 Performance is measured in terms of the run length distribution

Performance Performance

 Since the run length distribution is skewed it is recommended to look at summary measures such as the average, the standard deviation and a number of percentiles including the 5th, 25th, 50th, 75th and the 95th first, second and the third quartiles.  We consider both the IC and the OOC set up. For the IC set up the reference and the test samples are simulated from the standard l di t ib ti ith 30 50 100 d 150 d 5 11 d normal distribution with m = 30, 50, 100 and 150 and n = 5, 11 and 25, respectively.  Different choices of H are considered for a given pair of (m, n) and we find H by searching, for which ARL0 is about 500. The findings under the IC set up are shown in Table 2.

Standard 5th 1st 3rd Quartile 95th

IC Performance characteristics of the SL chart for ARL0=500

.

Performance -- IC

m n H H1 Standard Deviation 5 Percentile 1 Quartile Median 3 Quartile 95 Percentile 30 5 9.4 5.4 1216.59 9 59 176 486 1956 30 11 9.24 5.0 978.05 8 58 187 533 1978 30 25 8.4 4.3 1027.58 5 38 148 503 2175 50 5 10.32 7.1 918.88 13 78 215 534 1886 50 11 10.1 6.1 860.88 11 73 219 572 1936 50 25 9.5 5.0 890.65 8 57 190 556 2000 100 5 11.25 7.25 690.00 21 108 274 606 1710 100 11 11.07 6.35 703.58 18 99 281 611 1742 100 25 10.74 5.40 703.81 13 89 255 648 1837 150 5 11.50 7.65 692.79 18 113 287 632 1633 150 11 11.45 6.80 627.38 20 113 291 659 1675 150 25 11.17 5.60 660.91 16 95 272 645 1712

Performance characteristics of the run length distribution of the SL chart for the normal distribution with ARL0 = 500

Performance -- OOC

m = 30, n = 5, H = 9.4 mean stdev ARL Standard Deviation 5th Percentile 1st Quartile Median 3rd Quartile 95th Percentile 1 500.79 1216.59 9 59 176 486 1956 0.25 1 369.08 984.81 5 34 107 323 1500.1 0.5 1 145.18 474.79 2 12 37 113 571.05 0.75 1 42.49 134.36 1 5 13 34 152.05 1 1 13.09 34.62 1 2 5 12 46 1.25 1 5.45 10.97 1 1 3 6 17.05 1.5 1 2.67 3.24 1 1 2 3 8 1.25 114.11 210.38 4 18 50 125 420 0.25 1.25 86.72 171.41 3 13 36 92 332 0.5 1.25 45.57 89.61 2 7 18 47 173.05 0.75 1.25 19.9 41.60 1 4 9 21 70 1 1.25 9.27 18.47 1 2 5 10 30 1.25 1.25 4.79 6.36 1 1 3 6 15 1.5 1.25 2.89 3.08 1 1 2 3 8

slide-8
SLIDE 8

7/25/2011 8

Performance characteristics of the run length distribution of the SL chart for the Laplace distribution with ARL0 = 500

Performance

m = 30, n = 5, H = 9.4 mean stdev ARL Standard Deviation 5th Percentile 1st Quartile Median 3rd Quartile 95th Percentile 1 487.67 939.52 10 61 182 504 1942 0.25 1 394.53 896.41 5 34 116 356 1630 0.5 1 207.50 736.13 2 10 33 120 818 0.75 1 71.90 381.40 1 3 9 31 244 1 1 25.56 245.16 1 1 3 9 55 1 25 1 7 78 150 72 1 1 2 4 16 1.25 1 7.78 150.72 1 1 2 4 16 1.5 1 2.36 6.67 1 1 1 2 6 1.25 171.96 363.68 4 25 68 183 670 0.25 1.25 140.08 312.71 3 16 48 133 539 0.5 1.25 81.19 322.74 1 7 20 60 307 0.75 1.25 29.86 136.21 1 3 7 20 106 1 1.25 10.15 28.48 1 2 3 8 38 1.25 1.25 4.31 9.67 1 1 2 4 13 1.5 1.25 2.41 4.16 1 1 1 2 7

100 Phase I Samples from N(0,1); Attained IC ARL and SDRL for m = 100, n = 5 and ARL0 = 500

Effect of Estimation

100 Phase-I samples Upward Bias in sample mean and Downward Bias in sample SD in Phase-I sample Downward Bias in both sample mean and sample SD in Phase-I sample Upward Bias in both sample mean and Sample SD in Phase-I sample Downward Bias in sample mean an upward bias in sample SD in Phase-I sample Target mean=0 SD=1

Phase-I mean Phase-I SD ARL0 Attained (SDRL0) Phase-I mean Phase-I SD ARL0 Attained (SDRL0) Phase-I mean Phase-I SD ARL0 Attained (SDRL0) Phase-I mean Phase-I SD ARL0 Attained (SDRL0)

Marginal Bias in Mean

  • nly

(4 cases)

0.01101 0.999 508.80 (518.72)

  • 0.0219

1.007 290.54 (288.16) 0.0204 0.997 558.88 (550.48)

  • 0.0380

1.008 473.61 (478.34) M i l Bi 0 0186 0 973 406 48 0 0254 0 986 331 35 0 0110 1 017 604 22 0 0003 1 040 674 51 Marginal Bias in both Mean and SD (18 cases) 0.0186 0.973 406.48 (400.52)

  • 0.0254

0.986 331.35 (346.05) 0.0110 1.017 604.22 (612.58)

  • 0.0003

1.040 674.51 (656.41) 0.0249 0.960 284.40 (278.24)

  • 0.0282

0.969 350.54 (354.51) 0.0197 1.023 375.92 (368.03)

  • 0.0438

1.013 487.31 (479.41) 0.0385 0.976 401.34 (412.97)

  • 0.0474

0.976 408.82 (402.83) 0.0461 1.027 551.19 (540.47)

  • 0.0237

1.024 692.64 (665.74)

  • 0.0220

0.982 377.80 (380.78) 0.0419 1.0425 582.69 (605.31)

  • 0.0220

1.040 512.49 (542.036)

  • 0.0187

0.987 417.31 (383.66) 0.0081 1.040 616.18 (642.35) 0.0007 1.043 669.90 (696.30)

Effect of Estimation

A RL0 ( m = 1 5 0 ) A RL0 ( m = 1 0 0 ) 2 0 0 0 1 5 0 0 1 0 0 0 5 0 0 A tta in e d A RL0 ( No m in a l A RL0 = 5 0 0 ) m Mean StDev Minimum Q1 Median Q3 Maximum 100 455.0 315.9 113.2 252.6 373.7 558.9 2081.4 150 515.5 229.0 133.5 340.4 449.9 652.1 1276.1

Summary statistics for the Conditional IC run length distribution of the SL chart for 100 different Phase-I representative samples for m = 100 and 150.

 A large number of Phase I observations is

Recommendations

needed so that the chart has the specified unconditional ARL0  100-150 Phase I observations are needed in

  • rder for thin tailed distributions such as the

normal.  For moderate to heavy tailed distributions, li i l l ti gg t th t d d i preliminary calculations suggest that m needed is likely to be higher, closer to 400.  These are not very different from the more recent recommendations for normal charts for the mean

slide-9
SLIDE 9

7/25/2011 9

 The SL chart monitors the location and the scale simultaneously on a single chart

Summary

scale simultaneously on a single chart  It maintains the nominal ARL0 for all continuous distributions  It provides an overall decision (some change in the process) and a follow-up diagnostic decision (a shift in either location or scale or both)  Does not require the practitioner to make the assumption of normality or any other distribution for the validity of the decision.  Thus the proposed chart can be useful in practical applications.  In Progress  Joint Monitoring Nonparametric

Current Current Work Work

 Joint Monitoring – Nonparametric  Consider refinements to Lepage statistics  Consider other than Shewhart-type charts  Study performance  Joint Monitoring -- Parametric  Joint Monitoring Parametric  New Normal theory charts  Charts for other distributions  Future …. ?

Thank you!