Co-authors: C-F Chien, Y-J Chen National Tsing Hua University ISMI - - PowerPoint PPT Presentation

โ–ถ
co authors c f chien y j chen
SMART_READER_LITE
LIVE PREVIEW

Co-authors: C-F Chien, Y-J Chen National Tsing Hua University ISMI - - PowerPoint PPT Presentation

Bayesian Inference Technique for Data mining for Yield Enhancement in Semiconductor Manufacturing Data Presenter: M. Khakifirooz Co-authors: C-F Chien, Y-J Chen National Tsing Hua University ISMI 2015, 16 th -18 th Oct. KAIST, Daejeon, Korea 1


slide-1
SLIDE 1

Bayesian Inference Technique for Data mining for Yield Enhancement in Semiconductor Manufacturing Data Presenter: M. Khakifirooz Co-authors: C-F Chien, Y-J Chen National Tsing Hua University

ISMI 2015, 16th -18th Oct. KAIST, Daejeon, Korea

1

slide-2
SLIDE 2

The Purpose

  • f Bayesian

Inference Data Structure provided by Data Model Data Analysis Approach

  • Bayesian Variable

Selection (BVS)

  • Data Clearance
  • Yield Classification

Conclusive Research Framework Final Decision Table

Conclusion & Path Forward

2

Outline

slide-3
SLIDE 3

Bayesian Inference Naรฏve Bayesian Classifier Gaussian Bayesian Classifier โ€ฆ Bayesian Networks

3

Learning Curve

The Purpose of Bayesian Inference

slide-4
SLIDE 4

4

Human Experience Human Experience + System Analysis

Yield Learning Curve of Semiconductor Manufacturing

Yield Learning Curve of Semiconductor Manufacturing: In addition to data analytics, Cumulative

Engineering Training and Experience

significantly enhanced yield improvement

Effron(1996), Tobin et al. (1999)

The Purpose of Bayesian Inference

slide-5
SLIDE 5

5

Data Structure provided by Data Model

๐‘— = 1, โ€ฆ , ๐‘ ๐‘‚ โ‹• of process stage sample size 1 โ‰ค ๐‘™๐‘— โ‰ค ๐‘‚ โ‹• of specify tools at each stage ๐‘œ๐‘—๐‘˜, ๐‘˜ = 1, โ€ฆ , ๐‘™๐‘— 1 โ‰ค ๐‘„

๐‘œ๐‘—๐‘˜ โ‰ค ๐‘œ๐‘—๐‘˜

๐‘ž๐‘š, ๐‘š = 1, โ€ฆ , ๐‘„

๐‘œ๐‘—๐‘˜

frequency of each specify tool โ‹• of exist chambers for each tool frequency of each exist chamber ๐‘‚ =

๐‘˜=1 ๐‘™๐‘— ๐‘š=1 ๐‘„๐‘œ๐‘—๐‘˜

๐‘ž๐‘š ๐‘‚ โˆ— ๐‘ =

๐‘—=1 ๐‘ ๐‘˜=1 ๐‘™๐‘— ๐‘š=1 ๐‘„๐‘œ๐‘—๐‘˜

๐‘ž๐‘š Response Variable: %Yield (continues) Explanatory Variables: Stages (tools-chambers) (nominal) Stages (process time) (continues)

Obs. ๐ฐ๐›๐ฌ๐Ÿ ๐ฐ๐›๐ฌ๐Ÿ‘ ๐‘œ1 ๐‘1 ๐‘2 ๐‘œ2 ๐‘1 ๐‘2 ๐‘œ3 ๐‘1 Na Obs. ๐ฐ๐›๐ฌ๐Ÿ-๐’ƒ๐Ÿ ๐ฐ๐›๐ฌ๐Ÿโˆ’๐’„1 ๐ฐ๐›๐ฌ๐Ÿ‘-๐’ƒ๐Ÿ‘ ๐ฐ๐›๐ฌ๐Ÿ‘-๐’„2 ๐‘œ1 1 1 ๐‘œ2 1 1 ๐‘œ3 1

Nominal Variables Dummy Variables

slide-6
SLIDE 6

6

Data Structure provided by Data Model

Yield ๐’•๐’–๐’ƒ๐’‰๐’‡ ๐Ÿ ๐’•๐’–๐’ƒ๐’‰๐’‡ ๐Ÿ‘

  • bs. 1

๐‘ˆ๐‘๐‘๐‘š 1 ๐‘ˆ๐‘๐‘๐‘š 2

  • bs. 2

๐‘ˆ๐‘๐‘๐‘š 1 ๐‘ˆ๐‘๐‘๐‘š 1

  • bs. 3

๐‘ˆ๐‘๐‘๐‘š 2 Tool 2 Yield ๐’•๐’–๐’ƒ๐’‰๐’‡ ๐Ÿ ๐’•๐’–๐’ƒ๐’‰๐’‡ ๐Ÿ‘

  • bs. 1

Chamber 1 Chamber 2

  • bs. 2

Chamber 2 Chamber 1

  • bs. 3

Chamber 1 Chamber 2 Yield ๐’•๐’–๐’ƒ๐’‰๐’‡ ๐Ÿ ๐’•๐’–๐’ƒ๐’‰๐’‡ ๐Ÿ‘

  • bs. 1

๐‘ˆ๐‘๐‘๐‘š 1. Chamber 1 ๐‘ˆ๐‘๐‘๐‘š 2. Chamber 2

  • bs. 2

๐‘ˆ๐‘๐‘๐‘š 1. Chamber 2 ๐‘ˆ๐‘๐‘๐‘š 1. Chamber 1

  • bs. 3

๐‘ˆ๐‘๐‘๐‘š 2. Chamber 1 ๐‘ˆ๐‘๐‘๐‘š 2. Chamber 2 Yield ๐’•๐’–๐’ƒ๐’‰๐’‡ ๐Ÿ ๐’•๐’–๐’ƒ๐’‰๐’‡ ๐Ÿ‘

  • bs. 1

๐ธ๐‘๐‘ข๐‘“ 1.1 ๐ธ๐‘๐‘ข๐‘“ 1.2

  • bs. 2

๐ธ๐‘๐‘ข๐‘“ 2.1 ๐ธ๐‘๐‘ข๐‘“ 2.2

  • bs. 3

๐ธ๐‘๐‘ข๐‘“ 3.1 Date 3.2

Yield ๐’• ๐Ÿ. ๐‘ผ ๐Ÿ. ๐‘ซ๐’Š ๐Ÿ ๐’• ๐Ÿ. ๐‘ผ ๐Ÿ. ๐‘ซ๐’Š ๐Ÿ‘ ๐’• ๐Ÿ. ๐‘ผ ๐Ÿ‘. ๐‘ซ๐’Š ๐Ÿ ๐’• ๐Ÿ‘. ๐‘ผ ๐Ÿ‘. ๐‘ซ๐’Š ๐Ÿ‘ ๐’• ๐Ÿ‘. ๐‘ผ ๐Ÿ‘. ๐‘ซ๐’Š ๐Ÿ‘

  • bs. 1

1 1

  • bs. 2

1 1

  • bs. 3

1 1

Yield ๐’• ๐Ÿ. ๐‘ผ ๐Ÿ. ๐‘ซ๐’Š ๐Ÿ ๐’• ๐Ÿ. ๐‘ผ ๐Ÿ. ๐‘ซ๐’Š ๐Ÿ‘ ๐’• ๐Ÿ. ๐‘ผ ๐Ÿ‘. ๐‘ซ๐’Š ๐Ÿ ๐’• ๐Ÿ‘. ๐‘ผ ๐Ÿ‘. ๐‘ซ๐’Š ๐Ÿ‘ ๐’• ๐Ÿ‘. ๐‘ผ ๐Ÿ‘. ๐‘ซ๐’Š ๐Ÿ‘

  • bs. 1

๐ธ๐‘๐‘ข๐‘“ 1.1 ๐ธ๐‘๐‘ข๐‘“ 1.2

  • bs. 2

๐ธ๐‘๐‘ข๐‘“ 2.1 ๐ธ๐‘๐‘ข๐‘“ 2.2

  • bs. 3

๐ธ๐‘๐‘ข๐‘“ 3.1 ๐ธ๐‘๐‘ข๐‘“ 2.3

slide-7
SLIDE 7

7

Data Structure provided by Data Model

Obs. ๐ฐ๐›๐ฌ๐Ÿ-๐’ƒ๐Ÿ ๐ฐ๐›๐ฌ๐Ÿโˆ’๐’„1 ๐ฐ๐›๐ฌ๐Ÿ-๐’…๐Ÿ ๐‘œ1 1 ๐‘œ2 1 ๐‘œ3 1 Pr(ith variable sellected) 1 3 1 3 1 3

var1โˆ’๐‘1, var1โˆ’๐‘1, var1โˆ’๐‘‘1

๐‘’ Multinomial 1

3 , 1 3 , 1 3 1,0,0 0,0,1 0,1,0 ๐ฐ๐›๐ฌ๐Ÿ-๐’ƒ๐Ÿ ๐ฐ๐›๐ฌ๐Ÿ-๐’…๐Ÿ ๐ฐ๐›๐ฌ๐Ÿโˆ’๐’„1 To randomly pick a point in this space, we need a continues distribution Distribution over Multinomial (posterior distribution): Dirichlet Distribution

selection probability based on engineer experience

slide-8
SLIDE 8

๏ฎ Critical Phenomena: i.

High dimensionality caused by transforming categorical variables to dummies

ii.

Multicollinearity caused by dummies nature

iii.

Complicated posterior distribution caused hardness for direct variable selection

  • Remedy:

Approximate Inference with Sampling

Use random sampling (MCMC techniques: Gibbs sampler, Metropolis-Hastings,โ€ฆ) to approximate the distribution and selecting significant explanatories

8

Data Analysis Approach

slide-9
SLIDE 9

9

Data Analysis Approach: Gibbs Sampler

Suppose ๐’š๐Ÿ, ๐’š๐Ÿ‘~๐๐ฌ ๐‘ฆ, ๐‘ฆ2 Beginning with initial value ๐’š๐Ÿ

๐Ÿ , ๐’š๐Ÿ‘ ๐Ÿ

Sampling at iteration t as follow:

Iteration Sample ๐ฒ๐Ÿ Sample ๐ฒ๐Ÿ‘ k x๐Ÿ

๐‘ข ~๐๐ฌ x๐Ÿ|x๐Ÿ‘ tโˆ’1

x๐Ÿ‘

๐‘ข ~๐๐ฌ x๐Ÿ‘|x๐Ÿ ๐‘ข

Iterating the above step until the sample values have the same distribution as if they where sampled from the true posterior joint distribution

Based on frequency of visits, selecting the most probable variables

slide-10
SLIDE 10

10

Data Analysis Approach: Data Clearance

When X is categorical (dummy var.) & Y is quantitative variable

  • parametric or non-parametric?
  • dependent or independent?
  • unbalanced class?

Yield value Representative var. Bad Yield 53.12 < 1 Middle Yield 53.12 โ‰ค and โ‰ค 57.51 ignore Good Yield >57.51

slide-11
SLIDE 11

11

Data Analysis Approach: Data Clearance

Level a Level b Level c fc๐‘ fc๐‘ Level d fd๐‘ fd๐‘

Variable I Variable II

If both ๐‘ค๐‘๐‘ . ๐ฝ & ๐‘ค๐‘๐‘ . ๐ฝ๐ฝ are explanatory:

  • test the Interchangeability of measures
  • measurement of the degree of Homogeneity

If ๐‘ค๐‘๐‘ . ๐ฝ is explanatory and ๐‘ค๐‘๐‘ . ๐ฝ๐ฝ is response:

  • measurement of the Reliability of instrument (test/scale)
  • measurement of the Objectivity or lack of bias

MEASURMENT of AGREEMENT

  • W. S. Robinson(1957)

Cohenโ€™s Kappa ๐“›

๐’ง < 0, "No agreement" 0 โ‰ค ๐’ง < 0.2, โ€œSlight agreementโ€œ 0.2 โ‰ค ๐’ง < 0.4, "Fair agreement" 0.4 โ‰ค ๐’ง < 0.6, "Moderate agreement" 0.6 โ‰ค ๐’ง < 0.8, "Substantial agreement" 0.8 โ‰ค ๐’ง โ‰ค 1, "Almost perfect agreement"

slide-12
SLIDE 12

12

Research Framework (I)

Data Preparation Data Mining & Key Factor Screening Problem Definition Data Integration Dummy Variable Construction for Integrated Variables (1460 var.) Wrap the associate variables Cohenโ€™s Kappa Statistics for each pairs of input variables

Agreement

Assign Cutting Point & Bad/Middle/Good Wafers

No Agreement

A Bayesian Framework for Semiconductor Manufacturing Data

Almost perfect agreement Substantial agreement Moderate agreement 3 109 1,764 Fair agreement Slight agreement No agreement 24,539 280,081 758,574 THE CLASS DISTRIBUTION FOR THE KAPPA TEST FOR EACH PAIR OF INPUT VARIABLES

slide-13
SLIDE 13

13

Research Framework (II)

BVS via Gibbs Sampler Data Clearance ๐’ง โ‰ค 0.2

No Agreement Agreement

GLM Construction with Gaussian distribution & Repeated Random Sub-sampling Validation A Comparison to the Wrapped Variables Define Abnormal Devices & Time Model Construction, Evaluation & Interpretation Cohenโ€™s Kappa Statistics for each pairs of X & Y Data Mining & Key Factor Screening

Model RMSE Adjusted R-squared Min Median Max Min Median Max Gibbs + GLM 1.842 2.653 2.841 0.046 0.371 0.711 GBM + GLM 2.534 3.051 3.332 0.000 0.053 0.337 RF + GLM 2.268 2.838 3.660 0.016 0.293 0.507 GLM 7.951 34.60 139.8 0.000 0.029 0.214

Number of resamples 20, Number of iterations 2

slide-14
SLIDE 14

14

Decision Graph

High Yield Middle Yield Low Yield

slide-15
SLIDE 15

Factors Date Bad Good Stage10 - Tool2 - Chamber3 before 8/29/2014 2:32 after 8/29/2014 12:50 Stage12 - Tool2 - Chamber1 between 8/30/2014 3:26 & 8/30/2014 3:43 before 8/29/2014 10:55 Stage12 - Tool2 - Chamber4 after 8/29/2014 7:36 till 8/30/2014 3:44 before 8/29/2014 7:36 Stage13 - Tool5 - Chamber2

  • generally effected the high yield

Stage17 - Tool2 - Chamber2 after 8/30/2014 12:21 before 8/30/2014 10:37 Stage23-Tool3-Chamber2

  • generally effected the high yield

Stage44 - Tool7.- Chamber2 and Chamber3 at 9/3/2014 at 9/1/2014 Stage49 - Tool1.- Chamber4 at 9/3/2014 at 9/2/2014 Stage57 - Tool1.- Chamber3

  • generally effected the high yield

15

Decision Table

slide-16
SLIDE 16

๏ฎ

Based on the empirical results, we validate that the proposed approach has practical viability, which means adding the efficacy of domain knowledge and experience to the system could improve results.

๏ฎ

Using the domain knowledge might be to restrict conjunctions in rules to tools, chambers and steps that are related to occurs within a reasonable time frame.

๏ฎ

The data are not sampled from a stationary population, hence, over the time, the results may change significantly, or some empirical answer might be reject based on engineer domain knowledge, which doesnโ€™t mean that the result is incorrect.

๏ฎ

The result may be a proxy for one or more events that are occurring elsewhere or at the other periods of the time, hence, the simulation study is an essential tool for evaluation the accuracy of our proposed method.

16

Conclusion & Path Forward

slide-17
SLIDE 17

17