A possible Validation Metrics for the ACTD C. W. Gear July 19, - - PowerPoint PPT Presentation

a possible validation metrics for the actd
SMART_READER_LITE
LIVE PREVIEW

A possible Validation Metrics for the ACTD C. W. Gear July 19, - - PowerPoint PPT Presentation

A possible Validation Metrics for the ACTD C. W. Gear July 19, 2004 (a modification of a presentation of June 16 by CWG and Skip Knowles) 1 Background Typically, calculations are visually compared with an acceptable experimental


slide-1
SLIDE 1

1

A possible Validation Metrics for the ACTD

  • C. W. Gear

July 19, 2004 (a modification of a presentation of June 16 by CWG and Skip Knowles)

slide-2
SLIDE 2

Making the World Safer

2

Background

  • Typically, calculations are visually compared with an

“acceptable” experimental data set or an analytic solution

– Quality of the calculation is determined by “expert” judgment – Different experts may reach different conclusions – No audit trail defining how the conclusions were reached

  • Which aspects of the calculations were deemed to be important?
  • What is the relative ranking of each of these aspects?

– Implicit rules for ranking calculations established after the calculations are completed

  • May lead to sense of unfairness where relative rankings may impact

future contract work

– Quality of the experimental data can be a key factor

  • A more orderly and well defined procedure is highly desirable
slide-3
SLIDE 3

Making the World Safer

3

Metrics

  • Provide an unbiased mathematical basis for

comparing calculations with experimental data or analytic solutions

  • Emphasis on ACTD applications
  • Recommended metrics to be evaluated against early

JOLT experimental data

  • Refinements to be made as appropriate
slide-4
SLIDE 4

Making the World Safer

4

Some Definitions

  • Normalization – scaling of the result to achieve some objective.

This could include:

– Range Normalization

  • Limit the metric values to a fixed, predetermined range

– Amplitude Normalization

  • Normalize the difference between calculated and measured values
  • Weighting

– 1) Multipliers to assign different importance to different parts of the calculation, e.g., the waveform – 2) Multipliers to assign different quality to different experimental data

slide-5
SLIDE 5

Making the World Safer

5

Validation Metric Objectives

  • Permit an assessment of the state of the art of each of the

theoretical predictions of interest to the ACTD customer

  • Provide a clear and unbiased means of comparing different

calculations

– Ground rules established in advance of comparison

  • Reproduce the results of truly expert judgment comparisons

– A validation metric should not be a replacement for expert judgment

  • Define and quantify the expert judgment process

– What is important – What is the relative importance of different aspects

  • Include consideration of experimental data uncertainties if

appropriate

  • Provide an audit trail for the assessment process
slide-6
SLIDE 6

Making the World Safer

6

What Aspects of the Calculations are Important?

  • Emphasis should be on what is important for the customer

– Avoid concentrating on interesting nuances

  • Key Issues

– Peak environments (velocity, displacement, strain) – Tunnel damage (rubble depth, rubble velocity, material damage)

  • Significant issues

– Wave shape near the peak – Significant TOA (shock, peak velocity, reliefs, reflections, etc.)

  • Unimportant Issues

– Very low amplitude TOA resulting from numerics – Wave shape in the tail – High frequency content (unless peak accelerations are of interest)

  • Different problems may require different emphasis
slide-7
SLIDE 7

Making the World Safer

7

Example of TOA Issues

  • Calculation “a” compares well with

data although calculation “b” may be closer in many metrics

  • Difference in TOA is of no

significance for predicting tunnel damage

  • While we probably want to favor a

computation that gets a good approximation to the TOA, we propose to evaluate the TOA separately from the waveshape in making the comparison

slide-8
SLIDE 8

Making the World Safer

8

Is a Single Metric Sufficient?

  • Single metric cannot capture complex aspects of

importance to the customer

– Database of tunnel damage insufficient for direct comparison – Must calculate environment and predict tunnel damage – Multiple aspects of environments must be considered to have confidence in the predictions

  • Must compare code results with what can be reliably

measured

  • Important to tie into existing database

– IGVN approach

slide-9
SLIDE 9

Making the World Safer

9

Relative Importances – a proposal

(scale of 0-10)

  • Key Issues

– (10) Peak environments (velocity, displacement, strain) – (10) Tunnel damage (rubble depth, rubble velocity, material damage)

  • Significant issues

– (5) Wave shape near the peak – (2) Significant TOA (shock, peak velocity, reliefs)

  • Unimportant Issues

– (0) Very low amplitude TOA resulting from numerics – (1) Wave shape in the tail – (1) High frequency content (higher if peak accelerations are of interest)

We propose to measure each component separately and combine them using these weights (or others to be decided).

slide-10
SLIDE 10

Making the World Safer

10

For the waveshape metric we propose a weighted sum of squares

2

( ) /

N i i i i

M w c m Normalization

=

⎡ ⎤ =

⎣ ⎦

Where is the measured value of a function at time and is the computed value at the corresponding time*. Normalization is a divisor to be discussed later, and is a weight, also to be discussed later. Important points about weights :

  • 1. They depend on m only, not on c, so that same weights are used

for comparisons of all computations

  • 2. They will be chosen to give greater weight in regions of large m

since those are consdiered more important to failure modes. * - the corresponding time is a time shifted to remove time of arrival differences so that we are measuring only the waveshape differences in this calculation.

i

m

i

c

i

t

i

w

slide-11
SLIDE 11

Making the World Safer

11

What is the Rationale for the Sum of the Squares?

If M (measurement) and C (computation) are data from same model, we expect to see close values more often than very different values:

close

Much less

Difference d Much more Probability of difference

IF each measurement contains the sum of many small errors due to various causes (rock variations, instrumentation, …) its error is Gaussian (“law of large numbers”). Probability looks like A.exp(-d2/σ2) where σ is an estimate of how much difference, d, we might expect (the standard deviation). If we have many measurements, di, each with its own σi, joint probability is proportional to the product exp[-(d1 / σ1)2] exp[-(d2 / σ2)2] … exp[-(dn / σn)2] = exp[-Σ(di / σi)2] Thus, SS = Σ(di / σi)2 is an appropriate combination of the various differences. Zero means “perfect” and infinity means “perfectly awful.”

slide-12
SLIDE 12

Making the World Safer

12

IF we had statistical information about the distributions of experimental errors, and of data such as rock characteristics, I would argue that we should stay with a metric that ranges from 0 to infinity, normalizing it by the known/estimated standard deviations. This would give us information that could be used in Bayesian updating of probabilities of damage. Then, if you really want the range [0,1] you can take the exponential of its negative to get a probability. However, in this case, we do not have any statistical information, so I will propose a normalization in which a metric value of 0.4 means a 40% deviation from the waveform – that is, if the calculated waveform is exactly 60% of the measured, or exactly 140%, the metric will be 0.4 (and similarly for

  • ther values).

Why the Sum of the Squares - continued

slide-13
SLIDE 13

Making the World Safer

13

Weighting of Experimental Data

  • Better data should count more than poor data
  • Currently data quality determined by expert judgment

– Usually done by the experimenter – Based on cleanness of the data, observed scatter, etc.

  • For now, consider weighting experimental data points to reflect

data quality & quantity

– Give each data used in a metric (i.e. the waveform from a gage, or the TOA data from a gage) a rating 0 ≤ Q with larger numbers meaning better quality/quanity of data

  • More post shot data analysis is desirable

– e.g., better comparisons of data from different gages – Could improve quality and assessment of data

slide-14
SLIDE 14

Making the World Safer

14

The Waveform Metric

We compute the following: (maximum measured)

(Qual…ntity) QS (Q scaled factor) Metric value MW MW = 0 if waveforms agree (after TOA shift). MW = 1 if c = 0 or c = 2m. For now, I recommend a value p = 1. This places more importance to larger m regions The sum is an approximation to twice the integral, so it is nearly time spacing independent. Q is the area under the graph of (|m|/mmax)2. If we had statistical knowledge of data quality, that should also be factored in. QS is chosen such that the metric for c = 0 is 1.

max

max( )

i i

m m =

1 1 1 1 1 max

( ) with and

p N i i i M M i

m Q t t t t t t m

+ − + =

⎛ ⎞ = − = = ⎜ ⎟ ⎝ ⎠

2 1 1 1 max

( ) ( )

p N i i i i i

m QS m t t m

+ − =

⎛ ⎞ = − ⎜ ⎟ ⎝ ⎠

2 1 1 1 max

( ) ( ) /

p N i W i i i i i

m M m c t t QS m

+ − =

⎛ ⎞ = − − ⎜ ⎟ ⎝ ⎠

slide-15
SLIDE 15

Making the World Safer

15

Combining multiple waveform metrics

If we have a set of waveforms and have evaluated their metrics, MWj, j = 1, 2, …, m and the corresponding Qj values, we compute the combined metric and Q value as: this means we have more data as the basis for our metric This is a weighted average

Similar approach applies to combination of any metrics of same type

1 m j j

Q Q

=

= ∑

2 1

/

m W j Wj j

M Q M Q

=

= ∑

slide-16
SLIDE 16

Making the World Safer

16

TOA Metrics

  • TOA is initially estimated as time at which waveform first reaches

p% of the maximum (if positive going). (p has to be chosen based

  • n the initial shape.)
  • The TOA of the computed waveform is then “fine tuned” by

comparison with the experimental data by minimizing the function Here is a small shift in the relative time of the computed waveform, is the (interpolated) value of after the additional time shift of , and is a penalty function multiplier to prevent large values of when the waveforms are very different. If the TOAs of computed and measured are TOAc and TOAm, then the TOA metric is MT = |TOAc – TOAm|/TOAm

For now, its Q value is the same as that of the waveform

2 2 1

[ ( )]

N i i i

m c t t δ μδ

=

− +

( )

i

c t δ

t δ

i

c μ

t δ t δ

slide-17
SLIDE 17

Making the World Safer

17

Combining Metrics of Different Forms

Importance factors, IMPW, IMPT, … , are assigned for each type

  • f metric. The metrics are then combined as a weighted average:

... ...

W W W T T T W W T T

IMP Q M IMP Q M M IMP Q IMP Q + + = + +

slide-18
SLIDE 18

Making the World Safer

18

0.5 1 1.5 2 2.5 −90 −80 −70 −60 −50 −40 −30 −20 −10 10 A04 raw data. SRI black, LLL red, SNL green, TRT blue, WAI cyan

Example: gage A04 – unshifted data

slide-19
SLIDE 19

Making the World Safer

19

−0.5 0.5 1 1.5 2 2.5 −90 −80 −70 −60 −50 −40 −30 −20 −10 10 A04 Shifted TOA raw data. SRI black, LLL red, SNL green, TRT blue, WAI cyan

Example: gage A04 – TOA corrected data

slide-20
SLIDE 20

Making the World Safer

20

Example results from Jolt-1, gages A4-A6 and Q1-Q6

(two slides) WAVEFORM METRICS Test Q QS LLL SNL TRT WAI A04 0.8250 343.72 0.7110 1.8575 0.2327 0.4017 A05 0.8715 296.06 0.6805 2.1307 0.1889 0.3485 A06 0.9906 240.67 0.7630 1.4542 0.5622 0.4921 Q01 0.8932 2509.80 0.9038 3.0051 0.6363 0.9075 Q02 0.5678 2207.81 0.7778 2.2435 0.2991 0.4868 Q03 1.0083 2568.52 0.8815 4.5505 0.3306 0.6341 Q04 0.5713 5584.08 0.7830 2.3099 0.4413 0.7109 Q05 0.5916 3197.17 0.8369 2.4539 0.4265 0.8393 Q06 0.9005 3219.64 0.9122 2.5757 0.5675 0.8165 Waveform combined 0.8126 2.7167 0.4436 0.6510

slide-21
SLIDE 21

Making the World Safer

21

TOA (SRI) and DIFFERENCES divided by SRI Test Q SRI LLL SNL TRT WAI A04 0.8250 0.2704 0.2168 0.1290 0.2435 0.3110 A05 0.8715 0.2726 0.2070 0.1199 0.2060 0.3299 A06 0.9906 0.3519 0.2930 0.0415 0.0976 0.3507 Q01 0.8932 0.3551 0.2617 0.0744 0.0769 0.3170 Q02 0.5678 0.1675 -0.0627 0.0951 -0.0171 -0.0604 Q03 1.0083 0.3015 0.3931 0.2139 0.1872 0.4190 Q04 0.5713 0.1541 -0.0295 0.1892 -0.0482 -0.0831 Q05 0.5916 0.2211 0.0823 0.1963 0.0743 0.1240 Q06 0.9005 0.3283 0.2594 0.1148 0.0853 0.2787 Scaled TOA diffs combined 0.2484 0.1389 0.1428 0.2997 Using IMP(Waveform) = 10.000 IMP(TOA) = 2.000 Q LLL SNL TRT WAI Overall 7.2197 0.7487 2.4807 0.4091 0.6068

Example results from Jolt-1, gages A4-A6 and Q1-Q6

(slide two)

slide-22
SLIDE 22

Making the World Safer

22

SUMMARY

The five most important points in this proposal are:

  • 1. We use a modified version of the sum of squares as a metric, where

high signal regions are more heavily weighted to reflect their relative importance

  • 2. The measured (experimental) values are treated as special and any

weighting is done with respect to them, not to computed variables (which change from calculator to calculator so would result in different weighting being used by different calculators)

  • 3. We apply different “importance weights” to different features to reflect

expert judgment of the relative importance of the various features.

  • 4. We separate the Time of Arrival from the waveform comparison so that

it can be given a different importance weight in the final metric

  • 5. We estimate the amount of “good data”, Q, in each submetric, so that

when submetrics are combined we can give more weight to those with larger amounts of good data.

slide-23
SLIDE 23

Making the World Safer

23

−0.5 0.5 1 1.5 2 2.5 −90 −80 −70 −60 −50 −40 −30 −20 −10 10 A05 Shifted TOA raw data. SRI black, LLL red, SNL green, TRT blue, WAI cyan

slide-24
SLIDE 24

Making the World Safer

24

−0.5 0.5 1 1.5 2 2.5 −100 −80 −60 −40 −20 20 A06 Shifted TOA raw data. SRI black, LLL red, SNL green, TRT blue, WAI cyan

slide-25
SLIDE 25

Making the World Safer

25

−0.5 0.5 1 1.5 2 2.5 −450 −400 −350 −300 −250 −200 −150 −100 −50 50 Q01 Shifted TOA raw data. SRI black, LLL red, SNL green, TRT blue, WAI cyan

slide-26
SLIDE 26

Making the World Safer

26

0.5 1 1.5 2 2.5 −300 −250 −200 −150 −100 −50 50 Q02 Shifted TOA raw data. SRI black, LLL red, SNL green, TRT blue, WAI cyan

slide-27
SLIDE 27

Making the World Safer

27

−0.5 0.5 1 1.5 2 2.5 −400 −350 −300 −250 −200 −150 −100 −50 50 Q03 Shifted TOA raw data. SRI black, LLL red, SNL green, TRT blue, WAI cyan

slide-28
SLIDE 28

Making the World Safer

28

−0.5 0.5 1 1.5 2 2.5 −500 −400 −300 −200 −100 100 Q04 Shifted TOA raw data. SRI black, LLL red, SNL green, TRT blue, WAI cyan

slide-29
SLIDE 29

Making the World Safer

29

−0.5 0.5 1 1.5 2 2.5 −450 −400 −350 −300 −250 −200 −150 −100 −50 50 Q05 Shifted TOA raw data. SRI black, LLL red, SNL green, TRT blue, WAI cyan

slide-30
SLIDE 30

Making the World Safer

30

−0.5 0.5 1 1.5 2 2.5 −300 −250 −200 −150 −100 −50 50 Q06 Shifted TOA raw data. SRI black, LLL red, SNL green, TRT blue, WAI cyan