Spatial forecast verifjcation Manfred Dorninger University of - - PowerPoint PPT Presentation

spatial forecast verifjcation
SMART_READER_LITE
LIVE PREVIEW

Spatial forecast verifjcation Manfred Dorninger University of - - PowerPoint PPT Presentation

Spatial forecast verifjcation Manfred Dorninger University of Vienna Vienna, Austria manfred.dorninger@univie.ac.at Thanks to: B. Ebert, B. Casati, C. Keil 7 th Verifjcation T utorial Course, Berlin, 3-6 May, 2017 Motivation: Model


slide-1
SLIDE 1

Manfred Dorninger University of Vienna Vienna, Austria manfred.dorninger@univie.ac.at

Spatial forecast verifjcation

7th Verifjcation T utorial Course, Berlin, 3-6 May, 2017

Thanks to: B. Ebert, B. Casati, C. Keil

slide-2
SLIDE 2

Motivation:

Aladin (1.76) LM (1.80) ECMWF (1.33)

Model – VERA: RMSE pmsl, 13- 24 h

RMSE ~9 km ~25 km ~2 km

slide-3
SLIDE 3

Motivation:  The „double“ penalty problem

Analysis FC-model I (coarse) FC-model II (fine)

+

  • FC-AN
  • fine-scale model catches the small-

scale trough but at the wrong place (or time)

  • gets penalized twice
  • increases quadratic measures

compared to coarse model

  • true for other continuous variables

as well (e.g., precipitation, wind speed, etc.)

slide-4
SLIDE 4

4

Traditjonal spatjal verifjcatjon (grid point wise approach)

Compute statjstjcs on forecast-observatjon pairs

– Contjnuous values (e.g., precipitatjon amount, temperature, NWP variables):

  • mean error, MSE, RMSE, correlatjon
  • anomaly correlatjon, S1 score

– Categorical values (e.g., precipitatjon occurrence):

  • Contjngency table statjstjcs (POD, FAR, etc…)
slide-5
SLIDE 5

Anomaly correlatjon

5

    (centered)

  • r

d) (uncentere

2 2 2 2

     

             ) C O ( ) C F ( ) C O ( ) C O ( ) C F ( ) C F ( ) C O ( ) C F ( ) C O ( ) C F ( AC

slide-6
SLIDE 6

6

Traditjonal spatjal verifjcatjon using categorical scores

Forecast Observed

False alarms Hits Misses

Observed yes no yes hits false alarms no misses correct negatives Predicted

Contingency Table

alarms false hits alarms false FAR   misses hits hits POD   misses hits alarms false hits FBI    alarms false misses hits hits TS   

random random

hits alarms false misses hits hits hits ETS     

x → y →

slide-7
SLIDE 7

7

POD=0.39, FAR=0.63, CSI=0.24

slide-8
SLIDE 8

8

Traditjonal spatjal verifjcatjon

  • Requires an exact match between forecasts and observatjons at

every grid point

Hi res forecast RMS ~ 4.7 POD=0, FAR=1 TS=0 Low res forecast RMS ~ 2.7 POD~1, FAR~0.7 TS~0.3 10 10 10 3 fcst obs fcst

  • bs

 Problem of "double penalty" -

event predicted where it did not

  • ccur, no event predicted where

it did occur

 Traditional scores do not say

very much about the source or nature of the errors

10 10 fcst

  • bs
slide-9
SLIDE 9

9

What do traditjonal scores not tell us ?

  • Traditjonal approaches provide overall measures of skill but…
  • They don't provide much diagnostjc informatjon about the

forecast:

– What went wrong? What went right? – How close is the forecast to observatjon (in terms of spatjal thinking)? – Does the forecast look realistjc? – How can I improve this forecast? – How can I use it to make a decision?

  • Best performance for smooth forecasts !!!
  • Some scores are insensitjve to the size of the errors…
slide-10
SLIDE 10

10

Spatjal forecasts

New spatjal verifjcatjon techniques aim to:

  • account for fjeld spatjal structure
  • provide informatjon on error in physical terms
  • account for uncertaintjes in locatjon (and tjming)

Weather variables defined

  • ver spatial domains have

coherent spatial structure and features

WRF model Stage II radar

slide-11
SLIDE 11

Spatial verification types

  • Neighborhood (fuzzy) verifjcatjon methods
  • give credit to "close" forecasts
  • Scale separatjon methods
  • measure scale-dependent error
  • Features-based methods
  • evaluate atuributes of identjfjable features
  • Field deformatjon
  • evaluate phase errors
slide-12
SLIDE 12

Spatial verification types

Gilleland, et al. 2009

slide-13
SLIDE 13

Gilleland, et al. 2009

slide-14
SLIDE 14

14

Neighborhood (fuzzy) verifjcatjon methods  give credit to "close" forecasts

10 10 fcst obs 10 10 fcst

  • bs

“close“ “not close“

slide-15
SLIDE 15

15

Why is it called "fuzzy"?

  • bservation

forecast

  • bservation

forecast

Squint your eyes!

Neighborhood verifjcatjon methods

  • Don't require an exact match between forecasts and
  • bservatjons

– Unpredictable scales – Uncertainty in observatjons

 Look in a space / time neighborhood around the point of

interest

 Evaluate using categorical, continuous, probabilistic

scores / methods

t t + 1 t - 1 Forecast value Frequency

slide-16
SLIDE 16

16

Neighborhood verifjcatjon methods

Treatment of forecast data within a window:

– Mean value (upscaling) – Occurrence of event* somewhere in window – Frequency of events in window  probability – Distributjon of values within window

May also look in a neighborhood of observatjons

* Event defined as a value exceeding a given threshold, for example, rain exceeding 1 mm/hr

Rainfall Frequency forecast

  • bservation
slide-17
SLIDE 17

17

  • bservation

forecast

  • bservation

forecast

  • bservation

forecast

Moving windows

For each combinatjon of neighborhood size and intensity threshold, accumulate scores as windows are moved through the domain

  • bservation

forecast

slide-18
SLIDE 18

18

Neighborhood verifjcatjon framework

Neighborhood methods use one of two approaches to compare forecasts and observatjons:

single observation – neighborhood forecast (SO-NF, user-oriented) neighborhood observation – neighborhood forecast (NO-NF, model-oriented)

  • bservation

forecast

  • bservation

forecast

slide-19
SLIDE 19

19

Difgerent neighborhood verifjcatjon methods have difgerent decision models for what makes a useful forecast

*NO-NF = neighborhood observation-neighborhood forecast, SO-NF = single observation-neighborhood forecast

from Ebert, Meteorol. Appl., 2008

slide-20
SLIDE 20

Detailed descriptjon of Fractjon Skill Score (FSS) (Roberts and Lean, 2008)

  • We want to know

– How forecast skill varies with neighborhood size – The smallest neighborhood size that can be used to give suffjciently accurate forecasts – Does higher resolutjon NWP provide more accurate forecasts on scales of interest (e.g., river catchments)

Step 1: FC and Observation/Analysis have to be on the same grid. Step 2: Choose suitable thresholds q (e.g.: 0.5, 1, 2, 4 mm) Step 3: Convert FC/AN fields to binary fields IO and IM according to threshold

slide-21
SLIDE 21

Detailed descriptjon of Fractjon Skill Score (FSS) (Roberts and Lean, 2008)

Step 4: Generate fractions for all thresholds:

Pobs 1x1 Pfcst 1x1 Pobs 35x35 Pfcst 35x35

slide-22
SLIDE 22

Detailed descriptjon of Fractjon Skill Score (FSS) (Roberts and Lean, 2008)

Step 5: Compute fraction skill score for all thresholds:

  

  

   

N i N i

  • bs

fcst N i

  • bs

fcst

P P P P

1 1 2 2 1 2

N 1 N 1 ) ( N 1 1 FSS Maximum estimation (low-skill reference) of MSE:

(Pfcst-Pobs)2 = Pfcst

2 - 2PfcstPobs + Pobs 2 ~ Pfcst 2 + Pobs 2 = MSEref

slide-23
SLIDE 23

Detailed descriptjon of Fractjon Skill Score (FSS) (Roberts and Lean, 2008)

Step 6: Graphical presentation for each threshold and spatial scale: Interpretation:

  • Skill increases with spatial scale
  • The smaller the displacement error the faster the skill increases with

increasing spatial scale

  • When the length of the moving window is smaller or equal the

displacment error there is no skill and FSS=0

slide-24
SLIDE 24

Detailed descriptjon of Fractjon Skill Score (FSS) (Roberts and Lean, 2008)

Q: What happens if size of moving window is equal to domain size? Q: What are useful (skillfull) numbers of FSS?

fo=domain obs fraction on the grid scale (for f0=0.2(20%)  target skill: FSS=0.5+0,2/2=0.6

slide-25
SLIDE 25

Detailed descriptjon of Fractjon Skill Score (FSS) (Roberts and Lean, 2008)

slide-26
SLIDE 26

26

Scale separatjon methods scale-dependent error

  • 1. Which spatial scales are well represented and

which scales have error?

  • 2. How does the skill depend on the precipitation

intensity?

NOTE: scale = single band spatial fjlter  features of difgerent scales  feedback on difgerent physical processes and model parameterizations In the neighborhood based (fuzzy) verifjcation, the scale is the neighborhood size (low band pass fjlter): as the scale increases the exact positioning requirements are more and more relaxed

slide-27
SLIDE 27

27

What is the difgerence between neighborhood and scale separatjon approaches?

  • Neighborhood verifjcatjon methods
  • Get scale informatjon by fjltering out higher

resolutjon scales

  • Scale separatjon methods
  • Get scale informatjon by isolatjng scales of interest
slide-28
SLIDE 28

Nimrod case study: intense storm displaced

Step 1: Gridded data, square domain with dimension 2n It can be applied to any meteorological field … however, it was specifically designed for spatial precipitation forecasts …

slide-29
SLIDE 29

Step 2: Intensity: threshold to obtain binary images (categorical approach)

1

  • 1

Binary Forecast Binary Analysis Binary Error Image u=1mm/h

slide-30
SLIDE 30

30

Step 3: Scale  wavelet decompositjon of binary error

Scale l=8 (640 km) Scale l=1 (5 km) mean (1280 km) Scale l=6 (160 km) Scale l=7 (320 km) Scale l=5 (80 km) Scale l=4 (40 km) Scale l=3 (20 km) Scale l=2 (10 km)

1

  • 1

L l l u u

E E

1 ,

L l l u u

MSE MSE

1 ,

slide-31
SLIDE 31

31

Step 4: MSE skill score for each threshold and scale component

  L

ε ε MSE MSE MSE MSE MSE SS

l u random l u best l u random l u l u l u

/ 1 2 1

, , , , , , , , ,

     

Sample climatology (base rate)

slide-32
SLIDE 32

Strenghts Categorical approach  robust and resistant Wavelets  cope with spatially discontinuous fields characterized by the presence of few sparse non-zero features  suitable for spatial precipitation forecasts Weaknesses need gridded data on a square domain with dimension 2n

slide-33
SLIDE 33

Dorninger and Gorgas, 2012

slide-34
SLIDE 34

34

Features-based methods evaluate atuributes of features

slide-35
SLIDE 35

35

Feature-based approach (CRA)

Ebert and McBride, J. Hydrol., 2000

  • Defjne entjtjes using (user defjned) threshold (Contjguous Rain

Areas)

  • Horizontally translate the forecast untjl a patuern matching

criterion is met:

– minimum total squared error between forecast and observatjons – maximum correlatjon – maximum overlap

  • The displacement is the vector difgerence between the original and

fjnal locatjons of the forecast.

Observed Forecast

slide-36
SLIDE 36

36

CRA error decompositjon

Total mean squared error (MSE) before shifuing MSEtotal = MSEdisplacement + MSEvolume + MSEpatuern The displacement error is the difgerence between the mean square error before and afuer shifuing MSEdisplacement = MSEtotal – MSEshifued The volume error is the bias in mean intensity where and are the mean forecast and observed values afuer shifuing. The patuern error, computed as a residual, accounts for difgerences in the fjne structure, MSEpatuern = MSEshifued - MSEvolume

2

) X F ( MSEvolume  

X

F

slide-37
SLIDE 37

37

Example: CRA verifjcatjon of precipitatjon forecast over USA

  • 1. What is the locatjon error of the forecast?
  • 2. How do the forecast and observed rain areas compare? Average values?

Maximum values?

  • 3. How do the displacement, volume, and patuern errors contribute to the total

error?

1 2 3 1 2 3

slide-38
SLIDE 38

5th Int'l Verification Methods Workshop, Melbourne, 1-3 December 2011

38

1st CRA

slide-39
SLIDE 39

5th Int'l Verification Methods Workshop, Melbourne, 1-3 December 2011

39

2nd CRA

slide-40
SLIDE 40

40

Sensitjvity to rain threshold

1 mm h-1 10 mm h-1 5 mm h-1 1 mm h-1 5 mm h-1 10 mm h-1

slide-41
SLIDE 41

41

Strengths of CRA

The entity-based CRA verification method has a number of attractive features: (a) It is intuitive, quantifying what we can see by eye (b) it estimates the location error in the forecast, (c) the total error can be decomposed into contributions from location, intensity, and pattern, (d) forecast events can be categorized as hits, misses, etc. These descriptions could prove a useful tool for monitoring forecast performance over time.

slide-42
SLIDE 42

42

Weaknesses of CRA

There are also some drawbacks to this approach: (a)Pattern matching: it must be possible to associate entities in the forecast with entities in the observations. This means that the forecast must be halfway decent. The verification results for a large number of CRAs will be biased toward the "decent" forecasts, i.e., those for which location and intensity errors could reliably be determined. (b)The user must choose the pattern matching method as well as the isoline used to define the entities. The verification results will be somewhat dependent on these choices (subjective). (c)When a forecast and/or observed entity extends across the boundary

  • f the domain it is not possible to be sure whether the pattern match is
  • ptimal. If the CRA has a reasonably large area still within the domain

then the probability of a good match is high. Ebert and McBride (2000) suggest applying a minimum area criterion to address this issue..

slide-43
SLIDE 43

43

Structure-Amplitude-Locatjon (SAL)

Wernli et al., Mon. Wea. Rev., 2008

  • Verification of rain forecasts in a defined domain
  • No match of objects

SAL consists of three components:

  • S structure
  • A amplitude
  • L location

Perfect forecast: S=A=L=0

slide-44
SLIDE 44

Step 1: Definition of precipitation objects

slide-45
SLIDE 45
slide-46
SLIDE 46
slide-47
SLIDE 47

Dorninger Verifikation WS 2015

4.2.1 SAL

slide-48
SLIDE 48
slide-49
SLIDE 49
slide-50
SLIDE 50
slide-51
SLIDE 51
slide-52
SLIDE 52
slide-53
SLIDE 53
slide-54
SLIDE 54
slide-55
SLIDE 55
slide-56
SLIDE 56

Q: Look at precip fields. What do you expect for S, A and L? FC OBS A: S=A=L=0; SAL is invariant against pure rotation.

slide-57
SLIDE 57
slide-58
SLIDE 58

SAL Examples

slide-59
SLIDE 59

59

Field deformatjon  evaluate phase errors

slide-60
SLIDE 60
  • Displacement and Amplitude Score DAS
  • constjtutes a spatjal measure belonging to the fjeld

verifjcatjon technique

  • is based on an areal image matcher using classical optjcal

fmow technique

  • has two components: DIS and AMP (normalized with

characteristjcal values)

  • is applied in both observatjon and forecast space (to

account for misses and false alarms)

  • has been used in deterministjc mode
  • is coded in python and freely available

DAS in a nutshell

(Keil and Craig, WAF 2009)

slide-61
SLIDE 61

Optjcal fmow algorithm: Pyramidal Matching

  • 1. Project observed and simulated images to same grid
  • 2. Coarse-grain both images by averaging of 2F pixels onto one pixel

element

  • 3. Compute a displacement vector fjeld that minimizes the RMSE within

the range of +/- 2 pixel elements

  • 4. Repeat step 2 at successively fjner scales
  • 5. Displacement vector for every pixel results from the sum over all scales

(Mannstein et al., 2002)

slide-62
SLIDE 62

Pyramidal Image Matching

  • bservation

forecast Step 1: projection on same grid

slide-63
SLIDE 63

Pyramidal Image Matching

Step 2: coarse graining F=4

  • bservation

forecast

slide-64
SLIDE 64

Pyramidal Image Matching

Step 3: compute displacement vector by minimizing RMSE

  • bservation

forecast

slide-65
SLIDE 65

Pyramidal Image Matching

Step 3: compute displacement vector by minimizing RMSE

  • bservation

forecast

slide-66
SLIDE 66

Pyramidal Image Matching

Step 3: compute displacement vector by minimizing RMSE

  • bservation

forecast

slide-67
SLIDE 67

Pyramidal Image Matching

Step 4: cycle on finer scales using morphed image

  • bservation

morphed forecast

slide-68
SLIDE 68

Pyramidal Image Matching

Step 4: cycle on finer scales using morphed image

  • bservation

morphed forecast

slide-69
SLIDE 69

Pyramidal Image Matching

Step 5: sum over all scales

  • bservation

morphed forecast

slide-70
SLIDE 70

Pyramidal Image Matching

Step 5: sum over all scales

  • bservation

morphed forecast

slide-71
SLIDE 71

Pyramidal Image Matching

Step 5: sum over all scales

  • bservation

morphed forecast

slide-72
SLIDE 72

Displacement error fjeld DIS

DIS

slide-73
SLIDE 73

Displacement error fjeld DIS and Amplitude error fjeld AMP

DIS AMP

slide-74
SLIDE 74

DAS

DAS fjeld: combined DIS and AMP fjelds

slide-75
SLIDE 75

DAS has two components: 1.displacement error (of observed and forecast imagery) 2.amplitude error (RMSE of observed and morphed forecast imagery)

  • DAS is applied in observatjon and forecast space:

2 1 2

) , ( 1         

A

y x AMP n AMP

A

y x DIS n DIS ) , ( 1

Displacement and Amplitude Score DAS

  

fct fct

  • bs
  • bs

fct

  • bs

DIS n DIS n n n DIS    1

(Keil and Craig, WAF 2009)

slide-76
SLIDE 76
  • DAS shall be a single valued measure of forecast

quality:

  • underlying principle: complete miss = 100% AMP

error

  • Dmax : maximum search distance
  • I0 : characteristjc intensity chosen to be typical
  • f amplitude of the observed features

max

I AMP D DIS DAS  

Displacement and Amplitude Score DAS

(Keil and Craig, WAF 2009)

slide-77
SLIDE 77

77

Conclusions

  • What method should you use for spatjal

verifjcatjon?

– Depends what questjon(s) you would like to address

  • Many spatjal verifjcatjon approaches

– Neighborhood – credit for "close" forecasts – Scale separatjon – scale-dependent error – Features-based – atuributes of features – Field deformatjon – phase and amplitude errors

slide-78
SLIDE 78

78

What method(s) could you use to verify

Neighborhood – credit for "close" forecasts Scale separation – scale-dependent error Features-based – attributes of features Field deformation – phase and amplitude errors

Wind forecast (sea breeze)

slide-79
SLIDE 79

79

Forecast Observed

What method(s) could you use to verify

Neighborhood – credit for "close" forecasts Scale separation – scale-dependent error Features-based – attributes of features Field deformation – phase and amplitude errors

Cloud forecast

slide-80
SLIDE 80

80

What method(s) could you use to verify

Neighborhood – credit for "close" forecasts Scale separation – scale-dependent error Features-based – attributes of features Field deformation – phase and amplitude errors 5-day forecast Analysis

Mean sea level pressure forecast

slide-81
SLIDE 81

81

What method(s) could you use to verify

Neighborhood – credit for "close" forecasts Scale separation – scale-dependent error Features-based – attributes of features Field deformation – phase and amplitude errors

Tropical cyclone forecast

3-day forecast Observed