Understanding Climate Change: A Data Driven Approach Vipin Kumar - - PowerPoint PPT Presentation
Understanding Climate Change: A Data Driven Approach Vipin Kumar - - PowerPoint PPT Presentation
NSF Expeditions in Computing Understanding Climate Change: A Data Driven Approach Vipin Kumar University of Minnesota kumar@cs.umn.edu http://climatechange.cs.umn.edu Expeditions Team Vipin Kumar, UM Auroop Ganguly, NEU Nagiza
March 4, 2014 Slide 2
Expeditions Team
Vipin Kumar, UM Auroop Ganguly, NEU Nagiza Samatova, NCSU Arindam Banerjee, UM Fred Semazzi, NCSU Joe Knight, UM Shashi Shekhar, UM Peter Snyder, UM Jon Foley, UM Alok Choudhary, NW Ankit Agrawal, NW Abdollah Homiafar Michael Steinbach Singdhansu Chatterjee Karsten Steinhaeuser Stefan Liess Shyam Boriah NCA&T UM UM UM UM UM
March 4, 2014 Slide 3
Understanding Climate Change - Motivation
March 4, 2014 Slide 4
Understanding Climate Change – Physics-Based Approach
General Circulation Models: Mathematical models with physical equations based on fluid dynamics
Parameterization and non-linearity
- f differential equations are sources for uncertainty!
Cell Clouds Land Ocean
Figure Courtesy: NCAR
March 4, 2014 Slide 5
Understanding Climate Change - Physics Based Approach
General Circulation Models: Mathematical models with physical equations based on fluid dynamics
Cell Clouds Land Ocean
Figure Courtesy: NCAR Figure Courtesy: ORNL
March 4, 2014 Slide 6
Understanding Climate Change - Physics Based Approach
Projection of temperature increase under different Special Report on Emissions Scenarios (SRES) by 24 different GCM configurations from 16 research centers used in the Intergovernmental Panel on Climate Change (IPCC) 4th Assessment Report.
Figure Courtesy: ORNL
March 4, 2014 Slide 7
Physics based models are essential but insufficient
“The sad truth of climate science is that the most crucial information is the least reliable” (Nature, 2010)
– Relatively reliable predictions at global scale for ancillary
variables such as temperature
– Least reliable predictions for variables that are crucial for
impact assessment such as regional precipitation
Regional hydrology exhibits large variations among major IPCC model projections
Disagreement between IPCC models
Low uncertainty High uncertainty Out of scope Temperature Hurricanes Fires Pressure Extremes Malaria outbreaks Large-scale wind Precipitation Landslides Physics based models
March 4, 2014 Slide 8
Data-Driven Knowledge Discovery in Climate Science
Transformation from Data-Poor to Data-Rich
- Sensor Observations
- Reanalysis Data
- Model Simulations
A new and transformative data-driven approach that:
- Makes use of wealth of observational and simulation data
- Advances understanding of climate processes
- Informs climate change impacts and adaptation
“Climate change research is now ‘big science,’ comparable in its magnitude, complexity, and societal importance to human genomics and bioinformatics.” (Nature Climate Change, Oct 2012)
March 4, 2014 Slide 9
Need for data driven analysis
Low uncertainty High uncertainty Out of scope Temperature Global hurricanes Global fires Pressure Extremes Malaria outbreaks Large-scale wind Precipitation Landslides Global fires Atlantic hurricanes Global sea surface temperatures
March 4, 2014 Slide 10
Need for data driven analysis
Low uncertainty High uncertainty Out of scope Temperature Global hurricanes Global fires Pressure Extremes Malaria outbreaks Large-scale wind Precipitation Landslides Global fires Atlantic hurricanes Global sea surface temperatures
fic
8 ° W 7 ° W 6 ° W 5 ° W 4 ° W 8 ° W 7 ° W 6 ° W 5 ° W 4 ° W 8 ° W 7 ° W 6 ° W 5 ° W 4 ° W 8 ° W 7 ° W 6 ° W 5 ° W 4 ° W 3 5 ° S 3 ° S 2 5 ° S 2 ° S 1 5 ° S 1 ° S 5 ° S ° 5 ° N 1 ° N 3 5 ° S 3 ° S 2 5 ° S 2 ° S 1 5 ° S 1 ° S 5 ° S ° 5 ° N 1 ° N
ñ – – ñ ° ° ° ° ° ° 5° × °
6 12 18 24
Average June-October Atlantic Tropical Cyclones (1979 - 2010) 14.4 11.33 8
El Niño Neutral La Niña
Correlation with fires in Amazon Chen et al., Science, 2011 SST Anomaly Time Series in the ENSO region
March 4, 2014 Slide 11
Challenges in data driven analysis
- Spatio-temporal auto- and cross-
correlation
- Noisy, heterogeneous, and
uncertain
- Evolutionary processes
- Multiple spatio-temporal scales
- Unknown, non-linear, and long-
range dependency structure
- Variability
- Class imbalance
- Multivariate non-stationary
- Large unlabeled datasets
- Significance testing
Faghmous and Kumar (2013)
March 4, 2014 Slide 12
Guiding Theme
The discovery and characterization of patterns and dependencies have emerged as the primary research tasks because they…
- 1. Provide an empirical understanding of physical processes…
- finding pressure dipole between Tahiti and Darwin led to the understanding of
modulation of the Walker Circulation
- 2. Allow for prediction of unknown quantities…
- where observations are sparse
- for statistical downscaling
- where physical models are inadequate (e.g., predicting the number of hurricanes
using a large number of covariates)
- 3. Enable long-range projection of highly stochastic processes…
- deriving climate extremes or hurricanes from low-resolution global model
simulations
March 4, 2014 Slide 13
Project vision and scope
Process Understanding Extreme Events
- Heat Waves
- Rainfall Extremes
- Droughts
- Hurricanes
Model Evaluation Downscaling
- Statistical
- Dynamical
Ocean-Atm.-Land Interactions Change Detection
- Abrupt vs. Gradual
- Point vs. Regions/Intervals
- Change in Extremes
Spatio-Temporal Classification Sparse/High-Dim. Methods Causal Relationships Networks/Graphs HPC Computational Innovations Understanding Climate Change
Transformative Computer Science Research Advancing Climate Change Science
March 4, 2014 Slide 14
Pattern Mining: Ocean Eddies Monitoring
- Scalable spatio-temporal pattern
mining algorithms for noisy and continuous data
- Novel multiple object tracking for
uncertain features
- Detect more accurate features
and tracks for improved ocean dynamics monitoring
- Open source data base of 20+
years of eddies and eddy tracks available for scientific applications
Faghmous et al. AAAI (2012a) Faghmous et al. CIDU (2012b) Best student paper award Faghmous et al. AAAI (2013) NSF Nordic Research Opportunity Grant to conduct research at the Bjerknes Centre for Climate Research in Norway
March 4, 2014 Slide 15
Network analysis: Climate teleconnections
Kawale et al. SDM(2011a) Kawaleet al. CIDU (2011b) Best student paper award Kawale et al. ACM SIGKDD (2012) Steinhaeuser et al. Climate Dynamics (2012). SC’11: Exploration in Science through Computation Award Grace Hopper ‘12: Best Poster Award (Winner of the ACM Student Research Competition)
- Scalable method for discovering
anti-correlated graph regions
- Novel dynamic graph clustering
for dense directed graphs
- Significance testing for spatio-
temporal patterns
- Discovered previously unknown
climate teleconnection
- Analyzed climate network
properties to better understand global climate dynamics
- Method used to compare climate
models
Climate Network
March 4, 2014 Slide 16
Fu et al. UAI(2013) Subbian et al. SDM(2013) Best Application Paper Award Hsieh et al. NIPS(2012) Wang et al. ICML(2012) Chatterjee et al. SDM(2012) Best Student Paper Award Fu et al. SDM(2012)
- Hierarchical sparse regression: rates
- f convergence with low samples
- Multi-task learning with spatial
smoothing
- Primal decomposition based LP
solver for max-cut type problems (~10 million+ node graphs)
- Regional land-climate predictions
from observations over oceans
- Combining multiple GCM outputs
more accurately than state-of-art
- Mega-drought detection, trends over
past 100-1000 years
RMSE Prediction RMSE from spatially smoothened Multi-model ensemble
- Fig. RMSE vs. Model Complexity of OLS and Sparse
Regression Methods
Predictive Modeling: Regression, Ensembles, Inference
March 4, 2014 Slide 17
Relationship mining: Seasonal hurricane activity
- Contrast-based network mining for
discriminatory signatures
- Novel dynamic graph clustering for
dense directed graphs
- Statistically robust methodology for
automatic inference of modulating networks
- Improved forecast skill for seasonal
hurricane activity
- Discovered key factors and mechanisms
modulating NA hurricane variability
- Discovered novel climate index with
much improved correlation with NA hurricane variability: 0.69 vs 0.49
High activity Low activity NSF News, DOE Research News, Science360 Sencan et al. IJCAI (2011) Pendse et al. SIAM SDM (2012) Chen et al. Data Mining & Knowledge Discovery (2012) Chen et al. SIAM SDM (2013) Chen et al. IJCAI (2013) Semazzi et al. in review at journal (2013)
March 4, 2014 Slide 18
Extremes and uncertainty: Heat waves, heavy rainfall, …
Ghosh et al. Nature Climate Change (2012) Parish et al. Computers & Geosciences (2012) Kodra et al. Environmental Research Letters (2012) Ganguly et al. Climate Extremes & UQ: Book Ch. (2013) Kodra et al. in revision at journal (2013) Kumar et al. in review at journal (2013)
- Extreme value theory in space-time and
dependence of extremes on covariates
- Mutual information and copula-methods for
space-time extremes dependence
- Uncertainty quantification with Bayesian and
resampling techniques
- Physics-guided data mining and quantification of
uncertainty
- Spatiotemporal trends in heat waves, cold snaps,
and heavy rain with climate change
- Climate model evaluation and physics-guided
uncertainty quantification
- Covariate-based improvement of extremes
projections under climate change
- Translation to adaptation and stakeholder
relevant metrics
Press Release 11- 266
JOURNAL PIECE REVEALS NEW DATA- DRIVEN METHODS FOR UNDERSTANDING CLIMATE CHANGE
Geographical variability
- f
rainfall extremes in India enhances interpretation
- f
climate change data
March 4, 2014 Slide 19
High Performance Tools and Methods
Jin et al. EuroMPI (2011) Patwary et al. SC (2012) Hentrix et al. HPC (2012) Kumar et al. IPDPS (2011) Rangel et al. in review (2013) Jin et al. in review (2013)
- Created a library of common data mining /
machine learning kernels for clustering, classification, PCA, etc.
- Many algorithms have shown speedups of
two to three orders of magnitude.
- Developed technologies for compressing
and querying huge datasets, and for performing similarity searches with a more than 10-fold speed-up
- Devised an image indexing technique
based on a new Locality Sensitive Hashing (LSH) scheme.
- Developing HPC solutions for our
collaborators, including bootstrapping methods for extreme value prediction and Markov Random Field based abrupt change detection
Improving I/O for the Global Cloud Resolving Model
March 4, 2014 Slide 20
Case Study: Data-Driven Discovery of Dipoles
Dipoles represent a class of teleconnections characterized by anomalies of opposite polarity at two locations at the same time.
March 4, 2014 Slide 21
Importance of Dipoles
Correlation of land temperature anomalies with NAO Correlation of land temperature anomalies with SOI
SOI strongly influences global climate variability. NAO influences sea level pressure (SLP) and temperature over the Northern Hemisphere.
Crucial for understanding the climate system and are known to cause temperature and precipitation anomalies throughout the globe.
March 4, 2014 Slide 22
List of Major Climate Oscillations
AO: EOF Analysis of 20N-90N Latitude AAO: EOF Analysis of 20S-90S Latitude
Discovered primarily by human
- bservation or by EOF analysis.
van Loon & Rogers, 1978 Wallace & Gutzler, 1981 von Storch & Zwiers, 2002
March 4, 2014 Slide 23
Motivation for Automatic Discovery of Dipoles
- The known dipoles are defined
by static locations but the underlying phenomenon is dynamic
- Manual discovery can miss
many dipoles
- EOF and other types of
eigenvector analysis finds the strongest signals and the physical interpretation of those can be difficult.
23
Dynamic behavior of the high and low pressure fields corresponding to NOA climate index (Portis et al, 2001)
AO: EOF Analysis of 20N- 90N Latitude AAO: EOF Analysis of 20S- 90S Latitude
March 4, 2014 Slide 24
Challenges in studying dipoles
- The distribution of positive and
negative edges around the Earth is uneven as most of the highly positive edges come from nearby locations due to spatial autocorrelation. The area weighted correlation shows that the equator is dominant.
- If we remove all edges <
5000km away the distribution is balanced.
- The number of negative edges
around the globe is very high. So an algorithm focusing on negative edges will not scale.
Distribution of edges around the Earth
Distribution of edges > 5000km away Distribution of negative edges
Distribution of edges around the Earth with abs correlation > 0.5
Distribution of negative edges
Distribution of edges around the Earth having a distance > 5000km and abs correlation > 0.2
March 4, 2014 Slide 25
Graph-Based Approach for Dipole Discovery
Nodes in the Graph correspond to grid points on the globe.
Discovered Dipoles
Steinbach et al., 2003
Tsonis et al., 2004, 2006 Donges et al., 2009a,b
Kawale et al., 2011 Edge weight corresponds to correlation between the two anomaly time series
Climate Network
March 4, 2014 Slide 26
Benefits of Automatic Dipole Discovery
- Detection of Global Dipole
Structure
- Most known dipoles discovered
- New dipoles may represent
previously unknown phenomenon.
- Enables analysis of relationships
between different dipoles
- Location based definition
possible for some known indices that are defined using EOF analysis.
- Dynamic versions are often
better than static
- Dipole structure provides an
alternate method to analyze GCM performance CIDU’11: Best Student Paper Award SC’11: Explorations in Science through Computation Award Grace Hopper’12: Best Poster Award (Winner of the ACM Student Research Competition)
Kawale et al., 2011a,b, 2012
Slide 27 March 4, 2014
Comparing Dipole Structure in Historical (Reanalysis) Data
NCEP 1979-2000 ERA-Interim 1979-2000 JRA-25 1979-2000 MERRA 1979-2000
Slide 28 March 4, 2014
Static vs Dynamic NAO Index - Impact on land temperature
The dynamic index generates a stronger impact on land temperature anomalies as compared to the static index.
Figure to the right shows the aggregate area weighted correlation for networks computed for different 20 year periods during 1948-2008.
Area-weighted Score
Slide 29 March 4, 2014
The dynamic index generates a stronger impact on land temperature anomalies as compared to the static index.
Figure to the right shows the aggregate area weighted correlation for networks computed for different 20 year periods during 1948-2008.
Area-weighted Score
Static vs Dynamic NAO Index - Impact on land temperature
Slide 30 March 4, 2014
Location Based definition of AO
- Mean Correlation between static and dynamic index: 0.84
- Impact on land temperature anomalies comparatively same using static and dynamic index
Impact on Land temperature Anomalies using Static and Dynamic AO
Static AO: EOF Analysis
- f 20N-90N Latitude
EOF-AO Dynamic Dipole -AO
Composite maps for timeseries from both approaches on hadley center SLP data (1979-2011).
Slide 31 March 4, 2014
Location Based definition of AAO
- Mean Correlation between Static and Dynamic index = 0.88
- Impact on land temperature anomalies comparatively same using static and dynamic index
Impact on Land temperature Anomalies using Static and Dynamic AAO
Static AAO: EOF Analysis of 20S-90S Latitude
EOF-AAO Dynamic Dipole -AAO
Composite maps for timeseries from both approaches on hadley center SLP data (1979-2011).
Slide 32 March 4, 2014
A New Dipole near Australia?
1 2 3
3 1
- Comparison of dipoles by looking at
land temperature impact.
- Significant difference between the
AAO impact and that due to dipoles 1,2,3 which are similar.
AAO AAO
Slide 33 March 4, 2014 March 6, 2013
Composites of ASO dipole from Hadley center SLP data on Hadley center SLP at 95% confidence ASO AAO SOI WHOLE YEAR SOUTHERN WINTER (JJA)
March 4, 2014 Slide 34
Composites of ASO dipole from Hadley center SLP data on GPCP precipitation data at 95% confidence ASO AAO SOI WHOLE YEAR SOUTHERN WINTER (JJA)
Slide 35 March 4, 2014
Model Analysis : Dipole Structure in CCSM and GFDL
The dipole structure of the top 2 models from CMIP3 to CMIP5
GFDL (CMIP3) CCSM(CMIP3) CCSM (CMIP5) GFDL (CMIP5)
SOI present SOI absent SOI present SOI present
Slide 36 March 4, 2014
saurabh’s plots
Slide 37 March 4, 2014
Slide 38 March 4, 2014
Surface Temperature Correlated with SOI
NCEP2 JRA CCSM3 MIROC-3.2medres GFDL-CM2.1 CCSM4 MIROC5 GFDL-CM3
Reanalysis CMIP3 CMIP5
Slide 39 March 4, 2014
Precipitation Correlated with SOI
GPCP CCSM4 MIROC5 GFDL-CM3 CCSM3 MIROC-3.2medres GFDL-CM2.1
Observation CMIP3 CMIP5
March 4, 2014 Slide 40
Conclusion
- Global climate change is a defining societal challenge for our
generation
- Data-guided discovery methods can play a major role in
answering some of these key questions
- Significant advances in spatio-temporal data mining