Processing and analysis of Earth Observation data Carsten - - PowerPoint PPT Presentation

processing and analysis of earth observation data
SMART_READER_LITE
LIVE PREVIEW

Processing and analysis of Earth Observation data Carsten - - PowerPoint PPT Presentation

Processing and analysis of Earth Observation data Carsten Brockmann, Brockmann Consult GmbH ESA Climate Change Initiative Toolbox Science Lead Big Data Analytics & GIS, Mnster 20.-21. September 2017. Earth Observation Managing big EO


slide-1
SLIDE 1

Processing and analysis of Earth Observation data

Carsten Brockmann, Brockmann Consult GmbH

ESA Climate Change Initiative Toolbox Science Lead

Big Data Analytics & GIS, Münster 20.-21. September 2017.

slide-2
SLIDE 2

Earth Observation Managing big EO data is increasingly complex. But not just technically.

  • Collaboration. Culture. Organisation. Structure.
slide-3
SLIDE 3

Lake MacKay, Australia

slide-4
SLIDE 4

Tonga, Pacific

slide-5
SLIDE 5

Arctic Ocean

slide-6
SLIDE 6

Karavasta Lagoon, Albania

slide-7
SLIDE 7

Satellite Images = Measurement Data

slide-8
SLIDE 8

Turning image data into information

Satellite Images = Measurement Data

Generic Tools

  • Data selection & access
  • Visualisation
  • Analysis
  • Processing
  • Export

Instrument specific tools

  • System correction
  • Data processing (L1 –> L2)

Thematic processing, synergy

  • Programming tools
  • API for python and Java
  • Scripting
  • Graph model builder
slide-9
SLIDE 9

SNAP Architecture

SNAP Engine

Java SE 8 Platform NetBeans RCP

SNAP Desktop Sentinel-3 Toolbox (S3TBX) Sentinel-2 Toolbox (S2TBX) Sentinel-1 Toolbox (S1TBX)

Python GeoTools JAI NetCDF …

Any combination

  • f toolboxes

add-ons is allowed, even none, as SNAP Desktop is a already a useful stand-alone application for EO data exploitation. Programming language layer 3rd-party library layer SNAP layer

slide-10
SLIDE 10

SNAP Architecture

SNAP Engine

Java SE 8 Platform NetBeans RCP

SNAP Desktop Sentinel-3 Toolbox (S3TBX) Sentinel-2 Toolbox (S2TBX) Sentinel-1 Toolbox (S1TBX)

Python GeoTools JAI NetCDF …

Any combination

  • f toolboxes

add-ons is allowed, even none, as SNAP Desktop is a already a useful stand-alone application for EO data exploitation. Programming language layer 3rd-party library layer SNAP layer

slide-11
SLIDE 11

SNAP Application Modes

slide-12
SLIDE 12

Golden Age of Earth Observation

By the end of 2017, the operational Sentinel-1, -2, -3 and -5p satellites alone will continuously collect a volume of 27 Terabytes per day / 10PB per year. It could take around 2.5 years to download 1 Petabyte

  • f Sentinel-1 data and a staggering 63 years to pre-

process it on your own computer (Wagner, 2015)

slide-13
SLIDE 13

Data Local Processing

Optimising data transfer Sharing input data Sharing result Rapid turn-around cycles

slide-14
SLIDE 14

Hadoop approach Concurrent data-local processing Tasks are transferred over the network Good scalability Archive-centric approach Network storage Data are transferred over the network Risk of network bottleneck

Data Local Processing

slide-15
SLIDE 15

Calvalus Adapters

Calvalus Processing System for EO Data

Calvalus On-demand Portal Calvalus Bulk Production Calvalus Adapters SNAP GPF Operators and Graphs SNAP Aggre- gators Linux Executables

Apply Apache Hadoop to earth observation

Transfer the algorithm to the data (data-local in a narrower sense) Avoid an archive-centric approach

Add Calvalus software layers for EO data processing and validation

EO processing workflows Data processor plug-in framework Bulk production control Portal

Integrate data processors

Linux executables SNAP & BEAM GPF operators and aggregators Ppen for other frameworks

slide-16
SLIDE 16
  • distributed file system HDFS
  • n local disks of compute nodes
  • transparent, optimised data-local

access

L1 File L2 Processor (Mapper Task) L2 File L1 File L2 Processor (Mapper Task) L2 File L1 File L2 Processor (Mapper Task) L2 File L1 File L2 Processor (Mapper Task) L2 File L1 File L2 Processor (Mapper Task) L2 File

  • MERIS RR L1, North Sea, 3 days
  • CoastColour NN L2 processor
  • 6 minutes (22 nodes)
  • output: L2 files
  • Only Mapper Tasks, no reduce

step necessary

Data Local Processing

  • Level 2 Processing Workflow on Calvalus -
slide-17
SLIDE 17
  • distributed file system HDFS
  • n local disks of compute nodes
  • transparent, optimised data-local

access

  • Algorithm defined by Google employees Dean + Ghemawat in 2004
  • Idea: partition data into chunks, compute chunks locally (map),

concatenate intermediate results to final result (reduce)

  • Allows for high degree of parallelisation
  • Fits very well with principle of data locality

Map-Reduce on Calvalus

slide-18
SLIDE 18

Map-Reduce on Calvalus

  • Temporal and Spatial Integration -
  • Temp. Binning

(Reducer Task)

  • MERIS RR L1, global, 10-day
  • SNAP „C2RCC“ Water processor
  • 20 mins (100 nodes)
  • Output: 1 L3 product
  • distributed file system HDFS
  • n local disks of compute nodes
  • transparent, optimised data-local

access

L1 File

L2 Proc. + Spat. Binning (Mapper Task)

Spatial Bins L1 File

L2 Proc. + Spat. Binning (Mapper Task)

Spatial Bins L1 File

L2 Proc. + Spat. Binning (Mapper Task)

Spatial Bins L1 File

L2 Proc. + Spat. Binning (Mapper Task)

Spatial Bins L1 File

L2 Proc. + Spat. Binning (Mapper Task)

Spatial Bins

  • Temp. Binning

(Reducer Task) Temp. Bins Temp. Bins L3 Formatting L3 File

slide-19
SLIDE 19

L1 File L2 Proc. & Matcher (Mapper Task) Output Records L1 File L2 Proc. & Matcher (Mapper Task) Output Records L1 File L2 Proc. & Matcher (Mapper Task) Output Records L1 File L2 Proc. & Matcher (Mapper Task) Output Records

  • MERIS RR L1, global, 3 months
  • „CoastColour C2W“ processor
  • NOMAD in-situ dataset
  • 1.5 minutes (100 nodes)
  • Output: scatter-plots and pixel extraction tables

Matchup Analysis (Reducer Task) MA Report Input Records L1 File L2 Proc. & Matcher (Mapper Task) Output Records

Map-Reduce on Calvalus

  • Match-up analysis -
slide-20
SLIDE 20

Supported by SNAP Graph Processing Framework

  • Access to data via reader/writer objects instead of files
  • Operator chaining to build processors from modules
  • Tile cache and pull principle for in-memory processing
  • Hadoop MapReduce for partitioning and streaming

Streaming on Calvalus

  • With SNAP -
slide-21
SLIDE 21

EO Data & Data-Processing Platforms

European Space Agency & national Space Agencies

  • Thematic Exploitation Platforms
  • Mission Exploitation Platforms

European Commission:

  • Copernicus Data and Information Access Services (DIAS)

Copernicus Collaborative Ground Segments Private offers

  • Google Earth Engine
  • Amazon Web Services
slide-22
SLIDE 22

The Urban Thematic Exploitation Platform

Visualisation & Analysis Urban TEP Processing Centres Urban TEP portal + gateway

gateway to ...

slide-23
SLIDE 23

Datasets and services Geo-browser Processing request forms and result access

Portal functions

slide-24
SLIDE 24

Analysis and visualisation

Combination of satellite products and socio-economic data Derivation of new criteria

slide-25
SLIDE 25

Processing request form

slide-26
SLIDE 26

Istanbul Moscow Sao Paulo

Global binary raster mask showing location of human settlements (12m/75m)

GUF

slide-27
SLIDE 27

▪ SAR4Urban (2015-2016)

Beijing

ERS-2 PRI & ASAR IMP VV 2002-2003 15m spatial resolution 48 scenes

Urban growth

slide-28
SLIDE 28

Beijing

S1A IW GRDH VV 2014-2015 10m spatial resolution 31 scenes

Urban growth

slide-29
SLIDE 29

Urban TEP is ...

  • attractive high-quality datasets ...

... that meet space, time and feature dimensions of the domain

  • the capability to generate them
  • the facilities ...

... to access and use them, ... to generate more of them

slide-30
SLIDE 30

Package Upload Local test processing Deployment Processing Request Concurrent processing VM for download Browser for request submission Urban TEP portal Urban TEP processing centres

Processor development model

slide-31
SLIDE 31

Systematic or on-demand processing

  • Datasets may be pre-generated, providing access to them as product
  • for long-running processes
  • for global datasets with high complexity/information reduction
  • in order to be able to visualise them
  • Example: GUF
  • Datasets may be processed on-demand, providing a service instead
  • for short-running processes
  • for selected areas
  • in case of user-defined parameterisation
  • to avoid storage of large output datasets
  • Example: Sentinel-2 timescan service (unless generated systematically)
slide-32
SLIDE 32

Urban TEP processing centres

IT4Innovations Brockmann Consult DLR

cluster (Salomon HPC) cluster (Calvalus/Hadoop) YARN scheduler virtualised env. (GeoFarm) +cluster(Calvalus/Hadoop)

  • Sentinel-2 (urban areas, Africa),

OLCI, MERIS Sentinel-1 and other datasets Geoserver WPS + own backend implementation Calvalus WPS + Urban TEP config+extension

  • Geoserver WMS

Geoserver WMS

  • large-scale global Landsat timescan

processing GUF subsetting, Sentinel-2 timescan processing GUF and other Urban datasets fast internet access, HPC. host of portal and analysis/visualisation distributed data-local processing and concurrent aggregation systematic generation of datasets

slide-33
SLIDE 33

Copernicus Data and Exploitation Platform – Deutschland National entry point to the EU Copernicus Sentinel Satellite Systems, their data products and the products of the Copernicus Services Processing facilities on the platform

slide-34
SLIDE 34

EU DIAS

slide-35
SLIDE 35

Confusing?

YES!

slide-36
SLIDE 36

Climate Monitoring Data Climate change is a global challenge. Open climate data is crucial.

slide-37
SLIDE 37

The objective of the Climate Change Initiative (CCI) is to realise the full potential of the long-term global Earth Observation archives that ESA together with its Member States have established over the last 30 years, as a significant and timely contribution to the Essential Climate Variable databases required by the United Nations Framework Convention on Climate Change (UNFCCC).

ESA Climate Change Initiative (CCI)

slide-38
SLIDE 38

ESA Climate Change Initiative (CCI) 16 projects >300 scientists >100 organisations 18 countries Since 2009

slide-39
SLIDE 39

7 years. X individuals X organisations X projects

slide-40
SLIDE 40
slide-41
SLIDE 41

Climate Monitoring Data An Overview of Climate Data Production.

slide-42
SLIDE 42

Essential Climate Variables have been defined by the global science community to support the United Nations Framework Convention on Climate Change (UNFCCC). Step 1. Deciding what to actually measure.

slide-43
SLIDE 43

Criteria of Essential Climate Variables (ECV)

  • Relevance. Critical for climate monitoring.
  • Feasibility. Global measurement is feasible.

Cost effective. Using proven technology.

slide-44
SLIDE 44

Satellites can help

  • Global. Observe entire Earth.
  • Uniformity. Same instrument everywhere.

Rapid Measurement. Constant watch.

  • Continuity. Long time series to monitor change in climate.
slide-45
SLIDE 45

Step 2. Get the raw satellite data. Current data. Archived data. Planning for the future.

slide-46
SLIDE 46

Example

slide-47
SLIDE 47

Step 3. Process the data. Gridding, Homogenisation, Calibration & Validation, Quality. Scientific processing - application of state-of-the–art algorithms distilled from the very latest scientific reasoning.

slide-48
SLIDE 48

Step 4. Distribution of Climate Data Products. “Just give me the data”.

slide-49
SLIDE 49

ESA Climate Change Initiative (CCI) Managing Complexity of Climate Data Production. Open Data Challenges & Approaches

slide-50
SLIDE 50

Meaningfulness & Community.

Managing Open Data Complexity

slide-51
SLIDE 51

Ease of Data Access.

Managing Open Data Complexity

slide-52
SLIDE 52
slide-53
SLIDE 53
slide-54
SLIDE 54

Bespoke Open Standards.

Managing Open Data Complexity

slide-55
SLIDE 55

Machine & Human Readable Standards.

Managing Open Data Complexity

slide-56
SLIDE 56

Interoperability & Collaboration.

Managing Open Data Complexity

slide-57
SLIDE 57

Open-source Tooling

Managing Open Data Complexity

slide-58
SLIDE 58

“Climate Analysis Toolbox for ESA” A software to facilitate processing and analysis of all the data products generated by the ESA Climate Change Initiative Programme (CCI).

CATE

slide-59
SLIDE 59

ESA UNCLASSIFIED - For Official Use

slide-60
SLIDE 60

61

  • Data sources
  • Operations
  • Workflows

Web Service (WebAPI) { RESTful }

Command-Line App (CLI)

Desktop App (GUI)

Python Core Lib (API) Python Core Lib (API)

Plugin 1 Plugin 2 Plugin 1 Plugin 2 ESA Open Data Portal and other data services

Process 1 Process 2 Process 3

slide-61
SLIDE 61

Cate Desktop

62

slide-62
SLIDE 62
  • Browse datasets

published by CCI Open Data Portal

  • Download full datasets or

just subsets

▪ Temporal subset ▪ Spatial subset ▪ Variable subset

  • Manage also your

local data sources

63

slide-63
SLIDE 63

64

slide-64
SLIDE 64

65

slide-65
SLIDE 65

66

slide-66
SLIDE 66

67

slide-67
SLIDE 67
  • Every operation is a new

workflow step

  • Workflows can be

executed from Python

  • r from the

Command-Line Interface (CLI)

68

slide-68
SLIDE 68

Python Programming & Batch Processing

Exported CLI Calls

69

  • Exported Python Code
slide-69
SLIDE 69

Execution Scenarios

70

Cate Desktop Cate Desktop Cate Desktop

slide-70
SLIDE 70

71

slide-71
SLIDE 71

Processing and analysis of Earth Observation data

Managing big EO data is increasingly complex. But not just technically.

  • Collaboration. Culture. Organisation. Structure.
  • Communication. Learning. Exchange.

Sentinel Toolbox SNAP: step.esa.int CCI Toolbox: github.com/CCI-Tools/CCI-Tools.github.io ect-core.readthedocs.io cci-tools.github.io Earth System Datacube: earthsystemdatacube.net/