Topics Part I: BFAST R package optimizations Part rt II: II: Sc - PowerPoint PPT Presentation

Topics Part I: BFAST R package optimizations Part rt II: II: Sc Scala lable le EO data management with ith Sc SciD iDB Part III: Hands-on with SciDB, Landsat, and BFAST 1. SciDB installation (with Docker) 2. Data ingestion 3. Analysis (practical part)

BFAST on la large datasets: : bfastSpatial and and raster • works well with out-of-memory data • supports multicore parallel processing • difficult to stack data from different tiles due to overlap and different recording dates • does not scale beyond multiple machines on its own

SciD iDB for la large EO datasets • Array-based data management and analytical system [1] • Runs on single computers as well as on large clusters • Open-source version available • Sparse storage • Basic data representation as multidimensional arrays • 𝑜 dimensions, 𝑛 attributes (bands) with different data types latitude time time longitude longitude [1] Stonebraker, M., Brown, P., Zhang, D., & Becla, J. (2013). SciDB: A database management system for applications with complex analytics. Computing in Science & Engineering , 15 (3), 54-62.

Dis istributin ing arrays by by chunkin ing • arrays are divided into equally sized chunks • chunks are distributed over many SciDB instances • instances may run on the same or different machines in a shared nothing cluster  distributing storage and computational load

Query la language and and functionali lity • SciDB query language: Array Functional Language (AFL) • Native functionality: – Load / write arrays from / to files – Arithmetic operations – Subsetting by dimensions and / or attributes – Aggregations (window, aggregate) – Array joins – Changing array schemas (repartitioning, redimensioning) – Linear algebra routines: (GEMM, GESVD, basic statistics) – …

SciD iDB: : ext xtensions for EO data SciDB • can load data from CSV and custom-binary files only • does not understand spatial / temporal reference of arrays  spacetime extensions [1]: – scidb4geo (https://github.com/appelmar/scidb4geo) – scidb4gdal (https://github.com/appelmar/scidb4gdal) [1] Appel M., Lahn F., Pebesma E., Buytaert W., Moulds S. (2016). Scalable Earth-observation Analytics for Geoscientists: Spacetime Extensions to the Array Database SciDB. accepted for poster presentation at EGU General Assembly 2016, Vienna, Austria April 17-22, 2016.

scid idb4geo New AF AFL (Array Functi ctional onal Language) age) operator ors Operat ator Descripti ription on eo_arrays() Lists geographically referenced arrays eo_setsrs() Sets the spatial reference of existing arrays eo_getsrs() Gets the spatial reference of existing arrays eo_extent() Computes the geographic extent of referenced arrays eo_settrs() Sets the temporal reference of arrays eo_gettrs() Gets the temporal reference of arrays eo_setmd() Sets key value metadata of arrays and array attributes eo_getmd() Gets key value metadata of arrays and array attributes eo_over() Overlays two arrays by space and / or time

scid idb4gdal • supports ingestion and download of images to and from SciDB • GDAL supports > 100 raster formats • ingestion automatically combines images by space and time (mosaicing) t

Interfacing R In lient: packages scidb [1] and scidbst [2] works R R as as a a cli with proxy objects and lazy evaluation  starts computations when you want to read the data • overwrites R methods, e.g. %*% • limited to native SciDB functionality : stream [3] and r_exec [4] Runnin ing R R with ithin in Sc SciD iDB: • apply arbitrary R functions in parallel on chunks [1] https://github.com/Paradigm4/SciDBR [2] https://github.com/flahn/scidbst [3] https://github.com/Paradigm4/stream [4] https://github.com/Paradigm4/r_exec

BFAST wit ithin in SciD iDB • Id Idea: organize chunk sizes such that one chunk contains the complete time-series of a small region, e.g. 50x50 pixels • Use stream or r_exec to run bfast in parallel • R and the bfast package must be installed on all SciDB servers  scalability with relatively little amount of reimplementation needed  move computations to the data instead of move the data to the computations

Stu tudy case: Mon onit itoring ch changes in in NDVI tim time seri eries of of La Landsat 7 in in sou outh wes est t Eth thio iopia • Landsat 7 data from 12 tiles captured between 2003-07-21 and 2014-12-27  1975 scenes • Derived NDVI product from ESPA • approx. 325,000 km 2 • monitor changes starting with 2010-01-01, with ROC history model

Landsat 7 in in SciD iDB 1. Ingestion: – For all *_ndvi.tif images: • extract date from filename • reproject / warp to the same spatial reference system • upload to SciDB 2. Repartition the array such that chunks contain complete time series of 64x64 pixels 3. Preprocessing: – remove any values <= -9999 or >10000 – unscale to -1, 1 • Ingestion of all scenes took around 4 days • Repartitioning took around 2 days

Landsat 7 in in SciD iDB The data is represented in SciDB as a three-dimensional array with dail ily temporal l reso solu lutio ion and • 49548 x 47713 x 4177 cells in total • 64 x 64 x 4177 cells per chunk • Only 0.5% ( 54 ⋅ 10 9 ) of the cells contain data • SciDB has sparse storage

Scala labil ility wit ith SciD iDB in instances • 16 SciDB instances on one machine used (64 CPU cores, 256 GB main memory) • running bfastmonitor repeatedly with different number of available CPU cores on a small subset

Study case: : result lts • Running bfastmonitor on the complete dataset took 8 days

Conclusions • SciDB is able to make BFAST scalable even in large cluster environments • The multidimensional array model, chunking, and sparse storage are well-suited to represent large EO datasets from many scenes • Ingestion and data restructuring time consuming, alternatives to GDAL needed • Installation and data ingestion not straightforward • Analysis from R relatively easy to learn for experienced R users (see hands-on part)

Thank you Questions?

Topics Part I: BFAST R package optimizations Part rt II: II: Sc - PowerPoint PPT Presentation

Topics Part I: BFAST R package optimizations Part rt II: II: Sc Scala lable le EO data management with ith Sc SciD iDB Part III: Hands-on with SciDB, Landsat, and BFAST 1. SciDB installation (with Docker) 2. Data ingestion 3.

Advanced MySQL topics Presented by : John A Mahady AndrewInfoServices.com Topics Topics

6/30/20 SIO15-SS1 2020 Topics 01/02: Nat. Disasters/Forces and Energy SIO15-SS1 2020 Topics

EFFICACY TOPICS EFFICACY TOPICS Public ICH meeting - Brussels 14 th November 2008 International

Topics Redux Michael R. Gunson February 23, 2001 1 AIRS Topics Status mrg Topics From Last

Dealing With Missing Data Possible Future Topics Novice user topics: Advanced topics:

Provider Topics for MCOs and OLTL Topics for MCOs o Safe and Orderly Discharges for NF

2020 Church Finance Topics Presented by Suzanne Krejcar, Treasurer January 26, 2020 Topics

Agenda Decision Topics Review 2006 Scheduled Meeting Topics (what, when) Determine

Aug me nte d Re a lity Sung -e ui Yo o n Project Guidelines: Project Topics Any topics

Current Trends and Hot Topics from a MHRA Borderline Perspective Trends and Hot topics

Topics Topics mechanical energy Force regulation by muscle WATCH HOW MUSCLE CELLS CONTRACT

AUCD Research Topics of AUCD Research Topics of Interest (RTOI) Webinar Interest (RTOI) Webinar

OPEN CALL TOPICS- ADDITIONAL LIST CURRENT TOPICS Innovation TOPIC SUB-THEMES MEMBERS/PARTNERS

Fraud, Waste and Abuse Presentation Topics TOPICS SLIDES Our Pledge 3 Program Integrity

Topics Topics Acute Radiation Syndrome (ARS) y ( ) Definition and diagnosis

NOISE ABATEMENT ANALYSIS NOISE ABATEMENT ANALYSIS DISCUSSION TOPICS DISCUSSION TOPICS

AHCAL Project Status Jianbei Liu for the AHCAL group (USTC+IHEP+SJTU) State Key Laboratory of

PEN phone conference 13.02.2019 Univ.-Prof. Dr.-Ing. Markus Stommel Fakultt fr Maschinenbau

Adaptive Mesh Refinement in Filling Simulations Based on Level Set RICAM Special Semester |

When does a slime mould compute? Memorial University

Why People Dual Screen Political Debates and Why It Matters for Democratic Engagement Andrew

JOIN THE INDUSTRIAL TRANSFORMATION EMAF is back! This year were reinforcing PORTUGALS

Higher Education Finance Student Finance England www.gov.uk/studentfinance 0300 100 0607 Apply

Digital Architecture: Concerns in pedagogical approach Photo by Pavel Nekoranec on Unsplash