EDC Forum 2017 Big Spatial Data in Agriculture Marlena Gtza, Thilo - - PowerPoint PPT Presentation
EDC Forum 2017 Big Spatial Data in Agriculture Marlena Gtza, Thilo - - PowerPoint PPT Presentation
EDC Forum 2017 Big Spatial Data in Agriculture Marlena Gtza, Thilo Steckel, Heinrich Warkentin CLAAS E-Systems 1. CLAAS & GIS Technologies 2. Hadoop as a Big Data Ecosystem 3. Big Data & GIS Technologies 4. Rsum 21.09.2017
- 1. CLAAS & GIS Technologies
- 2. Hadoop as a „Big Data“ Ecosystem
- 3. Big Data & GIS Technologies
- 4. Résumé
21.09.2017
Company Presentation CLAAS E-Systems | CLAAS Group
Product Range
Vier Spalten Telehandler Balers Service & Parts Software and systems Combine harvesters Forage harvesters Tractors Forage harvesting machines
21.09.2017
Company Presentation CLAAS E-Systems | Trends and Challenges
Agricultural Engineering in the past
21.09.2017
Company Presentation CLAAS E-Systems | Trends and Challenges
Have we reached our limits?
21.09.2017
Company Presentation CLAAS E-Systems | Trends and Challenges
Precision Agriculture
21.09.2017
GIS in Agriculture
21.09.2017
21.09.2017
- 1. CLAAS & GIS Technologies
- 2. Hadoop as a „Big Data“ Ecosystem
- 3. Big Data & GIS Technologies
- 4. Résumé
Research Project AGATA - Analyse großer Datenmengen in Verarbeitungsprozessen
21.09.2017
Hadoop
21.09.2017
- Core Concepts
- HDFS
- Physical replication of data
- Fault tolerant through redundancy
- MapReduce Framework
Data
09/10/2017
- Machine Data
- GPS position
- Operating data
- Master data
- Field Data
- Polygons
- Documenation
Hadoop
- Good at:
- Storing, processing, querying big data sets
- „batch“ processing of data
- Bad at:
- Processing spatial data
- Handling time and space components
- Visualization of (spatial / temporal) data
21.09.2017
21.09.2017
- 1. CLAAS & GIS Technologies
- 2. Hadoop as a „Big Data“ Ecosystem
- 3. Big Data & GIS Technologies
- 4. Résumé
How can the current big data infrastructure be extended by ESRI technology to support the spatial component of the data in the processing process at CLAAS? Does the use of GIS technologies have an added value for information processing at CLAAS?
21.09.2017
Research cooperation with 52N and ESRI
GIS Tools for Hadoop:
- pen-source
- Esri Geometry API for Java: Java based API
- Spatial Framework for Hadoop: adds User Defined Functions (UDFs)
for spatial queries
- Geoprocessing Tools for Hadoop: connection to ArcGIS Desktop
21.09.2017
Configuration 1: GIS Tools for Hadoop
- calculating the average of selected DGM points
- selecting DEM points that are within a buffer of 5 m
around the GPS point
- assigning the average height to the GPS point
SELECT tm_gps.*, AVG(dgm_dt.dgm_height) as avg_gps_height FROM tm_gps, dgm_dt WHERE ST_Contains( ST_Buffer(ST_Point(tm_gps.gps_long, tm_gps.gps_lat), 0.000045), ST_Point(dgm_dt.dgm_long, testdgm2.dgm_lat)) GROUP BY tm_gps.id, tm_gps.gps_long, tm_gps.gps_lat, tm_gps.gps_height;
21.09.2017
Configuration 1: GIS Tools for Hadoop - Example
enrichment of altitude data Code Explanation
- spatial binning: partitioning space by a grid
- f fixed resolution, aggregation of height
values from the DEM in each cell
- each cell has a unique ID, assigning the
aggregated height values to the grid cells
- determining the Bin-IDs
- joining the machine data to the height
values
CREATE VIEW height_agg_bin AS SELECT bin_id, ST_BinEnvelope(0.0005, bin_id) shape, COUNT(*) count, AVG(dgm_height) height, MAX(dgm_height) max, MIN(dgm_height) min FROM ( SELECT ST_Bin(0.0005, ST_Point(dgm_dt.dgm_long, dgm_dt.dgm_lat)) bin_id, * FROM dgm_dt ) bins GROUP BY bin_id; SELECT * FROM ( SELECT *, ST_BIN(0.0005, ST_Bin(0.0005, ST_Point(gps_long, gps_lat))) as bin_id FROM tm_gps ) t1 LEFT OUTER JOIN height_agg_bin t2 ON (t1.bin_id = t2.bin_id);
21.09.2017
Configuration 1: GIS Tools for Hadoop - Example
enrichment of altitude data: spatial binning Code Explanation
ArcGIS Enterprise Stack
- Hadoop Cluster is integrated as a Big Data File Share
- extension of the Hadoop system with ArcGIS Software
- API for Python to use the ArcGIS Enterprise components in
code scripts
- ArcGIS Pro to operate processes on ArcGIS Enterprise
21.09.2017
Configuration 2: ArcGIS Enterprise
Use Case example using the GeoAnalytics Server
- detection of field boundaries on the basis of GPS points using
GIS technologies
Use Case example Field boundaries
21.09.2017
Step 1: Preprocessing Input data points: GPS position + timestamp Attributes = sensor data
Use Case example Field boundaries
21.09.2017
Use Case example Field boundaries
Step 1: Preprocessing Filtering of non relevant data (street data, U-turns, null values,…)
21.09.2017
Use Case example Field boundaries
Step 2: Reconstruction of field trajectories “Reconstruct Tracks” Tool
21.09.2017
Step 3: Grouping Grouping field tracks to fields by dissolving or rastering and grouping the tracks.
Use Case example Field boundaries
21.09.2017
Step 4: Generating field boundaries Extracting the field polygons by expanding the field tracks to the work width of the machine.
Use Case example Field boundaries
21.09.2017
Known Issues:
- Hadoop is secured by Apache Knox Gateway workaround required
21.09.2017
- 1. CLAAS & GIS Technologies
- 2. Hadoop as a „Big Data“ Ecosystem
- 3. Big Data & GIS Technologies
- 4. Résumé
Offline
- 1. GIS Tools for Hadoop:
- using Spatial Frameworks for Hadoop (open source and easy to integrate)
- next steps: Using Spark on Hadoop
- Geospark and other geospatial packages for Spark
- 2. Desktop GIS
- provides additional GIS tools that are not available in the GIS Tools for Hadoop
- 3. ArcGIS Enterprise:
- Big Data technology stack can enhance the analysis of machine data
- Integration with big data structure is not working reliable
21.09.2017
Conclusion
Company Presentation CLAAS E-Systems | CLAAS E-Systems
Occupational Areas
21.09.2017
Company Presentation CLAAS E-Systems | CLAAS E-Systems
Employment Figures
21.09.2017
Company Presentation CLAAS E-Systems | CLAAS E-Systems
Entry Opportunities
21.09.2017
Company Presentation CLAAS E-Systems | CLAAS E-Systems
Thank you for your attention!
.
21.09.2017