Data Mining on Agriculture Data using Neural Networks Georg Ru, - - PowerPoint PPT Presentation

data mining on agriculture data using neural networks
SMART_READER_LITE
LIVE PREVIEW

Data Mining on Agriculture Data using Neural Networks Georg Ru, - - PowerPoint PPT Presentation

Outline Motivation Available Data Points of interest Data Modeling Results Work in Progress: Self-Organizing Maps Data Mining on Agriculture Data using Neural Networks Georg Ru, Rudolf Kruse, Martin Schneider, Peter Wagner July 16th,


slide-1
SLIDE 1

Outline Motivation Available Data Points of interest Data Modeling Results Work in Progress: Self-Organizing Maps

Data Mining on Agriculture Data using Neural Networks

Georg Ruß, Rudolf Kruse, Martin Schneider, Peter Wagner July 16th, 2008

Georg Ruß, Rudolf Kruse, Martin Schneider, Peter Wagner Data Mining on Agriculture Data using Neural Networks

slide-2
SLIDE 2

Outline Motivation Available Data Points of interest Data Modeling Results Work in Progress: Self-Organizing Maps

Outline Motivation Available Data Data Details Data Overview Points of interest Data Modeling Results Work in Progress: Self-Organizing Maps

Georg Ruß, Rudolf Kruse, Martin Schneider, Peter Wagner Data Mining on Agriculture Data using Neural Networks

slide-3
SLIDE 3

Outline Motivation Available Data Points of interest Data Modeling Results Work in Progress: Self-Organizing Maps

Motivation: Precision Farming

small−scale precision treatment uniform treatment field

Figure: Precision Farming: from uniform field to small-scale area

Georg Ruß, Rudolf Kruse, Martin Schneider, Peter Wagner Data Mining on Agriculture Data using Neural Networks

slide-4
SLIDE 4

Outline Motivation Available Data Points of interest Data Modeling Results Work in Progress: Self-Organizing Maps

Motivation: Precision Farming

◮ precision farming

◮ divide field into small-scale parts ◮ treat small parts independently instead of uniformly ◮ cheap data collection ◮ GPS-based technology

◮ lots of data (sensors, imagery, GPS-tagged) ◮ use data mining to

◮ improve efficiency ◮ improve yield Georg Ruß, Rudolf Kruse, Martin Schneider, Peter Wagner Data Mining on Agriculture Data using Neural Networks

slide-5
SLIDE 5

Outline Motivation Available Data Points of interest Data Modeling Results Work in Progress: Self-Organizing Maps

Data Flow Model

acquire data preprocess build model evaluate model

  • ptimize / use

Figure: Data Mining Context

Georg Ruß, Rudolf Kruse, Martin Schneider, Peter Wagner Data Mining on Agriculture Data using Neural Networks

slide-6
SLIDE 6

Outline Motivation Available Data Points of interest Data Modeling Results Work in Progress: Self-Organizing Maps Data Details Data Overview

Nitrogen Fertilizer

◮ easy to measure when manuring ◮ three points into the growing season where nitrogen fertilizer

is applied

◮ three attributes: N1, N2, N3

Georg Ruß, Rudolf Kruse, Martin Schneider, Peter Wagner Data Mining on Agriculture Data using Neural Networks

slide-7
SLIDE 7

Outline Motivation Available Data Points of interest Data Modeling Results Work in Progress: Self-Organizing Maps Data Details Data Overview

Vegetation Measuring

◮ Red Edge Inflection Point ◮ first derivative value along the red edge region ◮ aerial photography or tractor-mounted sensor ◮ larger value means more vegetation ◮ measured before N2 and N3 ◮ two attributes: REIP32, REIP49

Georg Ruß, Rudolf Kruse, Martin Schneider, Peter Wagner Data Mining on Agriculture Data using Neural Networks

slide-8
SLIDE 8

Outline Motivation Available Data Points of interest Data Modeling Results Work in Progress: Self-Organizing Maps Data Details Data Overview

Electric Conductivity

◮ measure apparent conductivity of soil down to 1.5m ◮ uses commercial sensors ◮ one attribute: EM38

Georg Ruß, Rudolf Kruse, Martin Schneider, Peter Wagner Data Mining on Agriculture Data using Neural Networks

slide-9
SLIDE 9

Outline Motivation Available Data Points of interest Data Modeling Results Work in Progress: Self-Organizing Maps Data Details Data Overview

Yield

◮ measure yield when harvesting ◮ data from 2003 (previous year) and 2004 (current year) ◮ two attributes: Yield03, Yield04

Georg Ruß, Rudolf Kruse, Martin Schneider, Peter Wagner Data Mining on Agriculture Data using Neural Networks

slide-10
SLIDE 10

Outline Motivation Available Data Points of interest Data Modeling Results Work in Progress: Self-Organizing Maps Data Details Data Overview

Table: Attributes overview

Attr. min max mean std N1 100 57.7 13.5 N2 100 39.9 16.4 N3 100 38.5 15.3 REIP32 721.1 727.2 725.7 0.64 REIP49 722.4 729.6 728.1 0.65 EM38 17.97 86.45 33.82 5.27 Yield03 1.19 12.38 6.27 1.48 Yield04 6.42 11.37 9.14 0.73

Georg Ruß, Rudolf Kruse, Martin Schneider, Peter Wagner Data Mining on Agriculture Data using Neural Networks

slide-11
SLIDE 11

Outline Motivation Available Data Points of interest Data Modeling Results Work in Progress: Self-Organizing Maps Data Details Data Overview

Splitting the data

Table: Overview: available data sets for three fertilization times (FT)

FT1 Yield03, EM38, N1 FT2 Yield03, EM38, N1, REIP32, N2 FT3 Yield03, EM38, N1, REIP32, N2, REIP49, N3

◮ FT1 ⊂ FT2 ⊂ FT3 (in terms of attributes) ◮ size of data sets: ≈ 5000 records ◮ For each FT*: Variable to predict is Yield04

Georg Ruß, Rudolf Kruse, Martin Schneider, Peter Wagner Data Mining on Agriculture Data using Neural Networks

slide-12
SLIDE 12

Outline Motivation Available Data Points of interest Data Modeling Results Work in Progress: Self-Organizing Maps

Research Questions

◮ How much does fertilization influence current-year yield? ◮ Is there a correlation between data attributes that influences

yield?

◮ How well can modeling techniques predict Yield2004? ◮ Can we model the data with a multi-layer-perceptron?

(reproducing earlier results)

◮ What would be the optimal MLP’s topology (number of

neurons per layer)?

Georg Ruß, Rudolf Kruse, Martin Schneider, Peter Wagner Data Mining on Agriculture Data using Neural Networks

slide-13
SLIDE 13

Outline Motivation Available Data Points of interest Data Modeling Results Work in Progress: Self-Organizing Maps

Data Modeling: Multi-Layer Perceptron

◮ Feedforward artificial neural network ◮ Maps a set of input data onto output data ◮ Mapping can be learned ◮ Here: predict current year’s yield from current data

Georg Ruß, Rudolf Kruse, Martin Schneider, Peter Wagner Data Mining on Agriculture Data using Neural Networks

slide-14
SLIDE 14

Outline Motivation Available Data Points of interest Data Modeling Results Work in Progress: Self-Organizing Maps

Data Modeling: Multi-Layer Perceptron

◮ Use different-size multi-layer-perceptrons for modeling ◮ Try to determine optimal layer size (number of hidden layers:

2)

◮ Compare MLPs for different data sets ◮ Use cross-validation and mean squared error for performance

measuring

Georg Ruß, Rudolf Kruse, Martin Schneider, Peter Wagner Data Mining on Agriculture Data using Neural Networks

slide-15
SLIDE 15

Outline Motivation Available Data Points of interest Data Modeling Results Work in Progress: Self-Organizing Maps

MSE plot for FT1

5 10 15 20 25 30 35 5 10 15 20 25 30 35 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 size of first hidden layer size of second hidden layer mse

Figure: MSE for first data set

Georg Ruß, Rudolf Kruse, Martin Schneider, Peter Wagner Data Mining on Agriculture Data using Neural Networks

slide-16
SLIDE 16

Outline Motivation Available Data Points of interest Data Modeling Results Work in Progress: Self-Organizing Maps

MSE plot for FT2

5 10 15 20 25 30 35 5 10 15 20 25 30 35 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 size of first hidden layer size of second hidden layer mse

Figure: MSE for second data set

Georg Ruß, Rudolf Kruse, Martin Schneider, Peter Wagner Data Mining on Agriculture Data using Neural Networks

slide-17
SLIDE 17

Outline Motivation Available Data Points of interest Data Modeling Results Work in Progress: Self-Organizing Maps

MSE plot for FT3

5 10 15 20 25 30 35 5 10 15 20 25 30 35 0.2 0.25 0.3 0.35 0.4 0.45 0.5 size of first hidden layer size of second hidden layer mse

Figure: MSE for third data set

Georg Ruß, Rudolf Kruse, Martin Schneider, Peter Wagner Data Mining on Agriculture Data using Neural Networks

slide-18
SLIDE 18

Outline Motivation Available Data Points of interest Data Modeling Results Work in Progress: Self-Organizing Maps

MSE difference plot between FT1 and FT2

5 10 15 20 25 30 35 5 10 15 20 25 30 35 −0.3 −0.2 −0.1 0.1 0.2 0.3 0.4 0.5 size of first hidden layer size of second hidden layer difference of mse

Figure: MSE difference from first to second data set

Georg Ruß, Rudolf Kruse, Martin Schneider, Peter Wagner Data Mining on Agriculture Data using Neural Networks

slide-19
SLIDE 19

Outline Motivation Available Data Points of interest Data Modeling Results Work in Progress: Self-Organizing Maps

MSE difference plot between FT2 and FT3

5 10 15 20 25 30 35 5 10 15 20 25 30 35 −0.4 −0.3 −0.2 −0.1 0.1 0.2 0.3 size of first hidden layer size of second hidden layer difference of mse

Figure: MSE difference from second to third data set

Georg Ruß, Rudolf Kruse, Martin Schneider, Peter Wagner Data Mining on Agriculture Data using Neural Networks

slide-20
SLIDE 20

Outline Motivation Available Data Points of interest Data Modeling Results Work in Progress: Self-Organizing Maps

MSE difference plot between FT1 and FT3

5 10 15 20 25 30 35 5 10 15 20 25 30 35 −0.2 −0.1 0.1 0.2 0.3 0.4 0.5 size of first hidden layer size of second hidden layer difference of mse

Figure: MSE difference from first to third data set

Georg Ruß, Rudolf Kruse, Martin Schneider, Peter Wagner Data Mining on Agriculture Data using Neural Networks

slide-21
SLIDE 21

Outline Motivation Available Data Points of interest Data Modeling Results Work in Progress: Self-Organizing Maps

Summary MLP

◮ data can be modeled well with an MLP

◮ low overall error ◮ prediction accuracy of between 0.45 and 0.55

t ha at an average

yield of 9.14

t ha

◮ prediction gets better with more data

◮ expected behaviour ◮ shown by difference plots Georg Ruß, Rudolf Kruse, Martin Schneider, Peter Wagner Data Mining on Agriculture Data using Neural Networks

slide-22
SLIDE 22

Outline Motivation Available Data Points of interest Data Modeling Results Work in Progress: Self-Organizing Maps

Using the MLP predictor

◮ use MLP predictor to optimize fertilization ◮ get new data and try to understand MLP’s predictions ◮ ⇒ that’s what’s next

Georg Ruß, Rudolf Kruse, Martin Schneider, Peter Wagner Data Mining on Agriculture Data using Neural Networks

slide-23
SLIDE 23

Outline Motivation Available Data Points of interest Data Modeling Results Work in Progress: Self-Organizing Maps

Data Modeling: Self-Organizing Maps

◮ Unsupervised artificial neural network ◮ Maps high-dimensional data onto two-dimensional plane ◮ Preserves neighborhood relations ◮ Here:

◮ recognition of correlations ◮ understanding of data ◮ visualization of data Georg Ruß, Rudolf Kruse, Martin Schneider, Peter Wagner Data Mining on Agriculture Data using Neural Networks

slide-24
SLIDE 24

Outline Motivation Available Data Points of interest Data Modeling Results Work in Progress: Self-Organizing Maps

Data split

Table: Overview on available data sets for specific fertilization strategies for different fields

F131-all yield05, em38, n1, reip32, n2, reip49, n3, yield06, fert. strategy F131-net subset of F131-all where fertilization strategy is neural network F330-all yield05, em38, n1, reip32, n2, reip49, n3, yield06, fert. strategy F330-net subset of F330-all where fertilization strategy is neural network

Georg Ruß, Rudolf Kruse, Martin Schneider, Peter Wagner Data Mining on Agriculture Data using Neural Networks

slide-25
SLIDE 25

Outline Motivation Available Data Points of interest Data Modeling Results Work in Progress: Self-Organizing Maps

Results for F131-all, Labels/U-Matrix

(a) Labels (b) U-Matrix

Figure: F131-all, U-Matrix and Labels

Georg Ruß, Rudolf Kruse, Martin Schneider, Peter Wagner Data Mining on Agriculture Data using Neural Networks

slide-26
SLIDE 26

Outline Motivation Available Data Points of interest Data Modeling Results Work in Progress: Self-Organizing Maps

Results for F131-all, Nitrogen

(a) n1 (b) n2 (c) n3

Figure: F131-all, n1, n2, n3

Georg Ruß, Rudolf Kruse, Martin Schneider, Peter Wagner Data Mining on Agriculture Data using Neural Networks

slide-27
SLIDE 27

Outline Motivation Available Data Points of interest Data Modeling Results Work in Progress: Self-Organizing Maps

Results for F131-all, REIP, Yield

(a) reip49 (b) yield05 (c) yield06

Figure: F131-all, reip49 vs. yield05 vs. yield06

Georg Ruß, Rudolf Kruse, Martin Schneider, Peter Wagner Data Mining on Agriculture Data using Neural Networks

slide-28
SLIDE 28

Outline Motivation Available Data Points of interest Data Modeling Results Work in Progress: Self-Organizing Maps

Results for F131-all, correlation

(a) n3 vs. yield06, F131-all (b) n3 vs. yield06, F131-net

Figure: F131-all, correlation between n3 and yield06

Georg Ruß, Rudolf Kruse, Martin Schneider, Peter Wagner Data Mining on Agriculture Data using Neural Networks

slide-29
SLIDE 29

Outline Motivation Available Data Points of interest Data Modeling Results Work in Progress: Self-Organizing Maps

Results for F131-all, correlation

(a) reip49 / yield06, F131 (b) reip49 / yield06, F330

Figure: F131-all, correlation between reip49 and yield06

Georg Ruß, Rudolf Kruse, Martin Schneider, Peter Wagner Data Mining on Agriculture Data using Neural Networks

slide-30
SLIDE 30

Outline Motivation Available Data Points of interest Data Modeling Results Work in Progress: Self-Organizing Maps

Summary SOM

◮ very good tool for visualizing the data ◮ helps finding correlations easily without correlation plots ◮ helps finding attributes that can be used for predicting yield

Georg Ruß, Rudolf Kruse, Martin Schneider, Peter Wagner Data Mining on Agriculture Data using Neural Networks

slide-31
SLIDE 31

Outline Motivation Available Data Points of interest Data Modeling Results Work in Progress: Self-Organizing Maps

Further Work

◮ evaluate further modeling techniques ◮ compare techniques on further (already available) data sets ◮ generate optimized decision rules for, e.g. usage of fertilizer or

pesticides

Georg Ruß, Rudolf Kruse, Martin Schneider, Peter Wagner Data Mining on Agriculture Data using Neural Networks

slide-32
SLIDE 32

Outline Motivation Available Data Points of interest Data Modeling Results Work in Progress: Self-Organizing Maps

Questions / Discussion

◮ Questions?

Georg Ruß, Rudolf Kruse, Martin Schneider, Peter Wagner Data Mining on Agriculture Data using Neural Networks