Modeling What exactly is the problem, the expected benefit? project - PowerPoint PPT Presentation

Modeling What exactly is the problem, the expected benefit? project understanding How would a solution look like? What is known about the domain? revise objective What data do we have available? Is the data relevant to the problem? data understanding Is it valid? Does it reflect our expectations? Is the data quality, quantity, recency sufficient? partially does data no cancel project suit problem? yes Which data should we concentrate on? data preparation How is the data best transformed for modeling? How may we increase the data quality? What kind of model architecture suits the problem best? What is the best technique/method to get the model? modeling How good does the model perform technically? technical quality revise objective improvable? likely unlikely How good is the model in terms of project requirements? evaluation What have we learned from the project? business objective partially close project achieved? no success How is the model best deployed? deployment How do we know that the model is still valid? Compendium slides for “Guide to Intelligent Data Analysis”, Springer 2011. 1 / 50 � Michael R. Berthold, Christian Borgelt, Frank H¨ c oppner, Frank Klawonn and Iris Ad¨ a

The Four Steps of Modeling Select the model class General structure of the analysis result ”Architecture” or ”model class” Example: Linear or quadratic functions for regression problem Compendium slides for “Guide to Intelligent Data Analysis”, Springer 2011. 2 / 50 � Michael R. Berthold, Christian Borgelt, Frank H¨ c oppner, Frank Klawonn and Iris Ad¨ a

The Four Steps of Modeling Select the model class General structure of the analysis result ”Architecture” or ”model class” Example: Linear or quadratic functions for regression problem Select the score function Evaluate possible ”models” using a score function Compendium slides for “Guide to Intelligent Data Analysis”, Springer 2011. 2 / 50 � Michael R. Berthold, Christian Borgelt, Frank H¨ c oppner, Frank Klawonn and Iris Ad¨ a

The Four Steps of Modeling Select the model class General structure of the analysis result ”Architecture” or ”model class” Example: Linear or quadratic functions for regression problem Select the score function Evaluate possible ”models” using a score function Apply the algorithm Compare models through the score function But: How do we find the models? Compendium slides for “Guide to Intelligent Data Analysis”, Springer 2011. 2 / 50 � Michael R. Berthold, Christian Borgelt, Frank H¨ c oppner, Frank Klawonn and Iris Ad¨ a

The Four Steps of Modeling Select the model class General structure of the analysis result ”Architecture” or ”model class” Example: Linear or quadratic functions for regression problem Select the score function Evaluate possible ”models” using a score function Apply the algorithm Compare models through the score function But: How do we find the models? Validate the results We know: Best model among the chose ones But: Is this the best among very good or very bad choices? Compendium slides for “Guide to Intelligent Data Analysis”, Springer 2011. 2 / 50 � Michael R. Berthold, Christian Borgelt, Frank H¨ c oppner, Frank Klawonn and Iris Ad¨ a

Model class? Model = The form or structure of the analysis result Here the parameters are not defined only the type is selected Examples: Linear models ( y = ax + b ) Constant values (e.g. mean) Rule based models (if A buys product one , then weather is sunny ) Compendium slides for “Guide to Intelligent Data Analysis”, Springer 2011. 3 / 50 � Michael R. Berthold, Christian Borgelt, Frank H¨ c oppner, Frank Klawonn and Iris Ad¨ a

Model class - Requirements Simplicity Occam’s razor : Choose the simplest model that still ”explains” the data. Or : Numquam ponenda est pluralitas sine necessitate = [Plurality must never be posited without necessity] easier to understand lower complexity avoid overfitting(see Slide 21 ff.) Compendium slides for “Guide to Intelligent Data Analysis”, Springer 2011. 4 / 50 � Michael R. Berthold, Christian Borgelt, Frank H¨ c oppner, Frank Klawonn and Iris Ad¨ a

Model class - Requirements Simplicity Occam’s razor : Choose the simplest model that still ”explains” the data. Or : Numquam ponenda est pluralitas sine necessitate = [Plurality must never be posited without necessity] easier to understand lower complexity avoid overfitting(see Slide 21 ff.) Interpretability Black-Boxes are mostly not a proper choice But: They can result in a very good accuracy(e.g. neural networks) Compendium slides for “Guide to Intelligent Data Analysis”, Springer 2011. 4 / 50 � Michael R. Berthold, Christian Borgelt, Frank H¨ c oppner, Frank Klawonn and Iris Ad¨ a

Global vs. local models Global models provide a (not necessarily good) description for the whole data set. Example: Regression line Local models or patterns provide a description for only a part or subset of the data set. Example: Association rules Compendium slides for “Guide to Intelligent Data Analysis”, Springer 2011. 5 / 50 � Michael R. Berthold, Christian Borgelt, Frank H¨ c oppner, Frank Klawonn and Iris Ad¨ a

Fitting Criteria and Score Function find an objective function f : M → I R Which, evaluates the quality of your model In order to detect the ”best” model Compendium slides for “Guide to Intelligent Data Analysis”, Springer 2011. 6 / 50 � Michael R. Berthold, Christian Borgelt, Frank H¨ c oppner, Frank Klawonn and Iris Ad¨ a

Fitting Criteria and Score Function find an objective function f : M → I R Which, evaluates the quality of your model In order to detect the ”best” model Example R m and ”model” M : I R m → I R m Given: Dataset D = { d 1 , d 2 , ...d n } ∈ I ( M predicts a value for a given data point). Compendium slides for “Guide to Intelligent Data Analysis”, Springer 2011. 6 / 50 � Michael R. Berthold, Christian Borgelt, Frank H¨ c oppner, Frank Klawonn and Iris Ad¨ a

Fitting Criteria and Score Function find an objective function f : M → I R Which, evaluates the quality of your model In order to detect the ”best” model Example R m and ”model” M : I R m → I R m Given: Dataset D = { d 1 , d 2 , ...d n } ∈ I ( M predicts a value for a given data point). � n i =1 ( x − M ( x )) 2 Mean squared error : f ( x ) = 1 n Compendium slides for “Guide to Intelligent Data Analysis”, Springer 2011. 6 / 50 � Michael R. Berthold, Christian Borgelt, Frank H¨ c oppner, Frank Klawonn and Iris Ad¨ a

Fitting Criteria and Score Function find an objective function f : M → I R Which, evaluates the quality of your model In order to detect the ”best” model Example R m and ”model” M : I R m → I R m Given: Dataset D = { d 1 , d 2 , ...d n } ∈ I ( M predicts a value for a given data point). � n i =1 ( x − M ( x )) 2 Mean squared error : f ( x ) = 1 n � n Mean absolute error : f ( x ) = 1 i =1 | x − M ( x ) | n Compendium slides for “Guide to Intelligent Data Analysis”, Springer 2011. 6 / 50 � Michael R. Berthold, Christian Borgelt, Frank H¨ c oppner, Frank Klawonn and Iris Ad¨ a

Short comment : What is classification? Example Imagine a cup factory, which wants to classify their cups as good or broken. Compendium slides for “Guide to Intelligent Data Analysis”, Springer 2011. 7 / 50 � Michael R. Berthold, Christian Borgelt, Frank H¨ c oppner, Frank Klawonn and Iris Ad¨ a

Error functions for classification problems How to set up an error function for those classification problems? Very common misclassification rate = # wrong classified # total classified A low misclassification rate does not necessarily tell anything about the quality of a classifier. when classes are unbalanced (e.g. When 99% of the production are ok, a classifier always predicting ok will have a misclassification rate of 1%.) Compendium slides for “Guide to Intelligent Data Analysis”, Springer 2011. 8 / 50 � Michael R. Berthold, Christian Borgelt, Frank H¨ c oppner, Frank Klawonn and Iris Ad¨ a

Modeling What exactly is the problem, the expected benefit? project - PowerPoint PPT Presentation

Modeling What exactly is the problem, the expected benefit? project understanding How would a solution look like? What is known about the domain? revise objective What data do we have available? Is the data relevant to the problem? data

Modeling of proteins and complexes High resolution Low resolution Modeling of domains Modeling

Virtual Reality Modeling Virtual Reality Modeling from http://www.okino.com/ Modeling Modeling

Language Modeling CSE354 - Spring 2020 Task Language Modeling Probabilistic Modeling

Topics Why E Field Modeling What is E Field Modeling Case Studies Questions 2 Why

Outline 1 The topic 2 Decision support systems 3 Modeling 3.3 Advanced modeling

Verilog HDL:Digital Design and Modeling Chapter 5 Gate-Level Modeling Chapter 5 Gate-Level

Modeling Offsets and Linkage in a Modeling Offsets and Linkage in a Modeling Offsets and Linkage

Modeling Land Competition Modeling Land Competition Modeling Land Competition Ron Sands Ron

Importance of Soft Tissue Modeling Importance of Soft Tissue Modeling Most medical procedures

Verilog HDL:Digital Design and Modeling Chapter 8 Behavioral Modeling Chapter 8 Behavioral

Why choice modeling? Elea McDonnell Feit Instructor DataCamp Marketing Analytics in R: Choice

Mixed Eect Models Danielle Quinn PhD Candidate, Memorial University Regression Modeling in R:

Verilog HDL:Digital Design and Modeling Chapter 9 Structural Modeling Chapter 9 Structural

Language Modeling CSE392 - Spring 2019 Special Topic in CS Task Probabilistic Modeling

Modeling with UML Chapter 2, lecture 1, Overview: modeling with UML What is modeling?

Computer Simulation Modeling Jonathan Thaler Department of Computer Science 1 / 61 Modeling

Berlin Buzzwords, June 4th, 2012, Dr. Christoph Goller, IntraFind Software AG Outline

Computer Simulation and Applications in Life Sciences Dr. Michael Emmerich & Dr. Andre Deutz

Decision Trees Sven Koenig, USC Russell and Norvig, 3 rd Edition, Section 18.3 These slides are

How Computers Discover How Computers Discover A Mini-Review of Algorithmic Meta-Discovery Filip

What is this thing...? Lecture 20. Realism Continued * Reading for this week: T&R Chapter 12,

Quantum Mechanics A Gentle Introduction Sebastian Riese 27.12.2018 Quantum Mechanics 1/40

Welcome to Class 2: Did people in Columbuss >me

Prediction and Solomonoff Pter Gcs Boston University Quantum Foundations worshop, August

Modeling What exactly is the problem, the expected benefit? project - PowerPoint PPT Presentation

Modeling What exactly is the problem, the expected benefit? project understanding How would a solution look like? What is known about the domain? revise objective What data do we have available? Is the data relevant to the problem? data

Modeling of proteins and complexes High resolution Low resolution Modeling of domains Modeling

Virtual Reality Modeling Virtual Reality Modeling from http://www.okino.com/ Modeling Modeling

Language Modeling CSE354 - Spring 2020 Task Language Modeling Probabilistic Modeling

Topics Why E Field Modeling What is E Field Modeling Case Studies Questions 2 Why

Outline 1 The topic 2 Decision support systems 3 Modeling 3.3 Advanced modeling

Verilog HDL:Digital Design and Modeling Chapter 5 Gate-Level Modeling Chapter 5 Gate-Level

Modeling Offsets and Linkage in a Modeling Offsets and Linkage in a Modeling Offsets and Linkage

Modeling Land Competition Modeling Land Competition Modeling Land Competition Ron Sands Ron

Importance of Soft Tissue Modeling Importance of Soft Tissue Modeling Most medical procedures

Verilog HDL:Digital Design and Modeling Chapter 8 Behavioral Modeling Chapter 8 Behavioral

Why choice modeling? Elea McDonnell Feit Instructor DataCamp Marketing Analytics in R: Choice

Mixed Eect Models Danielle Quinn PhD Candidate, Memorial University Regression Modeling in R:

Verilog HDL:Digital Design and Modeling Chapter 9 Structural Modeling Chapter 9 Structural

Language Modeling CSE392 - Spring 2019 Special Topic in CS Task Probabilistic Modeling

Modeling with UML Chapter 2, lecture 1, Overview: modeling with UML What is modeling?

Computer Simulation Modeling Jonathan Thaler Department of Computer Science 1 / 61 Modeling

Berlin Buzzwords, June 4th, 2012, Dr. Christoph Goller, IntraFind Software AG Outline

Computer Simulation and Applications in Life Sciences Dr. Michael Emmerich &amp; Dr. Andre Deutz

Decision Trees Sven Koenig, USC Russell and Norvig, 3 rd Edition, Section 18.3 These slides are

How Computers Discover How Computers Discover A Mini-Review of Algorithmic Meta-Discovery Filip

What is this thing...? Lecture 20. Realism Continued * Reading for this week: T&amp;R Chapter 12,

Quantum Mechanics A Gentle Introduction Sebastian Riese 27.12.2018 Quantum Mechanics 1/40

Welcome to Class 2: Did people in Columbuss &gt;me

Prediction and Solomonoff Pter Gcs Boston University Quantum Foundations worshop, August

Computer Simulation and Applications in Life Sciences Dr. Michael Emmerich & Dr. Andre Deutz

What is this thing...? Lecture 20. Realism Continued * Reading for this week: T&R Chapter 12,

Welcome to Class 2: Did people in Columbuss >me