Interpretation of Dimensionally-Reduced Crime Data A Study with - - PowerPoint PPT Presentation

interpretation of dimensionally reduced crime data
SMART_READER_LITE
LIVE PREVIEW

Interpretation of Dimensionally-Reduced Crime Data A Study with - - PowerPoint PPT Presentation

Interpretation of Dimensionally-Reduced Crime Data A Study with Untrained Domain Experts Dominik Jckle Florian Stoffel Sebastian Mittelstdt Daniel Keim Harald Reiterer Introduction to Domain Experts Data analysts of a Law Enforcement


slide-1
SLIDE 1

Interpretation of Dimensionally-Reduced Crime Data

A Study with Untrained Domain Experts

Dominik Jäckle Florian Stoffel Sebastian Mittelstädt Daniel Keim Harald Reiterer

slide-2
SLIDE 2

Introduction to Domain Experts

Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

Data analysts of a Law Enforcement Agency (LEA)

  • Work with tabular data on a daily basis
  • Identification of patterns & suspects
  • Comparative case analysis

(consider similarities & correlations)

slide-3
SLIDE 3

Introduction to Domain Experts

Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

Data analysts of a Law Enforcement Agency (LEA)

  • Work with tabular data on a daily basis
  • Identification of patterns & suspects
  • Comparative case analysis

(consider similarities & correlations)  Challenge: consider multiple attributes simultaneously

slide-4
SLIDE 4

Planar Data Projections

Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

Multidimensional Scaling (MDS) = Distance-Preserving Projection A ... ... ... B ... ... ... C ... ... ...

Data Records = Crimes n Attributes

Data

Overall goal: ℝ𝑜 → ℝ𝑛 ; 𝑛 < 𝑜

slide-5
SLIDE 5

Planar Data Projections

Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

Multidimensional Scaling (MDS) = Distance-Preserving Projection A ... ... ... B ... ... ... C ... ... ...

Data Records = Crimes n Attributes

A B C A 0 ... ... B ... 0 ... C ... ... 0 Data Distance Matrix

Overall goal: ℝ𝑜 → ℝ𝑛 ; 𝑛 < 𝑜

Compute Distances

slide-6
SLIDE 6

Planar Data Projections

Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

A B C

Multidimensional Scaling (MDS) = Distance-Preserving Projection A ... ... ... B ... ... ... C ... ... ...

Data Records = Crimes n Attributes

A B C A 0 ... ... B ... 0 ... C ... ... 0 Data Distance Matrix 2D Scatterplot

Overall goal: ℝ𝑜 → ℝ𝑛 ; 𝑛 < 𝑜

Compute Distances Projection

slide-7
SLIDE 7

Planar Data Projections

Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

A B C

Multidimensional Scaling (MDS) = Distance-Preserving Projection A ... ... ... B ... ... ... C ... ... ...

Data Records = Crimes n Attributes

A B C A 0 ... ... B ... 0 ... C ... ... 0 Data Distance Matrix 2D Scatterplot

Overall goal: ℝ𝑜 → ℝ𝑛 ; 𝑛 < 𝑜

Compute Distances Projection

Main Problem interpretation of the visual depiction

slide-8
SLIDE 8

Previous Work

Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

Includes Domain Experts No Study (any)

slide-9
SLIDE 9

Previous Work

Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

Ward & Martin (1995) Buja (1996)

Includes Domain Experts No Study (any)

slide-10
SLIDE 10

Previous Work

Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

Ward & Martin (1995) Buja (1996)

Includes Domain Experts No Study (any)

Seo & Shneiderman (2005) Nam & Mueller (2013) Krause et al. (2016)

Application Examples Case Studies

Johansson & Johansson (2009) Ingram et al. (2010) Turkay et al. (2011) Fernstad et al. (2013) Turkay et al. (2012) Yuan et al. (2013) Liu et al. (2014) Jeong et al. (2009)

slide-11
SLIDE 11

Previous Work

Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

Ward & Martin (1995) Buja (1996)

Includes Domain Experts No Study (any)

Seo & Shneiderman (2005) Nam & Mueller (2013) Krause et al. (2016)

Application Examples Case Studies

Johansson & Johansson (2009) Ingram et al. (2010) Turkay et al. (2011) Fernstad et al. (2013) Turkay et al. (2012) Yuan et al. (2013) Liu et al. (2014) Jeong et al. (2009)

User Studies

without Domain Experts

Yi et al. (2005) Brown et al. (2012) Sedlmair et al. (2013) Stahnke et al. (2016)

slide-12
SLIDE 12

Previous Work

Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

Ward & Martin (1995) Buja (1996)

Includes Domain Experts No Study (any)

Seo & Shneiderman (2005) Nam & Mueller (2013) Krause et al. (2016)

Application Examples Case Studies

Johansson & Johansson (2009) Ingram et al. (2010) Turkay et al. (2011) Fernstad et al. (2013) Turkay et al. (2012) Yuan et al. (2013) Liu et al. (2014) Jeong et al. (2009)

User Studies

without Domain Experts

Yi et al. (2005) Brown et al. (2012) Sedlmair et al. (2013) Stahnke et al. (2016)

Our Study

slide-13
SLIDE 13

Can domain experts not trained in advanced statistics interpret the depiction of a data projection?

slide-14
SLIDE 14

Data: San Francisco Crimes

https://data.sfgov.org/ Category Description DayOfWeek Date Time PdDistrict Resolution Address Location

slide-15
SLIDE 15

Data: San Francisco Crimes

Category Description DayOfWeek Date Time PdDistrict Resolution Address Location

Category: DISORDERLY CONDUCT Description: MAINTAINING A PUBLIC NUISANCE AFTER NOTIFICATION DayOfWeek: Sunday Date: 08/21/2016 12:00:00 AM Time: 6:36 PdDistrict: TENDERLOIN Resolution: ARREST, BOOKED Address: 400 Block of LEAVENWORTH ST Location: (37.7851373814889°, -122.414457162309°)

https://data.sfgov.org/

slide-16
SLIDE 16

Data Types

Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

DISORDERLY CONDUCT MAINTAINING A PUBLIC NUISANCE AFTER NOTIFICATION 08/21/2016 00:06:36 AM categorical numerical textual

slide-17
SLIDE 17

Data Types

Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

DISORDERLY CONDUCT MAINTAINING A PUBLIC NUISANCE AFTER NOTIFICATION 08/21/2016 00:06:36 AM categorical numerical textual

Similarity between ... numerical values 𝑡𝑗𝑛 𝑊

1, 𝑊 2 = 𝑊 1 − 𝑊 2

textual attrib. 𝑡𝑗𝑛 𝑤1, 𝑤2 =

𝑤1∙𝑤2 𝑤1 ∙ 𝑤2

categorical values 𝑡𝑗𝑛 𝑊

1, 𝑊 2 = 𝑊 1 ≠ 𝑊 2

How to combine different data types?

slide-18
SLIDE 18

Interactive Visualization

Dimension/Variable

𝐸1 𝑡𝑗𝑛1 𝑥1 𝐸2 𝑡𝑗𝑛2 𝑥2 𝐸3 𝑡𝑗𝑛3 𝑥3 … 𝐸𝑜 𝑡𝑗𝑛𝑜 𝑥𝑜

Weighting & Similarity Visual Data Exploration

Projection Steering

slide-19
SLIDE 19

Weighting and Similarity

Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

Interactive weighting = impact of an attribute Integration of diverse data types Gower Metric: 𝑒𝑗𝑡𝑢 𝐵, 𝐶 =

σ𝑗=1

|𝑒𝑗𝑛| 𝑡𝑗𝑛𝑗 𝐵𝑗,𝐶𝑗 ∙𝑥𝑗

|𝑒𝑗𝑛|

slide-20
SLIDE 20

Weighting and Similarity

Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

𝑒𝑗𝑡𝑢 𝐵, 𝐶 = σ𝑗=1

|𝑒𝑗𝑛| 𝑡𝑗𝑛𝑗 𝐵𝑗, 𝐶𝑗 ∙ 𝑥𝑗

|𝑒𝑗𝑛|

slide-21
SLIDE 21

Visual Data Exploration

Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

Overview Detail

slide-22
SLIDE 22

Visual Data Exploration

Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

Overview Detail Projection

slide-23
SLIDE 23

Visual Data Exploration

Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

Overview Detail Projection Content Lens

slide-24
SLIDE 24

Visual Data Exploration

Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

Overview Detail Projection Content Lens Tooltip Data View

slide-25
SLIDE 25

Visual Data Exploration

𝑒𝑗𝑡𝑢 𝐵, 𝐶 = σ𝑗=1

|𝑒𝑗𝑛| 𝑡𝑗𝑛𝑗 𝐵𝑗, 𝐶𝑗 ∙ 𝑥𝑗

|𝑒𝑗𝑛|

slide-26
SLIDE 26

Interpretation Study

slide-27
SLIDE 27

Study Design

Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

3 LEA data analysts (1 female)

  • worked with data tables on a daily basis
  • not used to work with abstract data representations

4 consecutive tasks

  • Each analyst was confronted with the same task order
  • Each task was introduced as a new, subsequent analysis question
slide-28
SLIDE 28

Study Design

Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

San Francisco Crime Data

  • Week from Monday, July 25, 2016 to Monday, August 1, 2016
  • 13 dimensions
  • 36 different crime categories

After the study, we let analysts fill out a questionaire regarding:

  • basic understanding
  • interaction concepts
  • extraction of knowledge
slide-29
SLIDE 29

Tasks

slide-30
SLIDE 30

Task 1 Is there a pattern among dimensions between days?

slide-31
SLIDE 31

Task 1: Model Solution

slide-32
SLIDE 32

Task 2 Why is the Monday separated from all other days of the week? What is special about the Date distribution?

slide-33
SLIDE 33

Task 2: Model Solution

slide-34
SLIDE 34

Task 3 Which distribution of dimension values can you find for the rest of the week?

slide-35
SLIDE 35

Task 3: Model Solution

slide-36
SLIDE 36

Task 4 Leaving the temporal aspect behind, is there a pattern based on places or crime types?

slide-37
SLIDE 37

Task 4: Model Solution

slide-38
SLIDE 38

Findings

slide-39
SLIDE 39

Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

F1: The analysis starts with an already known hypothesis.

slide-40
SLIDE 40

Crime Routine Activity (L. E. Cohen, 1979)

Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

Place District / Street / GPS Time Date / Time / Weekday Occasion Crime Opportunity

slide-41
SLIDE 41

Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

F1: The analysis starts with an already known hypothesis. F2: Analysts always consider to add/remove dimensions to the depiction to explain a cluster separation.

slide-42
SLIDE 42

Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

F1: The analysis starts with an already known hypothesis. F2: Analysts always consider to add/remove dimensions to the depiction to explain a cluster separation. F3: Analysts do not add/remove dimensions to explain an anomaly they are insecure about.

slide-43
SLIDE 43

Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

F1: The analysis starts with an already known hypothesis. F2: Analysts always consider to add/remove dimensions to the depiction to explain a cluster separation. F3: Analysts do not add/remove dimensions to explain an anomaly they are insecure about. F4: Analysts untrained in DR have a great understanding of a multivariate depiction given a use case relating to their domain.

slide-44
SLIDE 44

Conclusion

Interactive visualization system to explore the data using the Gower Metric Qualitative study of subjective experiences of domain experts

Jäckle et al. | Interpretation of Dimensionally-Reduced Crime Data

slide-45
SLIDE 45

THANK YOU!

Dominik Jäckle Florian Stoffel Sebastian Mittelstädt Daniel Keim Harald Reiterer

http://www.dominikjaeckle.com/projects/2017/crime_interpret