Artworks and Articles Meet Artworks and Articles Meet MAPPER and - - PowerPoint PPT Presentation

artworks and articles meet artworks and articles meet
SMART_READER_LITE
LIVE PREVIEW

Artworks and Articles Meet Artworks and Articles Meet MAPPER and - - PowerPoint PPT Presentation

Artworks and Articles Meet Artworks and Articles Meet MAPPER and Persistent MAPPER and Persistent Homology Homology Presented by Alicia Ledesma Alonso and Hongyuan Zhang Presentation design adapted from slidesgala.com


slide-1
SLIDE 1

Artworks and Articles Meet Artworks and Articles Meet MAPPER and Persistent MAPPER and Persistent Homology Homology

Presented by Alicia Ledesma Alonso and Hongyuan Zhang

Presentation design adapted from slidesgala.com https://slidesgala.com/sheldon/

slide-2
SLIDE 2

Why TDA? Why TDA?

  • Coordinate freeness
  • Deformation invariance
  • Compressed representations
slide-3
SLIDE 3
  • Topology and topological

spaces

  • Distance and metrics
  • Simplicial Complex
  • Persistent Homology

Topology Topology

slide-4
SLIDE 4

Cleaned/Filtered data Cleaned/Filtered data Raw Data Raw Data Analyze Analyze Persistent Homology Persistent Homology Mapper Mapper

Pipeline Pipeline

slide-5
SLIDE 5

What is persistent homology What is persistent homology?

Filtration example Filtration example Barcodes Barcodes

slide-6
SLIDE 6

Ideally, we can recover the topological features of the original data cloud from the resulting simplicial complex.

What is Mapper? What is Mapper?

Credit to: “A User’s Guide to Topological Data Analysis” by Elizabeth Munch

slide-7
SLIDE 7

arXiv arXiv

  • arXiv Data
  • arXiv online API and AmazonS3
  • arXiv persistent homology
  • Select random samples
  • Identify persistent intervals
  • Identify differences
  • arXiv Mapper
  • Color by academic categories
  • Explore various lenses
  • Compare
slide-8
SLIDE 8

arXiv metric arXiv metric

How do we measure distance between two articles?

  • L. Carlsson, G. Carlsson, and M.

Vejdemo-Johansson. Fibres of Failure: Classifying errors in predictive

  • processes. arXiv e
  • prints, February 2018.
slide-9
SLIDE 9

arXiv Persistent Homology arXiv Persistent Homology - Dionysus Dionysus

slide-10
SLIDE 10

arXiv Color Function arXiv Color Function

slide-11
SLIDE 11
slide-12
SLIDE 12
slide-13
SLIDE 13

Met Met

  • Met Data
  • Official MET GitHub
  • Met persistent homology
  • Select random samples
  • Identify persistent intervals
  • Identify differences
  • Met Mapper
  • Identify subgroups
  • Select significant features
  • Compare
slide-14
SLIDE 14

Met Metric Met Metric

Q: How to measure distance between two artworks? A: Mixed type of data->measure each type using different metrics For categorical features->Jaccard distance For numerical features->difference divided by max distance

slide-15
SLIDE 15

Met Mapper Met Mapper

slide-16
SLIDE 16

Statistical Analysis Statistical Analysis

slide-17
SLIDE 17

Model Comparison Model Comparison

Model 1: “Is Public Domain” ~ “Drawings and Prints” Model 2: “Is Public Domain” ~ all variables Model Accuracy Scores (using Python Sklearn score() method): Model 1 52.17% Model 2 73.83%

Mapper is effective in guiding feature selection!

slide-18
SLIDE 18

Met Persistent Homology Met Persistent Homology

[4, 5) is a relatively persistent interval for both groups in Dimension 1! Persistent Homology can help classification! comparing the number

  • f persistent barcodes

and the distributions of variables

slide-19
SLIDE 19

Thank you to Professor Marcos Ortiz for his mentorship, Grinnell College and the NSF for providing funding, and the Department of Mathematics and Statistics of Grinnell College for providing this opportunity.

Thank you!