Carlos Ramos Carreo Grupo de Aprendizaje Automtico, Department of - - PowerPoint PPT Presentation

▶

Aug 11, 2023 5 likes •256 views

Carlos Ramos Carreo Grupo de Aprendizaje Automtico, Department of Computer Science , Universidad Autnoma de Madrid (UAM) Who are we? Carlos Ramos Carreo (carlos.ramos@uam.es) Jos Luis Torrecilla Noguerales

SLIDE 1

Carlos Ramos Carreño

Grupo de Aprendizaje Automático, Department of Computer Science , Universidad Autónoma de Madrid (UAM)

SLIDE 2

Who are we?

Carlos Ramos Carreño (carlos.ramos@uam.es)¹
José Luis Torrecilla Noguerales (joseluis.torrecilla@uam.es)²
Alberto Suárez (alberto.suarez@uam.es)¹
Miguel Carbajo Berrocal
Pablo Marcos Manchón
Amanda Hernando Bernabé
Pablo Pérez Manso

¹ Department of Computer Science , Universidad Autónoma de Madrid (UAM) ² Department of Mathematics, Universidad Autónoma de Madrid (UAM)

SLIDE 3

What is scikit-fda?

A software package for Functional Data Analysis

(FDA)

Preprocessing, exploration and machine learning

tools

Fully integrated in the Python science ecosystem
Efficient, flexible and easy to use

SLIDE 4

Which other tools for FDA are available?

Mainly R software:

General purpose

○ fda ○ fda.usc ○ tidyfun

Representation

○ funData

Registration

○ fdasrvf

Robust analysis

○ roahd

FPCA

○ fdapace ○ MFPCA

Regression

○ refund ○ refund.wave ○ fdaPDE ○ sparseFLMM ○ FDBoost

SLIDE 5

Which other tools for FDA are available?

Mainly R software:

Visualization

○ rainbow

Variable selection

○ RFgroove

Time series

○ fds ○ ftsa

Clustering

○ Funclustering ○ funcy ○ funFEM ○ funHDDC

SLIDE 6

Powerful, easy to use, generic purpose programming language
The Scipy environment:

○ Numpy: N-dimensional arrays and linear algebra ○ SciPy: Utilities (statistics, integration, formats…) ○ Matplotlib: Plotting ○ Jupyter: Interactive notebooks ○ and much more...

Why Python?

SLIDE 7

Scipy Toolkits (SciKits)
Specialized science packages:

Why scikit?

SLIDE 8

exploratory analysis representation preprocessing statistical inference machine learning

scikit-fda

SLIDE 9

representation

basis representation

regularly sampled irregularly sampled

discretized representation

SLIDE 10

Discretized representation

Each curve is evaluated at the same points

SLIDE 11

Basis representation

Expansion in a truncated basis of functions

SLIDE 12

smoothing

preprocessing

registration dimensionality reduction

SLIDE 13

Registration

Alignment of the curves, so that common features (peaks, valleys...) are

at the same points

Typically, a warping function is used to transform the input
Several methods

○ Shift registration ○ Landmark registration ○ Elastic registration ○ ...

SLIDE 14

Shift registration

Warpings are translations
Try to minimize the least squares criterion

SLIDE 15

Landmark registration

Warping functions to move the predefined landmarks to fixed positions
Landmarks should be specified by the user

SLIDE 16

Elastic registration

Uses the square root velocity framework (Srivastava et al., 2011

<arXiv:1103.3817> and Tucker et al., 2014 <doi:10.1016/j.csda.2012.12.001>)

Available also in fdasrvf in R
Unsupervised method

SLIDE 17

descriptive statistics

exploratory analysis

utliers

depth visualization

SLIDE 18

Functional data boxplot

Similar to the boxplot of univariate data
A depth function must be chosen

SLIDE 19

statistical inference

estimation confidence intervals statistical hypothesis testing

SLIDE 20

clustering

machine learning

regression classification

SLIDE 21

K-means clustering

Predefined number of clusters
Finds the best position of the centroids of the clusters
A functional metric must be chosen

SLIDE 22

Fuzzy K-means

Fuzzy version of K-means
Each observation does not necessary belong to only one of the clusters:

it has a degree of membership to each of them

The degrees of membership add up to one

SLIDE 23

Documentation

Up to date and available online
Easily searchable
Cross referenced
Detailed examples and interactive notebooks
Examples downloadable as Python source files or

Jupyter notebook

SLIDE 24

Where can I find more?

PyPI: https://pypi.org/project/scikit-fda/ Github page: https://github.com/GAA-UAM/scikit-fda/ Documentation: https://fda.readthedocs.io

SLIDE 25

Carlos Ramos Carreño

Grupo de Aprendizaje Automático, Department of Computer Science , Universidad Autónoma de Madrid (UAM)

Who are we?

¹ Department of Computer Science , Universidad Autónoma de Madrid (UAM) ² Department of Mathematics, Universidad Autónoma de Madrid (UAM)

What is scikit-fda?

(FDA)

tools

Which other tools for FDA are available?

Mainly R software:

○ fda ○ fda.usc ○ tidyfun

○ funData

○ fdasrvf

○ roahd

○ fdapace ○ MFPCA

○ refund ○ refund.wave ○ fdaPDE ○ sparseFLMM ○ FDBoost

Which other tools for FDA are available?

Mainly R software:

○ rainbow

○ RFgroove

○ fds ○ ftsa

○ Funclustering ○ funcy ○ funFEM ○ funHDDC

○ Numpy: N-dimensional arrays and linear algebra ○ SciPy: Utilities (statistics, integration, formats…) ○ Matplotlib: Plotting ○ Jupyter: Interactive notebooks ○ and much more...

Why Python?

Why scikit?

exploratory analysis representation preprocessing statistical inference machine learning

scikit-fda

representation

basis representation

regularly sampled irregularly sampled

discretized representation

Discretized representation

Each curve is evaluated at the same points

Basis representation

Expansion in a truncated basis of functions

smoothing

preprocessing

registration dimensionality reduction

Registration

at the same points

○ Shift registration ○ Landmark registration ○ Elastic registration ○ ...

Shift registration

Landmark registration

Elastic registration

descriptive statistics

exploratory analysis

depth visualization

Functional data boxplot

statistical inference

estimation confidence intervals statistical hypothesis testing

clustering

machine learning

regression classification

K-means clustering

Fuzzy K-means

it has a degree of membership to each of them

Documentation

Jupyter notebook

Where can I find more?

PyPI: https://pypi.org/project/scikit-fda/ Github page: https://github.com/GAA-UAM/scikit-fda/ Documentation: https://fda.readthedocs.io

Thanks for your attention!!