Datawarehousing para datos genticos, socioeconmicos y fenotpicos, - - PowerPoint PPT Presentation

datawarehousing para datos gen ticos socioecon micos y
SMART_READER_LITE
LIVE PREVIEW

Datawarehousing para datos genticos, socioeconmicos y fenotpicos, - - PowerPoint PPT Presentation

Datawarehousing para datos genticos, socioeconmicos y fenotpicos, con visualizacin 3D SciPy 2018 Luciano Serruya Aloisi Pablo Toledo Margalef Universidad Nacional de la Patagonia San Juan Bosco August 31, 2018 1 / 16 Roadmap


slide-1
SLIDE 1

Datawarehousing para datos genéticos, socioeconómicos y fenotípicos, con visualización 3D

SciPy 2018 Luciano Serruya Aloisi Pablo Toledo Margalef

Universidad Nacional de la Patagonia San Juan Bosco

August 31, 2018

1 / 16

slide-2
SLIDE 2

Roadmap

✓ Introduction ✓ A little bit of software engineering

What we did Why we did it that way How we did it

✓ Demo time ✓ Conclusions ✓ The end

2 / 16

slide-3
SLIDE 3

whoami

Linux, Python, and Javascript

@LucianoSerruya @LucianoFromTrelew

Linux and Python. FP enthusiast

@T_Papablo @PaPablo Both students at UNPSJB, Trelew

3 / 16

slide-4
SLIDE 4

Getting in context

IPCSH-CONICET studies how traditions, manners, and ancestry

heritage relate themselves with medical interest physical variables.

RAICES Project (IPCSH-CONICET) aims to build a Patagonian

biobank (a genetic data bank)

It is intended to help futures applications and design of public health

politcs

4 / 16

slide-5
SLIDE 5

Getting in context (cont.)

RAICES Project sampling consists of a poll made to the volunteers

(people who have been born in Argentina) and others several different exams

These polls were (and still are) completed via a Google Form

and then exported to a xls (Excel file)

The exams also output their own files

5 / 16

slide-6
SLIDE 6

Getting in context (cont.)

Sampled data:

Phenotypic data (whole-body videos, 3D scans) Socioeconomic data (monthly income, lifestyle, how many home

appliances do they have)

Ancestry (where do their parents and grandparent come from, what

languages do they speak)

Drinking, smoking, eating habits ...

6 / 16

slide-7
SLIDE 7

Internship @ CENPAT

This work comes out as a internship at CENPAT-CONICET Researchers of the project needed a software to handle all that data

and files

That is where we come on in! To develop a datawarehouse and a web

application to fulfill their needs

7 / 16

slide-8
SLIDE 8

Design decisions

Web application REST Architecture noSQL Database (Mongo)

8 / 16

slide-9
SLIDE 9

Development

Datawarehouse (Python

+ Mongo = Mongoengine FTW)

Pandas

for data processing

DRF + Vue.js THREE.js Bokeh

9 / 16

slide-10
SLIDE 10

Problems we came across

Missing data Incorrectly formatted data Lack of documentation about Django + Mongo THREE.js API is not ES6 friendly

10 / 16

slide-11
SLIDE 11

IT’S DEMO TIME

11 / 16

slide-12
SLIDE 12

Conclusions & Resolutions

You will have to integrate your frontend applicaciont with your API

sooner or later

Decoupled architectures generates coupling if there is a lack of

communication between both development teams Working with files and REST it is not the happiest thing to do (it was

not for us at least)

12 / 16

slide-13
SLIDE 13

Conclusions & Resolutions (cont.)

If you get to work with 3D visualization, keep an eye on the following

Size and scale of your mesh Camera and mesh position and angle Lighting (if you are working with textures)

13 / 16

slide-14
SLIDE 14

I would like to know some more about it, please

This slides - https://github.com/LucianoFromTrelew/scipy2018-raices-dw.git RAICES Project - https://twitter.com/raices_proyecto IPCSH - https://ipcsh.conicet.gov.ar/ CENPAT - http://www.cenpat-conicet.gob.ar/

14 / 16

slide-15
SLIDE 15

I listen to your questions

15 / 16

slide-16
SLIDE 16

The end

¡Muchas gracias! ¡Muito obrigado! Thank you very much! UNPSJB logo

16 / 16