FolderShare: Building a data sharing cloud on Drupal 8 for - - PowerPoint PPT Presentation

foldershare building a data sharing cloud on drupal 8 for
SMART_READER_LITE
LIVE PREVIEW

FolderShare: Building a data sharing cloud on Drupal 8 for - - PowerPoint PPT Presentation

FolderShare: Building a data sharing cloud on Drupal 8 for researchers Amit Chourasia, David Nadeau & Michael Norman San Diego Supercomputer Center, UC San Diego Project code: dibbs.seedme.org/downloads or drupal.org/projects/foldershare


slide-1
SLIDE 1

Amit Chourasia, David Nadeau & Michael Norman San Diego Supercomputer Center, UC San Diego

Project code: dibbs.seedme.org/downloads

  • r

drupal.org/projects/foldershare drupal.org/projects/smalldata Trial website: sandbox.seedme.org Project website: dibbs.seedme.org

FolderShare: Building a data sharing cloud on Drupal 8 for researchers

slide-2
SLIDE 2

About me

Amit Chourasia

San Diego Supercomputer Center @ UC San Diego

  • VisualizaJon scholar/evangelist
  • Lecturer/Instructor
  • Interests

– High performance compuJng – Data stewardship – Computer graphics

Drupal user since ~2006 | version 4.7.4

  • Personal website ( Drupal 4 - 8: 150 pages)
  • Project website ( Drupal 5 : 50,000+ pages)
  • SeedMe2 cloud service ( Drupal 7: 150,000+ pages)
  • SeedMe2 plaCorm (Drupal 8: AnGcipate 1M+ items)

Seeking: PHP programmers Desired: Deep knowledge of Drupal 8 core

slide-3
SLIDE 3

PresentaJon Overview

1. Background & moJvaJon 2. Architecture 3. SeedMe2 pla\orm

FolderShare: Virtual file system » EnJty data model & access control » File management & security » Views integraJon » UI & Command plugins » File forma_ers » Web services Small data module and API – VisualizaJon

4. Target users/Use cases 5. Screenshots 6. Demo

slide-4
SLIDE 4

SeedMe 1

a.k.a. Stream encode explore and disseminate My experiments

  • Based on Drupal 7
  • In producJon as Pla\orm as a Service (PaaS)
  • Video encoding was the main focus

SeedMe 2 : Data sharing building blocks

EvoluJon of the original SeedMe project (Complete rewrite)

  • Based on Drupal 8
  • Incorporates user feedback from original SeedMe project
  • Built for distribuJon and extension
  • Data sharing and data management is the main focus

SeedMe Project

slide-5
SLIDE 5

SeedMe 2’s focus

Enable rapid access to data Consumable data Can be handled by stock web browser (Upload/Download, < 2GB per file) Displayable on many devices (Phone to PC)

slide-6
SLIDE 6

Data management stumbling blocks

Transfer Storage CollaboraJon AutomaJon Access control

But what about PresentaJon and Discovery?

slide-7
SLIDE 7

Data management stumbling blocks

Transfer Storage CollaboraJon AutomaJon Access control

Issues due to content dispersion

DescripJon in someone’s mind Data in the Cloud Discussion on emails

Three D’s : Data, DescripGon, Discussion

slide-8
SLIDE 8

Filesystem based soluGons

Related soluJons

Content management system based soluGons

HubZero FigShare

Middleware

Globus IRods NEWT

SoZware repositories

GitHub SVN CVS

File hosGng

Cloud drives

(Dropbox, etc)

WebDAV

Tools

SCP FTP

LimitaGons of exisGng soluGons

Lack extensibility

Lack support for rich content (descripJon, discussion, etc…) Lack independent developer support Lack 3 way interacJon via web browser & command line & API Resource restricted

slide-9
SLIDE 9

Workflow

Update as desired Description Create Project Sharing Add folders View Search Download Web browser, Command line, REST or App Sign In 1 2 3 4 Upload files

slide-10
SLIDE 10

Users Drupal 8

Architecture

Modules

  • Virtual file system
  • Access control
  • Hierarchical storage
  • Command plugins
  • UI and display
  • Search / index
  • Web services
  • File forma_ers
  • Quick VisualizaJon

Webserver (Apache + PHP) Database (MySQL) Web browser REST clients Command line Small data (PHP library)

Drupal 8 Contributed Modules e.g. Federated AuthenGcaGon via OIDC module

Project contribuJons

slide-11
SLIDE 11

SeedMe Pla\orm Ecosystem

Drupal (Content Management System)

  • Widely used in industry, academia and government

(third most popular CMS on web aker Wordpress & Joomla)

  • Modular architecture with large ecosystem

(over 1,000 contributed modules)

  • Large developer & support community

(4,000 contributors to core + thousands more)

  • Security advisory and updates for core and stable contributed

modules every month

  • VersaJle deployment opJons

(personal hosJng, insJtuJonal hosJng, cloud hosJng)

slide-12
SLIDE 12

FolderShare module

  • Required dependencies (11 - All in Drupal core)

– DateJme – Field – File – Image – Link – Media – OpJons – Text

  • OpGonal dependencies ( 3 - All in Drupal core)

– Comment – Help – RESTful web services

  • HTTP basic authenJcaJon
  • SerializaJon

– Search – Filter – System – User – Views

  • Core modules recommended

– Text editor – Field UI (may be) – Views UI (may be)

  • Contributed modules recommended

– Real name – REST UI (may be) – Small Data (for quick visualizaGon of CSV, JSON files)

slide-13
SLIDE 13

FolderShare module

  • Virtual file system (fieldable):

– EnJty type & API – Access controls – Usage tracking – Views, displays, breadcrumbs, forms – Plugins for field forma_ers, search, views, acJons, and queue workers

  • Configurable by sites

– e.g. Keywords, comments, flags, DOIs

  • Extensible by developers
slide-14
SLIDE 14

Files & folders

  • Children point to parents

– Parent IDs enable fast queries for all children of a folder – Root IDs enable fast queries for access controls and breadcrumbs

parenJd points to immediate parent rooJd points to top folder

slide-15
SLIDE 15

Abstracted file storage

Folders exist in the database Every uploaded file has a File ID (sequenGal)

Internal file organizaJon and storage based on 16 bit pa_ern Bit pa_ern 0000 0000 0000 0000 Physical hierarchy as 0000/0000/0000/0000/file_id Each underlying folder stores 9,999 files Total file handling: 65,535 * 9,999 = 655,284,465 (~655 million)

slide-16
SLIDE 16

Access controls

  • Drupal account-based
  • Permissions + access control list on top folders

– List of users that can view and author

  • Top folder controls enJre hierarchy

– Simpler than desktop OSes – Similar to file sharing services – Fast to check access

slide-17
SLIDE 17

File storage

  • Folders only exist in database
  • Files described in database & stored on disk
  • Disk directory !== folder hierarchy

– Be_er for security and load balancing – Files have generated names

  • Avoids character set and name length limits

– Files have no extensions

  • Avoids accidental server execuJon of “.php”, etc.
slide-18
SLIDE 18

Views

  • List personal, public, and shared files & folders

– Pages & embedded views in folder pages

  • Integrated desktop-like UI

– Select files and folders – Then choose menu command

  • Three UI variants:

– No scripJng – ScripJng but no AJAX – ScripJng with AJAX

slide-19
SLIDE 19

Plugins

  • Field forma_ers

– Folder names, enJty references, MIME-type icons

  • Search

– Index and present results

  • Queue worker

– Update folder hierarchy sizes in background

  • AcJons & custom commands

– Menu UI items to add, delete, etc

slide-20
SLIDE 20

Total lines of php code Node : 25,639 (Drupal core 8.5.0) Foldershare: 50,156 (Alpha1 version)

20,000 40,000 60,000 80,000 100,000 120,000 140,000 v1 v2 v3 v4 v5 alpha Lines of code Release

Foldershare code main

PHP Docs 2000 4000 6000 8000 10000 12000 v1 v2 v3 v4 v5 alpha Lines of code Release

Foldershare code misc

JS YML CSS TWIG TXT

Code trivia

slide-21
SLIDE 21

Foldershare API DocumentaJon

slide-22
SLIDE 22

SmallData API & Module

  • Structured data parsers & writers

– Tables, trees, and graphs – JSON, CSV, TSV, TXT, etc.

  • Field forma_ers

– Light-weight visualizaJon – Line plots, bar charts, pie charts, etc.

SmallData

slide-23
SLIDE 23

FOR ADMINISTRATORS

FolderShare configuraGon

Admin menu Structure > FolderShare

slide-24
SLIDE 24

FolderShare configuraGon

Fields Manage fields, forms & display

ConfiguraJon located in admin menu Structure > FolderShare

slide-25
SLIDE 25

FolderShare configuraGon

Files Storage locaJon & upload restricJons

slide-26
SLIDE 26

FolderShare configuraGon

Interface Command plugins

slide-27
SLIDE 27

FolderShare configuraGon

Lists Manage lisJng of file and folders

slide-28
SLIDE 28

FolderShare configuraGon

Search (opGonal)

slide-29
SLIDE 29

FolderShare configuraGon

Security Manage sharing capabiliJes

slide-30
SLIDE 30

FolderShare configuraGon

Web services (opGonal) Manage REST capabiliJes

slide-31
SLIDE 31

FolderShare REST seengs

Requires REST UI contributed module Manage REST operaJons

slide-32
SLIDE 32

FolderShare Usage

Admin menu Reports > FolderShare usage

slide-33
SLIDE 33

FolderShare permissions

slide-34
SLIDE 34

FOR USERS

slide-35
SLIDE 35

Top folders owned by you Top folders shared with you Public top folders

Menu These lists display top level folders Folders may have a descripJon

slide-36
SLIDE 36

Menu opJons – with no selecJon

slide-37
SLIDE 37

Menu opJons – with selecJon

slide-38
SLIDE 38

Sharing form to restrict access

slide-39
SLIDE 39

Sortable lisJng of files and folders. Different users

slide-40
SLIDE 40

Breadcrumbs shows path

slide-41
SLIDE 41

Every folder and file may add a descripJon.

(Forma_ed text field aka Body field in Drupal’s Node)

slide-42
SLIDE 42

Add customs fields such as Comments to FolderShare.

The FolderShare enGty is fieldable.

slide-43
SLIDE 43

Menu opJons change on selecJon

slide-44
SLIDE 44

View sub folder Breadcrumbs shows path

slide-45
SLIDE 45

VisualizaJons can be switched interacJvely to different chart types

Quick visualizaJon of CSV & JSON files

slide-46
SLIDE 46

Sample command line interacJon

foldershare --help foldershare --host http://demo.seedme.org --user dave --password ’cliRocks!' help

slide-47
SLIDE 47

Sample command line interacJon

foldershare --help foldershare --host http://demo.seedme.org --user dave --password ’cliRocks!' help ls --help ls / ls -l "/Classification Collection" ls -l "/Classification Collection/Preliminary Results" mkdir --help mkdir "/test" mkdir "/test/data" upload --help upload "plots/villi.png" "/test" upload "plots/composite.png" "/test” Upload "sample-small-data/OpenGL mesh memory use.csv" "/test/data" Upload "sample-small-data/Image classification breakdown schedule.json" "/test/data" download --help download "/test" "/Users/amit/downloads" rm --help rm -rf "/test”

slide-48
SLIDE 48

SeedMe2 – Target users/Use cases

  • Researchers

– CollaboraJon hub or personal dashboard

  • Project repositories

– Include project specific customizaJon (e.g. taxonomy, keywords)

  • Developers

– Integrate your scienJfic applicaJon

  • Science gateways

– Data sharing – Data publishing – Data escrow service

  • CyberInfrastructure providers

– Offer SeedMe pla\orm to your user base/communiJes

slide-49
SLIDE 49

Real usage by physicists

slide-50
SLIDE 50

Deployment/service opJons

DIY - Run own instance (Your branding + your domain)

  • On your own hardware
  • Condo hardware

Provider/vendor runs an instance

  • Your insJtuJon
  • CI centers
  • Commercial vendors

As a cloud service (Our branding & our domain)

  • dibbs.seedme.org
slide-51
SLIDE 51

Explorers welcome (web browser needed)

Trial website: sandbox.seedme.org

Try it yourself!

Developers invited

Download code: h_ps://dibbs.seedme.org/downloads

slide-52
SLIDE 52

Team

Amit Chourasia, David Nadeau, Dmitry Mishin & Michael Norman San Diego Supercomputer Center | University of California San Diego

SeedMe

Acknowledgements

All users and applicaJon integrators for their valuable feedback NaGonal Science FoundaGon

This material is based upon work supported by the NaJonal Science FoundaJon under Grant

  • No. ACI-1235505 and ACI-1443083

"Any opinions, findings, and conclusions or recommendaJons expressed in this material are those of the author(s) and do not necessarily reflect the views of the NaJonal Science FoundaJon."

slide-53
SLIDE 53

DEMO

slide-54
SLIDE 54

Talk to us

amit sdsc.edu

  • Keen to learn about potential FolderShare uses cases in your work
  • Seeking: PHP programmers

Desired: Deep knowledge of Drupal 8 core

  • Single sign-on: Checkout http://www.cilogon.org/drupal

Project code: dibbs.seedme.org/downloads

  • r

drupal.org/projects/foldershare drupal.org/projects/smalldata Trial website: sandbox.seedme.org Project website: dibbs.seedme.org