A Multi-Tenancy Cloud-Native Digital Library Platform Yinlin Chen, - - PowerPoint PPT Presentation

a multi tenancy cloud native digital library platform
SMART_READER_LITE
LIVE PREVIEW

A Multi-Tenancy Cloud-Native Digital Library Platform Yinlin Chen, - - PowerPoint PPT Presentation

A Multi-Tenancy Cloud-Native Digital Library Platform Yinlin Chen, Jim Tuttle, William A. Ingram {ylchen, jim.tuttle, waingram}@vt.edu Information Technologies and Services Virginia Tech Libraries Agenda Cloud-native concept Virginia


slide-1
SLIDE 1

A Multi-Tenancy Cloud-Native Digital Library Platform

Yinlin Chen, Jim Tuttle, William A. Ingram

{ylchen, jim.tuttle, waingram}@vt.edu

Information Technologies and Services Virginia Tech Libraries

slide-2
SLIDE 2

Agenda

  • Cloud-native concept
  • Virginia Tech Digital Library Platform (VTDLP)
  • Design strategy
  • Architecture overview
  • Implementation overview
  • VTL experiences
slide-3
SLIDE 3

Cloud-native Concept

  • Entire infrastructure is deployed in the Cloud (AWS)
  • Platform is composed of a suite of microservices and

managed services

  • Focus on the business logic and workflow
  • Utilize the advantages provided by the Cloud
slide-4
SLIDE 4

Virginia Tech Digital Library Platform (VTDLP)

  • New services to Digital Library Platform

– ID Minting service, Access Service, Metadata service, …

  • Migrating legacy services to Digital Library Platform

– IAWA, VTechWork, …

Preservation Data Modeling Presentation

slide-5
SLIDE 5

VTDLP Overview

Preservation staging SW Virginia IAWA Images VtechWork ETDs Storage

APTrust Amazon S3 Presentation

Others IAW A Others BeyondVT

. . . Batch Metadata Service Metadata Service ID Minting Service Resolution Service Serialization Service Other Services

slide-6
SLIDE 6

Design Strategy

  • Cloud native (AWS ecosystem)
  • Microservice/SOA (AWS lambda)
  • Serverless (AWS managed services)
  • CI/CD Pipeline
  • Caching as much as possible

– Static files – Lambda functions

  • Automation as much as possible

– Infrastructure as code – No manual provisioning or managing servers

slide-7
SLIDE 7

AWS Ecosystem

Amazon S3 Amazon Glacier AWS Lambda Amazon DynamoDB Amazon CloudFront Amazon Route 53 Amazon CloudWatch AWS CloudTrail AWS CloudFormation IAM Amazon API Gateway Amazon SQS Amazon SNS

AWS CLI Network & Content Delivery Compute & Database Management

AWS Organizations Messaging Security & Identity Storage Services Amazon Pinpoint Amazon Cognito Amazon EC2 Amazon ES AWS Certificate Manager AWS Amplify

slide-8
SLIDE 8

Software stacks

React AWS Amplify Node.js Python

Web App Microservice (AWS Lambda)

AWS AppSync

slide-9
SLIDE 9

Preservation Pipeline

Checksums Fixity Virus Scan AWS S3 APTrust PREMIS Apache Airflow

slide-10
SLIDE 10

Lambda Example – Metadata file

1. File upload to S3 2. S3 triggers a Lambda function 3. Lambda function parses file content and inserts/updates record in the DynamoDB

slide-11
SLIDE 11

Lambda Example – DynamoDB / ES

1. Data modifications in DynamoDB will trigger a Lambda function 2. Lambda function captures changes and updates Amazon ES

slide-12
SLIDE 12

Presentation - Multi-Tenant Architecture

DB Application Hub Search App1 App2 AppN

slide-13
SLIDE 13

Web App

Amazon Route 53 Amazon API Gateway Amazon CloudFront AWS Certificate Manager Amazon Cognito Amazon S3 AWS Lambda Amazon DynamoDB Amazon Elasticsearch Service

AWS Cloud

slide-14
SLIDE 14

The International Archive of Women in Architecture

  • A level 0 compliant image server using Amazon S3 and Amazon

CloudFront

  • Tiles images, manifest JSON files, and etc.
  • Terabytes of scan images to be processed
slide-15
SLIDE 15

Image processing workflow

AWS Lambda

Amazon S3

Amazon S3 AWS Batch

Batch Job – image set 1 Batch Job – image set 2 Batch Job – image set 3 Batch Job – image set N

Tiles & Manifest

Raw images

Amazon CloudWatch Rule

Amazon Elastic File System

Amazon EC2

slide-16
SLIDE 16

Batch job - IIIF_S3 Docker

Amazon S3

  • Command
  • Parameters
  • Environment

variables

  • vCPUs
  • Memory

Tiles & Manifest

IIIF

Amazon Elastic File System AWS Batch

slide-17
SLIDE 17

CI/CD with AWS

Developers

AWS CodePipeline AWS CodeBuild AWS CloudFormation Amazon S3 AWS Lambda Amazon API Gateway

(1) (2) (3) (4) (5) (6) (7)

slide-18
SLIDE 18

Cloud benefit - Backup examples

  • S3

– Amazon S3 is 99.999999999% durability and 99.99% availability. – On average, may lose one of 10,000 objects every 10 million years or so. – Cross-region replication

  • DynamoDB

– Point-in-time recovery (Last 35 days) – On-Demand Backup (Stored in S3)

  • ElasticSearch

– Daily snapshots (Last 14 days) – On-Demand Backup (Stored in S3)

slide-19
SLIDE 19

VTL Experiences

  • Entire development team is AWS certified
  • One AWS Certification Subject Matter Expert (SME)
  • AWS trainings and conferences
  • Thinking and implementing new ideas the Cloud way
slide-20
SLIDE 20

Q & A

Thank You!