Perspectives Beng Chin in OOI www.c .comp.n .nus.edu.s - - PowerPoint PPT Presentation

perspectives
SMART_READER_LITE
LIVE PREVIEW

Perspectives Beng Chin in OOI www.c .comp.n .nus.edu.s - - PowerPoint PPT Presentation

Healthcare Transformation fr from Data and System Perspectives Beng Chin in OOI www.c .comp.n .nus.edu.s .sg/~ooibc 1 Contents Healthcare Problems Challenges Our Healthcare Data Analytics Stack GEMINI Cleaning,


slide-1
SLIDE 1

Beng Chin in OOI www.c .comp.n .nus.edu.s .sg/~ooibc

Healthcare Transformation fr from Data and System Perspectives

1

slide-2
SLIDE 2

Contents

  • Healthcare Problems
  • Challenges
  • Our Healthcare Data Analytics Stack
  • GEMINI
  • Cleaning, De-biasing, Regularizing
  • ForkBase
  • Storage Engine for Collaborative Analytics and Forkable Applications
  • Foodlg / Foodhealth
  • Pre-diabetes app
  • MediLOT
  • A blockchain solution
  • Conclusions

2

slide-3
SLIDE 3

3

The Mnistry Of Health (MOH) Office for Healthcare Transformation (MOHT) (formed in 2018) aims to shape the future of healthcare in Singapore. This is done by identifying, developing and experimenting with game-changing systems-level concepts and innovations in the key areas of health promotion, illness prevention and the delivery of care. AI in Health Grand Challenge (Ongoing large grant call by AI.SG – 3 x5 mil in the first phase and 1 x 20 mil in the second phase)

“How can Artificial Intelligence (AI) help primary care teams stop or slow disease progression and

complication development in 3H – Hyperglycemia (diabetes), Hypertension (high blood pressure) and Hyperlipidemia (high cholesterol) patients by 20% in 5 years?”

slide-4
SLIDE 4

3H Problems: Where/what Can We Contribute?

Life Style Hyperglycemia Hypertension Hyperlipidemia Drug Compliance + Pharmacogenomics Eye (DME, retinopathy, glaucoma, …) Kidney (AKI, ESRF …) Cardiac (AMI) Stroke (AF, fall…) Limb Salvage/amputation

Personal Health Coach

Hospital System Sensors + Cameras Chatbot + Behavior … Telemedicine Healthcare Analytics Primary Care Secondary Care ++

4

slide-5
SLIDE 5

Infective Ex COPD Pre-disease Primary care Community care

Proposed Current

DISCOVERY AI

Screening enrichment AI tool COPD

DISCOVERY AI

PHASE 1 - RULE BASED LEARNING COPD Workflow Version 1.1 (Carehub)

Emergency Dept Ward/ICU Primary care Community care SOC Discharge SMS COPD Infective Ex COPD Rehab Follow up Home care Follow up LOS:LOC Inf Ex COPD Mild

14 days + 7 days 1 day 30 days 1 day 5 days

Alerts to Dr

22:36

SMS Alerts to patient RBL tool function SSW Follow up Telehealth Follow up

2 days 1 day 0 day 5 days

3:6

Infective Ex COPD Follow up Home rehab Follow up

7 days 1 day 0 day 5 days

8:6

Infective Ex COPD Rehab Follow up Telehealth Follow up

14 days 14 days 1 day 0 day 5 days

14:20

Inf Ex COPD Mod Inf Ex COPD Severe

1 day 1 day 0 days LOS:LOC – Length of stay : care Checkpoint 1 Checkpoint 2 Book Home rehab Learnt patient characteristics behaviors and outcomes

SMS SMS SMS SMS SMS SMS SMS SMS

READMISSION Smoking, Fhx, Compliance, etc

High risk COPD Step-up Care

Step- up care Learnt patient characteristics behaviors and outcomes

SMS SMS

SMS cascade Carehub @AH hand over to GP Multidisciplinary Teams Integrated General Hospital@AH

SMS 5

slide-6
SLIDE 6
  • Increase the accuracy of diagnoses
  • Improve preventive medicine
  • Optimize insurance product costs
  • Better understand the needs for

medications

  • Cut costs on healthcare facility

management etc

A unified end-to-end engine to integrate all available data sources and provide a holistic view of medical data, from where we support all sorts of medical applications.

Healthcare System/AI’s Objective

This is beyond typical database query processing

6

slide-7
SLIDE 7

The Reality of Exploiting AI

  • The actual implementation of the ML algorithm is usually less than 5%

lines of code in a real, non-trivial application

  • The main effort (i.e. those 95% LOC) is spent on:
  • Data cleaning & annotation
  • Data extraction, transformation, loading
  • Data integration & pruning
  • Parameter tuning
  • Model training & deployment
  • … …
  • This blurs the line between DB and “non-DB” processing, and calls for

better integration

7

These are what we have been doing!

slide-8
SLIDE 8

The BIG Data Analytics Pipeline*

8

Acquisition Extraction/ Cleaning/ Annotation Integration Interpretation/ Visualization Analytics/ Modeling

Data Science Application of AI/ML Big Data

*Alexandros Labrinidis, H. V. Jagadish: Challenges and Opportunities with Big Data. PVLDB 5(12): 2032-2033 (2012)

slide-9
SLIDE 9

Challenges

9

slide-10
SLIDE 10

Read eadmi mi- ssion ion DP DPM Ra Radio dio- log logy App pp Predia ediabet bet es es Prev. GEMINI Platform Res esear earch Clinical Needs  Readmission  Disease Progression Modelling (DPM)  … Supp uppor

  • rt

… …

Identifying Common Challenges

10

slide-11
SLIDE 11

…… more

China Healthcare Providers/Hospitals

11

slide-12
SLIDE 12

12

Time-consuming data extraction

  • Different storage formats
  • Unstructured data

Difficult data cleaning

  • Missing data
  • Duplications
  • Different coding standards

Doctors-in-the-loop data annotation (medical expertise)

  • Missing code filling
  • Standardized diagnoses

Bias in observation data

  • Observation data is biased from the

actual conditions of the patients

Complexity of medical features

  • Numerous concepts
  • Heterogeneous data
  • Complex relations

Demanding data storage requirements

  • Multi-source and heterogeneous data

formats

  • Reuse of datasets
  • Provenance

Challenges

slide-13
SLIDE 13

Challenge 1:Data Preprocessing

13

time-consuming data extraction

different storage formats, un-structured data

difficult and expensive data cleaning

missing data, duplications, different coding standards

medical expertise required for data annotation

standardizing diagnoses, missing code filling

Unstructured T ext Data Diagnoses Lab T ests Medications Procedures Image Data

slide-14
SLIDE 14

Challenge 2:Bias in EMR Data

14

slide-15
SLIDE 15

NUH surgery dataset: 22987 medical features 12319 diagnosis codes 2335 lab test codes 6932 medication names 1401 procedure codes 8 demographic features (BirthYear, Gender etc)

Numerous Concepts Multi-source and Heterogeneous Data Complex Relations

UMLS consists of over 2.97 million concepts and 10+ million terms. Medical data consists of diagnoses, lab tests, procedures, etc. Complex relations among different sources of medical data

Challenge 3:Complex Features Relations

15

slide-16
SLIDE 16

Challenge 4:Dataset Management in Healthcare

  • Dataset Cleansing
  • Track evolution history to ensure correctness
  • Dataset Transformation
  • Save different formats for future reuse
  • Dataset Sharing/redundancy
  • Avoid data redundancy to reduce storage overhead
  • Dataset Security
  • Impose access control to healthcare data

16

slide-17
SLIDE 17

Challenge 5: Data Prior

  • Existing ML algorithms work well for image classification and sequence

prediction, but not healthcare problems

  • Images are not random pixels
  • Neighbor pixels are most corelated --> CNN
  • Color channel prior --> haze removal/super-resolution
  • Sequences are not random numbers/words
  • Latent state at each time point --> RNN LSTM
  • Prior for healthcare?
  • How to find and formulate?
  • How to create algo/model to utilize them?

17

slide-18
SLIDE 18

Matching Data and Model/Algorithm

  • No Free Lunch Theorem [1997]
  • Checklist for useful AI:
  • Lots of data
  • Flexible models
  • Efficient system and algorithm design
  • Powerful priors that can defeat the curse of dimensionality
  • Opportunities come from utilizing data distribution information
  • Can we learn prior from data? (Domain-specific AutoML)

18

slide-19
SLIDE 19

Development Pipeline

  • Parameterize existing data processing solutions to meet the

characteristics of healthcare data

19

Data Acquisition: Hospital Data Genome Data Medical KB CT/MRI Images Integration& Augmentation: AE/D Data Cleaning Collaborate Analytics KB Data Enrichment Image Augmentation Understanding& Interpretation: EMR Bias Resolving EMR Imputation EMR Embedding EMR Pattern Mining Application Deployment: Standard Model Pool Adaptive Regularizer KB Hashing Model Bagging & Evaluation Extensive Raw Data Cleaned Data with Rich Semantics Extracted Effective Feature Sets Medical Insights

slide-20
SLIDE 20

Enabling Global Optimization

  • SINGA – RAFIKI (MLaaS) -- PANDA mainly for healthcare

20

PANDA Healthcare Current AI systems Aim Defining new AI problems Optimizing for existing AI problems Iteration Doctors take part in the development circle Data scientists as the agent Key Techs Efficient declarative interaction ML model and platform Domain Knowledge Instilled by doctors Understood by data scientists Delivery Explored together with doctors Plain model outputs

  • J. Gao, W. Wang, M. Zhang, G. Chen, H.V. Jagadish, G. Li, T.K. Ng, B.C. Ooi, S. Wang, J. Zhou: PANDA: Facilitating Usable AI Development.

https://arxiv.org/pdf/1804.09997.pdf 2018.

  • W. Wang, S. Wang, J. Gao, M. Zhang, G. Chen, T.K. Ng, B.C. Ooi, J. Shao: Rafiki: Machine Learning as an Analytics Service System. 2018
slide-21
SLIDE 21

Healthcare Data Analytics Stack

GEMINI (GEneralisable Medical Information aNalysis and Integration platform)

21 Z.J. Ling, Q.T. Tran, J. Fan, G.C.H.Koh, T. Nguyen, C.S. Tan, J.W.L. Yip and M. Zhang. GEMINI: An Integrative Healthcare Analytics System PVLDB 7(13): 1766-1771, 2014.

slide-22
SLIDE 22

AI Implementation at NUH

Pre-processing filter matrix

CDOC

CCDR

Demographic information ED notes Dispensed medication Visits and encounters Labtest results Radiology reports Procedures Discharge summaries Vital signs Inpatient medications Inpatient notes Outpatient notes

H-Cloud

Diagnosis module Readmissions module Complications module Disease progression mod VDO module Future Extensions

Production AI Modules

Predicted clinical WARNING Deep machine learning

Reinforced learning

GEMINI

22

slide-23
SLIDE 23

Example: Readmission Prediction

23

slide-24
SLIDE 24

WARNING

88.6%

Chance of readmission

Ranked Factors : 1. Uncontrolled diabetes H/C 16 2. > 6 medications 3. 72.3% chance of post-op wound infection 4. Past readmissions due to social factors

Acknowledge

Common Alert Platform

24

slide-25
SLIDE 25

GEMINI Platform (2011 - )

+

Visualization SINGA Malleable, Semantic Storage CPU-GPU Cluster ForkBase Infrastructure Data Analysis Pipeline iDat DICE Raw Data CohAna CohAna CDAS epiC Cohort Analysis Machine/Deep Learning Crowdsourcing Data Integration Big Data Processing Application

Healthcare

EMR EMR-T EMR Transformation

GAM

25

slide-26
SLIDE 26

26

MASTER MAPPER

MAPFORCE AUTO

General complications model Lab test model

ForkBase Working storage LSI database CSI database Extracted trial data

AUGURIUM Readmissions Disease Progression Model Pre-Processing layer Expandable storage SPH CSI LSI SDSD REDCAP I2B2

DISCOVERY AI SandBox

Database Layer

CDOC|CCDR

Tissue Repository

LSI

CSI

SP H

Active layer

Augurium Learning database

Learning layer

SPH database RA learning database DPM learning database

slide-27
SLIDE 27

Making Healthcare Data Usable

  • J. Dai, M. Zhang, G. Chen, J. Fan, K.Y. Ngiam, B.C. Ooi: Fine-grained Concept Linking using Neural Networks in Healthcare. ACM SIGMOD 2018
  • X. Cai, J. Gao, K. Y. Ngiam, B. C. Ooi, Y. Zhang, and X. Yuan. Medical concept embedding with time- aware attention. IJCAI 2018.

27

slide-28
SLIDE 28

Healthcare Data Usability

1.1 “chronic kidney” 1.2 Returned result set 1.3 Manually curate the results Round 1 2.1 “chronic renal failure”, ”ckd” 2.2 Returned result set 2.3 Manually curate the results Round 2 1.4 Confirmed results 2.4 Confirmed results …

If a doctor wants to analyze the medical records related to “chronic kidney disease” …

28

slide-29
SLIDE 29

Healthcare Data Usability

  • Two reasons cause the healthcare data usability.
  • Different writing styles.

Real-world healthcare data

2 recent cva posterior circulation transient ischaemic infarct multi infarct cva with dementia massive ischemic stroke with hemorrhagic conversion acute stroke infarct 2 rt sided cva with gd recovery 1994 5 r groin hematoma cerebellar stroke acute left pontine cva acute cva left ic laci acute cva left sided weakness basal ganglion infarct

refer to concept code Canonical description

I63.50 Cerebral infarction due to unspecified

  • cclusion or stenosis of unspecified

cerebral artery 29

slide-30
SLIDE 30

Healthcare Data Usability

  • Two reasons cause the healthcare data usability.
  • Different writing styles.
  • Different medical standards.

Real-world healthcare data

internal haemorrhoid prolapsed haemorrhoid bleeding ligated 3 degree pile prolapsed haemorrhoid 3rd degree prolasped piles, not thrombosed thrombosed internal haemorrhoid 3rd degree pile x 1 haemorrhoid 3rd degree external hemorrhoids hemorrhoids prolapsing piles haemorrhoids no complication prolapsed and thrombosed haemorrhoid at 4 clock

Standard Concept code Canonical description

ICD-10-CM K64.2 Third degree hemorrhoids ICD-9-CM 455.0 Internal hemorrhoids without mention of complication ICD-9-CM 455.1 Internal thrombosed hemorrhoids ICD-9-CM 455.2 Internal hemorrhoids with other complication ICD-9-CM 455.5 External hemorrhoids with other complication ICD-9-CM 455.6 Unspecified hemorrhoids without mention of complication ICD-9-CM 455.7 Unspecified thrombosed hemorrhoids ICD-9-CM 455.8 Unspecified hemorrhoids with other complication 30

slide-31
SLIDE 31

Healthcare Data Usability

  • Two reasons cause the healthcare data usability.
  • Different writing styles.
  • Different medical standards.
  • To improve the healthcare data usability, we need a linker that is able

to automatically link a medical record to a unified concept ontology.

Concept linker

31

slide-32
SLIDE 32

Neural Concept Linking

  • We have developed a neural concept linking framework to accomplish

the healthcare concept linking.

32

slide-33
SLIDE 33

Neural Concept Linking

Concept representations Word representations p(s|c)=0.016

33

slide-34
SLIDE 34

Example Results

34

chr iron deficiency anemia iron deficiency anemia secondary to blood loss (chronic)

NCL

protein deficiency anemia

Other linkers

adenocarcinoma of colon malignant neoplasm of colon, unspecified polyp of colon

NCL Other linkers

K63.5 C18.9 D53.0 D50.0

We cleaned 13 years of NUHS data – 90 % done by machine, 10% done by human

slide-35
SLIDE 35

Resolving “bias”

35

  • K. Zheng, J. Gao, K. Y. Ngiam, B. C. Ooi and W.L.J. Yip: Resolving the Bias in Electronic Medical Records. ACM KDD, 2017.

Adaptive Lightweight Regularization Tool for Complex Analytics. Z. Luo, S. Cai, J. Gao, M. Zhang, K.Y. Ngiam, G. Chen and W. Lee. ICDE, 2018. Knowledge Driven Regularization. K. Yang, Z. Luo, J. Gao, J. Zhao, B.C. Ooi, B. Xie. 2019

slide-36
SLIDE 36

Similar Pattern and yet Different Results

  • Patient1 always visits hospital due to

respiratory infection

  • Can we conclude that Patient1 has

respiratory infection every day?

  • Patient2 always visits hospital due to

chronic kidney disease

  • Can we conclude that Patient2 has chronic

kidney disease every day?

  • What is the difference?

36

slide-37
SLIDE 37

Bias in EMR Data

  • If a doctor or analyst want to analyze the EMR data with missing values, they may

employ traditional imputation methods directly

  •  Misinterpretation

time Acute kidney failure (AKF) ?

N17.9 N17.9

? 𝒖𝟐 𝒖𝟑 𝒖𝟒 𝒖𝟓 ? ? 𝒖𝟔 𝒖𝟕 ? Last observation carried forward time Glomerular filtration rate (GFR) ? ? 𝒖𝟐 𝒖𝟑 𝒖𝟒 𝒖𝟓 ? 𝒖𝟔 𝒖𝟕

40 ?

? Mean imputation

20

37

slide-38
SLIDE 38

Bias in EMR Data

  • Bias – recorded EMR series is different from patients’ actual hidden conditions
  • Patients tend to visit hospital more often when they feel sick
  • Doctors tend to prescribe the lab examinations that show abnormality
  • To Solve Bias Challenge – EMR Regularization
  • Transform the biased EMR series into unbiased EMR series

38

slide-39
SLIDE 39

Resolving Bias in EMR Data

  • Condition Change Rate (CCR)
  • measures how a medical feature is likely

to change from its condition in the previous observation

  • Observation Rate (OR)
  • measures the probability that a medical

feature is exposed at a time point based

  • n its actual condition at that time point

Time Slice 𝑢 Time Slice 𝑢 + 1 Time Slice 0

39

slide-40
SLIDE 40

Resolving Bias in EMR Data

  • Imputation accuracy evaluation
  • Benefits for analytic tasks
  • In-hospital mortality prediction, Diagnosis by category prediction
  • Disease progression modelling

40

slide-41
SLIDE 41

Time Severity Severity Labeled Medical Features time

𝒕𝟐

𝒋

𝒕𝟑

𝒋

𝒕𝟒

𝒋

… 𝒕𝒍

𝒋

Longitudinal Patient Matrix

Diag Lab Med Proc

Kidney Disease Blood Pressure Insulin Cholesterol Amputation HbA1C

𝒖𝟐 𝒖𝟑 𝒖𝟒 𝒖𝟓

Diabetes

Age Race Gender Education … Prediction Time Point

10 20 30 40 50 60 70

2012-01-01 2012-03-01 2012-04-30 2012-06-29 2012-08-28 2012-10-27 2012-12-26

GFR Value Time

Comparably Stable Progression Trajectory

Patient1 Patient2 Patient3 Patient4 Patient5 Patient6

10 20 30 40 50 60 70

2012-01-01 2012-03-01 2012-04-30 2012-06-29 2012-08-28 2012-10-27 2012-12-26

GFR Value Time

Deteriorating Progression Trajectory

Patient1 Patient2 Patient3

Disease Progression Modeling

41

slide-42
SLIDE 42

Advice to Doctors on Intervention

  • Our model would suggest to guarantee the monitoring for Patient 1  may need dialysis or kidney transplant
  • Our model would suggest healthcare workers to provide more aggressive interventions to Patient 2 in advance
  • Our model would suggest to guarantee the monitoring for Patient 3

Powered by GEMINI

Lower is more severe

42

slide-43
SLIDE 43

Facilitating Data Sharing and Provenance

  • S. Wang, T. T. A . Dinh, Q. Lin, Z. Xie, M. Zhang, Q. Cai, G. Chen, B.C. Ooi, P. Ruan: ForkBase:

An Efficient Storage Engine for Blockchain and Forkable Applications. VLDB 2018

43

slide-44
SLIDE 44

ForkBase Designs

Versioning & Tamper Evidence Merkle DAG Indexing & Deduplication SIRI indexes Collaboration Workflows Fork Semantics git database blockchain (versioning) (query) (integrity)

44

slide-45
SLIDE 45

ForkBase Storage Stack

Node 𝑩 Node 𝑪

put(object) → version get(version) → {objects} merge({objects}) → object

Access Control

branch-based

Data Security

integrity

Consistency

merge semantics

Documents Hosting Git Collaborative Dataset Mgmt Blockchain

Chunk Storage

(deduplication, immutability)

Branch Representation

(versioning, tamper evidence)

Data Access APIs

(data types, fork semantics)

Semantic Views

(application-oriented)

Applications

45

slide-46
SLIDE 46

SIRI Indexes & POS-tree

  • An Index Class: Structurally-Invariant Reusable Indexes
  • Structurally Invariant, Recursively Identical, Universally Reusable …
  • An Implementation: Pattern-Oriented-Split Tree

Root with Hash

M M M M M M M M M

{‹split-key, H({elements}›} {elements} M Ind ndex Nod

  • de

Dat Data Nod

  • de

Nod

  • de Met

Meta Nod

  • de Pat

Patte tern

M

Content-determined Structure (-> Deduplication) Native Merkle Tree (-> Tamper Evidence) Probabilistically Balanced Tree (-> Query Efficiency)

46

slide-47
SLIDE 47

Blockchain Data Model in ForkBase

  • KV Store
  • Customized structures
  • Linked block
  • State Merkle tree
  • State delta
  • Hard to implement
  • ForkBase
  • Achieve with built-in types
  • UBlob
  • UMap
  • Easy to maintain
  • 10+ lines for each structure

⋯ ⋯

Blockchain ForkBase FID Txns prev_hash Blob

Data (Blob)

Map Smart Contract ID ... ... ... ... Map Data Key Data Version ... ... ... ...

⋯ ⋯

Data (Blob) Data (Blob)

⋯ ⋯

⋯ ⋯ Blockchain Internal Structure

State Hash Txns prev_hash Block

State Delta State Merkle Tree

Rocksdb KV Store

Contract ID Key Value

⋯ ⋯ 47

slide-48
SLIDE 48

Analytic-Ready Blockchain Backend

  • Analytic on blockchain is expensive
  • Need to scan whole block history to extract information
  • Built-in data types in ForkBase to support fast analytics

State Scan Query Block Scan Query

48

slide-49
SLIDE 49

Prevention is Better Than Cure

  • L. Long, W. Wang, J. Wen, M. Zhang, Q. Lin, B.C. Ooi: Object-Level Representation Learning for Few-Shot

Image Classification. arXiv preprint arXiv:1805.10777. 2018

49

slide-50
SLIDE 50

Lifestyle InterVENtion Programme ( LIVEN )

50

The effect of a behaviour-based lifestyle change program using combined face and remote sessions on weight, diet intake and physical activity level in people at-risk of diabetes: a Randomised Controlled Trial

Diabetes Prevention Programme

US

UK

Remote Sessions Face to Face Sessions

slide-51
SLIDE 51

51

Effecting Behavioral Change

  • Self-monitoring with pre-set

goals and intuitive nutrition information

  • Peer-to peer monitoring of

dietary and physical activity goals

  • Daily and weekly reports of

progress

  • Remote monitoring by

healthcare professionals for timely and meaningful feedback

  • Quick and Easy way to record

dietary intake

  • A deep learning image-based

food recognition for a faster, closest food match and handy recording

Snap Track Feedback

slide-52
SLIDE 52

Diabetes Prevention

Image Recognition Knowledge Base Healthcare Analytics Social Network Scan Diary Review Share

Activity Plan Recommendation

Healthy Diet + Exercise

52

slide-53
SLIDE 53

Administrator/Dietician Portal

  • Dietary Review + Chat
  • Review user’s weekly meal (photo) history

Realtime Chat with Dietician provides instant feedback to users

53

slide-54
SLIDE 54

Foodhealth/Foodlg

Collect training images from heterogeneous sources and label them via crowdsourcing Train deep learning models for food recognition Food recognition and health analysis using images and

  • ther information from the

Foodlg app

Off-line On-line

STEP 1 STEP 2 STEP 3

54

slide-55
SLIDE 55

Personalizing and Decentralizing Healthcare

55

slide-56
SLIDE 56

AI + BlockChain + Cloud + big Data

56

BigData/ DBMS Objectives:

  • 1. Transparency
  • 2. Accountability
  • 3. Auditability
  • 4. Governance
  • 5. Security
  • 6. …

Analytics/ DataScience

slide-57
SLIDE 57

BlockChain enabled Healthcare

  • BlockChain (BC) acts as a tamper-evident storage for archiving

Healthcare Records from different healthcare providers

  • BlockChain acts a “Central Healthcare Record Repository”
  • It enables Data Provenance, Data Analytics, and Medical-care

everywhere based on patient’s preference

  • It may help transform Healthcare management and research

57

slide-58
SLIDE 58
  • 1. Holistic

Every patient will have a complete longitudinal health record: their own health story that they can access at any institution

  • 2. Patient-

centric The patient holds his/her

  • wn private key

and has fine control over who can view their medical records

  • 3. Personalised

Using an advanced analytics

  • verlay

(GEMINI), MediLOT facilitates personalised treatment strategies

  • 4. Decentralised

Patients’ data is stored in different locations, eliminating the risk

  • f

a single catastrophic breach

The MediLOT Solution

slide-59
SLIDE 59

Hospital Patient Data Requestor

Permissioned (Hyperledger++)

Responsible for aggregation of patient EHR

Block 1 Block 2 Block 3 Block N Block 1 Block 2 Block 3 Block 4 Block N Block 5 Block 6 ERC20 Token Contract Registry Contract Consent Contract

Public (Ethereum)

Allows for transfer and crediting of ERC20 LOT tokens (MediLOT utility token)

Dual BlockChain Schema

Who will Pay?

slide-60
SLIDE 60

On-Chain Scalability

Consensus Layer (PBFT, PoW, PoS, etc.) Smart Contract Execution Engine (Virtual Machine, Docker, etc.) Data Model Layer (LevelDB, RocksDB, etc.)

60

Dinh, J. Wang, G. Chen, R. Liu, B. C. Ooi, K.-L. Tan: BLOCKBENCH: A Framework for Analysing Private Blockchains. ACM SIGMOD 2017

  • A. Dinh, R. Liu, M. Zhang, G. Chen, B.C. Ooi, J. Wang: Untangling Blockchain: A Data Processing View of Blockchain Systems. IEEE TKDE, 2018.
slide-61
SLIDE 61

MediLOT’s Technologies

Dual Blockchain

Ethereum & Hyperledger++

  • Enhanced

Hyperledger with scalable consensus and sharding

  • Throughput up by

15x

Analytics

GEMINI The underlying healthcare suite that supports big data analytics and personalised medicine

Data Storage

ForkBase Proprietary storage with rich semantics, immutability and data sharing, Blockchain optimised native storage system

61

slide-62
SLIDE 62

Conclusions

  • Healthcare is a complex but impactful/meaningful Application
  • Domain Knowledge
  • Verification and Validation – a tedious process
  • A good (example) application that calls for better integration of AI/ML and

Database technologies, and possibly Blockchain technologies

  • We have addressed some of the challenges, and have implemented:
  • GEMINI (DICE, CDAS, epiC, Apache SINGA, ForkBase) is being used by 2 major

hospitals in Singapore

  • Foodhealth (foodlg) is used by 3 hospitals in Singapore
  • MediLOT is in testnet phase and used by hospitals in China
  • Objectives:
  • To predict, prevent/pre-empt, personalize for more effective healthcare
  • Be Good. If you can’t, be Safe. Live well …

62

Minority Report In Healthcare?

slide-63
SLIDE 63

Acknowledgements

  • Collaborators: Gang Chen, H.V. Jagadish, Kee Yuan Ngiam, James Yip++
  • Collaborators (ex-students): Meihui Zhang, Wei Wang, Jinyang Gao,

Chang Yao

  • Visitors: Divy Agrawal, H.V. Jagadish, Dave Maier, Renée Miller, Tamer

Özsu, Amit Sheth, Wang-Chien Lee, Wang-Chew Tan, Ju Fan, ++

  • Current set of 6-10-10 bosses: Zhaojing Luo, Kaiping Zheng, Jian Dai,

Sheng Wang, Shaofeng Cai, Lei Zhu, Qian Lin, Pingcheng Ruan, Qingchao Cai, Anh Dinh, Zhongle Xie, Piaopiao Feng ++

  • Ex-Research Fellows and RAs/Engineers/Students: ….

63

slide-64
SLIDE 64

Healthcare AI I Success Factors

Clinical problems and clinician drivers Data, data, data Data scientists Scalable, secure hardware Clinical trials and Clinicians 01 02 03

Foundational factors:

  • Funding
  • Ethics
  • Trusted custodian
  • Central governance
  • Freedom to innovate and

implement

04 06 05 Deployment Platforms/ Productisation

64

slide-65
SLIDE 65

Thanks!

65