Serverless Beacon: Helping take genomic analysis from the cloud to the clinic
HEALTH AND BIOSECURITY
Serverless Beacon: Helping take genomic analysis from the cloud to - - PowerPoint PPT Presentation
Serverless Beacon: Helping take genomic analysis from the cloud to the clinic Brendan Hosking October 2019, HISA Data Analytics 2019 HEALTH AND BIOSECURITY Genomic data discovery: Beacons 2 | Custom Continuous Deployment to Uncover the
HEALTH AND BIOSECURITY
Custom Continuous Deployment to Uncover the Secrets in the Genome | Brendan Hosking 2 |
Serverless Beacon: Helping take genomic analysis from the cloud to the clinic | Brendan Hosking
Custom Continuous Deployment to Uncover the Secrets in the Genome | Brendan Hosking 3 | Serverless Beacon: Helping take genomic analysis from the cloud to the clinic | Brendan Hosking
Custom Continuous Deployment to Uncover the Secrets in the Genome | Brendan Hosking
Used by
4 |
Serverless Beacon: Helping take genomic analysis from the cloud to the clinic | Brendan Hosking
Serverless Innovation for Health | Denis C. Bauer | @allPowerde 5 |
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT HG00096 HG00097 HG00099 HG00100 HG00101 HG00102 HG00103 HG00105 HG00106 HG0 1 15820 rs2691315 G T 100 PASS AC=6;AN=20;VT=SNP;EX_TARGET GT 1|0 0|1 0|1 0|0 0|0 1| 1 15903 rs557514207 G GC 100 PASS AC=8;AN=20;VT=INDEL;EX_TARGET GT 0|1 0|1 0|0 1|0 0|1 0|0 1 69761 rs200505207 A T 100 PASS AC=2;AN=20;VT=SNP;EX_TARGET GT 0|0 0|0 0|0 0|0 1|1 0|0 1 889159 rs13302945 A C 100 PASS AC=20;AN=20;VT=SNP;EX_TARGET GT 1|1 1|1 1|1 1|1 1|1 1|1 1 894573 rs13303010 G A 100 PASS AC=18;AN=20;VT=SNP;EX_TARGET GT 1|0 1|1 1|1 1|1 1|1 1|1 1 897216 rs186126206 C T 100 PASS AC=1;AN=20;VT=SNP;EX_TARGET GT 0|0 0|0 0|0 0|0 1|0 0|0 1 897325 rs4970441 G C 100 PASS AC=20;AN=20;VT=SNP;EX_TARGET GT 1|1 1|1 1|1 1|1 1|1 1|1 1 899928 rs6677386 G C 100 PASS AC=20;AN=20;VT=SNP;EX_TARGET GT 1|1 1|1 1|1 1|1 1|1 1|1 1 1564952 rs535125876;rs112177324 TG TGG,T 100 PASS AC=0,8;AN=20;VT=INDEL;MULTI_ALLELIC;EX_TARGET GT 2|2 0|0 0|0
Records Samples Beacon Dataset
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT HG00096 HG00097 HG00099 HG00100 HG00101 HG00102 HG00103 HG00105 HG00106 HG00107 1 15820 rs2691315 G T 100 PASS AC=6;AN=20;VT=SNP;EX_TARGET GT 1|0 0|1 0|1 0|0 0|0 1|0 1|0 0|0 0|0 0|1 1 15903 rs557514207 G GC 100 PASS AC=8;AN=20;VT=INDEL;EX_TARGET GT 0|1 0|1 0|0 1|0 0|1 0|0 0|1 0|1 0|1 0|1 1 69761 rs200505207 A T 100 PASS AC=2;AN=20;VT=SNP;EX_TARGET GT 0|0 0|0 0|0 0|0 1|1 0|0 0|0 0|0 0|0 0|0 1 69897 rs200676709 T C 100 PASS AC=17;AN=20;VT=SNP;EX_TARGET GT 1|1 1|1 0|0 1|1 1|1 1|0 1|1 1|1 1|1 1|1 1 876499 rs4372192 A G 100 PASS AC=20;AN=20;VT=SNP;EX_TARGET GT 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1 877831 rs6672356 T C 100 PASS AC=20;AN=20;VT=SNP;EX_TARGET GT 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1 878314 rs142558220 G C 100 PASS AC=2;AN=20;VT=SNP;EX_TARGET GT 0|0 0|0 1|1 0|0 0|0 0|0 0|0 0|0 0|0 0|0 1 881070 rs41285794 G A 100 PASS AC=1;AN=20;VT=SNP;EX_TARGET GT 0|0 0|0 0|0 1|0 0|0 0|0 0|0 0|0 0|0 0|0 1 881627 rs2272757 G A 100 PASS AC=13;AN=20;VT=SNP;EX_TARGET GT 0|0 0|0 1|1 1|1 1|0 1|1 1|1 1|0 1|0 1|1 1 881918 rs35471880 G A 100 PASS AC=2;AN=20;VT=SNP;EX_TARGET GT 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 1|1 1 887560 rs3748595 A C 100 PASS AC=20;AN=20;VT=SNP;EX_TARGET GT 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1 887801 rs3828047 A G 100 PASS AC=20;AN=20;VT=SNP;EX_TARGET GT 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1 888639 rs3748596 T C 100 PASS AC=20;AN=20;VT=SNP;EX_TARGET GT 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1 888659 rs3748597 T C 100 PASS AC=20;AN=20;VT=SNP;EX_TARGET GT 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1 889158 rs13303056 G C 100 PASS AC=20;AN=20;VT=SNP;EX_TARGET GT 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1 889159 rs13302945 A C 100 PASS AC=20;AN=20;VT=SNP;EX_TARGET GT 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1 894573 rs13303010 G A 100 PASS AC=18;AN=20;VT=SNP;EX_TARGET GT 1|0 1|1 1|1 1|1 1|1 1|1 1|1 1|0 1|1 1|1 1 897216 rs186126206 C T 100 PASS AC=1;AN=20;VT=SNP;EX_TARGET GT 0|0 0|0 0|0 0|0 1|0 0|0 0|0 0|0 0|0 0|0 1 897325 rs4970441 G C 100 PASS AC=20;AN=20;VT=SNP;EX_TARGET GT 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1 899928 rs6677386 G C 100 PASS AC=20;AN=20;VT=SNP;EX_TARGET GT 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1 1564952 rs535125876;rs112177324 TG TGG,T 100 PASS AC=0,8;AN=20;VT=INDEL;MULTI_ALLELIC;EX_TARGET GT 2|2 0|0 0|0 2|0 0|2 2|0 0|2 2|0 2|0 1 878314 rs142558220 G C 100 PASS AC=2;AN=20;VT=SNP;EX_TARGET GT 0|0 0|0 1|1 0|0 0|0 0|0 0|0 0|0 0|0 0|0 1 881070 rs41285794 G A 100 PASS AC=1;AN=20;VT=SNP;EX_TARGET GT 0|0 0|0 0|0 1|0 0|0 0|0 0|0 0|0 0|0 0|0 1 881627 rs2272757 G A 100 PASS AC=13;AN=20;VT=SNP;EX_TARGET GT 0|0 0|0 1|1 1|1 1|0 1|1 1|1 1|0 1|0 1|1 1 881918 rs35471880 G A 100 PASS AC=2;AN=20;VT=SNP;EX_TARGET GT 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 1|1 1 887560 rs3748595 A C 100 PASS AC=20;AN=20;VT=SNP;EX_TARGET GT 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1 887801 rs3828047 A G 100 PASS AC=20;AN=20;VT=SNP;EX_TARGET GT 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1 888639 rs3748596 T C 100 PASS AC=20;AN=20;VT=SNP;EX_TARGET GT 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1 888659 rs3748597 T C 100 PASS AC=20;AN=20;VT=SNP;EX_TARGET GT 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1 889158 rs13303056 G C 100 PASS AC=20;AN=20;VT=SNP;EX_TARGET GT 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1 889159 rs13302945 A C 100 PASS AC=20;AN=20;VT=SNP;EX_TARGET GT 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1 894573 rs13303010 G A 100 PASS AC=18;AN=20;VT=SNP;EX_TARGET GT 1|0 1|1 1|1 1|1 1|1 1|1 1|1 1|0 1|1 1|1 1 897216 rs186126206 C T 100 PASS AC=1;AN=20;VT=SNP;EX_TARGET GT 0|0 0|0 0|0 0|0 1|0 0|0 0|0 0|0 0|0 0|0 1 897325 rs4970441 G C 100 PASS AC=20;AN=20;VT=SNP;EX_TARGET GT 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1 899928 rs6677386 G C 100 PASS AC=20;AN=20;VT=SNP;EX_TARGET GT 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1|1 1 1564952 rs535125876;rs112177324 TG TGG,T 100 PASS AC=0,8;AN=20;VT=INDEL;MULTI_ALLELIC;EX_TARGET GT 2|2 0|0 0|0 2|0 0|2 2|0 0|2 2|0 2| 0|0Variant Call File
Serverless Beacon: Helping take genomic analysis from the cloud to the clinic | Brendan Hosking
Serverless Innovation for Health | Denis C. Bauer | @allPowerde
6 |
Serverless Beacon: Helping take genomic analysis from the cloud to the clinic | Brendan Hosking
Serverless Innovation for Health | Denis C. Bauer | @allPowerde
7 |
Serverless Beacon: Helping take genomic analysis from the cloud to the clinic | Brendan Hosking
Bioinformatics | Denis C. Bauer | @allPowerde
Custom Continuous Deployment to Uncover the Secrets in the Genome | Brendan Hosking 3 | 1.POST request to create/update dataset. The request includes vcf locations and dataset
metadata.
2.Invoke submitDataset lambda function. 3.Validate and insert the dataset metadata into Datasets dynamodb. 4.If a change was made to the vcfs in a dataset, publish the dataset to summariseDataset SNS. 5.Reads the dataset id from summariseDataset SNS. 6.Read the VCF locations from Datasets dynamodb. 7.Check VcfSummaries dynamodb to see if all the VCFs have been summarised. 8.If any VCF is missing call, variant or sample count information, publish that VCF to summariseVCF. 9.Read the vcf location from summariseVCF SNS. 10.Attempt to enter the slices of the VCF in the VCF location item in VcfSummaries dynamodb. 11.If there already values in the toUpdate attribute for that item, abort. 12.Read the number of samples from the vcf location. 13.Enter the sample count for the VCF location in VcfSummaries dynamodb. 14.Publish each region slice to summariseSlice SNS. 15.Read the region and VCF location from summariseSlice SNS. 16.Parse the region in the VCF and counts the total number of variants and calls. 17.Remove the region and adds its counts to the VCF item in VcfSummaries dynamodb. 18.Record the slices that remain to be updated. 19.If there are no more slices to update, get all datasets that use that VCF from Datasets
dynamodb. 20.For each dataset found, publish that dataset to the summariseDataset SNS. 21.Read the dataset id from summariseDataset SNS.
22.Read the VCF locations from Datasets dynamodb. 23.Read the VCF summaries from VcfSummaries dynamodb. 24.If all the VCFs have been summarised, aggregate the counts and enter them in Datasets
dynamodb.
Serverless Beacon: Helping take genomic analysis from the cloud to the clinic | Brendan Hosking
Bioinformatics | Denis C. Bauer | @allPowerde
Custom Continuous Deployment to Uncover the Secrets in the Genome | Brendan Hosking 3 | Serverless Beacon: Helping take genomic analysis from the cloud to the clinic | Brendan Hosking 1.GET request for a summary of the available datasets. 2.Invoke getInfo lambda function. 3.Read the Datasets dynamoDB, to get the summary of each dataset. 4.Return the dataset summary information to the API Gateway. 5.Return the summary of each dataset to the client. 6.GET request for information about variants in a particular region, perhaps on a subset of
datasets. 7.Invoke queryDatasets lambda function. 8.Collect metadata as well as vcf location for each dataset from Datasets dynamoDB. 9.Invoke splitQuery for each dataset, with the desired region and variant type.
each combination.
queryDatasets.
information, to API Gateway.
Serverless Innovation for Health | Denis C. Bauer | @allPowerde 10 | Serverless Beacon: Helping take genomic analysis from the cloud to the clinic | Brendan Hosking
Serverless Innovation for Health | Denis C. Bauer | @allPowerde 11 | Serverless Beacon: Helping take genomic analysis from the cloud to the clinic | Brendan Hosking
IM&T Administered
Serverless Innovation for Health | Denis C. Bauer | @allPowerde
12 |
User Administered
Project Infrastructure Definition Continuous Deployment Secure Account Storage Deployment Mechanics
Serverless Beacon: Helping take genomic analysis from the cloud to the clinic | Brendan Hosking
Serverless Innovation for Health | Denis C. Bauer | @allPowerde 13 | Serverless Beacon: Helping take genomic analysis from the cloud to the clinic | Brendan Hosking
rapid prototyping and scalability.
future-ready.
practise: let’s build a healthier future together!
Serverless Innovation for Health | Denis C. Bauer | @allPowerde 14 | Serverless Beacon: Helping take genomic analysis from the cloud to the clinic | Brendan Hosking
Denis Bauer, PhD Rob Dunne, PhD Piotr Szul
Transformational Bioinformatics
Collaborators News Software Lynn Langit
Top 10 Australian IT stories of 2017
You?
We are hiring… …email Denis
Suzanne Scott Oscar Luo, PhD Arash Bayat, PhD Natalie Twine, PhD Genome Insight Yatish Jain Aidan O’Brien Laurence Wilson, PhD Brendan Hosking Aidan Tay Daniel Reti Digital Genome Engineering Suzanne Scott
Mumbai 2019