Scalable Distributed Lineage Authentication
Ashish Gehani
Scalable Distributed Lineage Authentication – p. 1/59
Scalable Distributed Lineage Authentication Ashish Gehani Scalable - - PowerPoint PPT Presentation
Scalable Distributed Lineage Authentication Ashish Gehani Scalable Distributed Lineage Authentication p. 1/59 What is data lineage ? Output Operation Input 1 Input n (a) Primitive operation (b) Compound operation tree Scalable
Ashish Gehani
Scalable Distributed Lineage Authentication – p. 1/59
Output Operation Input 1 Input n
(a) Primitive operation (b) Compound operation tree
Scalable Distributed Lineage Authentication – p. 2/59
Scalable Distributed Lineage Authentication – p. 3/59
Scalable Distributed Lineage Authentication – p. 4/59
myGrid - Biology Grid workflows
Scalable Distributed Lineage Authentication – p. 5/59
Scalable Distributed Lineage Authentication – p. 6/59
Scalable Distributed Lineage Authentication – p. 7/59
? Consumer Producer Output Input Input = Output
Scalable Distributed Lineage Authentication – p. 8/59
accessed, modified files
Process File 1 Read File 2 Read close()
close() File 3 Write Process execution Time close()
File 3 File 1 File 2 Owner
Scalable Distributed Lineage Authentication – p. 9/59
Net Address Inode Time 1 Signature Output Input n Executor Input
Scalable Distributed Lineage Authentication – p. 10/59
Scalable Distributed Lineage Authentication – p. 11/59
Steps 1 2 3 4 5 Workload Instruction 0.4 KB 3 KB 31 KB 253 KB 2 MB Research 0.2 KB 0.8 KB 2 KB 8 KB 29 KB Web 1 KB 39 KB 1 MB 29 MB 813 MB Windows 0.2 KB 0.8 KB 2 KB 9 KB 30 KB
Scalable Distributed Lineage Authentication – p. 12/59
Time (in ms) to read tree in open(): Steps 1 2 3 4 Workload Instruction 0.04 0.05 0.11 1.72 Research 0.05 0.05 0.04 0.04 Web 0.06 0.13 6.42 997.5 Windows 0.07 0.04 0.04 0.04 Time (in ms) to write tree in close(): Steps 1 2 3 4 Workload Instruction 0.20 0.28 0.32 0.84 Research 0.16 0.19 2.39 3.1 Web 0.16 0.24 4.82 579.14 Windows 0.16 0.50 5.34 3.17
Scalable Distributed Lineage Authentication – p. 13/59
Scalable Distributed Lineage Authentication – p. 14/59
Workload Storage Instruction 0.4 KB Research 0.2 KB Web 1 KB Windows 0.2 KB
Scalable Distributed Lineage Authentication – p. 15/59
Algorithm : CHECKLINEAGE(D) {E, S, O, I1, . . . , In} ← GETROOT(D) OUTPUT(E) PE ← PKILOOKUP(E) if I1, . . . , In = {} then Result ← VERIFY(PE, S, E, O) if Result = FALSE then CheckFailed else Result ← VERIFY(PE, S, E, O|I1| . . . |In) if Result = TRUE then for i ← 1 to n do CHECKLINEAGE(Ii) ← − Reliability drops else CheckFailed
Scalable Distributed Lineage Authentication – p. 16/59
Scalable Distributed Lineage Authentication – p. 17/59
Pruned levels λ Stored locally Pruned − must be recovered from remote node
Scalable Distributed Lineage Authentication – p. 18/59
Scalable Distributed Lineage Authentication – p. 19/59
Scalable Distributed Lineage Authentication – p. 20/59
Pegasus Planner NCBI TIGR PDB Swiss−Prot New Data Query GADU Server PFAM BLOCKS BLAST THMM 300 Nodes Globus Node Globus Node Globus Node User’s administrative domain External trusted database Grid nodes − trust but verify Comparative Analysis Database JGI
Scalable Distributed Lineage Authentication – p. 21/59
Scalable Distributed Lineage Authentication – p. 22/59
Scalable Distributed Lineage Authentication – p. 23/59
Scalable Distributed Lineage Authentication – p. 24/59
Scalable Distributed Lineage Authentication – p. 25/59
Scalable Distributed Lineage Authentication – p. 26/59
Scalable Distributed Lineage Authentication – p. 27/59
Scalable Distributed Lineage Authentication – p. 28/59
+-----------+--------------+ | Field | Type | +-----------+--------------+ | LPID | int(11) | | Host | varchar(256) | | IP | char(16) | | Time | datetime | | PID | int(11) | | PID_Name | varchar(256) | | PPID | int(11) | | PPID_Name | varchar(256) | | UID | int(11) | | UID_Name | char(32) | | GID | int(11) | | GID_Name | char(32) | | CmdLine | varchar(256) | | Environ | text | +-----------+--------------+
Scalable Distributed Lineage Authentication – p. 29/59
+------------+--------------+ | Field | Type | +------------+--------------+ | LFID | int(11) | | FileName | varchar(256) | | Time | datetime | | NewTime | datetime | | RdWt | int(11) | | LPID | int(11) | | Hash | varchar(256) | | Signature | varchar(256) | +------------+--------------+
Scalable Distributed Lineage Authentication – p. 30/59
Scalable Distributed Lineage Authentication – p. 31/59
Scalable Distributed Lineage Authentication – p. 32/59
Scalable Distributed Lineage Authentication – p. 33/59
Scalable Distributed Lineage Authentication – p. 34/59
Scalable Distributed Lineage Authentication – p. 35/59
Scalable Distributed Lineage Authentication – p. 36/59
Scalable Distributed Lineage Authentication – p. 37/59
Scalable Distributed Lineage Authentication – p. 38/59
Scalable Distributed Lineage Authentication – p. 39/59
Scalable Distributed Lineage Authentication – p. 40/59
Scalable Distributed Lineage Authentication – p. 41/59
Scalable Distributed Lineage Authentication – p. 42/59
Scalable Distributed Lineage Authentication – p. 43/59
Scalable Distributed Lineage Authentication – p. 44/59
Scalable Distributed Lineage Authentication – p. 45/59
Scalable Distributed Lineage Authentication – p. 46/59
Scalable Distributed Lineage Authentication – p. 47/59
Scalable Distributed Lineage Authentication – p. 48/59
Scalable Distributed Lineage Authentication – p. 49/59
Scalable Distributed Lineage Authentication – p. 50/59
Scalable Distributed Lineage Authentication – p. 51/59
Scalable Distributed Lineage Authentication – p. 52/59
Scalable Distributed Lineage Authentication – p. 53/59
Scalable Distributed Lineage Authentication – p. 54/59
Scalable Distributed Lineage Authentication – p. 55/59
Scalable Distributed Lineage Authentication – p. 56/59
Scalable Distributed Lineage Authentication – p. 57/59
Scalable Distributed Lineage Authentication – p. 58/59
Scalable Distributed Lineage Authentication – p. 59/59