iRODS Client: iRODS Client: AWS Lambda Function for S3 1.0 AWS - - PowerPoint PPT Presentation

irods client irods client aws lambda function for s3 1 0
SMART_READER_LITE
LIVE PREVIEW

iRODS Client: iRODS Client: AWS Lambda Function for S3 1.0 AWS - - PowerPoint PPT Presentation

iRODS Client: iRODS Client: AWS Lambda Function for S3 1.0 AWS Lambda Function for S3 1.0 Terrell Russell, Ph.D. June 9-12, 2020 @terrellrussell iRODS User Group Meeting 2020 Chief Technologist, iRODS Consortium Virtual Event 1 iRODS


slide-1
SLIDE 1

iRODS Client: AWS Lambda Function for S3 1.0 iRODS Client: AWS Lambda Function for S3 1.0

June 9-12, 2020 iRODS User Group Meeting 2020 Virtual Event Terrell Russell, Ph.D. @terrellrussell Chief Technologist, iRODS Consortium

1

slide-2
SLIDE 2

iRODS Client: AWS Lambda Function for S3 1.0

Design Goals Play nicely with the universe of tools that already know how to write to S3 directly Allow those updates within the S3 namespace to smoothly flow into the iRODS Catalog Trigger automated data management due to crossing the policy boundary

2

slide-3
SLIDE 3

iRODS Client: AWS Lambda Function for S3 1.0 Considerations Lambda can run Python code iRODS provides a python client library Success would be... near-real-time, asynchronous, catalog updates for creates/moves/deletes

3

slide-4
SLIDE 4

iRODS Client: AWS Lambda Function for S3 1.0

Files created, renamed, or deleted in S3 appear quickly in iRODS. iRODS is assumed to have its associated S3 Storage Resource(s) configured with HOST_MODE=cacheless_attached. You must configure your Lambda to trigger on all ObjectCreated and ObjectRemoved events for a connected S3 bucket. The iRODS connection information is stored in the AWS Systems Manager > Parameter Store as a JSON object string. SSL to iRODS is supported by placing a certificate in a relative path within the Lambda package.

Lambda S3

4

slide-5
SLIDE 5

iRODS Client: AWS Lambda Function for S3 1.0

This Lambda function can be configured to receive events from multiple sources at the same time.

Lambda S3 S3 S3

If the irods_default_resource is NOT defined in the environment in the Parameter Store, then the Lambda function will derive the name of a target iRODS Resource. By default, the Lambda function will append _s3 to the incoming bucket name.

5

slide-6
SLIDE 6

iRODS Client: AWS Lambda Function for S3 1.0

Lambda S3 SNS Lambda S3

6

SQS Lambda S3

The following AWS configurations are supported at this time:

slide-7
SLIDE 7

iRODS Client: AWS Lambda Function for S3 1.0

Limitations S3 is decoupled from the Lambda. A rename is actually a create and a delete message. To iRODS, this becomes a new data object. This means any metadata AVUs associated with the now-deleted data object is lost. Could be remedied with full checksum comparison. Other ideas welcome. SQS configuration is limited to batch_size = 1. Operating

  • n more than one message at a time would reduce the cost
  • f running this Lambda at AWS. Unclear how to signal

partial success at this time.

7

slide-8
SLIDE 8

Questions?

Thank You! Pre-release testing environment provided by Bristol Myers Squibb. https://github.com/irods/irods_client_aws_lambda_s3

8