rapid processing of synthetic seismograms using windows
play

Rapid Processing of Synthetic Seismograms using Windows Azure Cloud - PowerPoint PPT Presentation

Rapid Processing of Synthetic Seismograms using Windows Azure Cloud Vedaprakash Subramanian, Liqiang Wang Department of Computer Science University of Wyoming En-Jui Lee, Po Chen Department of Geology and Geophysics University of Wyoming 1


  1. Rapid Processing of Synthetic Seismograms using Windows Azure Cloud Vedaprakash Subramanian, Liqiang Wang Department of Computer Science University of Wyoming En-Jui Lee, Po Chen Department of Geology and Geophysics University of Wyoming 1

  2. Introduction • Scientific applications – Large amount of computing resources – Massive storage for datasets • Traditionally Handled by HPC Clusters – But Cost-ineffective • Claim: Cloud Computing is a better substitute 2

  3. … more • Conduct numerical simulation of synthetic seismograms • Implemented a system on Azure cloud – to generate synthetic seismograms on the fly based on user queries – to deliver them in real-time 3

  4. Synthetic Seismograms • Seismogram is a record of the ground shaking • Synthetic seismograms are generated by solving seismic wave – equation • Method is rapid centroid moment tensor (CMT) inversion method – Based on 3D velocity model • To increase efficiency, we store Receiver Green Tensors strain fields – Generated at the receiver’s location 4

  5. … more • The input parameters for this wave-equation are – latitude, longitude and depth of the earthquake locations – strike, dip, and rake (source parameters) 5

  6. … more Why y sy synt nthetic tic se seis ismograms ograms is is im impor ortant? tant? • Helps to map the Earth’s internal structure • Locate and measure the size of different seismic sources • For realistic interpretation of structures • Seismic Hazard Analysis 6

  7. Windows Azure 7

  8. Windows Azure • Windows Azure is Platform as a service architecture • Provides – Scalable cloud operating system – Data storage system • User controls the hosted application • User cannot control the underlying infrastructure 8

  9. Service Architecture • Azure service consists of : – Web role • For web application • For user interface – Worker role • For generalized development • For background processing • Roles are virtual machines • Say, 2 instances of worker role = 2 virtual machines running the code of worker role

  10. Storage Abstractions • Blobs • Tables • Queues • Drives • Our system uses only first three

  11. Blob Storage • Blobs – Interface for storing files – Two types namely Page blobs and Block blobs – Containers for grouping Account Container Blob File1 txt California File2 txt Geo Texas File1 txt

  12. Table Storage • Tables – Structured storage – Consists set of entities, which contain a set of properties – Partition key and Row key Account Table Entity Name = … Email = … Customers Name = … Email = … Geo Photo Id = … Photos Date = …

  13. Queue Storage • Queues – Reliable storage and delivery of messages – Communication between roles Account Queue Message ID = …… Geo Orders ID = …

  14. Load Balancing • Azure master system – Automatically load balance based on the partition key • Partition key for various storage abstractions – Blobs – Container Name + Blob Name – Entities – Table Name + Partition Key – Messages – Queue Name 14

  15. Implementation of the System 15

  16. Overview of the System • The architecture of the system : – Web Role • User interface – Job Manager • Coordinate the computation • Monitor the system Computation Input Queue User Request Queue Computation Output Queue Windows Azure Storage (Blob, Table, Queue)

  17. … more – Computation Worker Role • Computation stuff – 3 Azure Queues • Communication interface between the roles Computation Input Queue User Request Queue Computation Output Queue Windows Azure Storage (Blob, Table, Queue)

  18. Web Role • Receive the user input • Place the request as message in request queue User Request Queue Windows Azure Storage (Blob, Table, Queue)

  19. Job Manager • Retrieve the message from request queue • Read the job • Divide the job into sub-jobs • Place the sub-jobs as message in input- computation queue Computation Input Queue Request Queue Computation Output Queue Windows Azure Storage (Blob, Table, Queue)

  20. Coordinate the Computation • Num of computation worker roles : 5 • Num of CPU cores in each instance : 8 • Input : 1000 source locations • Num of queue messages : 1000 / 8 = 125 • 125 queue messages served by 5 computation worker roles • Performance gain factor (Theoretical) = Num of CPU cores * Num of instances = 8 * 5 = 40

  21. Inside Computation Worker Role • Retrieve the message • Process the sub-jobs in parallel using .NET Task Parallel Library • Write the result • Send message stating job completed Computation Input Queue Computation Output Queue Windows Azure Storage (Blob, Table, Queue)

  22. Monitor System Response • Based on the response time of the message • Threshold response time is 2ms • If exceeds – Allocate a new instance of computational worker role – Maintain its detail

  23. … more • De-allocate if – any new instance has been allocated – and its lifetime > one hour – and no message in the queue

  24. Monitor System Response Request to allocate VM msg msg msg msg msg Computation Input Queue Distributed File System

  25. Job Manager • Allocation & de-allocation are asynchronous • So, mutual exclusion lock are enforced between – Job assignment – De-allocation of an instance

  26. Data Storage • Seismic wave observations stored as Point (40’59’’N, 122’7’’W) data file 4059-1227 • Each data file is represented by its own latitude and longitude • So, blob name = latitude + longitude • For grouping the blobs, the region of California is divided into blocks – based on the seismic wave observation stations 26

  27. … more • Currently 4096 stations = 4096 35’25’’N, 119’3’’W 35’25’’N, 116’47’’W blocks • Each block is characterized by range of latitude and longitude 34’51’’N, 119’3’’W 34’51’’N, 116’47’’W 3525-3451-1193-11647 • So, block identification number = range of latitude + longitude • Container name = identification number 27

  28. … more • Divide California region into 16 parts • 1 Table for 1 part • Store list of blocks in the part into the table • Helps in better retrieval 28

  29. Data Query Algorithm • Locate blob corresponding to the given point – Retrieve the container name and blob name • Retrieve container name from the table – Locate the table – Do a linear search inside the table • Blob name = latitude + longitude 29

  30. Data Query Algorithm 41’43’’N, 124’6’’W 41’43’’N, 120’43’’W 40’85’’N, 123’56’’W 40’85’’N, 122’52’’W 40’59’’N, 122’7’’W Point 40’27’’N, 124’6’’W 40’27’’N, 123’56’’W 40’27’’N, 122’52’’W 39’11’’N, 120’43’’W 39’11’’N, 124’6’’W 39’11’’N, 122’52’’W 30

  31. Experiment 31

  32. Experiment • Performance was evaluated on – various configurations and number of instances of computational worker role – datasets from different number of seismic wave observation 32

  33. Performance Measurement – Execution Time Single Worker (4 core) 1000 900 Execution Time (seconds) Four Worker (4 core) 800 Single worker (4 core) + 700 TPL 600 Four Worker (4 core) + TPL 500 Two worker (8 core) + TPL 400 300 200 100 0 0 100 200 300 400 500 600 700 800 900 1000 Number of stations 33

  34. Conclusion 34

  35. Conclusion • Implemented the system on Windows Azure • Hence Cloud is a better substitute • Future Work : – Add Seismic Hazard Analysis feature to the system 35

  36. Thanks 36

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend