Rapid Processing of Synthetic Seismograms using Windows Azure Cloud - - PowerPoint PPT Presentation

rapid processing of synthetic seismograms using windows
SMART_READER_LITE
LIVE PREVIEW

Rapid Processing of Synthetic Seismograms using Windows Azure Cloud - - PowerPoint PPT Presentation

Rapid Processing of Synthetic Seismograms using Windows Azure Cloud Vedaprakash Subramanian, Liqiang Wang Department of Computer Science University of Wyoming En-Jui Lee, Po Chen Department of Geology and Geophysics University of Wyoming 1


slide-1
SLIDE 1

Rapid Processing of Synthetic Seismograms using Windows Azure Cloud

Vedaprakash Subramanian, Liqiang Wang Department of Computer Science University of Wyoming En-Jui Lee, Po Chen Department of Geology and Geophysics University of Wyoming

1

slide-2
SLIDE 2
  • Scientific applications

– Large amount of computing resources – Massive storage for datasets

  • Traditionally Handled by HPC Clusters

– But Cost-ineffective

  • Claim: Cloud Computing is a better substitute

2

Introduction

slide-3
SLIDE 3
  • Conduct numerical simulation of synthetic

seismograms

  • Implemented a system on Azure cloud

– to generate synthetic seismograms on the fly based on user queries – to deliver them in real-time

3

… more

slide-4
SLIDE 4
  • Seismogram is a record of the ground shaking
  • Synthetic seismograms are generated by

solving seismic wave – equation

  • Method is rapid centroid moment tensor

(CMT) inversion method

– Based on 3D velocity model

  • To increase efficiency, we store Receiver Green

Tensors strain fields

– Generated at the receiver’s location

4

Synthetic Seismograms

slide-5
SLIDE 5
  • The input parameters for this wave-equation are

– latitude, longitude and depth of the earthquake locations – strike, dip, and rake (source parameters)

5

… more

slide-6
SLIDE 6

… more

  • Helps to map the Earth’s internal structure
  • Locate and measure the size of different seismic sources
  • For realistic interpretation of structures
  • Seismic Hazard Analysis

Why y sy synt nthetic tic se seis ismograms

  • grams is

is im impor

  • rtant?

tant?

6

slide-7
SLIDE 7

Windows Azure

7

slide-8
SLIDE 8

8

  • Windows Azure is Platform as a service

architecture

  • Provides

– Scalable cloud operating system – Data storage system

  • User controls the hosted application
  • User cannot control the underlying

infrastructure

Windows Azure

slide-9
SLIDE 9
  • Azure service consists of :

– Web role

  • For web application
  • For user interface

– Worker role

  • For generalized development
  • For background processing
  • Roles are virtual machines
  • Say, 2 instances of worker role = 2 virtual

machines running the code of worker role

Service Architecture

slide-10
SLIDE 10
  • Blobs
  • Tables
  • Queues
  • Drives
  • Our system uses only first three

Storage Abstractions

slide-11
SLIDE 11
  • Blobs

– Interface for storing files – Two types namely Page blobs and Block blobs – Containers for grouping

Blob Container Account

Geo California

File1 txt File2 txt

Texas

File1 txt

Blob Storage

slide-12
SLIDE 12
  • Tables

– Structured storage

– Consists set of entities, which contain a set of properties – Partition key and Row key

Entity Table Account

Geo Customers

Name = … Email = … Name = … Email = …

Photos

Photo Id = … Date = …

Table Storage

slide-13
SLIDE 13
  • Queues

– Reliable storage and delivery of messages – Communication between roles

Message Queue Account

Geo Orders

ID = …… ID = …

Queue Storage

slide-14
SLIDE 14
  • Azure master system

– Automatically load balance based on the partition key

  • Partition key for various storage abstractions

– Blobs – Container Name + Blob Name – Entities – Table Name + Partition Key – Messages – Queue Name

14

Load Balancing

slide-15
SLIDE 15

Implementation of the System

15

slide-16
SLIDE 16
  • The architecture of the system :

– Web Role

  • User interface

– Job Manager

  • Coordinate the computation
  • Monitor the system

User Request Queue Computation Input Queue

Windows Azure Storage (Blob, Table, Queue)

Computation Output Queue

Overview of the System

slide-17
SLIDE 17

– Computation Worker Role

  • Computation stuff

– 3 Azure Queues

  • Communication interface between the roles

User Request Queue Computation Input Queue

Windows Azure Storage (Blob, Table, Queue)

Computation Output Queue

… more

slide-18
SLIDE 18
  • Receive the user input
  • Place the request as message in request queue

Web Role

User Request Queue

Windows Azure Storage (Blob, Table, Queue)

slide-19
SLIDE 19
  • Retrieve the message from request queue
  • Read the job
  • Divide the job into sub-jobs
  • Place the sub-jobs as message in input-

computation queue

Job Manager

Windows Azure Storage (Blob, Table, Queue)

Request Queue Computation Input Queue Computation Output Queue

slide-20
SLIDE 20
  • Num of computation worker roles : 5
  • Num of CPU cores in each instance : 8
  • Input : 1000 source locations
  • Num of queue messages : 1000 / 8 = 125
  • 125 queue messages served by 5 computation

worker roles

  • Performance gain factor (Theoretical)

= Num of CPU cores * Num of instances = 8 * 5 = 40

Coordinate the Computation

slide-21
SLIDE 21
  • Retrieve the message
  • Process the sub-jobs in parallel using .NET Task

Parallel Library

  • Write the result
  • Send message stating job completed

Inside Computation Worker Role

Windows Azure Storage (Blob, Table, Queue)

Computation Input Queue Computation Output Queue

slide-22
SLIDE 22
  • Based on the response time of the message
  • Threshold response time is 2ms
  • If exceeds

– Allocate a new instance of computational worker role – Maintain its detail

Monitor System Response

slide-23
SLIDE 23
  • De-allocate if

– any new instance has been allocated – and its lifetime > one hour – and no message in the queue

… more

slide-24
SLIDE 24

Distributed File System Computation Input Queue

msg msg msg msg msg

Request to allocate VM

Monitor System Response

slide-25
SLIDE 25
  • Allocation & de-allocation are asynchronous
  • So, mutual exclusion lock are enforced

between

– Job assignment – De-allocation of an instance

Job Manager

slide-26
SLIDE 26
  • Seismic wave observations stored as

data file

  • Each data file is represented by its
  • wn latitude and longitude
  • So, blob name = latitude + longitude
  • For grouping the blobs, the region of

California is divided into blocks

– based on the seismic wave observation stations

Point (40’59’’N, 122’7’’W)

4059-1227

26

Data Storage

slide-27
SLIDE 27
  • Currently 4096 stations = 4096

blocks

  • Each block is characterized by

range of latitude and longitude

  • So, block identification number =

range of latitude + longitude

  • Container name = identification

number

3525-3451-1193-11647

34’51’’N, 116’47’’W 35’25’’N, 116’47’’W 34’51’’N, 119’3’’W 35’25’’N, 119’3’’W 27

… more

slide-28
SLIDE 28
  • Divide California region

into 16 parts

  • 1 Table for 1 part
  • Store list of blocks in the

part into the table

  • Helps in better retrieval

28

… more

slide-29
SLIDE 29
  • Locate blob corresponding to the given point

– Retrieve the container name and blob name

  • Retrieve container name from the table

– Locate the table – Do a linear search inside the table

  • Blob name = latitude + longitude

29

Data Query Algorithm

slide-30
SLIDE 30

Point 41’43’’N, 124’6’’W 41’43’’N, 120’43’’W 39’11’’N, 120’43’’W 39’11’’N, 124’6’’W 40’27’’N, 124’6’’W 39’11’’N, 122’52’’W 40’27’’N, 122’52’’W 40’27’’N, 123’56’’W 40’85’’N, 122’52’’W 40’85’’N, 123’56’’W 40’59’’N, 122’7’’W 30

Data Query Algorithm

slide-31
SLIDE 31

Experiment

31

slide-32
SLIDE 32
  • Performance was evaluated on

– various configurations and number of instances of computational worker role – datasets from different number of seismic wave

  • bservation

32

Experiment

slide-33
SLIDE 33

100 200 300 400 500 600 700 800 900 1000 100 200 300 400 500 600 700 800 900 1000 Single Worker (4 core) Four Worker (4 core) Single worker (4 core) + TPL Four Worker (4 core) + TPL Two worker (8 core) + TPL Number of stations Execution Time (seconds)

33

Performance Measurement – Execution Time

slide-34
SLIDE 34

Conclusion

34

slide-35
SLIDE 35
  • Implemented the system on Windows Azure
  • Hence Cloud is a better substitute
  • Future Work :

– Add Seismic Hazard Analysis feature to the system

35

Conclusion

slide-36
SLIDE 36

Thanks

36