MicroMon: A Monitoring Framework for Tackling Distributed - - PowerPoint PPT Presentation

micromon a monitoring framework for tackling distributed
SMART_READER_LITE
LIVE PREVIEW

MicroMon: A Monitoring Framework for Tackling Distributed - - PowerPoint PPT Presentation

MicroMon: A Monitoring Framework for Tackling Distributed Heterogeneity Babar Khalid*+, Nolan Rudolph+, Ramakrishnan Durairajan, Sudarsun Kannan* *Rutgers University, University of Oregon (+co-primary authors) Background Modern


slide-1
SLIDE 1

MicroMon: A Monitoring Framework for Tackling Distributed Heterogeneity

Babar Khalid*+, Nolan Rudolph†+, Ramakrishnan Durairajan†, Sudarsun Kannan* *Rutgers University, †University of Oregon (+co-primary authors)

slide-2
SLIDE 2

2

Background

  • Modern applications are increasingly becoming geo-distributed
  • e.g., Cassandra, Apache Spark
  • Geo-distributed datacenters (DCs) use heterogeneous resources
  • storage heterogeneity (e.g., SSD, NVMe, Harddisk)
  • WAN heterogeneity (e.g., fiber optics, InfiniBand)
  • Hardware heterogeneity in DCs avoids vendor lockout and reduces
  • perational cost (by combining older/cheaper and newer/expensive

hardware)

  • Careful provisioning can provide high performance at lower cost
slide-3
SLIDE 3

3

Problem With Current Systems

  • Current monitoring frameworks for geo-distributed applications are

unidimensional

  • can only monitor hosts, storage devices, networks in isolation
  • Lack hardware heterogeneity awareness
  • e.g. no awareness for storage heterogeneity
  • could impact I/O intensive applications
  • Coarse-granular monitoring
  • unaware of host-level micro-metrics in software and hardware
  • e.g. page cache, node-level I/O traffic, node’s network queue delays
slide-4
SLIDE 4

4

Our Solution - MicroMon

  • MicroMon is a fined grained monitoring, dissemination, and inference

framework

  • Collects fine-grained (micrometrics) software and hardware metrics in

end-hosts and network

  • e.g., page cache utilization, disk read/write throughput in end host
  • Filters micrometrics into anomalies to efficiently disseminate
  • Enables replica selection for geo-distributed Cassandra
  • Preliminary study of Micromon integrated with geo-distributed

Cassandra shows high throughput gains

slide-5
SLIDE 5
  • Background
  • Case Study
  • Design
  • Evaluation
  • Conclusion

5

Outline

slide-6
SLIDE 6

6

Case Study - Cassandra

  • Distributed NoSQL database system deployed geographically
  • Manages large amounts of structured data in commodity servers
  • Provides highly available service and no single point of failure
  • Typically focuses on availability and partition tolerance
slide-7
SLIDE 7

7

Cassandra – Replication

Node 1 Node 2 Node 3 Node 4 Node 5

Client

Update (key)

Cassandra Cluster

slide-8
SLIDE 8

8

Cassandra – Replication

Node 1 Node 2 Node 3 Node 4 Node 5

Client

Update (key)

Rack 1 Rack 1 Rack 2

Rack Awareness

slide-9
SLIDE 9

9

Cassandra – Replication

Node 1 Node 2 Node 3 Node 4 Node 5

Client

Update (key)

Rack 1 Rack 1 Rack 2

Node 1 Node 2 Node 3 Node 4 Node 5

Rack 1 Rack 1 Rack 2 DC: US DC: Europe

DC Awareness

slide-10
SLIDE 10

10

Cassandra’s Snitch Monitoring

  • Cassandra uses Snitch to monitor network topology and route requests

across replicas

  • Also provides capability to spread replicas across DCs to avoid

correlated failures

  • Snitch monitors (read) latencies to avoid non-responsive replicas
  • Different types: Gossiping, MultiRegionSnitch
  • Gossiping uses rack and datacenter information to

gossip across nodes and collect latency information

  • Problem: No hardware heterogeneity awareness
slide-11
SLIDE 11

11

Analysis Goal and Methodology

  • Goal: Highlight the lack of heterogeneity awareness
  • Replica Configuration
  • SSD Replica: Sequential storage b/w - 600MB/s, rand b/w: 180 MB/s
  • HDD replica: Sequential storage b/w - 120MB/s, rand b/w: 10 MB/s
  • Network latency across replicas same (for this analysis)
  • Workload – YCSB benchmark
  • workload A (50% read and writes)
  • workload B (95% reads)
  • workload C (100% reads)
slide-12
SLIDE 12

12

Impact of Storage Heterogeneity Awareness

  • Significant performance impact over optimal SSD-only configuration
  • Snitch: Lack of awareness to storage hardware heterogeneity

10000 20000 30000 40000 50000 A B C OPS/sec YCSB Workloads HDD-only SSD-only Snitch

slide-13
SLIDE 13
  • Background
  • Case Study
  • Design
  • Evaluation
  • Conclusion

13

Outline

slide-14
SLIDE 14

14

Our Design: MicroMon

  • Monitoring and inference framework for geo-distributed applications
  • Performs micro-metrics monitoring at the host and network-level
  • micro-metrics includes fine-grained software and hardware metrics
  • Efficiently disseminates collected micro-metrics
  • Ongoing - Distributed inference engines to guide application requests

to the best replica

slide-15
SLIDE 15

15

MicroMon Challenges

  • Selection Problem: What micrometrics to consider?
  • Dissemination Problem: How to send all micrometrics?
  • Inference Problem: How to quickly infer from micrometrics?
slide-16
SLIDE 16

16

Design - Micrometrics Selection

  • Huge combinations of micrometrics across app, host OS, and network
  • Micrometrics could vary for different application-level metrics

e.g. micrometrics for latency different than those for throughput

  • Our approach: Start with storage and network micrometrics
  • Identify hardware and software micrometrics using resource usage
  • e.g. high storage usage -> monitor page cache, read/write latency
slide-17
SLIDE 17

17

MicroMon High-level Design

Enterprise Backbone Enterprise DC A

Storage stack micrometrics at DC Page cache (SW) File system (SW) Block device driver (SW) Hard disk (HW) Networking stack micrometrics at DC

  • ---- Transport -----

Flags (syn, ack, etc.) Window size Goodput Bytes transmitted/received Round-trip time

  • ---- Application -----

Throughput Networking stack micrometrics at switches

  • ---- Ingress/Egress -----

Port Packet count Byte count Drop count Utilization

  • ---- Buffer -----
  • Avg. queue length

Queue drop count Congestion status

Collected micrometrics Server Enterprise DC B Switch

slide-18
SLIDE 18

18

Reducing Dissemination – Anomaly Reports

  • Problem: Prohibitive cost of dissemination across thousands of nodes
  • cost increases with hardware and software components
  • e.g., SSD’s SMART counters contain close to 32 counters
  • Observation: OSes already expose anomalies (indirectly)
  • e.g. high I/O wait time of process -> higher page cache misses
  • e.g. sustained storage BW against max. hardware BW
  • e.g. network I/O queue wait time alludes to TCP congestion
  • Proposed Idea: Instead of sending thousands of micrometrics to

decision agent, only report OS perceived anomalies

slide-19
SLIDE 19

19

Reducing Dissemination - Network Telemetry

  • Network telemetry offers aggregated stats about state of the network
  • Idea: co-design in-band network telemetry (INT) with end host OS
  • monitor packets at end host with anomaly reports as payload
  • get network anomaly reports using INT
  • Pre-established anomaly thresholds reduce total aggregated stats further

Network anomalies INT header INT payload End-host anomalies

slide-20
SLIDE 20

20

Scalable Inference - Scoring-based Inference

  • Simple scoring–based inference in Cassandra
  • replicas sorted and ranked by network latency
  • Problem: for bandwidth sensitive applications, need higher weights for

WAN-based micrometrics compared to host-level micrometrics

  • Our approach:
  • we assign equal weights to all software and hardware micrometrics
  • use collected micrometrics to calculate a replica score
  • route request to replicas with higher scores
  • flexibility to assign higher weights for WAN-based micrometrics
  • Ongoing: Designing a generic, self-adaptive inference engine
slide-21
SLIDE 21
  • Background
  • Case Study
  • Design
  • Evaluation
  • Conclusion

21

Outline

slide-22
SLIDE 22

22

Evaluation Goals

Goals:

  • Understand the impact of storage heterogeneity with Micromon
  • Understand the impact of storage heterogeneity + network latency
  • Analyze the page cache impact (see paper for details)
slide-23
SLIDE 23

23

Analysis Methodology

  • Multiple DCs from CloudLab Infrastructure
  • three nodes located in UTAH, APT, and Emulab DCs
  • Replica Configuration
  • UTAH replica: NVMe storage (seq bw: 600MB/s, rand bw: 180 MB/s)
  • APT replica: HDD (seq bw: 120 MB/s, rand bw: 10 MB/s)
  • Emulab master node: HDD (same as above)
  • Network Latencies
  • 400us between UTAH (NVMe) replica and master node
  • 600us between APT (HDD) replica and master node
  • Workload – YCSB benchmark
  • workload A (50% read and writes)
  • workload B (95% reads)
  • workload C (100% reads)
slide-24
SLIDE 24

24

MicroMon’s - Storage Heterogeneity

  • Snitch lacks storage heterogeneity awareness
  • MicroMon’s storage heterogeneity awareness provides performance

close to SSD-only (optimal) configuration

  • Performance improves by up to 49% for large thread configuration

10000 20000 30000 40000 50000 32 clients 64 clients 128 clients 32 clients 64 clients 128 clients 32 clients 64 clients 128 clients Workload A Workload B Workload C Ops/sec HDD-only SSD-only Snitch MicroMon

slide-25
SLIDE 25

25

Storage Heterogeneity + Network Latency

  • 1000

1000 3000 5000 7000 9000 0ms 1ms 2ms 5ms 10ms 15ms 25ms Throughput (ops/s) Network Latency Snitch MicroMon

  • Introduce network latency for SSD-only node
  • For high network latencies (e.g., beyond 10ms) SSD benefits reduce
slide-26
SLIDE 26

26

Conclusion

  • Datacenter systems are becoming more and more heterogeneous
  • Deploying geo-distributed applications in heterogeneous datacenters

requires redesign of monitoring mechanisms

  • We propose MicroMon, a fine-grained micrometric monitoring,

dissemination, and inference framework

  • Our on-going work will focus on efficient dissemination and self-

adaptive inference mechanisms

slide-27
SLIDE 27

Thanks!

27

Questions?

Contact:

sudarsun.kannan@rutgers.edu ram@cs.uoregon.edu