A Sleep-based Communication Mechanism to Streaming Systems Save - - PowerPoint PPT Presentation

a sleep based communication mechanism to
SMART_READER_LITE
LIVE PREVIEW

A Sleep-based Communication Mechanism to Streaming Systems Save - - PowerPoint PPT Presentation

A Sleep-based Communica- tion Mechanism to Save Processor Utilization in Distributed A Sleep-based Communication Mechanism to Streaming Systems Save Processor Utilization in Distributed Shoaib Akram, Streaming Systems Angelos Bilas


slide-1
SLIDE 1

A Sleep-based Communica- tion Mechanism to Save Processor Utilization in Distributed Streaming Systems Shoaib Akram, Angelos Bilas Outline Introduction Our Work Experimental Platform Results A Broader Picture of Our Work Conclusions

A Sleep-based Communication Mechanism to Save Processor Utilization in Distributed Streaming Systems

Shoaib Akram Angelos Bilas

Foundation for Research and Technology - Hellas (FORTH) Institute of Computer Science (ICS)

May 1, 2011

slide-2
SLIDE 2

A Sleep-based Communica- tion Mechanism to Save Processor Utilization in Distributed Streaming Systems Shoaib Akram, Angelos Bilas Outline Introduction Our Work Experimental Platform Results A Broader Picture of Our Work Conclusions

1 Introduction 2 Our Work 3 Experimental Platform 4 Results 5 A Broader Picture of Our Work 6 Conclusions

slide-3
SLIDE 3

A Sleep-based Communica- tion Mechanism to Save Processor Utilization in Distributed Streaming Systems Shoaib Akram, Angelos Bilas Outline Introduction Our Work Experimental Platform Results A Broader Picture of Our Work Conclusions

Efficiency in Back-end Processing

  • Efficiency in back-end processing is important.
  • Scalability is important but software stacks of indiviual

nodes are becoming complex :

  • Runtime bloat (Nick Mitchell).
  • Complex messaging protocols.
  • Layers of software, libraries etc.
  • This leads to over-provisioning of resources for back-end

processing.

slide-4
SLIDE 4

A Sleep-based Communica- tion Mechanism to Save Processor Utilization in Distributed Streaming Systems Shoaib Akram, Angelos Bilas Outline Introduction Our Work Experimental Platform Results A Broader Picture of Our Work Conclusions

Distributed Streaming Systems

  • Recently gaining attention due to large amounts of data to

be processed/filtered.

  • Static queries and moving data.
  • Similar operators like traditional data bases.
  • Reasons for adopting a distributed model :
  • Geographically distributed sources of data.
  • Speed-up of application queries.
  • Borealis (academic consortium) and SystemS (IBM) are

common examples.

slide-5
SLIDE 5

A Sleep-based Communica- tion Mechanism to Save Processor Utilization in Distributed Streaming Systems Shoaib Akram, Angelos Bilas Outline Introduction Our Work Experimental Platform Results A Broader Picture of Our Work Conclusions

Key Requirements of Distributed Streaming Systems

  • Scalability to many nodes.
  • Provisioning for heavy inter-node communication.
  • Rich library of stream operators.
  • Communication protocol and operators should be

decoupled.

slide-6
SLIDE 6

A Sleep-based Communica- tion Mechanism to Save Processor Utilization in Distributed Streaming Systems Shoaib Akram, Angelos Bilas Outline Introduction Our Work Experimental Platform Results A Broader Picture of Our Work Conclusions

The Architecture of Borealis - Event Structure

  • Event-driven architetcure.
  • The notion of streams and tuples.

Tuples Size of Tuple Tuple1 Tuple2 Tuple3 Tuple4 Stream Info Source of Tuples Number of Field1 Stamp Field2 Field3 Time Tuples

slide-7
SLIDE 7

A Sleep-based Communica- tion Mechanism to Save Processor Utilization in Distributed Streaming Systems Shoaib Akram, Angelos Bilas Outline Introduction Our Work Experimental Platform Results A Broader Picture of Our Work Conclusions

The Architecture of Borealis - Threads and Data Structures

  • Four threads that work asynchronously:
  • receive thread
  • process thread
  • prepare thread
  • send thread
  • Data structures for inter-thread communication.
slide-8
SLIDE 8

A Sleep-based Communica- tion Mechanism to Save Processor Utilization in Distributed Streaming Systems Shoaib Akram, Angelos Bilas Outline Introduction Our Work Experimental Platform Results A Broader Picture of Our Work Conclusions

Communication Subsystems in Distributed Middleware Systems

  • Send/Receive operations are implemented using:
  • Interrupts - High overhead at high network speeds and

large message rates.

  • Polling - Wastes CPU cycles at low network rates.
  • Send/Receive API provided by Linux Sockets :
  • Blocking sockets (interrupts).
  • Non-blocking sockets (polling).
  • Monitoring multiple sockets (blocking call to select).
  • Problems with monitoring multiple sockets with select.
slide-9
SLIDE 9

A Sleep-based Communica- tion Mechanism to Save Processor Utilization in Distributed Streaming Systems Shoaib Akram, Angelos Bilas Outline Introduction Our Work Experimental Platform Results A Broader Picture of Our Work Conclusions

Sleeping - An Alternative Approach

  • Sleep for a specific amount of time if no communication is

expected.

  • Regulation of sleeping time :
  • Kernel issues.
  • Multiple applications.
  • Parameters of a single application changes.
  • Granularity of sleeping time may change with a different

kernel.

slide-10
SLIDE 10

A Sleep-based Communica- tion Mechanism to Save Processor Utilization in Distributed Streaming Systems Shoaib Akram, Angelos Bilas Outline Introduction Our Work Experimental Platform Results A Broader Picture of Our Work Conclusions

Our Approach: Distribution/Accumulation of Work

  • Typical configuration of a data streaming system is a

pipeline of senders/receivers.

  • Send and receive threads work asynchronously.
  • Goal of send thread :
  • Node downstream has enough work to perform.
  • Goal of receive thread :
  • Unpack the events and give work to process thread.
  • Layers above the communication protocol have enough

work to do.

slide-11
SLIDE 11

A Sleep-based Communica- tion Mechanism to Save Processor Utilization in Distributed Streaming Systems Shoaib Akram, Angelos Bilas Outline Introduction Our Work Experimental Platform Results A Broader Picture of Our Work Conclusions

Working in Waves

  • Both send and receive threads maintain messaging queues.
  • The receive thread informs the send thread of the

availability of free slots in the queue by sending a message (credit message).

  • After processing a few buffers, the receive thread sends a

credit message to the send thread.

  • The credit message allows the send thread to send data in

buffers that the receive thread has already made available.

  • If there the send thread can not find a credit message, it

sleeps.

slide-12
SLIDE 12

A Sleep-based Communica- tion Mechanism to Save Processor Utilization in Distributed Streaming Systems Shoaib Akram, Angelos Bilas Outline Introduction Our Work Experimental Platform Results A Broader Picture of Our Work Conclusions

Working in Waves

  • The receive thread unpacks the events, hand the events to

the event handler and then checks for an event in the next slot in the queue.

  • If the receive thread can not find data in the buffer, it

sleeps.

  • While it is sleeping, the send thread fills up the queue with

new events.

slide-13
SLIDE 13

A Sleep-based Communica- tion Mechanism to Save Processor Utilization in Distributed Streaming Systems Shoaib Akram, Angelos Bilas Outline Introduction Our Work Experimental Platform Results A Broader Picture of Our Work Conclusions

Working in Waves: Summary

  • Sleeping criteria for send thread :
  • Criteria: Sleep for a fixed amount of time if no credits

available.

  • Rationale: Receiver is busy unpacking messages and will

send credits at some point.

  • Sleeping criteria for receive thread :
  • Criteria: Sleep for a fixed amount of time if no new

message is available.

  • Rationale:
  • All the available messages were unpacked and distributed

to layer above.

  • Processing is much heavier than unpacking.
  • Collect work while consuming no extra CPU cycles.
slide-14
SLIDE 14

A Sleep-based Communica- tion Mechanism to Save Processor Utilization in Distributed Streaming Systems Shoaib Akram, Angelos Bilas Outline Introduction Our Work Experimental Platform Results A Broader Picture of Our Work Conclusions

Machine Parameters and Benchmark for Evaluation

  • Four server-type systems running Linux CentOS release

5.4.

  • Two Intel Xeon Quad-core (2-way hyper threaded).
  • 14 Gbytes DRAM.
  • 10 Gbits/s Ethernet NIC from Myrinet.
  • 10 Gbits/s Ethernet HP ProCurve 3400cl switch.
  • A custom-benchmark that filters the incoming data (filter

condition is always true to load network).

  • First node generates the tuples, the next two process the

tuples.

  • The last node receives the tuples and consumes them

internally.

slide-15
SLIDE 15

A Sleep-based Communica- tion Mechanism to Save Processor Utilization in Distributed Streaming Systems Shoaib Akram, Angelos Bilas Outline Introduction Our Work Experimental Platform Results A Broader Picture of Our Work Conclusions

Some Parameters of Borealis

  • No. of instances of borealis (8).
  • Batching factor (varying).
  • Tuple size (varying).
  • Size of send-side queue (10).
  • Size of receive-side queue (100).
  • Frequency of exchanging credits (every 10 buffers).
  • Sleeping time is 10 ms.
slide-16
SLIDE 16

A Sleep-based Communica- tion Mechanism to Save Processor Utilization in Distributed Streaming Systems Shoaib Akram, Angelos Bilas Outline Introduction Our Work Experimental Platform Results A Broader Picture of Our Work Conclusions

Myrinet MX - A User-level Networking API

  • Provides a user-level networking API.
  • Baseline throughput is higher :
  • Removes one copy on send side.
  • Removes two copies on the receive path.
  • Reduces the number of interrupts on the receive side.
  • Fine-grained control for managing buffers.
  • Ease of implementation of flow-control mechanisms.
slide-17
SLIDE 17

A Sleep-based Communica- tion Mechanism to Save Processor Utilization in Distributed Streaming Systems Shoaib Akram, Angelos Bilas Outline Introduction Our Work Experimental Platform Results A Broader Picture of Our Work Conclusions

Our Configurations of Borealis for Evaluation

  • tcp : Baseline version of borealis with TCP/IP.
  • mx-poll : Borealis with Myrinet MX protocol and polling
  • perations for testing buffers.
  • mx-int : Borealis with Myrinet MX protocol and polling
  • peations for testing buffers.
  • mx-sleep : Borealis with Myrinet MX protocol and using

sleep system call.

slide-18
SLIDE 18

A Sleep-based Communica- tion Mechanism to Save Processor Utilization in Distributed Streaming Systems Shoaib Akram, Angelos Bilas Outline Introduction Our Work Experimental Platform Results A Broader Picture of Our Work Conclusions

Baseline Throughput of Borealis with TCP and MyrinetMX

128 1K 16K event size (bytes) 200 400 600 ktuples/sec

(a) 128 bytes

128 1K 16K event size (bytes) 100 200 300 400 500 ktuples/sec tcp mx-poll mx-int mx-sleep

(b) 512 bytes

  • mx-int improves throughput of borealis compared to tcp

(22%).

  • mx-poll has lower throughput compared to mx-int.
  • mx-adp gives better throughput compared to tcp

(23-63%).

slide-19
SLIDE 19

A Sleep-based Communica- tion Mechanism to Save Processor Utilization in Distributed Streaming Systems Shoaib Akram, Angelos Bilas Outline Introduction Our Work Experimental Platform Results A Broader Picture of Our Work Conclusions

Throughput and CPU Utilization - All Configurations

128 1K 2K 4K 8K 16K event size (bytes) 200 400 600 ktuples/sec

(c) 128 bytes

256 1K 2K 4K 8K 16K event size (bytes) 200 400 600 ktuples/sec

(d) 256 bytes

512 1K 2K 4K 8K 16K event size (bytes) 100 200 300 400 500 ktuples/sec mx-poll mx-int mx-sleep

(e) 512 bytes

128 1K 2K 4K 8K 16K event size (bytes) 20 40 60 80 % utilization

(f) 128 bytes

256 1K 2K 4K 8K 16K event size (bytes) 20 40 60 80 % utilization

(g) 256 bytes

512 1K 2K 4K 8K 16K event size (bytes) 20 40 60 80 % utilization mx-poll mx-int mx-sleep

(h) 512 bytes

slide-20
SLIDE 20

A Sleep-based Communica- tion Mechanism to Save Processor Utilization in Distributed Streaming Systems Shoaib Akram, Angelos Bilas Outline Introduction Our Work Experimental Platform Results A Broader Picture of Our Work Conclusions

General Trends in Writing Middleware Systems

  • Modules are written by different developers.
  • Accounting for heterogenous architectures.
  • Accounting for slow networks.
  • Over-provisioning for memory.
slide-21
SLIDE 21

A Sleep-based Communica- tion Mechanism to Save Processor Utilization in Distributed Streaming Systems Shoaib Akram, Angelos Bilas Outline Introduction Our Work Experimental Platform Results A Broader Picture of Our Work Conclusions

General Trends in Writing Middleware Systems

  • Buffer management across threads/modules :
  • (buffer ptr,size).
  • Copying a buffer and passing it.
  • Serialization-Deserialization - Heterogenity and Portability

:

  • Communication among heterogenous nodes.
  • Packing data-structures spread in different parts of

memory.

  • Overhead of copies.
  • Use separate send operation to send each field of

data-structure.

slide-22
SLIDE 22

A Sleep-based Communica- tion Mechanism to Save Processor Utilization in Distributed Streaming Systems Shoaib Akram, Angelos Bilas Outline Introduction Our Work Experimental Platform Results A Broader Picture of Our Work Conclusions

General Trends in Writing Middleware Systems

  • Message Queuing for Asynchronous Operation :
  • Threads might block on slow networks.
  • Buffering provides asynchronous operation.
  • Not necessary on fast networks.
  • Send the event from the prepare thread and block (in

case).

  • Flow Control :
  • Memory is usually over-provisioned.
  • Virtual memory is backed up by swap space on disk.
  • Proper flow-control involves accounting memory under

utilization (by different threads).

  • Proper inter-thread flow-control saves memory resources

for other tasks in the system.

  • Different structures could possibly allow flow-control

(which one to choose).

slide-23
SLIDE 23

A Sleep-based Communica- tion Mechanism to Save Processor Utilization in Distributed Streaming Systems Shoaib Akram, Angelos Bilas Outline Introduction Our Work Experimental Platform Results A Broader Picture of Our Work Conclusions

Observations from the Borealis Communication Flow

  • stack space

1 2

new event global buffer (rbuf)

3a 3b 4

prepare thread send thread

global buffer (wbuf) new event (string)

serialized event 6

7

5a

5b

receive thread process thread

slide-24
SLIDE 24

A Sleep-based Communica- tion Mechanism to Save Processor Utilization in Distributed Streaming Systems Shoaib Akram, Angelos Bilas Outline Introduction Our Work Experimental Platform Results A Broader Picture of Our Work Conclusions

Conclusions

  • Sleep-based communication policies can save CPU cycles

for other tasks.

  • Main problem is to find a criteria to sleep.
  • Portability is a concern.
  • Save CPU cycles for a given application :
  • Less power.
  • Give CPU cycles to some other application :
  • Improves (overall) energy efficiency of a system.
  • Too much focus on scaling?
slide-25
SLIDE 25

A Sleep-based Communica- tion Mechanism to Save Processor Utilization in Distributed Streaming Systems Shoaib Akram, Angelos Bilas Outline Introduction Our Work Experimental Platform Results A Broader Picture of Our Work Conclusions

Conclusions

  • Sleep-based communication policies can save CPU cycles

for other tasks.

  • Main problem is to find a criteria to sleep.
  • Portability is a concern.
  • Save CPU cycles for a given application :
  • Less power.
  • Give CPU cycles to some other application :
  • Improves (overall) energy efficiency of a system.
  • Too much focus on scaling?