Runtime School of Computer Science Jose E. Labra Gayo Course - - PowerPoint PPT Presentation

runtime
SMART_READER_LITE
LIVE PREVIEW

Runtime School of Computer Science Jose E. Labra Gayo Course - - PowerPoint PPT Presentation

Software Architecture University of Oviedo Runtime School of Computer Science Jose E. Labra Gayo Course 2019/2020 Software Architecture Runtime behaviour University of Oviedo Also called: Components and connectors School of Computer


slide-1
SLIDE 1

Software Architecture

School of Computer Science University of Oviedo

Runtime

Jose E. Labra Gayo Course 2019/2020

slide-2
SLIDE 2

Software Architecture

School of Computer Science University of Oviedo

Runtime behaviour

Also called: Components and connectors

slide-3
SLIDE 3

Software Architecture

School of Computer Science University of Oviedo

Taxonomy

slide-4
SLIDE 4

Software Architecture

School of Computer Science University of Oviedo

Batch Pipes & Filters

Pipes & Filters with uniform interface

slide-5
SLIDE 5

Software Architecture

School of Computer Science University of Oviedo

Batch

Independent programs are executed sequentially Data is passed from one program to the next

Note Batch style = grandfather of software architectural styles Stage

Write port

Stage

Stage Connector

Stage

Read port

slide-6
SLIDE 6

Software Architecture

School of Computer Science University of Oviedo

Batch

Elements:

Independent executable programs

Constraints

Output of one program is linked to input of the next A program usually waits for the previous one to finish its execution

slide-7
SLIDE 7

Software Architecture

School of Computer Science University of Oviedo

Batch

Advantages Low coupling between components Re-configurability Debugging

It is possible to debug each input independently

Challenges It does not offer interactive interface Requires external intervention No support for concurrency

Low throughput

High latency

Definitions: Throughput: rate at which something can be processed. Example: number of jobs/second Latency: time delay experienced by a process Example: 2 seconds

slide-8
SLIDE 8

Software Architecture

School of Computer Science University of Oviedo

Pipes & Filters

Data flows through pipes and is processed by filters

Filter Filter

Filter Pipe

Filter Filter Filter

Write port Read port

slide-9
SLIDE 9

Software Architecture

School of Computer Science University of Oviedo

Pipes & Filters

Elements

Filter: component that transforms data

Filters can be executed concurrently

Pipe: Takes output data from one filter to the input of another filter

Properties: buffer size, data format, interaction protocol

slide-10
SLIDE 10

Software Architecture

School of Computer Science University of Oviedo

Pipes & Filters

Constraints

Pipes connect outputs from one filter to inputs of other filters Filters must agree on the exchange format they admit

slide-11
SLIDE 11

Software Architecture

School of Computer Science University of Oviedo

Pipes & Filters

Advantages

Better understanding of global system

Total behavior = sum of each filter behavior

Reusability:

Filters can be recombined

Evolution and extensibility:

It is possible to create/add new filters It is possible to substitute old filters by new ones

Testability

Independent verification of each filter

Performance

It enables concurrent execution of filters

Challenges Possible delays in case of long pipes It may be difficult to pass complex data structures Non interactivity

A filter can not interact with its environment

slide-12
SLIDE 12

Software Architecture

School of Computer Science University of Oviedo

Pipes & Filters

Examples & Applications

Unix

who | wc -l

Java

Clases java.io (PipedReader, PipedWriter)

Yahoo Pipes

slide-13
SLIDE 13

Software Architecture

School of Computer Science University of Oviedo

Pipes & Filters - uniform interface

Variant of Pipes & Filters where filters have the same interface Elements

The same as in Pipes & Filters

Constraints

Filters must have a uniform interface

slide-14
SLIDE 14

Software Architecture

School of Computer Science University of Oviedo

Pipes & Filters - uniform interface

Advantages:

Independent development of filters Re-configurability Facilitates system understanding

Challenges:

Performance can be affected if data have to be converted to the uniform interface

Marshalling

slide-15
SLIDE 15

Software Architecture

School of Computer Science University of Oviedo

Pipes & Filters - uniform interface

Examples:

Unix operating system

Programs with a text input (stdin) and 2 text outputs (stdout y stderr)

Web architecture: REST

slide-16
SLIDE 16

Software Architecture

School of Computer Science University of Oviedo

Master-Slave

slide-17
SLIDE 17

Software Architecture

School of Computer Science University of Oviedo

Master-Slave

Master divides work in sub-tasks Assigns each sub-task to different nodes The computational result is obtained as the combination of the slaves results results

Slave 1 Slave 2 Master Slave N . . . Problem

task 1 task 2 task N

Solution

result N result 2 result 1

slide-18
SLIDE 18

Software Architecture

School of Computer Science University of Oviedo

Master-Slave

Elements

Master: Coordinates execution Slave: does a task and returns the result

Constraints

Slave nodes are only in charge of the computation Control is done by the Master node

slide-19
SLIDE 19

Software Architecture

School of Computer Science University of Oviedo

Master-Slave

Advantages

Parallel computation Fault tolerance

Challenges

Difficult to coordinate work between slaves Dependency on Master node Dependency on physical configuration

slide-20
SLIDE 20

Software Architecture

School of Computer Science University of Oviedo

Master-Slave

Applications:

Process control systems Embedded systems Fault tolerant systems Search systems

slide-21
SLIDE 21

Software Architecture

School of Computer Science University of Oviedo

MVC: Model - view - controller MVC variants

PAC: Presentation - Abstraction - Control

slide-22
SLIDE 22

Software Architecture

School of Computer Science University of Oviedo

MVC

MVC: Model - View - Controller

Proposed by Trygve Reenskaug (end of 70's) Solution for GUI Controller separates model from the view "Mental model" offered through views

slide-23
SLIDE 23

Software Architecture

School of Computer Science University of Oviedo

MVC

Elements

Model: represents business logic and state View: Offers state representation to the user Controller: Coordinates interaction, views and model

Mental Model

Controller

Model View 1 View 2 User

slide-24
SLIDE 24

Software Architecture

School of Computer Science University of Oviedo

MVC

Constraints

Controller processes user events

Creates/removes views Handles interaction

Views only show values Models are independent of controllers/views

slide-25
SLIDE 25

Software Architecture

School of Computer Science University of Oviedo

MVC

Advantages

Supports multiple views of the same model Views synchronization Separation of concerns

Interaction (controller), state (model)

It is easy to create new views and controllers Easy to modify look & feel Creation of generic frameworks Challenges Increases complexity of GUI development Coupling between controllers and views Controllers/Views should depend on a model interface Some difficulties for GUI tools

slide-26
SLIDE 26

Software Architecture

School of Computer Science University of Oviedo

MVC

Applications

Lots of web frameworks follow MVC

Ruby on Rails, Spring MVC, Play, etc.

Some variants

Push: controllers send orders to views

RoR

Pull: controllers receive orders from views

Play

slide-27
SLIDE 27

Software Architecture

School of Computer Science University of Oviedo

MVC variants

PAC Model-View-Presenter Model View ViewModel ...

slide-28
SLIDE 28

Software Architecture

School of Computer Science University of Oviedo

PAC

PAC: Presentation-Abstraction-Control

Hierarchy of agents Each agent contains 3 components

Presentation Abstraction Control PAC Agent Presentation Abstraction Control PAC Agent Presentation Abstraction Control PAC Agent

slide-29
SLIDE 29

Software Architecture

School of Computer Science University of Oviedo

PAC

Elements

Agents with

Presentation: visualization aspects Abstraction: data model of an agent Control: connects presentation and abstraction components and enables communication between agents

Hierarchical relationship between agents

Constraints

Each agent is in charge of some functionality There is no direct communication between abstraction and presentation in each agent Communication through the control component

slide-30
SLIDE 30

Software Architecture

School of Computer Science University of Oviedo

PAC

Advantages

Separation of concerns

Identifies functionalities

Support for changes and extensions

It is possible to modify an agent without affecting

  • thers

Multitask

Agents can reside in different threads, processes or machines

Challenges

Complexity of the system

Too many agents can generate a complex structure which can be difficult tom maintain

Complexity of control components

Control components handle communication Quality of control components is important for the whole quality of the system

Performance

Communication overload between agents

slide-31
SLIDE 31

Software Architecture

School of Computer Science University of Oviedo

PAC

Applications

Network monitoring systems Mobile robots Drupal is based on PAC

Relationships

This patterns is related with MVC

MVC has no agent hierarchy

This pattern was re-discovered as Hierarchical MVC

slide-32
SLIDE 32

Software Architecture

School of Computer Science University of Oviedo

Shared data Blackboard Rule based

slide-33
SLIDE 33

Software Architecture

School of Computer Science University of Oviedo

Shared data

Independent components access the same state

Applications based on centralized data repositories

Component Component Shared Data Component ...

slide-34
SLIDE 34

Software Architecture

School of Computer Science University of Oviedo

Shared data

Elements

Shared data

Database or centralized repository

Components

Processors that interact with shared data

Component Component Shared Data Component ...

slide-35
SLIDE 35

Software Architecture

School of Computer Science University of Oviedo

Shared data

Constraints

Components interact with the global state Components don't communicate between each other

Only through shared state

Shared repository handles data stability and consistency

slide-36
SLIDE 36

Software Architecture

School of Computer Science University of Oviedo

Shared data

Advantages

Independent components

They don't need to be aware of the existence of other components

Data consistency

Centralized global state Unique Backup of all the system state

Challenges

Unique point of failure

A failure in the central repository can affect the whole system Distributing the central data can be difficult

Possible bottleneck

Inefficient communication Problems for scalability

Synchronization to access shared memory

slide-37
SLIDE 37

Software Architecture

School of Computer Science University of Oviedo

Shared data

Applications

Lots of systems use this approach

Some variants

This style is also known as:

Shared Memory, Repository, Shared data, etc.

Blackboard Rule based systems

slide-38
SLIDE 38

Software Architecture

School of Computer Science University of Oviedo

Blackboard

Complex problems which are difficult to solve

Knowledge sources solve parts of the problem Each knowledge source aggregates partial solutions to the blackboard

Knowledge Source Knowledge Source Blackboard Knowledge Source ... Control

slide-39
SLIDE 39

Software Architecture

School of Computer Science University of Oviedo

Blackboard

Elements

Blackboard: Central data repository Knowledge source: solves part of the problem and aggregates partial results Control: Manages tasks and checks the work state

Knowledge source Knowledge source Blackboard Knowledge source ... Control

slide-40
SLIDE 40

Software Architecture

School of Computer Science University of Oviedo

Blackboard

Constraints

Problem can be divided in parts Each knowledge source solves a part of the problem Blackboard contains partial solutions that are improving

Knowledge source Knowledge source Blackboard Knowledge source ... Control

slide-41
SLIDE 41

Software Architecture

School of Computer Science University of Oviedo

Blackboard

Advantages

Experimentability

Can be used for open problems Facilitates strategy changes

Reusability

Knowledge sources can be reused

Fault tolerance

Challenges

Debugging

No warranty that the right solution will be found Difficult to establish control strategy

Performance

It may need to review incorrect hypothesis

High development cost

Parallelism implementation It is necessary to synchronize blackboard access

slide-42
SLIDE 42

Software Architecture

School of Computer Science University of Oviedo

Blackboard

Applications

Some speech recognition systems

HEARSAY-II

Pattern recognition Weather forecasts Games Analysis of molecular structure

Crystalis

slide-43
SLIDE 43

Software Architecture

School of Computer Science University of Oviedo

Rule based systems

Variant of shared memory

Shared memory = Knowledge base

Contains rules and facts

Inference Engine Knowledge base Rules + facts User Interface

slide-44
SLIDE 44

Software Architecture

School of Computer Science University of Oviedo

Rule based systems

Elements:

Knowledge base: Rules and facts about some domain User interface: Queries/modifies knowledge base Inference engine: Answers queries from data and knowledge base

Inference Engine Knowledge base Rules + facts User Interface

slide-45
SLIDE 45

Software Architecture

School of Computer Science University of Oviedo

Rule based systems

Constraints:

Domain knowledge captured in knowledge base Limit imperative access to knowledge base

It is based on rules like:

IF antecedents THEN consequent

Limits expressiveness with regards to imperative languages

Inference Engine Knowledge base Rules + facts User Interface

slide-46
SLIDE 46

Software Architecture

School of Computer Science University of Oviedo

Rule based systems

Advantages

Maintainability

It may be easy to modify the knowledge base Specially tailored to be modified by domain experts

Separation of concerns

Algorithm Domain knowledge

Reusability

Challenges

Debugging Performance Rules creation and maintenance

Introspection Automatic rule learning Runtime update of rules

slide-47
SLIDE 47

Software Architecture

School of Computer Science University of Oviedo

Rule based systems

Applications

Expert system Production systems Rules libraries in Java

JRules, Drools, JESS

Declarative, rule based languages

Prolog (logic programming)

BRMS (Business Rules Management Systems)

slide-48
SLIDE 48

Software Architecture

School of Computer Science University of Oviedo

Call-return Client-Server Event based architectures

Publish-Subscribe Actor models

slide-49
SLIDE 49

Software Architecture

School of Computer Science University of Oviedo

Call-return

A component calls another component and waits for the answer

Component A Component B call return

slide-50
SLIDE 50

Software Architecture

School of Computer Science University of Oviedo

Call-return

Elements

Component that does the call Component that sends the answer

Constraints

Synchronous communication:

The caller waits for the answer

Componente A Componente B call return

slide-51
SLIDE 51

Software Architecture

School of Computer Science University of Oviedo

Call-return

Advantages

Easy to implement

Challenges

Problems for concurrent computation

If component is blocked waiting for the answer

It can be using unneeded resources

Distributed environments

Little utilization of computational capabilities

slide-52
SLIDE 52

Software Architecture

School of Computer Science University of Oviedo

Client-Server

Variant of layers

2 layers physically separated (2-tier)

Functionality is divided in several servers Clients connect to services

Interface request/response

Network

request response

client server

slide-53
SLIDE 53

Software Architecture

School of Computer Science University of Oviedo

Client-Server

Elements

Server: offers services through a query/answer protocol Client: does queries and process answers Network protocol: communication management between clients and servers

Network

request response

client server

slide-54
SLIDE 54

Software Architecture

School of Computer Science University of Oviedo

Client-Server

Constraints

Clients communicate with servers

Not the other way

Clients are independent from other clients Servers don't have knowledge about clients Network protocol establishes some communication warranties

Network

request response

client server

slide-55
SLIDE 55

Software Architecture

School of Computer Science University of Oviedo

Client-Server

Advantages

Distribution

Servers can be distributed

Low coupling

Separation of functionality between clients/servers Independent development

Scalability Availability

Functionality available to all clients But not all the servers need to offer all functionality

Challenges

Each server can be a single point of failure

Server attacks

Unpredictable performance

Dependency on the system and the network Problems when servers belong to other

  • rganizations

How can quality of service be warranted?

slide-56
SLIDE 56

Software Architecture

School of Computer Science University of Oviedo

Client-Server

Variants

Stateless Replicated server With cache

slide-57
SLIDE 57

Software Architecture

School of Computer Science University of Oviedo

Network

Client-Server stateless

Constraint

Server does not store information about clients Same query implies same answer

query answer

client server

slide-58
SLIDE 58

Software Architecture

School of Computer Science University of Oviedo

Client-Server stateless

Advantages

Scalability

Challenges

Application state management

Client must remember requests Handle information between requests

slide-59
SLIDE 59

Software Architecture

School of Computer Science University of Oviedo

Network

Replicated server

Constraint

Several servers offer the same service

Offer the client the appearance that there is only one server

query answer

client

server server server

Abstract Server

slide-60
SLIDE 60

Software Architecture

School of Computer Science University of Oviedo

Replicated server

Advantages

Better answer times Less latency Fault tolerance

Challenges

Consistency management between replicated servers Synchronization

slide-61
SLIDE 61

Software Architecture

School of Computer Science University of Oviedo

Client-server with cache

Cache = mediator between client/server

Stores copies of previous answers to the server

When a query is received it return the cached answer without asking the original server

Network

query answer

client server Cache

slide-62
SLIDE 62

Software Architecture

School of Computer Science University of Oviedo

Client-server with cache

Elements:

Intermediate cache nodes

Constraints

Some queries are directly answered by the cache node Cache node has a policy for answer management

Expiration time

slide-63
SLIDE 63

Software Architecture

School of Computer Science University of Oviedo

Client-server with cache

Advantages:

Less network

  • verload

Lots of repeated requests can be stored in the cache

Less answer time

Cached answers arrive earlier

Challenges

Complexity of configuration Expiration policy Not appropriate for certain domains

When high fidelity of answers is needed Example: real time systems

slide-64
SLIDE 64

Software Architecture

School of Computer Science University of Oviedo

Event based

EDA (Event-Driven-Architecture)

Event Producer

Event Processor

Event Consumer event event

slide-65
SLIDE 65

Software Architecture

School of Computer Science University of Oviedo

Event based

Elements:

Event:

Something that has happened (≠ request)

Event producer

Event generator (sensors, systems, ...)

Event consumer

DB, applications, scorecards, ...

Event processor

Transmission channel Filters and transforms events

Event Producer

Event Processor

Event Consumer event event

slide-66
SLIDE 66

Software Architecture

School of Computer Science University of Oviedo

Event based

Constraints:

Asynchronous communication

Producers generate events at any moment Consumers can be notified of events at any moment

Relationship one-to-many

An event can be sent to several consumers

slide-67
SLIDE 67

Software Architecture

School of Computer Science University of Oviedo

Event based

Advantages

Decoupling

Producer does not depend on consumer, nor vice versa.

Timelessness

Events are published without any need to wait for the termination

  • f any cycle

Asynchronous

In order to publish an event there is no need to finish any process

Challenges

Non sequential execution

Possible lack of control

Difficult to debug

slide-68
SLIDE 68

Software Architecture

School of Computer Science University of Oviedo

Event based

Applications

Event processing networks Event-Stream-Processing (ESP) Complex-event-processing

Variants

Publish-subscribe Actor models

slide-69
SLIDE 69

Software Architecture

School of Computer Science University of Oviedo

Publish-subscribe

Components subscribe to a channel to receive messages from other components

Component Event Bus

Subscribe Port Publish Port

Component Component Component Component

slide-70
SLIDE 70

Software Architecture

School of Computer Science University of Oviedo

Publish-subscribe

Elements:

Component:

Component that subscribes to a channel

Publication port

It is registered to publish messages

Subscription port

It is registered to receive some kind of messages

Event bus (message channel):

Transmits messages to subscribers

Event Bus

Subscribe Port Publish Port

Component Component Component Component Component

slide-71
SLIDE 71

Software Architecture

School of Computer Science University of Oviedo

Publish-subscribe

Constraints:

Separation between subscription/publication port

A component may have both ports

Non-direct communication

Asynchronous communication in general Components delegate communication responsibility to the channel

Event Bus

Subscribe Port Publish Port

Component Component Component Component Component

slide-72
SLIDE 72

Software Architecture

School of Computer Science University of Oviedo

Publish-subscribe

Advantages

Communication quality

Improves performance Debugging

Low coupling between components

Consumers do not depend on publishers ...nor vice versa...

Challenges

It adds a new indirection level

Direct communication may be more efficient in some domains

Complex implementation

It may require COTS

slide-73
SLIDE 73

Software Architecture

School of Computer Science University of Oviedo

Actor models

Used for concurrent computation

Actors instead of objects There is no shared state between actors Asynchronous message passing

Theoretical developments since 1973 (Carl Hewitt)

slide-74
SLIDE 74

Software Architecture

School of Computer Science University of Oviedo

Actor models

Elements

Actor: computational entity with state

It communicates with other actors sending messages It process messages one by one

Messages Addresses: Identify actors (mailing address)

slide-75
SLIDE 75

Software Architecture

School of Computer Science University of Oviedo

Actor models

Constraints

An actor can only:

Send messages to other actors

Messages are immutable

Create new actors Modify how it will process next message

Actors are decoupled

Receiver does not depend on sender

slide-76
SLIDE 76

Software Architecture

School of Computer Science University of Oviedo

Actor models

Constraints (2)

Local addresses

An actor can only send messages to known addresses Because they were given to it or because he created them

Parallelism:

All actions are in parallel No shared global state Messages can arrive in any order

slide-77
SLIDE 77

Software Architecture

School of Computer Science University of Oviedo

Actor models

Challenges

Message sending

How to handle arriving messages

Actor Coordination Non-consistent systems by definition

Advantages

Highly parallel Transparency and scalability

Internal vs external addresses

Non-local actor models

Web Services Multi-agent systems

slide-78
SLIDE 78

Software Architecture

School of Computer Science University of Oviedo

Actor models

Implementations

Erlang (programming language) Akka (library)

Applications

Reactive systems Examples: Ericsson, Facebook, twitter

slide-79
SLIDE 79

Software Architecture

School of Computer Science University of Oviedo

Broker Peer-to-peer MapReduce Lambda arquictecture Kappa architecture

Network

slide-80
SLIDE 80

Software Architecture

School of Computer Science University of Oviedo

Broker

Intermediate node that manages communication between a client and a server

Client Server Broker

stub skeleton

Client Server Broker

stub skeleton

bridge

slide-81
SLIDE 81

Software Architecture

School of Computer Science University of Oviedo

Broker

Elements

Broker

Manages communication

Client: Sends requests

Client Proxy: stub

Server: Returns answers

Server Proxy: skeleton

Bridge: Can connect brokers

Client Server Broker

stub skeleton

bridge

slide-82
SLIDE 82

Software Architecture

School of Computer Science University of Oviedo

Broker

Advantages

Separation of concerns

Delegates low level communication aspects to the broker Separate maintenance Reusability

Servers are independent from clients Portability

Broker = low level aspects

Interoperability

Using bridges

Challenges Performance

Adds an indirection layer

Can increase coupling between components Broker = single point of failure

slide-83
SLIDE 83

Software Architecture

School of Computer Science University of Oviedo

Broker

Applications

CORBA and distributed systems Android uses a variation of Broker pattern

slide-84
SLIDE 84

Software Architecture

School of Computer Science University of Oviedo

Peer-to-Peer

Equal and autonomous nodes (peers) that communicate between them.

Network Peer Peer Peer Peer Peer Peer

slide-85
SLIDE 85

Software Architecture

School of Computer Science University of Oviedo

Peer-to-Peer

Elements

Computational nodes: peers

They contain their own state and control thread

Network protocol

Constraints

There is no main node All peers are equal

slide-86
SLIDE 86

Software Architecture

School of Computer Science University of Oviedo

Peer-to-Peer

Advantages

Decentralized information and control Fault tolerance

There is no single point of failure A failure in one peer does not compromise the whole system

Challenges

Keeping the state of the system

Complexity of the protocol

Bandwidth Limitations

Network and protocol latency

Security

Detect malicious peers

slide-87
SLIDE 87

Software Architecture

School of Computer Science University of Oviedo

Peer-to-Peer

Popular applications

Napster, BitTorrent, Gnutella, ... This architecture style is not only to share files

e-Commerce (B2B) Collaborative systems Sensor networks Blockchain ...

Variants

Super-peers

slide-88
SLIDE 88

Software Architecture

School of Computer Science University of Oviedo

MapReduce

Proposed by Google

Published in 2004 Internal implementation by Google

Goal: big amounts of data

Lots of computational nodes Fault tolerance Write-once, read-many

Style composed of:

Master-slave Batch

Big Data Map Reduce

Result

slide-89
SLIDE 89

Software Architecture

School of Computer Science University of Oviedo

MapReduce

Elements

Master node: Controls execution

Node table It manages replicated file system

Slave nodes

Execute mappers, reducers Contain replicated data blocks

Big Data Map Reduce

Result

slide-90
SLIDE 90

Software Architecture

School of Computer Science University of Oviedo

MapReduce - Scheme

Inspired by functional programming

2 components: mapper and reducer

Data are divided for their processing Each data is associated with a key

Transforms [(key1,value1)] to [(key2,value2)]

c1

Input: [(key1,value1)]

v1 c1 v1 c1 v1

Output: [(key2,value2)]

c2 v2 c2 v2 c2 v2 c2 v2

MapReduce

slide-91
SLIDE 91

Software Architecture

School of Computer Science University of Oviedo

Step 1: mapper

mapper: (Key1, Value1)  [(Key2,Value2)]

c1 vi1 c2 vi2 c3 vi3 k1 v1 k2 v2 k1 v3 k3 v4 k1 v5 k1 v6 k3 v7

mapper mapper mapper

slide-92
SLIDE 92

Software Architecture

School of Computer Science University of Oviedo

Step 2: Merge and sort

System merges and sorts intermediate results according to the keys

k1 v1 k2 v2 k1 v3 k3 v4 k1 v5 k1 v6 k3 v7 k1 v1 v3 v5 v6 k2 v2 k3 v4 v7

Merge and sort

slide-93
SLIDE 93

Software Architecture

School of Computer Science University of Oviedo

Step 3: Reducers

reducer: (Key2, [Value2])  (Key2,Value2)

k1 v1 v3 v5 v6

k2 v2 k3 v4 v7

reducer reducer reducer vf1 vf2 vf3 k1 k2 k3

slide-94
SLIDE 94

Software Architecture

School of Computer Science University of Oviedo

MapReduce - general scheme

c1 vi1 c1 vi1 c1 vi1

mapper

reducer reducer reducer

vf1 vf2 vf3 k1 k2 k3 mapper mapper k1

v1

k2

v2

k1

v3

k3

v4

k1

v5

k1

v6

k3

v7

k1

v1 v3 v5 v6

k2

v2

k3

v4 v7

Merge and sort MapReduce

slide-95
SLIDE 95

Software Architecture

School of Computer Science University of Oviedo

MapReduce - count words

d1 a b d2 a c a d3 a c 4 1 2 a b c

a 1 b 1 a 1 c 1 a 1 a 1 c 1 mapper mapper mapper

reducer reducer reducer

a 1 1 1 1 b 1 c 1 1

Mezcla y

  • rdena

MapReduce

// return each work with 1 mapper(d,ps) { for each p in ps: emit (p, 1) } // sum the list of numbers of each word reducer(p,ns) { sum = 0 for each n in ns { sum += n; } emit (p, sum) }

slide-96
SLIDE 96

Software Architecture

School of Computer Science University of Oviedo

MapReduce - execution environment

Execution environment is in charge of:

Planning: Each job is divided in tasks Placement of data/code

Each node contains its data locally

Synchronization:

reduce tasks must wait map phase

Error and failure handling

High tolerance to computational nodes failures

slide-97
SLIDE 97

Software Architecture

School of Computer Science University of Oviedo

MapReduce - File system

Google developed a distributed file system - GFS Hadoop created HDFS

Files are divided in chunks 2 node types:

Namenode (master), datanodes (data servers)

Datanodes store different chunks

Block replication

Namenode contains metadata

Where is each chunk Direct communication between clients and datanodes

slide-98
SLIDE 98

Software Architecture

School of Computer Science University of Oviedo

MapReduce - File system

Namenode

file1: (B1 – N1 N2, B2 – N1 N2 N3) file2: (B3 – N2 N3, B4 – N1 N2) file3: (B5 – N1 N3)

B1 B1 B4 B4 B5 B5 B3 B3

Client1 Client2 N1 N2 N3 Datanodes Slaves

B2 B2 B2

Master

slide-99
SLIDE 99

Software Architecture

School of Computer Science University of Oviedo

MapReduce

Advantages

Distributed computations

Split input data

Replicated repository

Fault tolerant

Hardware/software heterogeneous

Large amount of data

Write-once. Read-many

Challenges

Dependency on master node Non interactivity Data conversion to MapReduce

Adapt input data Convert output data

slide-100
SLIDE 100

Software Architecture

School of Computer Science University of Oviedo

MapReduce: Applications

Lots of applications:

Google, 2007, 20petabytes/day, around 100,000 mapreduce jobs/day PageRank algorithm can be implemented as MapReduce Success stories:

Automatic translation, similarity, sorting, ...

Other companies: last.fm, facebook, Yahoo!, twitter, etc.

slide-101
SLIDE 101

Software Architecture

School of Computer Science University of Oviedo

MapReduce: Applications

Implementations

Google (internal) Hadoop (open source) …

Libraries

Hive (Hadoop): query language inspired by SQL Pig (Hadoop): specific language that can define data flows Cascading: API that can specify distributed data flows Flume Java (Google) Dryad (Microsoft)

slide-102
SLIDE 102

Software Architecture

School of Computer Science University of Oviedo

Lambda architecture

Handle Big Data & real time analytics Proposed by Nathan Marz, 2011 3 layers

Batch layer: precomputes all data with MapReduce

Generates partial aggregate views Recomputes from all data

Speed layer: real time, small window of data

Generates fast real time views

Serving layer: handles queries

Merges the different views

slide-103
SLIDE 103

Software Architecture

School of Computer Science University of Oviedo

Lambda architecture

Combines Real time with batch processing

New data stream

All data HDFS

Precompute views (MapReduce) Increment views

Batch recompute

Batch layer

Process stream

Realtime increment

Speed layer Serving layer

Partial aggregate 1

Partial aggregate N

...

Realtime view Append only

Small window of the data

Query

merge

Partial aggregate 2

slide-104
SLIDE 104

Software Architecture

School of Computer Science University of Oviedo

Lambda architecture

Constraints

All data is stored in the batch layer The batch layer precomputes views The results of the speed layer may not be accurate Serving layer combines precomputed views The views can be simple DBs for querying

slide-105
SLIDE 105

Software Architecture

School of Computer Science University of Oviedo

Lambda architecture

Advantages

Scalability (Big data) Real time Decoupling Fault tolerant Keep all input data Reprocessing

Challenges

Inherent complexity Merging views can be innacurate

Losing some events

slide-106
SLIDE 106

Software Architecture

School of Computer Science University of Oviedo

Lambda architecture

Applications

Spotify, Alibaba, …

Libraries

Apache Storm Netflix Suro project

slide-107
SLIDE 107

Software Architecture

School of Computer Science University of Oviedo

Kappa architecture

Proposed by Jay Krepps (Apache Kafka), 2013 Handle Big data & Real time with logs

Simplifies Lambda architecture

Removes the batch layer

Based on a distributed ordered log

Replicated cluster The log can be very large

slide-108
SLIDE 108

Software Architecture

School of Computer Science University of Oviedo

Data source 1 Data source 2

Kappa architecture

Diagram

Data Source N Stream Processing System App

queries Append-only immutable log

Serving Backend DB . . . Streaming layer Serving layer

slide-109
SLIDE 109

Software Architecture

School of Computer Science University of Oviedo

Kappa architecture

Constraints

The event log is append-only The events in the log are immutable Stream processing can request events at any position

To handle failures or doing recomputations

slide-110
SLIDE 110

Software Architecture

School of Computer Science University of Oviedo

Kappa architecture

Advantages

Scalable (big data) Real time processing Simpler than lambda

No batch layer

Challenges

Space requirements

Duplication of log and DB Log compaction

Ordering of events Delivery paradigms

At least once At most once (it may be lost) Exactly once

slide-111
SLIDE 111

Software Architecture

School of Computer Science University of Oviedo

Kappa architecture

Applications & libraries

Apache Kafka Apache Samza Spark Streaming

LinkedIn

slide-112
SLIDE 112

Software Architecture

School of Computer Science University of Oviedo

Plugins Microkernel Reflection Interpreters and DSL Mobile code

  • Code on demand
  • Remote evaluation
  • Mobile agents
slide-113
SLIDE 113

Software Architecture

School of Computer Science University of Oviedo

Plugin Plugin

Plugins

It allows to extend the system using plugins that add new functionality

Runtime engine Base system Plugin

slide-114
SLIDE 114

Software Architecture

School of Computer Science University of Oviedo

Plugins

Elements

Base system:

System that allows plugins

Plugins: Components that can be added/removed dynamically Runtime engine:

Starts, localizes, initializes, executes, and stops plugins

Plugin Plugin Runtime engine Base system Plugin

slide-115
SLIDE 115

Software Architecture

School of Computer Science University of Oviedo

Plugins

Constraints

Runtime engine manages plugins System can add/remove plugins Some plugins can depend on other plugins

The plugin must declare dependencies and the exported API

Plugin Plugin Runtime engine Base system Plugin

slide-116
SLIDE 116

Software Architecture

School of Computer Science University of Oviedo

Plugins

Advantages

Extensibility

Application can get new functionalities in some ways that were not foreseen by the

  • riginal developers

Customization

Application can have a small kernel that is extended on demand

Challenges

Consistency

Plugins must be added to the system in a sound way

Performance

Delay searching/configuring plugins

Security

Plugins made by third parties can compromise security

Plugin management and dependencies

slide-117
SLIDE 117

Software Architecture

School of Computer Science University of Oviedo

Plugins

Examples

Eclipse Firefox

Technologies

Component systems: OSGi

slide-118
SLIDE 118

Software Architecture

School of Computer Science University of Oviedo

Microkernel

Identify minimal functionality in a microkernel Extra functionality is added using internal servers External server handles communication with other systems

Microkernel

Adapter Servidor interno Servidor interno Internal server External server

Client System

slide-119
SLIDE 119

Software Architecture

School of Computer Science University of Oviedo

Microkernel

Elements

Microkernel: Minimal functionality Internal server: Extra functionality External server: Offers external API Client: External application

Adapter: Component that establish communication with external server

Microkernel

Adapter Servidor interno Servidor interno Internal server External server

Client System

slide-120
SLIDE 120

Software Architecture

School of Computer Science University of Oviedo

Microkernel

Constraints:

Microkernel implements only minimal functionality The rest of the functionality is implemented using internal servers Communication with clients by external servers

Microkernel

Adapter Servidor interno Servidor interno Internal server External server

Client System

slide-121
SLIDE 121

Software Architecture

School of Computer Science University of Oviedo

Microkernel

Advantages

Portability

It is only needed to port the kernel

Flexibility and extensibility

Adding new functionality with new internal servers

Security and reliability

Critical parts of the system are encapsulated Errors in external parts don't affect the microkernel

Challenges

Performance

A monolithic can be more efficient

Design complexity

Identify components in the microkernel

It may be difficult to separate parts to internal servers

Unique point of failure

If microkernel fails, the whole system may fail

slide-122
SLIDE 122

Software Architecture

School of Computer Science University of Oviedo

Microkernel

Applications

Operating systems Games Editors

slide-123
SLIDE 123

Software Architecture

School of Computer Science University of Oviedo

Reflection

It allows to change the structure and behavior of an application dynamically Systems that can modify themselves

Base level Metalevel Meta-object Protocol

slide-124
SLIDE 124

Software Architecture

School of Computer Science University of Oviedo

Reflection

Elements

Base level: Implements application logic Metalevel: Aspects that can be modified Metaobject protocol: Interface that can modify the metalevel

Base level Metalevel Meta-object Protocol

slide-125
SLIDE 125

Software Architecture

School of Computer Science University of Oviedo

Reflection

Constraints

Base level uses metalevel aspects for its behavior At runtime, it is possible to modify the metalevel using the metaobject protocol

Base level Metalevel Meta-object Protocol

slide-126
SLIDE 126

Software Architecture

School of Computer Science University of Oviedo

Reflection

Advantages

Flexibility

Adapt to changing conditions Change behavior of running system without changing source code or stopping execution

Challenges

Implementation

Not all languages enable meta-programming More difficult to combine with static type systems

Performance

It may be necessary to do some optimizations to limit reflection

Security:

Consistency maintenance

slide-127
SLIDE 127

Software Architecture

School of Computer Science University of Oviedo

Reflection

Applications

Most dynamic languages support reflection

Scheme, CLOS, Ruby, Python, ....

Intelligent systems Self-modifiable code

Base level Metalevel Meta-object Protocol

slide-128
SLIDE 128

Software Architecture

School of Computer Science University of Oviedo

Interpreters and DSLs

Include a domain specific language (DSL) that is interpreted by the system

Context Interpreter DSL program Application User

slide-129
SLIDE 129

Software Architecture

School of Computer Science University of Oviedo

Interpreters and DSLs

Elements

Interpreter: Module that executes the program Program: Written in the DSL

DSL can be designed so the end user can write programs

Context: Environment where the program is executed

Contexto Intérprete Programa en DSL

Aplicación Usuario

slide-130
SLIDE 130

Software Architecture

School of Computer Science University of Oviedo

Interpreters and DSLs

Constraints

Interpreter runs the program interacting with the context It is necessary to define a DSL

Syntax (grammar, parsing,...) Semantics (behavior)

Contexto Intérprete Programa en DSL

Aplicación Usuario

slide-131
SLIDE 131

Software Architecture

School of Computer Science University of Oviedo

Interpreters and DSLs

Advantages

Flexibility

Adapt application behavior to user needs

Usability

End users can write their

  • wn programs

Adaptability

Easy to adapt to unforeseen situations

Challenges

Design of the DSL Complexity of implementation

Interpreter Separation of context/interpreter

Performance

Possible programs may be not optimal

Security

Handle wrong programs

slide-132
SLIDE 132

Software Architecture

School of Computer Science University of Oviedo

Interpreters and DSLs

Variants:

Embedded DSLs

slide-133
SLIDE 133

Software Architecture

School of Computer Science University of Oviedo

Embedded DSLs

Embedded DSLs Domain specific languages that are embedded in general purpose host languages

This technique is popular in soma languages like Haskell, Ruby, Scala, etc.

slide-134
SLIDE 134

Software Architecture

School of Computer Science University of Oviedo

Embedded DSLs

Advantages:

Reuse of host language syntax Access to libraries and IDEs of host language

Challenges

Separation between DSL and host language End users may have too many expressivity

slide-135
SLIDE 135

Software Architecture

School of Computer Science University of Oviedo

Mobile code

Code that is transferred from one machine to another

System A sends a program to be run by system B System B must contain an interpreter for the language in which the program is written

Network System A System B

Interpreter

program

slide-136
SLIDE 136

Software Architecture

School of Computer Science University of Oviedo

Mobile code

Elements

Interpreter: Runs the code Program: Program that is transferred Network: Transfers the program

Network System A System B

Interpreter

program

slide-137
SLIDE 137

Software Architecture

School of Computer Science University of Oviedo

Mobile code

Constraints

The program must be run in the receiver system The network protocol transfers the program

Network System A System B

Interpreter

program

slide-138
SLIDE 138

Software Architecture

School of Computer Science University of Oviedo

Mobile code

Advantages

Flexibility and adaptability to new environments Parallelism

Challenges

Complexity of implementation Security

Network System A System B

Interpreter

program

slide-139
SLIDE 139

Software Architecture

School of Computer Science University of Oviedo

Mobile code

Variants

Code on demand Remote evaluation Mobile Agents

slide-140
SLIDE 140

Software Architecture

School of Computer Science University of Oviedo

Network

Code on demand

Code is downloaded and run by the client Combination between mobile code and client- server Example:

ECMAScript

Client

Program

Server

Query

Interpreter

Answer

slide-141
SLIDE 141

Software Architecture

School of Computer Science University of Oviedo

Code on demand

Elements

Client Server Code that is transferred from server to client

Constraints

Code resides or is generated by the server It is transferred to the client when it asks for it It is run by the client

Client must have an interpreter for the corresponding language

slide-142
SLIDE 142

Software Architecture

School of Computer Science University of Oviedo

Code on demand

Advantages

Improves user experience Extensibility: Application can add new functionalities that were not foreseen

No need to install or download a whole application Always Beta

Adaptability to client environment

Challenges

Security Coherence

It may be difficult to ensure an homogeneous behavior in different types of clients

Client can even decide not to run the program

Reminder: Responsive design

slide-143
SLIDE 143

Software Architecture

School of Computer Science University of Oviedo

Code on demand

Applications:

RIA (Rich Internet Applications)

HTML5 standardizes a lot of APIs Improves coherence between clients

Variants

AJAX

Initially: Asynchronous Javascript and XML The program that is running at the client side sends asynchronous requests to the server without stopping its running

slide-144
SLIDE 144

Software Architecture

School of Computer Science University of Oviedo

Remote evaluation

System A sends program to system B to be run and obtain its results

System A System B

Answer (Result)

Interpreter

Program Query

Network

slide-145
SLIDE 145

Software Architecture

School of Computer Science University of Oviedo

Remote evaluation

Elements

Sender: Does the query including the program Receiver: Runs the program and returns the results

Constraints

Receiver runs the program

It must contain some interpreter of the program language or the program could be in machine code

Network protocol transfers program and results

slide-146
SLIDE 146

Software Architecture

School of Computer Science University of Oviedo

Remote evaluation

Advantages

Exploits capabilities of third parties

Computational capabilities, memory, resources, etc.

Challenges

Security

Untrusted code Virus = variant of this style

Configuration

slide-147
SLIDE 147

Software Architecture

School of Computer Science University of Oviedo

Remote evaluation

Example:

Volunteer computation

SETI@HOME It was the basis for the BOINC system

Berkeley Open Infrastructure for Network Computing

Other projects: Folding@HOME, Predictor@Home, AQUA@HOME, etc.

slide-148
SLIDE 148

Software Architecture

School of Computer Science University of Oviedo

Mobile agents

Code and data can move from one machine to another to be run

The process takes its state from machine to machine Code can move autonomously

System B

Program

Interpreter

System A

Program

Interpreter

slide-149
SLIDE 149

Software Architecture

School of Computer Science University of Oviedo

Mobile agents

Elements

Mobile agent: Program that travels and is run from

  • ne machine or another autonomously

System: Execution environment where the mobile agents are run Network protocol: transfers state between agents

Systema B

Program

Interpreter

System A

Program

Interpreter

slide-150
SLIDE 150

Software Architecture

School of Computer Science University of Oviedo

Mobile agents

Constraints

Systems host and run mobile agents Mobile agents can decide to change its running from one system to another

They can communicate with other agents

Systema B

Program

Interpreter

System A

Program

Interpreter

slide-151
SLIDE 151

Software Architecture

School of Computer Science University of Oviedo

Mobile agents

Advantages

It can reduce network traffic

Code blocks that are run are transmitted

Implicit parallelism Fault tolerance to network failures Agents can be conceptually simple

Agent = independent unit of execution It is possible to create mobile agent systems Emergent behaviour

Adaptability to environtment changes

Reactive and learning systems

Challenges

Complexity of configuration Security

Malicious or incorrect code

slide-152
SLIDE 152

Software Architecture

School of Computer Science University of Oviedo

Mobile agents

Challenges

Complexity of configuration Security

Malicious or incorrect code

Systema B

Program

Interpreter

System A

Program

Interpreter

slide-153
SLIDE 153

Software Architecture

School of Computer Science University of Oviedo

Mobile agents

Applications

Information retrieval

Web crawlers

Peer-to-peer systems Telecommunications Remote control and monitoring

Systems:

JADE (Java Agent DEvelopment framework) IBM Aglets

slide-154
SLIDE 154

Software Architecture

School of Computer Science University of Oviedo