Complementi di Piattaforme Abilitanti Distribuite Distributed - - PowerPoint PPT Presentation

complementi di piattaforme abilitanti distribuite
SMART_READER_LITE
LIVE PREVIEW

Complementi di Piattaforme Abilitanti Distribuite Distributed - - PowerPoint PPT Presentation

Complementi di Piattaforme Abilitanti Distribuite Distributed Enabling Platforms || MCSN N. Tonellotto Complements of Distributed Enabling Platforms 1 Topics State-of-the-art technologies to dealing with large scale problems


slide-1
SLIDE 1

Complementi di Piattaforme Abilitanti Distribuite

Distributed Enabling Platforms ||

1 MCSN – N. Tonellotto – Complements of Distributed Enabling Platforms

slide-2
SLIDE 2

Topics

  • State-of-the-art technologies to dealing with large scale

problems

  • Frontier research in many different fields today requires world-wide

collaborations

  • Batch analysis of gazillion-bytes of experimental data

– Crawling, indexing, searching the Web – Web 2.0 applications – Online analysis of gazillion-bytes of usage data

  • Grid and Cloud Platforms

– Resource Management – Information Management – Data Management – System Virtualization – Cost Analysis – Data Analysis – Programming

2 MCSN – N. Tonellotto – Complements of Distributed Enabling Platforms

slide-3
SLIDE 3

Course Organization

  • 48 hours: ~32 lessons, ~16 laboratory
  • 36 hours: ~24 lessons, ~12 laboratory
  • Timetable

– Monday 14:00-16:00 Room 10B – Wednesday 17:00-19:00 Room 10B

  • Highly interactive lectures
  • Laboratory

– Java programming skills required

  • Notes and references available online

– Updated in real time on the course wiki

  • Grading

– notes (20%) – project (50%)

  • To be agreed with teacher

– oral session (30%)

3 MCSN – N. Tonellotto – Complements of Distributed Enabling Platforms

slide-4
SLIDE 4
  • Distributed…

– relating to a computer network in which at least some of the processing is done by the individual computers and information is shared by and

  • ften stored at the computers
  • Enabling…

– to make possible, practical, or easy

  • Platforms…

– the computer architecture and equipment used for a particular purpose

4 MCSN – N. Tonellotto – Complements of Distributed Enabling Platforms

TO DO WHAT?

slide-5
SLIDE 5

Large Scale Problems

  • In research
  • Frontier research in many different fields today requires world-wide

collaborations

  • Online access to expensive scientific instrumentation
  • Scientists and engineers will be able to perform their work without regard

to physical location

  • Simulations of world-scale mathematical models
  • Batch analysis of gazillion-bytes of experimental data
  • In production

– Crawling, indexing, searching the Web – Web 2.0 applications – Mining information – Highly interactive applications – Online analysis of gazillion-bytes of usage data

5 MCSN – N. Tonellotto – Complements of Distributed Enabling Platforms

slide-6
SLIDE 6

Biology

6 MCSN – N. Tonellotto – Complements of Distributed Enabling Platforms

slide-7
SLIDE 7

Earth Science

7 MCSN – N. Tonellotto – Complements of Distributed Enabling Platforms

slide-8
SLIDE 8

Physics

8 MCSN – N. Tonellotto – Complements of Distributed Enabling Platforms

slide-9
SLIDE 9

Astronomy

9 MCSN – N. Tonellotto – Complements of Distributed Enabling Platforms

slide-10
SLIDE 10

Google

10 MCSN – N. Tonellotto – Complements of Distributed Enabling Platforms

slide-11
SLIDE 11

Big enough?

  • Large Hadron Collider:

– 1019 bytes/year generated – 1021 bytes/year forecasted – 103 scientists – 102 institutions

  • Large Synoptic Survey Telescope (2016)

– 15 TB/night – 6.8 PB/year

  • Google

– 1019 byte/day processed – 0.1 sec query latency

  • Walmart

– 6000 stores, 267 M items/day

11 MCSN – N. Tonellotto – Complements of Distributed Enabling Platforms

slide-12
SLIDE 12

Our Data Driven World

  • Science

– Databases for astronomy, genomics, natural languages, seismic modeling, …

  • Humanities

– Scanned books, historic documents, …

  • Commerce

– Corporate sales, stock market transactions, census, airline traffic, …

  • Entertainment

– Hollywood movies, Internet images, MP3 music, …

  • Medicine

– Patient records, drugs composition, …

12 MCSN – N. Tonellotto – Complements of Distributed Enabling Platforms

slide-13
SLIDE 13

13 MCSN – N. Tonellotto – Complements of Distributed Enabling Platforms

Computing and Communication Technologies Evolution: 1960-2010!

* S Sputnik

1960 1970 1975 1980 1985 1990 1995 2000

* A ARPANET * E Email * E Ethernet * T TCP/IP * I IETF * I Internet E Era * W WWW E Era * M Mosaic * X XML * * PC C Clusters * Cr Crays * MPPs MPPs * * Ma Mainfr inframes es * H HTML * W W3C * * P2 P2P * * Gr Grids ids * * XEROX OX P PARC w worm COM OMPUTING Com Communi unica cation

  • n

* W Web S Services * * Minicom Minicomput puter ers * * PCs PCs * * WS C Clusters * * PDA PDAs * * Workstations Workstations * * HTC HTC

2010

* e e-Science Computing as U Utility * e e-Bu Business * S SocialNets

Cont Control

  • l

Centralised Decentralised

slide-14
SLIDE 14

14 MCSN – N. Tonellotto – Complements of Distributed Enabling Platforms

Performance, Capability, Value of ICT as defined by the three Laws of Computing

  • Moore’s Law.

– Transistors on a single chip doubles ~ every 18 months.

  • Gilder’s Law.

– Aggregate bandwidth triples ~ every year.

  • Metcalfe’s Law.

– The value of a network may grow exponentially with the number of participants.

Source: Cambridge Energy Resource Associates

slide-15
SLIDE 15

Experiment

  • You must put together your computers to

calculate 1020 prime numbers. How do you proceed?

– You agree to collaborate – You put your computers in a network – You install the programs – You run the programs – You wait for results – You publish your results on the Web

  • Is really that simple?

15 MCSN – N. Tonellotto – Complements of Distributed Enabling Platforms

slide-16
SLIDE 16

What if…

  • I do not trust someone else’s computer?
  • I do not trust the application?
  • I want to use my laptop during lectures?
  • The application wants more computers?
  • I forget the IP address of some computers?
  • My disk disintegrates losing the data?
  • Someone pays and we must share money?
  • We are still waiting the results after the class?

16 MCSN – N. Tonellotto – Complements of Distributed Enabling Platforms

NOT SO SIMPLE!

slide-17
SLIDE 17

Some issues

  • Security
  • Resource sharing
  • Dynamicity
  • Lack of information
  • Lack of global state
  • Fault tolerance
  • Accounting

17 MCSN – N. Tonellotto – Complements of Distributed Enabling Platforms

slide-18
SLIDE 18

How to solve a problem?

  • Manual Computing
  • Personal Computing
  • Mobile Computing
  • Ubiquitous Computing
  • Pervasive Computing
  • Parallel Computing
  • Distributed Computing
  • High Performance Computing
  • Grid Computing
  • Cloud Computing

18 MCSN – N. Tonellotto – Complements of Distributed Enabling Platforms