Class of
Computer Networks M
Luca Foschini Academic year 2015/2016 Global Data Batching
University of Bologna Dipartimento di Informatica – Scienza e Ingegneria (DISI) Engineering Bologna Campus
Data processing in today large clusters
- Excellent data parallelism
– Easy to find what to parallelize – Example: web data crawled by Google that need to be indexed – documents can be analyzed independently – It’s common to use 1000s nodes for one program that processes large amounts of data
- Communication overhead not very significant in the
- verall execution time