Spark RDD Operations
Transformation and Actions
1
Spark RDD Operations Transformation and Actions 1 MapReduce Vs RDD - - PowerPoint PPT Presentation
Spark RDD Operations Transformation and Actions 1 MapReduce Vs RDD Both MapReduce and RDD can be modeled using the Bulk Synchronous Parallel (BSP) model Communication Independent Local Independent Local Processor 1 Processing Processing
1
2
Independent Local Processing Independent Local Processing Independent Local Processing Independent Local Processing Independent Local Processing Independent Local Processing Independent Local Processing Independent Local Processing
Processor 1 Processor 2 … Processor n Communication
3
4
5
6
7
8
9
10
11
12
13
14
15
f f f f f f f Local Processing Local Processing Local Processing f Network Transfer Final Result Driver Machine f f
16
17
18
19
20
21
22
s Local Processing Local Processing c Network Transfer Final Result Driver Machine c c z s s s s
23
24