IR&DM ’13/’14
V.4 MapReduce
- 1. System Architecture
- 2. Programming Model
- 3. Hadoop
Based on MRS Chapter 4 and RU Chapter 2
!74
V.4 MapReduce 1. System Architecture 2. Programming Model 3. Hadoop - - PowerPoint PPT Presentation
V.4 MapReduce 1. System Architecture 2. Programming Model 3. Hadoop Based on MRS Chapter 4 and RU Chapter 2 IR&DM 13/14 ! 74 Why MapReduce? Large clusters of commodity
IR&DM ’13/’14
!74
IR&DM ’13/’14
!75 Jeff Dean Sanjay Ghemawat
IR&DM ’13/’14
!75 Jeff Dean Sanjay Ghemawat
IR&DM ’13/’14
!76
chunk 1df2 chunk 2ef0 chunk 3ef1 /foo/bar
chunk 2ef0 chunk 5ef0 chunk 3ef1 chunk 1df2 chunk 3ef2 chunk 5af1
control data
IR&DM ’13/’14
!77
control
assign tasks report progress
IR&DM ’13/’14
!78
IR&DM ’13/’14
!79
IR&DM ’13/’14
!80
IR&DM ’13/’14
!80 d123 a x b b a y d242 b y a x a b
IR&DM ’13/’14
!80 d123 a x b b a y d242 b y a x a b
(a,d123), (x,d242), … (b,d123), (y,d242), …
IR&DM ’13/’14
!80 d123 a x b b a y d242 b y a x a b
(a,d123), (x,d242), … (b,d123), (y,d242), …
1 m 1 m 1 m 1 m
IR&DM ’13/’14
!80 d123 a x b b a y d242 b y a x a b
(a,d123), (x,d242), … (b,d123), (y,d242), …
(a,d123), (a,d242), … (x,d123), (x,d242), …
1 m 1 m 1 m 1 m
IR&DM ’13/’14
!80 d123 a x b b a y d242 b y a x a b (a,4) (b,4) … (x,2) (y,2) …
(a,d123), (x,d242), … (b,d123), (y,d242), …
(a,d123), (a,d242), … (x,d123), (x,d242), …
1 m 1 m 1 m 1 m
IR&DM ’13/’14
!81
map(long did, string content) { int pos = 0 map<string, list<int>> positions = new map<string, list<int>>() for(string word : content.split()) { // tokenize document content positions.get(word).add(pos++) // aggregate word positions } for(string word : map.keys()) { emit(word, new posting(did, positions.get(word))) // emit posting } } reduce(string word, list<posting> postings) { postings.sort() // sort postings (e.g., by did) emit(word, postings) // emit posting list }
IR&DM ’13/’14
!82 Doug Cutting
IR&DM ’13/’14
http://googleblog.blogspot.com/2008/11/sorting-1pb-with-mapreduce.html
http://developer.yahoo.com/blogs/hadoop/posts/2009/05/hadoop_sorts_a_petabyte_in_162/
!83
IR&DM ’13/’14 IR&DM ’13/’14
!84
IR&DM ’13/’14 IR&DM ’13/’14
!85
IR&DM ’13/’14
!86
IR&DM ’13/’14
!87
IR&DM ’13/’14
!87
IR&DM ’13/’14
!87
IR&DM ’13/’14
!87
IR&DM ’13/’14
!88
IR&DM ’13/’14
!89
IR&DM ’13/’14
!90
IR&DM ’13/’14
!91
IR&DM ’13/’14
!92
IR&DM ’13/’14
!93
IR&DM ’13/’14
!94
IR&DM ’13/’14
!95
IR&DM ’13/’14
!96
IR&DM ’13/’14
!97
IR&DM ’13/’14
!98
IR&DM ’13/’14
!98
IR&DM ’13/’14
!98
IR&DM ’13/’14
!98
IR&DM ’13/’14
!98
IR&DM ’13/’14
!99
IR&DM ’13/’14 IR&DM ’13/’14
!100
IR&DM ’13/’14 IR&DM ’13/’14
!101