CSE 291D/234 Data Systems for Machine Learning
1
CSE 291D/234 Data Systems for Machine Learning Arun Kumar Topic 1: - - PowerPoint PPT Presentation
CSE 291D/234 Data Systems for Machine Learning Arun Kumar Topic 1: Classical ML Training at Scale Chapters 2, 5, and 6 of MLSys book 1 Academic ML 101 Classical ML Generalized Linear Models (GLMs); from statistics Bayesian Networks ;
1
2
3
https://www.kaggle.com/c/kaggle-survey-2019
4
5
6
7
8
9
Cache
105 – 106 A C C E S S C Y C L E S 107 – 108 100s
10
11
CU ALU Caches
Registers
tmp.csv tmp.py
12
13
14
15
16
17
18
n
i=1
Y X1 X2 X3 1b 1c 1d 1 2b 2c 2d 1 3b 3c 3d 4b 4c 4d … … … …
19
n
i=1
n
i=1
20
21
n
i=1
22
P1 P2 P3 P4 P5 P6
P1 P2 P3 P4 P5 P6
23
n
i=1
Y X1 X2 X3 1b 1c 1d 1 2b 2c 2d 1 3b 3c 3d 4b 4c 4d … … … …
0,1b,1c,1d 1,2b,2c,2d
1,3b,3c,3d … …
rL1(w(k))
<latexit sha1_base64="9Xu8gK1dsg0vjYS6WgkRx2IkG/s=">AB/XicbVDLSgNBEJyNrxhf6+PmZTAIySXsBkWPQS8ePEQwD0jW0DuZJENmZ5eZWSUuwV/x4kERr/6HN/GSbIHTSxoKq6e7yI86UdpxvK7O0vLK6l3PbWxube/Yu3t1FcaS0BoJeSibPijKmaA1zTSnzUhSCHxOG/7wcuI37qlULBS3ehRL4C+YD1GQBupYx+0Bfgc8HWnjAsPd0lhWBwXO3beKTlT4EXipiSPUlQ79le7G5I4oEITDkq1XCfSXgJSM8LpONeOFY2ADKFPW4YKCKjykun1Y3xslC7uhdKU0Hiq/p5IFBqFPimMwA9UPeRPzPa8W6d+4lTESxpoLMFvVijnWIJ1HgLpOUaD4yBIhk5lZMBiCBaBNYzoTgzr+8SOrlkntacm5O8pWLNI4sOkRHqIBcdIYq6ApVUQ0R9Iie0St6s56sF+vd+pi1Zqx0Zh/9gfX5Azqsk8g=</latexit>rL2(w(k))
rL(w(k)) = rL1(w(k)) + rL2(w(k)) + . . .
24
https://docs.dask.org/en/latest/dataframe.html
25
26
27
(yi,xi)∈B⊂D
28
29
30
31
32
33
Initialize Transition Merge Finalize Start by setting up “agg. state” in DRAM Example with SQL AVG: (S, C): (Partial sum, partial count) RDBMS gives a tuple from table; update agg. state (Optional: In parallel RDBMS, combine agg. states of workers)
<latexit sha1_base64="N8pZRaxhzlXn8gQ86guS41Ybwio=">ACnicbVDLSgMxFM34rPU16tJNtAgtljIji6L3bisaB/QDkMmzbShmWRIMpUydO3GX3HjQhG3foE7/8b0sdDWAxcO59zLvfcEMaNKO863tbS8srq2ntnIbm5t7+zae/t1JRKJSQ0LJmQzQIowyklNU81IM5YERQEjaBfGfuNAZGKCn6vhzHxItTlNKQYaSP59lH+rlgpwDYjoUZSigc4FU5hfuDTInQLvp1zSs4EcJG4M5IDM1R9+6vdETiJCNeYIaVarhNrL0VSU8zIKNtOFIkR7qMuaRnKUSUl05eGcETo3RgKQpruFE/T2RokipYRSYzgjpnpr3xuJ/XivR4ZWXUh4nmnA8XRQmDGoBx7nADpUEazY0BGFJza0Q95BEWJv0siYEd/7lRVI/K7kXJef2PFe+nsWRAYfgGOSBCy5BGdyAKqgBDB7BM3gFb9aT9WK9Wx/T1iVrNnMA/sD6/AGhDZcI</latexit>(S, C) ← (S, C) + (vi, 1) <latexit sha1_base64="ckJ8N6ixBXtCU6c1aXo4eYT0U=">ACH3icbVDLSgNBEJz1bXxFPXoZDGIECbvi6xjMxWNEo0I2LOTXp1kdmeZ6VXCEr/Ei7/ixYMi4s2/cRJz8FXQUFR1090VplIYdN0PZ2x8YnJqema2MDe/sLhUXF45NyrTHBpcSaUvQ2ZAigQaKFDCZaqBxaGEi7BbG/gXN6CNUMkZ9lJoxewqEZHgDK0UFPfLp5vbtc0t6kuIkGmtbqlvsjI/ZjhtcD8Vuku6P5dp0/Lp0Fnm9aCzlZQLkVdwj6l3gjUiIj1IPiu9WPIshQS6ZMU3PTbGVM42CS+gX/MxAyniXUHT0oTFYFr58L8+3bBKm0ZK20qQDtXvEzmLjenFoe0cHG1+ewPxP6+ZYXTYykWSZgJ/1oUZKioOwaFto4Ch7ljCuhb2V8mumGUcbacG4P1+S8536l4exX3ZLdUPRrFMUPWyDopE48ckCo5JnXSIJzck0fyTF6cB+fJeXevlrHnNHMKvkB5+MTQ5qh6A=</latexit>(S0, C0) ← X
worker j
(Sj, Cj)
Post-process agg. state and return result Return S’/C’
34
Initialize Transition Merge Finalize Allocate memory for model W(t) and mini-batch gradient stats Given tuple with (y,x), compute gradient and update stats If mini-batch limit hit, update model and reset stats (Optional: applies only to parallel RDBMS) “Combine” model parameters from indep. workers Return model parameters
35
36
Worker 1 Worker 2 Worker n … Master
1
<latexit sha1_base64="DvIp/6uHDZOjNMUvnpHo6Owbk=">AB+3icbVDLSsNAFJ3UV62vWJdugkWom5L4QJdFNy4r2Ae0MUymk3boZBJmbsQS8ituXCji1h9x5984abPQ1gMDh3Pu5Z45fsyZAtv+Nkorq2vrG+XNytb2zu6euV/tqCiRhLZJxCPZ87GinAnaBgac9mJcehz2vUnN7nfaRSsUjcwzSmbohHgWMYNCSZ1YHIYaxH6Td7CGtw0nmOZ5Zsxv2DNYycQpSQwVanvk1GEYkCakAwrFSfceOwU2xBEY4zSqDRNEYkwke0b6mAodUuekse2Yda2VoBZHUT4A1U39vpDhUahr6ejJPqha9XPzP6ycQXLkpE3ECVJD5oSDhFkRWXoQ1ZJIS4FNMJFMZ7XIGEtMQNdV0SU4i19eJp3ThnPWuLg7rzWvizrK6BAdoTpy0CVqolvUQm1E0BN6Rq/ozciMF+Pd+JiPloxi5wD9gfH5A6OglC4=</latexit>2
<latexit sha1_base64="xRaCmja4+Alc2G0VSB2AwXG38es=">AB+3icbVDLSsNAFL3xWesr1qWbYBHqpiRV0WXRjcsK9gFtDJPpB06mYSZiVhCfsWNC0Xc+iPu/BsnbRbaemDgcM693DPHjxmVyra/jZXVtfWNzdJWeXtnd2/fPKh0ZJQITNo4YpHo+UgSRjlpK6oY6cWCoNBnpOtPbnK/+0iEpBG/V9OYuCEacRpQjJSWPLMyCJEa+0HazR7SmjrNvIZnVu26PYO1TJyCVKFAyzO/BsMIJyHhCjMkZd+xY+WmSCiKGcnKg0SGOEJGpG+phyFRLrpLHtmnWhlaAWR0I8ra6b+3khRKOU09PVknlQuern4n9dPVHDlpTHiSIczw8FCbNUZOVFWEMqCFZsqgnCguqsFh4jgbDSdZV1Cc7il5dJp1F3zuoXd+fV5nVRwmO4Bhq4MAlNOEWtAGDE/wDK/wZmTGi/FufMxHV4xi5xD+wPj8AaUklC8=</latexit>n
<latexit sha1_base64="dQYgF7AMPEB/nejv+XzaZ3+cDiQ=">AB+3icbVDLSsNAFJ3UV62vWpduBotQNyXxgS6LblxWsA9oY5lMJ+3QySTM3Igl5FfcuFDErT/izr9x0mahrQcGDufcyz1zvEhwDb9bRVWVtfWN4qbpa3tnd298n6lrcNYUdaioQhV1yOaCS5ZCzgI1o0UI4EnWMeb3GR+5EpzUN5D9OIuQEZSe5zSsBIg3KlHxAYe37SR+SGpykAyNW7bo9A14mTk6qKEdzUP7qD0MaB0wCFUTrnmNH4CZEAaeCpaV+rFlE6ISMWM9QSQKm3WSWPcXHRhliP1TmScAz9fdGQgKtp4FnJrOketHLxP+8Xgz+lZtwGcXAJ0f8mOBIcRZEXjIFaMgpoYQqrjJiumYKELB1FUyJTiLX14m7dO6c1a/uDuvNq7zOoroEB2hGnLQJWqgW9RELUTRE3pGr+jNSq0X6936mI8WrHznAP2B9fkDACOUaw=</latexit>n
i=1
i
<latexit sha1_base64="YvR+r5RoOYyp/ar/kI19j/9+94=">ACJ3icbVDLSgMxFM3UV62vqks3wSLUTZnxgW4qRTcuK9gHdNohk2ba0ExmSDJCfM3bvwVN4K6NI/MW1noa0HAodziX3Hj9mVCrb/rJyS8srq2v59cLG5tb2TnF3rymjRGDSwBGLRNtHkjDKSUNRxUg7FgSFPiMtf3Qz8VsPREga8Xs1jk3RANOA4qRMpJXvHJDpIZ+oFtpT5fVcQqr0A0EwtpJNU+hK5PQ07TqpD0O57Me9Yolu2JPAReJk5ESyFD3iq9uP8JSLjCDEnZcexYdTUSimJG0oKbSBIjPEID0jGUo5DIrp7emcIjo/RhEAnzuIJT9feERqGU49A3ycmct6biP95nUQFl1NeZwowvHsoyBhUEVwUhrsU0GwYmNDEBbU7ArxEJmWlKm2YEpw5k9eJM2TinNaOb87K9Wuszry4AcgjJwAWogVtQBw2AwSN4Bm/g3XqyXqwP63MWzVnZzD74A+v7B2K9pkU=</latexit>37
38
https://madlib.apache.org/docs/latest/index.html
39
40
41
42
43
44
45
46
47
48
49
50
51
52
https://www.usenix.org/system/files/conference/nsdi12/nsdi12-final138.pdf
53
54
https://papers.nips.cc/paper/3150-map-reduce-for-machine-learning-on-multicore.pdf
55
56
57
58
59
Worker 1 Worker 2 Worker n
PS 1 PS 2 PS k …
n
2
1 )
<latexit sha1_base64="OcO+0LVRY+mliWjDoWz5mG5oEXI=">ACDnicbVC7SgNBFJ31GeNr1dJmMASJuz6QMugjYVFBPOAbAyzk9lkyOzsMnNXCMt+gY2/YmOhiK21nX/j5Fo4oELh3Pu5d57/FhwDY7zbS0tr6yurec28ptb2zu79t5+Q0eJoqxOIxGplk80E1yOnAQrBUrRkJfsKY/vBr7zQemNI/kHYxi1glJX/KAUwJG6tpFTxJfEOwBFz2W3mS45IUEBn6QNrP7tATlrOuWu3bBqTgT4EXizkgBzVDr2l9eL6JyCRQbRu04MnZQo4FSwLO8lmsWEDkmftQ2VJGS6k07eyXDRKD0cRMqUBDxRf0+kJNR6FPqmc3yqnvfG4n9eO4HgopNyGSfAJ0uChKBIcLjbHCPK0ZBjAwhVHFzK6YDogFk2DehODOv7xIGscV96RydntaqF7O4sihQ3SESshF56iKrlEN1RFj+gZvaI368l6sd6tj2nrkjWbOUB/YH3+AChxm4k=</latexit>60
61
62
63
64
65
66
67
http://vldb.org/pvldb/vol5/p716_yuchenglow_vldb2012.pdf
68
http://vldb.org/pvldb/vol5/p716_yuchenglow_vldb2012.pdf
69
http://vldb.org/pvldb/vol5/p716_yuchenglow_vldb2012.pdf
70
71
http://docs.h2o.ai/h2o/latest-stable/h2o-docs/variable-importance.html
72
73
Y X1 X2 X3 1b 1c 1d 1 2b 2c 2d 1 3b 3c 3d 4b 4c 4d … … … …
Gi Hi … … … … … … … … … …
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
https://www.usenix.org/system/files/osdi18-moritz.pdf
89
90
https://adalabucsd.github.io/papers/2015_Orion_SIGMOD.pdf https://adalabucsd.github.io/papers/2017_Morpheus_VLDB.pdf
91
http://www.cs.cmu.edu/~kijungs/etc/10-405.pdf
92
93
https://www.usenix.org/system/files/conference/hotos15/hotos15-paper-mcsherry.pdf