outline
play

Outline 0024 Spring 2010 24 :: 2 Parallel application - PowerPoint PPT Presentation

Outline 0024 Spring 2010 24 :: 2 Parallel application development 0024 Spring 2010 24 :: 3 0024 Spring 2010 24 :: 4 Lock data, not code 0024 Spring 2010 24 :: 5 Do you really need


  1. Outline �� 0024 Spring 2010 � – 24 :: 2 �

  2. Parallel application development � 0024 Spring 2010 � – 24 :: 3 �

  3. 0024 Spring 2010 � – 24 :: 4 �

  4. Lock data, not code � 0024 Spring 2010 � – 24 :: 5 �

  5. Do you really need locks? � No shared data => no need for locks � � � Recall that CSP gives you a model to avoid locks � � � No free lunch � Lock-free data structures � � � Mutex-free by design � � � Growing number of class/data structures � 0024 Spring 2010 � – 24 :: 6 �

  6. Detour: no shared data � What if we could write programs so that there are no side-effects? � � � Think about the simple fi nite impulse response fi lter for N inputs � � � Think of computing an expensive function for N numbers � � � Think of searching for a string in N documents � 0024 Spring 2010 � – 24 :: 7 �

  7. MapReduce � Basic idea: Parallel computing framework for restricted parallel programming model � Useful to distribute work to a farm (cluster) of compute nodes � User speci fi es what needs to be done for each data item (“map”) and how results are to be combined (“reduce”) � Libraries take care of everything else � � � Parallelization � � � Fault Tolerance � � � Data Distribution � � � Load Balancing � 0024 Spring 2010 � – 24 :: 8 �

  8. MapReduce � Map() �� � � Process a key/value pair to generate intermediate key/value pairs � Reduce() �� � � Merge all intermediate values associated with the same key � Names originated in the functional programming world … but slightly different semantics � 0024 Spring 2010 � – 24 :: 9 �

  9. Example: Counting Words � Map() �� � � Input < fi lename, fi le text> � � � Parses fi le and emits <word, count> pairs � � � E.g. <”hello”, 1> � Reduce() �� � � Sums all values for the same key � � � Emits <word, TotalCount> � � � E.g. <”hello”, 5 > <”hello”, 1> � <”hello”, 2 > <”hello”, 7 > => <”hello”, 15> � 0024 Spring 2010 � – 24 :: 10 �

  10. Example Use of MapReduce � Counting words in a large set of documents � map(string key, string value) � //key: document name //value: document contents for each word w in value EmitIntermediate(w, “1”); reduce(string key, iterator values) � //key: word //values: list of counts int results = 0; for each v in values result += ParseInt(v); Emit(AsString(result)); � 0024 Spring 2010 � – 24 :: 11 �

  11. Data Distribution � Input fi les are split into pieces � � � distributed fi le system � Intermediate fi les created from map tasks are written to local disk � Output fi les are written to distributed fi le system � 0024 Spring 2010 � – 24 :: 12 �

  12. Assigning Tasks � Many copies of user program are started � Tries to utilize data localization by running map tasks on machines with data � One instance becomes the master � Master fi nds idle machines and assigns tasks � 0024 Spring 2010 � – 24 :: 13 �

  13. 0024 Spring 2010 � – 24 :: 14 �

  14. MapReduce � 0024 Spring 2010 � – 24 :: 15 �

  15. Do you really need locks? � No shared data => no need for locks � � � Recall that CSP gives you a model to avoid locks � � � No free lunch � Lock-free data structures � � � Mutex-free by design � � � Growing number of class/data structures � 0024 Spring 2010 � – 24 :: 16 �

  16. Why Locking Doesn � t Scale � Not Robust � Relies on conventions � Hard to Use � � � Conservative � � � Deadlocks � � � Lost wake-ups � Not Composable � 0024 Spring 2010 � – 24 :: 17 �

  17. Locks are not Robust � If a thread holding a lock is delayed … No one else can make progress 0024 Spring 2010 � – 24 :: 18 �

  18. Why Locking Doesn � t Scale � Not Robust � Relies on conventions � Hard to Use � � � Conservative � � � Deadlocks � � � Lost wake-ups � Not Composable � 0024 Spring 2010 � – 24 :: 19 �

  19. Locking Relies on Conventions � Relation between � � � Lock bit and object bits � � � Exists only in programmer � s mind � Actual comment from Linux Kernel (hat tip: Bradley Kuszmaul) /* * When a locked buffer is visible to the I/O layer * BH_Launder is set. This means before unlocking * we must clear BH_Launder,mb() on alpha and then * clear BH_Lock, so no reader can see BH_Launder set * on an unlocked buffer and then risk to deadlock. */ 0024 Spring 2010 � – 24 :: 20 �

  20. Why Locking Doesn � t Scale � Not Robust � Relies on conventions � Hard to Use � � � Conservative � � � Deadlocks � � � Lost wake-ups � Not Composable � 0024 Spring 2010 � – 24 :: 21 �

  21. Sadistic Homework � Fifo queue enq( y) enq( x) No interference if ends “far enough” apart 0024 Spring 2010 � – 24 :: 22 �

  22. Sadistic Homework � Double-ended queue deq( ) deq( ) Interference OK if ends “close enough” together 0024 Spring 2010 � – 24 :: 23 �

  23. Sadistic Homework � Double-ended queue deq( ) deq( ) � Make sure suspended dequeuers awake as needed 0024 Spring 2010 � – 24 :: 24 �

  24. In Search of the Lost Wake-Up � Waiting thread doesn � t realize when to wake up � It � s a real problem in big systems � � � “Calling pthread_cond_signal() or pthread_cond_broadcast() when the thread does not hold the mutex lock associated with the condition can lead to lost wake-up bugs.” � from Google™ search for “lost wake-up” 0024 Spring 2010 � – 24 :: 25 �

  25. You Try It … � One lock? � � � Too Conservative � Locks at each end? � � � Deadlock, too complicated, etc � Waking blocked dequeuers? � � � Harder than it looks � 0024 Spring 2010 � – 24 :: 26 �

  26. Actual Solution � Clean solution would be a publishable result � [Michael & Scott, PODC 96] � High performance fi ne-grained lock-based solutions are good for libraries… � not general consumption by programmers � 0024 Spring 2010 � – 24 :: 27 �

  27. Why Locking Doesn � t Scale � Not Robust � Relies on conventions � Hard to Use � � � Conservative � � � Deadlocks � � � Lost wake-ups � Not Composable � 0024 Spring 2010 � – 24 :: 28 �

  28. Locks do not compose � Hash Table lock T1 lock T 1 Must lock T 1 before adding add(T 1 , item) item item Move from T 1 to T 2 lock T1 lock T 1 lock T2 lock T 2 Must lock T 2 delete(T 1 , item) before deleting add(T 2 , item) item item from T 1 Exposing lock internals breaks abstraction 0024 Spring 2010 � – 24 :: 29 �

  29. Monitor Wait and Signal � Empty buffer zzz Yes! If buffer is empty, wait for item to show up 0024 Spring 2010 � – 24 :: 30 �

  30. Wait and Signal do not Compose � empty Wait for either? empty zzz… 0024 Spring 2010 � – 24 :: 31 �

  31. The Transactional Manifesto � What we do now is inadequate to meet the multi-core challenge � Research Agenda � � � Replace locking with a transactional API � � � Design languages to support this model � � � Implement the run-time to be fast enough � 0024 Spring 2010 � – 24 :: 32 �

  32. Transactions � Atomic � � � Commit: takes effect � � � Abort: effects rolled back � � � Usually retried � Linearizable � � � Appear to happen in one-at-a-time order � 0024 Spring 2010 � – 24 :: 33 �

  33. Atomic Blocks � at om i c { x. r em ove( 3) ; y. add( 3) ; } at om i c { y = nul l ; } 0024 Spring 2010 � – 24 :: 34 �

  34. Atomic Blocks � at om i c { No data race x. r em ove( 3) ; y. add( 3) ; } at om i c { y = nul l ; } 0024 Spring 2010 � – 24 :: 35 �

  35. Sadistic Homework Revisited � Publ i c voi d Lef t Enq( i t em x) { Q node q = new Q node( x) ; q. l ef t = t hi s. l ef t ; t hi s. l ef t . r i ght = q; t hi s. l ef t = q; } Write sequential code (1) 0024 Spring 2010 � – 24 :: 36 �

  36. Sadistic Homework Revisited � Publ i c voi d Lef t Enq( i t em x) { at om i c { Q node q = new Q node( x) ; q. l ef t = t hi s. l ef t ; t hi s. l ef t . r i ght = q; t hi s. l ef t = q; } } (1) 0024 Spring 2010 � – 24 :: 37 �

  37. Sadistic Homework Revisited � Publ i c voi d Lef t Enq( i t em x) { at om i c { Q node q = new Q node( x) ; q. l ef t = t hi s. l ef t ; t hi s. l ef t . r i ght = q; t hi s. l ef t = q; } } Enclose in atomic block (1) 0024 Spring 2010 � – 24 :: 38 �

  38. Warning � Not always this simple � � � Conditional waits � � � Enhanced concurrency � � � Overlapping locks � But often it is � � � Works for sadistic homework � 0024 Spring 2010 � – 24 :: 39 �

  39. Composition � Publ i c voi d Tr ansf er ( Q ueue q1, q2) { at om i c { O bj ect x = q1. deq( ) ; q2. enq( x) ; } } Trivial or what? 0024 Spring 2010 � – 24 :: 40 �

  40. Wake-ups: lost and found � Publ i c O bj ect Lef t Deq( ) { at om i c { i f ( t hi s. l ef t == nul l ) r et r y; … } } Roll back transaction and restart when something changes 0024 Spring 2010 � – 24 :: 41 �

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend