datacollider effective data race detection for the kernel
play

DataCollider: Effective Data-Race Detection for the Kernel John - PowerPoint PPT Presentation

DataCollider: Effective Data-Race Detection for the Kernel John Erickson, Madanlal Musuvathi, Sebastian Burckhardt, Kirk Olynyk Microsoft Windows and Microsoft Research {jerick, madanm, sburckha, kirko}@microsoft.com "Although threads seem


  1. DataCollider: Effective Data-Race Detection for the Kernel John Erickson, Madanlal Musuvathi, Sebastian Burckhardt, Kirk Olynyk Microsoft Windows and Microsoft Research {jerick, madanm, sburckha, kirko}@microsoft.com "Although threads seem to be a small step from sequential computation, in fact, they represent a huge step. They discard the most essential and appealing properties of sequential computation: understandability, predictability, and determinism." — From “The Problem with Threads,” by Edward A. Lee, IEEE Computer , vol. 25, no. 5, May 2006

  2. Windows case study #1 Thread A Thread B RestartCtxtCallback(...) RunContext(...) Thread B { { pctxt->dwfCtxt |= pctxt->dwfCtxt &= CTXTF_NEED_CALLBACK; ~CTXTF_RUNNING; } } • The OR’ing in of the CTXTF_NEED_CALLBACK flag can be swallowed by the AND’ing out of the CTXTF_RUNNING flag! • Results in system hang.

  3. Case study #1, assembled Thread A Thread B mov eax, [pctxt->dwfCtxt] mov eax, [pctxt->dwfCtxt] and eax, NOT 10h or eax, 20h mov [pctxt->dwfCtxt], eax mov [pctxt->dwfCtxt], eax EAX = ?? EAX = ?? pctxt->dwfCtxt = 11h

  4. Case study #1, assembled Thread A Thread B mov eax, [pctxt->dwfCtxt] mov eax, [pctxt->dwfCtxt] 1 and eax, NOT 10h or eax, 20h mov [pctxt->dwfCtxt], eax mov [pctxt->dwfCtxt], eax EAX = 11h EAX = ?? pctxt->dwfCtxt = 11h

  5. Case study #1, assembled Thread A Thread B mov eax, [pctxt->dwfCtxt] mov eax, [pctxt->dwfCtxt] 1 2 and eax, NOT 10h or eax, 20h mov [pctxt->dwfCtxt], eax mov [pctxt->dwfCtxt], eax EAX = 01h EAX = ?? pctxt->dwfCtxt = 11h

  6. Case study #1, assembled Thread A Thread B mov eax, [pctxt->dwfCtxt] mov eax, [pctxt->dwfCtxt] 1 2 and eax, NOT 10h or eax, 20h /* CONTEXT SWITCH */ mov [pctxt->dwfCtxt], eax mov [pctxt->dwfCtxt], eax EAX = 01h EAX = ?? pctxt->dwfCtxt = 11h

  7. Case study #1, assembled Thread A Thread B mov eax, [pctxt->dwfCtxt] mov eax, [pctxt->dwfCtxt] 1 3 2 and eax, NOT 10h or eax, 20h /* CONTEXT SWITCH */ mov [pctxt->dwfCtxt], eax mov [pctxt->dwfCtxt], eax EAX = 01h EAX = 11h pctxt->dwfCtxt = 11h

  8. Case study #1, assembled Thread A Thread B mov eax, [pctxt->dwfCtxt] mov eax, [pctxt->dwfCtxt] 1 3 2 and eax, NOT 10h or eax, 20h 4 /* CONTEXT SWITCH */ mov [pctxt->dwfCtxt], eax mov [pctxt->dwfCtxt], eax EAX = 01h EAX = 31h pctxt->dwfCtxt = 11h

  9. Case study #1, assembled Thread A Thread B mov eax, [pctxt->dwfCtxt] mov eax, [pctxt->dwfCtxt] 1 3 2 and eax, NOT 10h or eax, 20h 4 /* CONTEXT SWITCH */ mov [pctxt->dwfCtxt], eax mov [pctxt->dwfCtxt], eax 5 EAX = 01h EAX = 31h pctxt->dwfCtxt = 31h

  10. Case study #1, assembled Thread A Thread B mov eax, [pctxt->dwfCtxt] mov eax, [pctxt->dwfCtxt] 1 3 2 and eax, NOT 10h or eax, 20h 4 /* CONTEXT SWITCH */ mov [pctxt->dwfCtxt], eax mov [pctxt->dwfCtxt], eax 6 5 EAX = 01h EAX = 31h pctxt->dwfCtxt = 01h

  11. Case study #1, assembled Thread A Thread B mov eax, [pctxt->dwfCtxt] mov eax, [pctxt->dwfCtxt] 1 3 2 and eax, NOT 10h or eax, 20h 4 CTXTF_NEED_CALLBACK disappeared! /* CONTEXT SWITCH */ mov [pctxt->dwfCtxt], eax mov [pctxt->dwfCtxt], eax 6 5 (pctxt->dwfCtxt & 0x20 == 0) EAX = 01h EAX = 31h pctxt->dwfCtxt = 01h

  12. Windows case study #1 Thread A Thread B RestartCtxtCallback(...) RunContext(...) Thread B { { pctxt->dwfCtxt |= pctxt->dwfCtxt &= CTXTF_NEED_CALLBACK; ~CTXTF_RUNNING; or [ecx+40], 20h and [ecx+40], ~10h } } • Instructions appear atomic, but they are not!

  13. Data race definition  By our definition, a data race is a pair of memory accesses that satisfy all the below:  The accesses can happen concurrently  There is a non-zero overlap in the physical address ranges specified by the two accesses  At least one access modifies the contents of the memory location

  14. Importance  Very hard to reproduce  Timings can be very tight  Hard to debug  Very easy to mistake as a hardware error “bit flip”  To support scalability, code is moving away from monolithic locks  Fine-grained locks  Lock-free approaches

  15. Previous Techniques  Happens-before and lockset algorithms have significant overhead  Intel Thread Checker has 200x overhead  Log all synchronizations  Instrument all memory accesses  High overhead can prevent usage in the field  Causes false failures due to timeouts

  16. Challenges  Prior schemes require a complete knowledge and logging of all locking semantics  Locking semantics in kernel-mode can be homegrown, complicated and convoluted.  e.g. DPCs, interrupts, affinities

  17. DataCollider: Goals

  18. DataCollider: Goals 1. No false data races  Tradeoff between having false positives and reporting fewer data races

  19. False vs. Benign  False data race  A data race that cannot actually occur  Benign data race  A data race that can and does occur, but is intended to happen as part of normal program execution

  20. False vs. benign example Thread B Thread A MyLockAcquire(); MyLockAcquire(); gReferenceCount++; gReferenceCount++; MyLockRelease(); MyLockRelease(); gStatisticsCount++; gStatisticsCount++;

  21. False vs. Benign  False data race  A data race that cannot actually occur  Benign data race  A data race that can and does occur, but is intended to happen as part of normal program execution

  22. False vs. benign example Thread B Thread A MyLockAcquire(); MyLockAcquire(); gReferenceCount++; gReferenceCount++; MyLockRelease(); MyLockRelease(); gStatisticsCount++; gStatisticsCount++;

  23. DataCollider: Goals 2. User-controlled overhead  Give user full control of overhead – from 0.0x up  Fast vs. more races found

  24. DataCollider: Goals 3. Actionable data  Contextual information is key to analysis and debugging

  25. Insights

  26. Insights 1. Instead of inferring if a data race could have occurred, let’s cause it to actually happen!  No locksets, no happens-before

  27. Insights 2. Sample memory accesses  No binary instrumentation  No synchronization logging  No memory access logging  Use code and data breakpoints  Randomly selection for uniform coverage

  28. Intersection Metaphor

  29. Intersection Metaphor Memory Address = 0x1000

  30. Intersection Metaphor Memory Address = 0x1000 Hi, I’m Thread A!

  31. Intersection Metaphor Instruction stream Memory Address = 0x1000

  32. Intersection Metaphor Instruction stream I have the lock, so I get a green light. Memory Address = 0x1000

  33. Intersection Metaphor Instruction stream Memory Address = 0x1000

  34. Intersection Metaphor Memory Address = 0x1000 DataCollider

  35. Intersection Metaphor Memory Address = 0x1000 DataCollider

  36. Intersection Metaphor Please wait a moment, Thread A – we’re doing a routine check for data races. Memory Address = 0x1000 DataCollider

  37. Intersection Metaphor Memory Address = 0x1000 Value = 3 Data Breakpoint DataCollider

  38. Intersection Metaphor Memory Address = 0x1000 Value = 3 Data Breakpoint DataCollider

  39. Intersection Metaphor: Normal Case

  40. Intersection Metaphor: Normal Case Memory Address = 0x1000 Value = 3 Data Breakpoint DataCollider

  41. Intersection Metaphor: Normal Case Thread B Memory Address = 0x1000 Value = 3 Data Breakpoint DataCollider

  42. Intersection Metaphor: Normal Case I don’t’ have the lock, so I’ll have to wait. Memory Address = 0x1000 Value = 3 Data Breakpoint DataCollider

  43. Intersection Metaphor: Normal Case Nothing to see here. Let me remove this trap. Memory Address = 0x1000 Value = 3 Data Breakpoint DataCollider

  44. Intersection Metaphor: Normal Case Looks safe now. Sorry for the inconvenience. DataCollider

  45. Intersection Metaphor: Normal Case

  46. Intersection Metaphor: Data Race

  47. Intersection Metaphor: Data Race Memory Address = 0x1000 Value = 3 Data Breakpoint DataCollider

  48. Intersection Metaphor: Data Race Thread B Memory Address = 0x1000 Value = 3 Data Breakpoint DataCollider

  49. Intersection Metaphor: Data Race Locks are for wimps! Memory Address = 0x1000 Value = 3 Data Breakpoint DataCollider

  50. Intersection Metaphor: Data Race DataCollider

  51. Intersection Metaphor: Data Race

  52. Intersection Metaphor: Data Race DataCollider

  53. Intersection Metaphor: Data Race Looks safe now. Sorry for the inconvenience. DataCollider

  54. Intersection Metaphor: Data Race

  55. Implementation

  56. Sampling memory accesses with code breakpoints; part 1 Process Advantages  Zero base-overhead – no 1. Analyze target binary for memory access instructions. code breakpoints means 2. Hook the breakpoint handler. only the original code is 3. Set code breakpoints at a running. sampling of the memory access instructions.  No annotations required 4. Begin execution. – only symbols.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend