Looking Inside a Race Detector kavya @kavya719 data race - PowerPoint PPT Presentation

Looking Inside a Race Detector

kavya @kavya719

data race detection

data races “when two+ threads concurrently access a shared memory location, at least one access is a write.” data race // Shared variable R R R var count = 0 W R R func incrementCount() { if count == 0 { R W W count ++ } !W W W } count = 1 count = 2 count = 2 func main() { // Spawn two “threads” !concurrent concurrent concurrent “g1” go incrementCount() go incrementCount() “g2” }

data races “when two+ threads concurrently access a shared memory location, at least one access is a write.” data race !data race Thread 1 Thread 2 // Shared variable var count = 0 lock(l) lock(l) func incrementCount() { count=1 count=2 if count == 0 { unlock(l) unlock(l) count ++ } } func main() { // Spawn two “threads” go incrementCount() go incrementCount() }

• relevant Panic messages from • elusive unexpected program • have undefined consequences crashes are often reported • easy to introduce in languages   on the Go issue tracker. like Go An overwhelming number of these panics are caused by data races, and an overwhelming number of those reports centre around Go’s built in map type. — Dave Cheney

given we want to write multithreaded programs, how may we protect our systems from the unknown consequences of the difficult-to-track-down data race bugs… in a manner that is reliable and scalable?

race detectors read by goroutine 7 at incrementCount() created at main()

…but how?

go race detector • Go v1.1 (2013)   • Integrated with the Go tool chain — > go run -race counter.go   • Based on C/ C++ ThreadSanitizer   dynamic race detection library • As of August 2015, 1200+ races in Google’s codebase, ~100 in the Go stdlib,   100+ in Chromium,   + LLVM, GCC, OpenSSL, WebRTC, Firefox

core concepts internals evaluation wrap-up

core concepts

concurrency in go The unit of concurrent execution : goroutines user-space threads   use as you would threads   > go handle_request(r) Go memory model specified in terms of goroutines within a goroutine: reads + writes are ordered with multiple goroutines: shared data must be synchronized…else data races!

The synchronization primitives: channels   > ch <- value   mutexes, conditional vars, …   > import “sync”   > mu.Lock()   atomics   > import “sync/ atomic"   > atomic.AddUint64(&myInt, 1)

concurrency ? “…goroutines concurrently access a shared memory location, at least one access is a write.” var count = 0 R R R func incrementCount() { W R R if count == 0 { count ++ R W W } } W W W func main() { “g1” count = 1 count = 2 count = 2 go incrementCount() “g2” go incrementCount() !concurrent concurrent concurrent }

how can we determine “concurrent” memory accesses?

var count = 0 func incrementCount() { if count == 0 { count++ } } func main() { incrementCount() incrementCount() } not concurrent — same goroutine

var count = 0 func incrementCount() { mu.Lock() if count == 0 { count ++ } mu.Unlock() } func main() { go incrementCount() go incrementCount() } not concurrent —   lock draws a “dependency edge”

happens-before orders events across goroutines X ≺ Y IF one of: — same goroutine — are a synchronization-pair memory accesses   — X ≺ E ≺ Y i.e. reads, writes a := b synchronization   IF X not ≺ Y and Y not ≺ X , via locks or lock-free sync mu.Unlock() concurrent! ch <— a

g1 g2 L A ≺ B same goroutine R B ≺ C A W lock-unlock on same object B U A ≺ D L C transitivity D R U

var count = 0 func incrementCount() { if count == 0 { count ++ } } func main() { go incrementCount() go incrementCount() } concurrent ?

g1 g2 A ≺ B and C ≺ D same goroutine A C R R but A ? C and C ? A W W B D concurrent

how can we implement happens-before?

vector clocks means to establish happens-before edges g2 g1 g 1 g 2 g 2 g 1 0 0 0 0 1 0 read(count) 2 0 3 0 t 1 = max(4, 0) unlock(mu) t 2 = max(0,1) 4 0 lock(mu) 4 1 0 1

g1 g2 (0, 0) (0, 0) (1, 0) L R A ≺ D ? (3, 0) < (4, 2) ? A (3, 0) W so yes . (4, 0) B U L (4, 1) C (4, 2) D R U

B ≺ C ? g1 g2 (2, 0) < (0, 1) ? no. A (0, 1) C (1, 0) R R C ≺ B ? B (2, 0) (0, 2) D no. W W so, concurrent

pure happens-before detection This is what the Go Race Detector does! Determines if the accesses to a memory location can be ordered by happens-before, using vector clocks.

internals

go run -race to implement happens-before detection, need to: create vector clocks for goroutines   …at goroutine creation   update vector clocks based on memory access,   synchronization events   …when these events occur   compare vector clocks to detect happens-before   relations.   …when a memory access occurs

spawn state lock program race read race detector race detector state machine

do we have to modify our programs then, to generate the events? memory accesses synchronizations goroutine creation nope.

var count = 0 func incrementCount() { if count == 0 { count ++ } } func main() { go incrementCount() go incrementCount() }

var count = 0 func incrementCount() { raceread() if count == 0 {   racewrite() count ++ }   racefuncexit() } func main() { go incrementCount() go incrementCount() - race

go tool compile -race the gc compiler instruments memory accesses adds an instrumentation pass over the IR. func compile(fn *Node) { ... order(fn) walk(fn) if instrumenting { instrument(Curfn) } ... }

This is awesome. We don’t have to modify our programs to track memory accesses. What about synchronization events, and goroutine creation? proc.go mutex.go package runtime package sync import “internal/race" func newproc1() { func (m *Mutex) Lock() { if race.Enabled { if race.Enabled { newg.racectx = race.Acquire(…) racegostart(…) } } ... ... } } raceacquire(addr)

runtime.raceread() ThreadSanitizer (TSan) library C++ race-detection library   (.asm file because it’s calling into C++) program TSan

threadsanitizer TSan implements the happens-before race detection:   creates, updates vector clocks for goroutines -> ThreadState   keeps track of memory access, synchronization events -> Shadow State, Meta Map   compares vector clocks to detect data races.

go incrementCount() func newproc1() { if race.Enabled { struct ThreadState { newg.racectx = racegostart (…) ThreadClock clock; } } ... } contains a fixed-size vector clock proc.go (size == max(# threads)) count == 0 1. data race with a previous access? raceread (…) 2. store information about this access   for future detections by compiler instrumentation

shadow state stores information about memory accesses. 8-byte shadow word for an access: directly-mapped: TID clock pos wr 0x7fffffffffff application TID: accessor goroutine ID   0x7f0000000000 clock: scalar clock of accessor , optimized vector clock 0x1fffffffffff pos: offset, size in 8-byte word shadow 0x180000000000 wr: IsWrite bit

Optimization 1 N shadow cells per application word (8-bytes) g x read g y write g x clock_1 0:2 0 g y clock_2 4:8 1 When shadow words are filled, evict one at random.

Optimization 2 TID clock pos wr scalar clock, not full vector clock. g x access: g x g y 3 2 3

g1: count == 0 0 0 g1 0 0:8 0 raceread (…) by compiler instrumentation g1: count++ 1 0 g1 1 0:8 1 racewrite (…) g2: count == 0 0 0 g2 0 0:8 0 raceread (…) and check for race

race detection compare: <accessor’s vector clock, new shadow word> with: each existing shadow word 0 0 g2 0 0:8 0 g1 1 0:8 1 “…when two+ threads concurrently access a shared memory location, at least one access is a write.”

race detection compare: <accessor’s vector clock, new shadow word> with: each existing shadow word 0 0 g2 0 0:8 0 g1 1 0:8 1 ✓ do the access locations overlap? ✓ are any of the accesses a write? ✓ are the TIDS different? ✓ are they concurrent (no happens-before)? g2’s vector clock: (0, 0) existing shadow word’s clock: (1, ?)

race detection compare (accessor’s threadState, new shadow word) with each existing shadow word: 0 0 g2 0 0:8 0 g1 1 0:8 1 ✓ do the access locations overlap? ✓ are any of the accesses a write? ✓ are the TIDS different? ✓ are they concurrent (no happens-before)? RACE!

synchronization events TSan must track synchronization events g2 g1 g 1 g 2 g 2 g 1 0 0 0 0 1 0 g 1 = max(3, 0) 2 0 g 2 = max(0,1) lock(mu) unlock(mu) 3 0 3 1

sync vars struct SyncVar { struct SyncVar { mu := sync.Mutex{} SyncClock clock; } } stored in the meta map region. contains a vector clock g1 g2 mu.Unlock() 3 0 SyncClock max( , 0 1 mu.Lock() SyncClock)

Looking Inside a Race Detector kavya @kavya719 data race - PowerPoint PPT Presentation

Looking Inside a Race Detector kavya @kavya719 data race detection data races when two+ threads concurrently access a shared memory location, at least one access is a write. data race // Shared variable R R R var count = 0 W R

So You Want to Race to Bermuda Marion Bermuda Race Starts June 19, 2015 So You Want to Race to

Marion to Bermuda Race 2021 Race Starts: June 18, 2021 So You Want to Race to Bermuda Why

Race Race In D&D, race refers to any intelligent humanoid species Dwarf Elf

Marion to Bermuda Race 2017 Front Row Seat to the Americas Cup Race Starts: June 9, 2017 So

Securing your in-ear fitness coach: Challenges in hardening next generation wearables Kavya

Bedford Basin Yacht Club 2017 RACE PROGRAM Your 2017 Race Management Team Race Officers Emma

Strategic Plan for Detector R&D at Fermilab Petra Merkel Fermilab Detector R&D

INTRODUCTION THE WORLD CUP OF AIR RACING P 1 AIR RACE 1 WORLD CUP AIR RACE 1 WORLD CUP THE

Race 1 Peer Teaching What role do you think race plays in international relations? 2 Race

ThreadSanitizer APIs for External Libraries Kuba Mracek, Apple ThreadSanitizer

0.07 0.06 0.05 0.04 Unspecialized inside Specialized inside (rot, trans) Specialized inside

Mental Stress Abdulrahman Al-Ishaq Abdulla Ansari Kavya Mary Thomas Students at the College of

What Came First? The Ordering of Events in Systems @kavya719 kavya the design of concurrent

Lets talk locks! @kavya719 kavya locks. locks are slow locks are slow latency

Applied Performance Theory @kavya719 kavya applying performance theory to practice

Pedestrian Tracking in Druid Hill Park Jeesoo Kim, Morgan Hobson, Aidan Smith, Kavya Tumkur

Steps&to&Success:& OM,&Civil&Engineer Jul$15$ Jul$17$ Leave$to$ Sep$16$

Welcome Transitions.Stakeholders.Seminar.#21 Professional.bodies.&.refugee.

Special Needs Trusts A General Overview of Purposes, Limitations and Administrative Keys Bill O.

TRANSPORTATION FUNDING AND THE U.S. HIGHWAY TRUST FUND Implications for Metropolitan

Webinar: State of the Off-grid Appliance Market 2019 September 24 th , 2019 The 2016 Global LEAP

IMPROVING PATIENT HEALTH FEDERAL HEALTH INITIATIVE: MEANINGFUL USE The Medicaid EHR Incentive

DIY Small Can Regulation Research Division January 14, 2020 1 Background Measure to

The 2nd International Conference on Systematic Innovation Regular Paper Session Arrangements No.

Looking Inside a Race Detector kavya @kavya719 data race - PowerPoint PPT Presentation

Looking Inside a Race Detector kavya @kavya719 data race detection data races when two+ threads concurrently access a shared memory location, at least one access is a write. data race // Shared variable R R R var count = 0 W R

So You Want to Race to Bermuda Marion Bermuda Race Starts June 19, 2015 So You Want to Race to

Marion to Bermuda Race 2021 Race Starts: June 18, 2021 So You Want to Race to Bermuda Why

Race Race In D&amp;D, race refers to any intelligent humanoid species Dwarf Elf

Marion to Bermuda Race 2017 Front Row Seat to the Americas Cup Race Starts: June 9, 2017 So

Securing your in-ear fitness coach: Challenges in hardening next generation wearables Kavya

Bedford Basin Yacht Club 2017 RACE PROGRAM Your 2017 Race Management Team Race Officers Emma

Strategic Plan for Detector R&amp;D at Fermilab Petra Merkel Fermilab Detector R&amp;D

INTRODUCTION THE WORLD CUP OF AIR RACING P 1 AIR RACE 1 WORLD CUP AIR RACE 1 WORLD CUP THE

Race 1 Peer Teaching What role do you think race plays in international relations? 2 Race

ThreadSanitizer APIs for External Libraries Kuba Mracek, Apple ThreadSanitizer

0.07 0.06 0.05 0.04 Unspecialized inside Specialized inside (rot, trans) Specialized inside

Mental Stress Abdulrahman Al-Ishaq Abdulla Ansari Kavya Mary Thomas Students at the College of

What Came First? The Ordering of Events in Systems @kavya719 kavya the design of concurrent

Lets talk locks! @kavya719 kavya locks. locks are slow locks are slow latency

Applied Performance Theory @kavya719 kavya applying performance theory to practice

Pedestrian Tracking in Druid Hill Park Jeesoo Kim, Morgan Hobson, Aidan Smith, Kavya Tumkur

Steps&amp;to&amp;Success:&amp; OM,&amp;Civil&amp;Engineer Jul$15$ Jul$17$ Leave$to$ Sep$16$

Welcome Transitions.Stakeholders.Seminar.#21 Professional.bodies.&amp;.refugee.

Special Needs Trusts A General Overview of Purposes, Limitations and Administrative Keys Bill O.

TRANSPORTATION FUNDING AND THE U.S. HIGHWAY TRUST FUND Implications for Metropolitan

Webinar: State of the Off-grid Appliance Market 2019 September 24 th , 2019 The 2016 Global LEAP

IMPROVING PATIENT HEALTH FEDERAL HEALTH INITIATIVE: MEANINGFUL USE The Medicaid EHR Incentive

DIY Small Can Regulation Research Division January 14, 2020 1 Background Measure to

The 2nd International Conference on Systematic Innovation Regular Paper Session Arrangements No.

Race Race In D&D, race refers to any intelligent humanoid species Dwarf Elf

Strategic Plan for Detector R&D at Fermilab Petra Merkel Fermilab Detector R&D

Steps&to&Success:& OM,&Civil&Engineer Jul$15$ Jul$17$ Leave$to$ Sep$16$

Welcome Transitions.Stakeholders.Seminar.#21 Professional.bodies.&.refugee.