Linux kernel synchroniza2on Don Porter CSE 506 1 CSE 506: - PowerPoint PPT Presentation

CSE 506: Opera.ng Systems Linux kernel synchroniza2on Don Porter CSE 506 1

CSE 506: Opera.ng Systems Logical Diagram Binary Memory Threads Formats Allocators User Today’s Lecture System Calls Synchroniza2on in Kernel the kernel RCU File System Networking Sync Memory CPU Device Management Scheduler Drivers Hardware Interrupts Disk Net Consistency 2

CSE 506: Opera.ng Systems Warm-up • What is synchroniza2on? – Code on mul2ple CPUs coordinate their opera2ons • Examples: – Locking provides mutual exclusion while changing a pointer-based data structure – Threads might wait at a barrier for comple2on of a phase of computa2on – Coordina2ng which CPU handles an interrupt 3

CSE 506: Opera.ng Systems Why Linux synchroniza2on? • A modern OS kernel is one of the most complicated parallel programs you can study – Other than perhaps a database • Includes most common synchroniza2on paXerns – And a few interes2ng, uncommon ones 4

CSE 506: Opera.ng Systems Historical perspec2ve • Why did OSes have to worry so much about synchroniza2on back when most computers have only one CPU? 5

CSE 506: Opera.ng Systems The old days: They didn’t worry! • Early/simple OSes (like JOS, pre-lab4): No need for synchroniza2on – All kernel requests wait un2l comple2on – even disk requests – Heavily restrict when interrupts can be delivered (all traps use an interrupt gate) – No possibility for two CPUs to touch same data 6

CSE 506: Opera.ng Systems Slightly more recently • Op2mize kernel performance by blocking inside the kernel • Example: Rather than wait on expensive disk I/O, block and schedule another process un2l it completes – Cost: A bit of implementa2on complexity • Need a lock to protect against concurrent update to pages/inodes/ etc. involved in the I/O • Could be accomplished with rela2vely coarse locks • Like the Big Kernel Lock (BKL) – Benefit: BeXer CPU u2litza2on 7

CSE 506: Opera.ng Systems A slippery slope • We can enable interrupts during system calls – More complexity, lower latency • We can block in more places that make sense – BeXer CPU usage, more complexity • Concurrency was an op2miza2on for really fancy OSes, un2l… 8

CSE 506: Opera.ng Systems The forcing func2on • Mul2-processing – CPUs aren’t gegng faster, just smaller – So you can put more cores on a chip • The only way soiware (including kernels) will get faster is to do more things at the same 2me 9

CSE 506: Opera.ng Systems Performance Scalability • How much more work can this soiware complete in a unit of 2me if I give it another CPU? – Same: No scalability---extra CPU is wasted – 1 -> 2 CPUs doubles the work: Perfect scalability • Most soiware isn’t scalable • Most scalable soiware isn’t perfectly scalable 10

CSE 506: Opera.ng Systems Performance Scalability 12 10 Execu.on Time (s) 8 Perfect Scalability 6 Not Scalable 4 Ideal: Time Somewhat scalable halves with 2 2x CPUS 0 1 2 3 4 CPUs 11

CSE 506: Opera.ng Systems Performance Scalability (more visually intui2ve) 0.45 Slope =1 == 0.4 perfect 1 / Execu.on Time (s) 0.35 scaling Performance 0.3 0.25 Perfect Scalability 0.2 Not Scalable 0.15 Somewhat scalable 0.1 0.05 0 1 2 3 4 CPUs 12

CSE 506: Opera.ng Systems Performance Scalability (A 3 rd visual) 35 Execu.on Time (s) * CPUs 30 25 20 Perfect Scalability 15 Not Scalable 10 Somewhat scalable 5 Slope = 0 == 0 perfect 1 2 3 4 scaling CPUs 13

CSE 506: Opera.ng Systems Coarse vs. Fine-grained locking • Coarse: A single lock for everything – Idea: Before I touch any shared data, grab the lock – Problem: completely unrelated opera2ons wait on each other • Adding CPUs doesn’t improve performance 14

CSE 506: Opera.ng Systems Fine-grained locking • Fine-grained locking: Many “liXle” locks for individual data structures – Goal: Unrelated ac2vi2es hold different locks • Hence, adding CPUs improves performance – Cost: complexity of coordina2ng locks 15

CSE 506: Opera.ng Systems Current Reality Fine-Grained Locking Performance Course-Grained Locking Complexity ò Unsavory trade-off between complexity and performance scalability 16

CSE 506: Opera.ng Systems How do locks work? • Two key ingredients: – A hardware-provided atomic instruc2on • Determines who wins under conten2on – A wai2ng strategy for the loser(s) 17

CSE 506: Opera.ng Systems Atomic instruc2ons • A “normal” instruc2on can span many CPU cycles – Example: ‘a = b + c’ requires 2 loads and a store – These loads and stores can interleave with other CPUs’ memory accesses • An atomic instruc2on guarantees that the en2re opera2on is not interleaved with any other CPU – x86: Certain instruc2ons can have a ‘lock’ prefix – Intui2on: This CPU ‘locks’ all of memory – Expensive! Not ever used automa2cally by a compiler; must be explicitly used by the programmer 18

CSE 506: Opera.ng Systems Atomic instruc2on examples • Atomic increment/decrement ( x++ or x--) – Used for reference coun2ng – Some variants also return the value x was set to by this instruc2on (useful if another CPU immediately changes the value) • Compare and swap – if (x == y) x = z; – Used for many lock-free data structures 19

CSE 506: Opera.ng Systems Atomic instruc2ons + locks • Most lock implementa2ons have some sort of counter • Say ini2alized to 1 • To acquire the lock, use an atomic decrement – If you set the value to 0, you win! Go ahead – If you get < 0, you lose. Wait L – Atomic decrement ensures that only one CPU will decrement the value to zero • To release, set the value back to 1 20

CSE 506: Opera.ng Systems Wai2ng strategies • Spinning: Just poll the atomic counter in a busy loop; when it becomes 1, try the atomic decrement again • Blocking: Create a kernel wait queue and go to sleep, yielding the CPU to more useful work – Winner is responsible to wake up losers (in addi2on to segng lock variable to 1) – Create a kernel wait queue – the same thing used to wait on I/O • Note: Moving to a wait queue takes you out of the scheduler’s run queue 21

CSE 506: Opera.ng Systems Which strategy to use? • Main considera2on: Expected 2me wai2ng for the lock vs. 2me to do 2 context switches – If the lock will be held a long 2me (like while wai2ng for disk I/O), blocking makes sense – If the lock is only held momentarily, spinning makes sense • Other, subtle considera2ons we will discuss later 22

CSE 506: Opera.ng Systems Linux lock types • Blocking: mutex, semaphore • Non-blocking: spinlocks, seqlocks, comple2ons 23

CSE 506: Opera.ng Systems Linux spinlock (simplified) 1: lock; decb slp->slock // Locked decrement of lock var jns 3f // Jump if not set (result is zero) to 3 2: pause // Low power instruc2on, wakes on // coherence event // Read the lock value, compare to zero cmpb $0,slp->slock // If less than or equal (to zero), goto 2 jle 2b // Else jump to 1 and try again jmp 1b 3: // We win the lock 24

CSE 506: Opera.ng Systems Rough C equivalent while (0 != atomic_dec(&lock->counter)) { do { // Pause the CPU un2l some coherence // traffic (a prerequisite for the counter // changing) saving power } while (lock->counter <= 0); } 25

CSE 506: Opera.ng Systems Why 2 loops? • Func2onally, the outer loop is sufficient • Problem: AXempts to write this variable invalidate it in all other caches – If many CPUs are wai2ng on this lock, the cache line will bounce between CPUs that are polling its value • This is VERY expensive and slows down EVERYTHING on the system – The inner loop read-shares this cache line, allowing all polling in parallel • This paXern called a Test&Test&Set lock (vs. Test&Set) 26

CSE 506: Opera.ng Systems Test & Set Lock // Has lock while (!atomic_dec(&lock->counter)) CPU 0 CPU 1 CPU 2 Write Back+Evict Cache Line atomic_dec atomic_dec Cache Cache 0x1000 Memory Bus 0x1000 RAM Cache Line “ping-pongs” back and forth 27

CSE 506: Opera.ng Systems Test & Test & Set Lock // Has lock while (lock->counter <= 0)) Unlock by CPU 0 CPU 1 CPU 2 wri2ng 1 read read Cache Cache 0x1000 Memory Bus 0x1000 RAM Line shared in read mode un2l unlocked 28

CSE 506: Opera.ng Systems Why 2 loops? • Func2onally, the outer loop is sufficient • Problem: AXempts to write this variable invalidate it in all other caches – If many CPUs are wai2ng on this lock, the cache line will bounce between CPUs that are polling its value • This is VERY expensive and slows down EVERYTHING on the system – The inner loop read-shares this cache line, allowing all polling in parallel • This paXern called a Test&Test&Set lock (vs. Test&Set) 29

CSE 506: Opera.ng Systems Reader/writer locks • Simple op2miza2on: If I am just reading, we can let other readers access the data at the same 2me – Just no writers • Writers require mutual exclusion 30

Linux kernel synchroniza2on Don Porter CSE 506 1 CSE 506: - PowerPoint PPT Presentation

CSE 506: Opera.ng Systems Linux kernel synchroniza2on Don Porter CSE 506 1 CSE 506: Opera.ng Systems Logical Diagram Binary Memory Threads Formats Allocators User Todays Lecture System Calls Synchroniza2on in Kernel the kernel

Introduction to Linux Kernel Modules Luca Abeni luca.abeni@santannapisa.it Linux Kernel Modules

Debugging the Linux Kernel with GDB Kieran Bingham Debugging the Linux Kernel with GDB Many

Introduction to Linux Aline Abler Aline Abler Linux, whats that? The pieces of a Linux

Making C Less Dangerous in the Linux Kernel Kees Cook | @keescook LINUX.CONF.AU 21-25 January

Linux Overview Amir Hossein Payberah payberah@gmail.com 1 Agenda Linux Overview Linux

Linux from Sensors to Servers ! When is Linux Not Linux? ! 1 1 Linux runs across a huge range

Linux Kernel Debugging Your kernel just oopsed - What do you do, hotshot? Muli Ben-Yehuda

Intro to Linux Kernel Programming Don Porter Lab 4 You will write a Linux kernel module

Linux Kung Fu Introduction What is Linux? Why Linux? What is the difference between a client

Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel -means Clustering Manuel

Linux Kernel Synchronization System Calls Synchronization in Kernel the kernel RCU File

1 Theres a kernel security researcher named Dan Rosenberg whose done a lot of linux kernel

CS533 Concepts of Operating Systems Linux Kernel Locking Techniques Intro to kernel locking

Linux Kernel Debugging Linux Kernel Debugging Advanced Operating Systems 2018/2019

service in Linux Kernel Vikas Shivappa (vikas.shivappa@linux.intel.com) 1 Agenda Problem

Making the Linux Kernel better (without coding) Wolfram Sang Consultant 1.2.2014, FOSDEM14

Pre-Submi*al Workshop Project Partners Introduc/on In a crea7ve city known for its art and

x86 Memory Protec.on and Transla.on Don Porter 1 CSE 506: Opera.ng Systems Logical Diagram

General Session NYSLRS Retirement Online Employer Workshop Presented by: New York State &

Hidden Markov Models Matt Gormley Lecture 19 Nov. 5, 2018 1 Reminders Homework 6: PAC

Lockout: Efficient Tes0ng for Deadlock Bugs Ali Kheradmand,

Introduc)on to the Applica)ons Area within the IETF Murray

Pr ProTrac acer er: T : Towar ards Pr ds Prac ac-c -cal Pr al Provenanc enance T e

Play Tes)ng CS 4730 Computer Game Design Credit:

Linux kernel synchroniza2on Don Porter CSE 506 1 CSE 506: - PowerPoint PPT Presentation

CSE 506: Opera.ng Systems Linux kernel synchroniza2on Don Porter CSE 506 1 CSE 506: Opera.ng Systems Logical Diagram Binary Memory Threads Formats Allocators User Todays Lecture System Calls Synchroniza2on in Kernel the kernel

Introduction to Linux Kernel Modules Luca Abeni luca.abeni@santannapisa.it Linux Kernel Modules

Debugging the Linux Kernel with GDB Kieran Bingham Debugging the Linux Kernel with GDB Many

Introduction to Linux Aline Abler Aline Abler Linux, whats that? The pieces of a Linux

Making C Less Dangerous in the Linux Kernel Kees Cook | @keescook LINUX.CONF.AU 21-25 January

Linux Overview Amir Hossein Payberah payberah@gmail.com 1 Agenda Linux Overview Linux

Linux from Sensors to Servers ! When is Linux Not Linux? ! 1 1 Linux runs across a huge range

Linux Kernel Debugging Your kernel just oopsed - What do you do, hotshot? Muli Ben-Yehuda

Intro to Linux Kernel Programming Don Porter Lab 4 You will write a Linux kernel module

Linux Kung Fu Introduction What is Linux? Why Linux? What is the difference between a client

Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel -means Clustering Manuel

Linux Kernel Synchronization System Calls Synchronization in Kernel the kernel RCU File

1 Theres a kernel security researcher named Dan Rosenberg whose done a lot of linux kernel

CS533 Concepts of Operating Systems Linux Kernel Locking Techniques Intro to kernel locking

Linux Kernel Debugging Linux Kernel Debugging Advanced Operating Systems 2018/2019

service in Linux Kernel Vikas Shivappa (vikas.shivappa@linux.intel.com) 1 Agenda Problem

Making the Linux Kernel better (without coding) Wolfram Sang Consultant 1.2.2014, FOSDEM14

Pre-Submi*al Workshop Project Partners Introduc/on In a crea7ve city known for its art and

x86 Memory Protec.on and Transla.on Don Porter 1 CSE 506: Opera.ng Systems Logical Diagram

General Session NYSLRS Retirement Online Employer Workshop Presented by: New York State &amp;

Hidden Markov Models Matt Gormley Lecture 19 Nov. 5, 2018 1 Reminders Homework 6: PAC

Lockout: Efficient Tes0ng for Deadlock Bugs Ali Kheradmand,

Introduc)on to the Applica)ons Area within the IETF Murray

Pr ProTrac acer er: T : Towar ards Pr ds Prac ac-c -cal Pr al Provenanc enance T e

Play Tes)ng CS 4730 Computer Game Design Credit:

General Session NYSLRS Retirement Online Employer Workshop Presented by: New York State &