read copy update rcu
play

Read-Copy-Update (RCU) Josh Triplett May 22, 2006 Topics The RCU - PDF document

Read-Copy-Update (RCU) Josh Triplett May 22, 2006 Topics The RCU API How it works How to use it What happens if you dont use it correctly Example uses Recurring Example - Writer 1 write_thing () 2 { 3


  1. Read-Copy-Update (RCU) Josh Triplett May 22, 2006 Topics • The RCU API • How it works • How to use it • What happens if you don’t use it correctly • Example uses Recurring Example - Writer 1 ✈♦✐❞ write_thing () 2 { 3 str✉❝t thing *t, *old; 4 t = kmalloc( s✐③❡♦❢ (*t), GFP_KERNEL ); 5 spin_lock (& thing_lock ); 6 t->contents = some_value; 7 old = global_thing; 8 global_thing = t; 9 spin_unlock (& thing_lock ); 10 kfree(old); 11 } Recurring Example - Reader 1 ✈♦✐❞ read_thing () 2 { 3 spin_lock (& thing_lock ); 4 printk(KERN_INFO "thing: %d\n", 5 global_thing ->contents ); 6 spin_unlock (& thing_lock ); 7 } 1

  2. The RCU API • rcu_read_lock / rcu_read_unlock • synchronize_rcu • call_rcu • rcu_barrier • _bh variants • rcu_assign_pointer • rcu_dereference r❝✉❴r❡❛❞❴❧♦❝❦ / r❝✉❴r❡❛❞❴✉♥❧♦❝❦ - Description • Delimit an RCU read-side critical section • Allows writers to detect concurrent readers • Prevents “quiescent state” • Reclamation deferred until current readers complete • May run concurrently with other readers and with writers • No corresponding writer lock: use other synchronization r❝✉❴r❡❛❞❴❧♦❝❦ / r❝✉❴r❡❛❞❴✉♥❧♦❝❦ - Usage 1 ✈♦✐❞ read_thing () 2 { 3 rcu_read_lock (); 4 printk(KERN_INFO "thing: %d\n", 5 global_thing ->contents ); 6 rcu_read_unlock (); 7 } r❝✉❴r❡❛❞❴❧♦❝❦ / r❝✉❴r❡❛❞❴✉♥❧♦❝❦ - Implementation 1 rcu_read_lock () preempt_disable () ★❞❡❢✐♥❡ 2 rcu_read_unlock () preempt_enable () ★❞❡❢✐♥❡ • No overhead without CONFIG_PREEMPT • Low overhead with CONFIG_PREEMPT • Quiescent state: context switch • Readers may not block 2

  3. s②♥❝❤r♦♥✐③❡❴r❝✉ - Description • Guarantees that all current readers have finished • Block until quiescent state on all CPUs • Use after removing item for future readers • Use before freeing item concurrent readers could still access s②♥❝❤r♦♥✐③❡❴r❝✉ - Usage 1 ✈♦✐❞ write_thing () 2 { 3 str✉❝t thing *t, *old; 4 t = kmalloc( s✐③❡♦❢ (*t), GFP_KERNEL ); 5 spin_lock (& thing_lock ); 6 t->contents = some_value; 7 old = global_thing; 8 global_thing = t; 9 spin_unlock (& thing_lock ); 10 synchronize_rcu (); 11 kfree(old); 12 } s②♥❝❤r♦♥✐③❡❴r❝✉ - Toy implementation 1 ✈♦✐❞ synchronize_rcu () 2 { 3 ✐♥t cpu; 4 for_each_cpu(cpu) 5 run_on_only(cpu); 6 run_on_all_cpus (); 7 } • Real, non-toy operating systems used this algorithm ❝❛❧❧❴r❝✉ - Description • Invoke callback when current readers have finished • Remove item from view of future readers first • Reclaim item in callback • Does not block 3

  4. ❝❛❧❧❴r❝✉ - Usage (Data structure) 1 str✉❝t thing { 2 ✐♥t contents; 3 str✉❝t rcu_head rcu; 4 }; ❝❛❧❧❴r❝✉ - Usage (Writer) 1 ✈♦✐❞ write_thing () 2 { 3 str✉❝t thing *t, *old; 4 t = kmalloc( s✐③❡♦❢ (*t), GFP_KERNEL ); 5 spin_lock (& thing_lock ); 6 t->contents = some_value; 7 old = global_thing; 8 global_thing = t; 9 spin_unlock (& thing_lock ); 10 call_rcu(old ->rcu , reclaim_thing ); 11 } ❝❛❧❧❴r❝✉ - Usage (Callback) 1 ✈♦✐❞ reclaim_thing( str✉❝t rcu_head *r) 2 { 3 str✉❝t thing *t; 4 t = container_of(r, str✉❝t thing , rcu); 5 kfree(t); 6 } • container_of gives structure pointer from member pointer ❝❛❧❧❴r❝✉ - Implementation • str✉❝t rcu_head contains list pointer • call_rcu queues rcu_head in per-CPU “next” list • “next” list moves to “current” list in quiescent state at start of grace period • “current” list moves to “done” list in quiescent state at end of grace period • Callbacks on “done” list get called and discarded 4

  5. s②♥❝❤r♦♥✐③❡❴r❝✉ - Real implementation 1 ✈♦✐❞ synchronize_rcu () { 2 rcu_synchronize rcu; str✉❝t 3 init_completion (&rcu.completion ); 4 call_rcu (&rcu.head , wakeme_after_rcu ); 5 wait_for_completion (&rcu.completion ); 6 } 7 st❛t✐❝ ✈♦✐❞ wakeme_after_rcu ( 8 str✉❝t rcu_head *head) { 9 str✉❝t rcu_synchronize *rcu; 10 rcu = container_of(head , 11 str✉❝t rcu_synchronize , head ); 12 complete (&rcu ->completion ); 13 } • rcu_synchronize contains rcu_head and completion • wait_for_completion blocks until complete called r❝✉❴❜❛rr✐❡r • Blocks until all RCU callbacks on all CPUs have completed • Usage example: module unloading • Implementation: CPU count and wait_for_completion ❴❜❤ variants • Used for “bottom half” handlers • Need shorter grace periods • Quiescent state: no bottom half running • Read-side critical sections: 1 rcu_read_lock_bh () local_bh_disable () ★❞❡❢✐♥❡ 2 rcu_read_unlock_bh () local_bh_enable () ★❞❡❢✐♥❡ • call_rcu_bh : different queues r❝✉❴❛ss✐❣♥❴♣♦✐♥t❡r - Description • Assign to an RCU-protected pointer • Use after initializing item • Makes item visible to readers • Includes appropriate memory barrier 5

  6. Without r❝✉❴❛ss✐❣♥❴♣♦✐♥t❡r • Writes could get reordered • Reader could see: 1 global_thing = t; 2 t->contents = some_value; • Reader can read global_thing->contents in between • Reader gets random uninitialized contents r❝✉❴❛ss✐❣♥❴♣♦✐♥t❡r - Usage 1 ✈♦✐❞ write_thing () 2 { 3 str✉❝t thing *t, *old; 4 t = kmalloc( s✐③❡♦❢ (*t), GFP_KERNEL ); 5 spin_lock (& thing_lock ); 6 t->contents = some_value; 7 old = global_thing; 8 rcu_assign_pointer (global_thing , t); 9 spin_unlock (& thing_lock ); 10 synchronize_rcu (); 11 kfree(old); 12 } r❝✉❴❛ss✐❣♥❴♣♦✐♥t❡r - Implementation 1 rcu_assign_pointer (p, v) \ ★❞❡❢✐♥❡ 2 ({ \ 3 smp_wmb (); \ 4 (p) = (v); \ 5 }) smp_wmb() provides a write memory barrier in SMP kernels. r❝✉❴❞❡r❡❢❡r❡♥❝❡ - Description • Get a copy of an RCU-protected pointer to dereference • Use inside rcu_read_lock() / rcu_read_unlock() • Includes appropriate memory barrier • Prevents read reordering 6

  7. Without r❝✉❴❞❡r❡❢❡r❡♥❝❡ • Reads could get reordered • Write memory barrier forces write of contents, then pointer • Reader can read new pointer, dereference, and find old contents • Only an issue on Alpha CPUs r❝✉❴❞❡r❡❢❡r❡♥❝❡ - Usage 1 ✈♦✐❞ read_thing () 2 { 3 rcu_read_lock (); 4 printk(KERN_INFO "thing: %d\n", 5 rcu_dereference (global_thing)->contents ); 6 rcu_read_unlock (); 7 } r❝✉❴❞❡r❡❢❡r❡♥❝❡ - Alternate Usage 1 ✈♦✐❞ read_thing () 2 { 3 str✉❝t thing *local_thing; 4 rcu_read_lock (); 5 local_thing = rcu_dereference (global_thing ); 6 printk(KERN_INFO "thing: %d\n", 7 local_thing ->contents ); 8 rcu_read_unlock (); 9 } • Useful if using local_thing repeatedly • Cannot use local_thing after rcu_read_unlock() r❝✉❴❞❡r❡❢❡r❡♥❝❡ - Implementation 1 ★❞❡❢✐♥❡ rcu_dereference (p) \ 2 ({ \ 3 typeof(p) _________p1 = p; \ 4 smp_read_barrier_depends (); \ 5 (_________p1 ); \ 6 }) • Uses GCC extension “statements as expressions” • Saves copy of pointer, calls smp_read_barrier_depends() , returns copy 7

  8. • Allows use of rcu_dereference() in expressions • smp_read_barrier_depends() no-op except on SMP Alpha Final version of writer 1 ✈♦✐❞ write_thing () 2 { 3 str✉❝t thing *t, *old; 4 t = kmalloc( s✐③❡♦❢ (*t), GFP_KERNEL ); 5 spin_lock (& thing_lock ); 6 t->contents = some_value; 7 old = global_thing; 8 rcu_assign_pointer (global_thing , t); 9 spin_unlock (& thing_lock ); 10 synchronize_rcu (); 11 kfree(old); 12 } Final version of reader 1 ✈♦✐❞ read_thing () 2 { 3 rcu_read_lock (); 4 printk(KERN_INFO "thing: %d\n", 5 rcu_dereference (global_thing)->contents ); 6 rcu_read_unlock (); 7 } RCU API summary • rcu_read_lock / rcu_read_unlock • synchronize_rcu • call_rcu • rcu_barrier • _bh variants • rcu_assign_pointer • rcu_dereference 8

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend