Coccinelle: 10 Years of Automated Evolution in the Linux Kernel
Julia Lawall (Inria-Whisper team, Julia.Lawall@inria.fr) March 2, 2020
1
Coccinelle: 10 Years of Automated Evolution in the Linux Kernel - - PowerPoint PPT Presentation
Coccinelle: 10 Years of Automated Evolution in the Linux Kernel Julia Lawall (Inria-Whisper team, Julia.Lawall@inria.fr) March 2, 2020 1 Our focus: The Linux kernel Open source OS kernel, developed by Linus Torvalds First released in
Julia Lawall (Inria-Whisper team, Julia.Lawall@inria.fr) March 2, 2020
1
Our focus: The Linux kernel
Torvalds
billions of smartphones (Android), battleships, stock exchanges, …
2
Some history
First release in 1991.
Recent evolution:
5 10 15 20 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 Million LOC 500 1,000 1,500 2,000 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 contributors 1 10 50 100 500 1000 5000 10000 101 102 103 104 contributions contributors
3
Key challenge
As software grows, how to ensure its continued maintenance?
Make functions and data structures:
– More effjcient – Easier to use correctly – Better adapted to their usage context
– More time consuming – More error prone – Need to communicate new coding strategies to all developers
Developers may hesitate to make needed changes.
4
Key challenge
As software grows, how to ensure its continued maintenance?
Make functions and data structures:
– More effjcient – Easier to use correctly – Better adapted to their usage context
– More time consuming – More error prone – Need to communicate new coding strategies to all developers
Developers may hesitate to make needed changes.
4
Key challenge
As software grows, how to ensure its continued maintenance?
Make functions and data structures:
– More effjcient – Easier to use correctly – Better adapted to their usage context
– More time consuming – More error prone – Need to communicate new coding strategies to all developers
Developers may hesitate to make needed changes.
4
Example change: init_timer → setup_timer
Initializing a timer requires:
Original initialization strategy (present in Linux v1.2.0, 1995):
5
Example change: init_timer → setup_timer
Initializing a timer requires:
Original initialization strategy (present in Linux v1.2.0):
init_timer(&ns_timer); ns_timer.data = 0UL; ns_timer.function = ns_poll; 6
Example change: init_timer → setup_timer
Replacement initialization strategy (introduced in Linux v2.6.15, Jan. 2006):
setup_timer(&ns_timer , ns_poll , 0UL);
Advantages:
7
Example change: init_timer → setup_timer
200 400 600 v2.6.15 Jan 2006 v3.0 Jul 2011 v4.0 Apr 2015 v4.14 Nov 2017 Call sites init_timer setup_timer
8
Example bug: missing of_node_puts
Device node structures are reference counted:
Iterators, e.g., for_each_child_of_node, put one value and get another.
9
Example bug: missing of_node_puts
50 100 150 200 250 v2.6.17 Jun 2006 v3.0 Jul 2011 v4.0 Apr 2015 v5.5 Jan 2020 Jump sites missing present
10
Assessment
relationships between them.
– Grep insuffjcient to fjnd the problem.
– Tedious and time-consuming to fjnd all occurrences.
– Hard to anticipate; some variants may be overlooked.
– New code can be introduced using the old coding strategy.
11
Assessment
relationships between them.
– Grep insuffjcient to fjnd the problem.
– Tedious and time-consuming to fjnd all occurrences.
– Hard to anticipate; some variants may be overlooked.
– New code can be introduced using the old coding strategy.
11
Assessment
relationships between them.
– Grep insuffjcient to fjnd the problem.
– Tedious and time-consuming to fjnd all occurrences.
– Hard to anticipate; some variants may be overlooked.
– New code can be introduced using the old coding strategy.
11
Assessment
relationships between them.
– Grep insuffjcient to fjnd the problem.
– Tedious and time-consuming to fjnd all occurrences.
– Hard to anticipate; some variants may be overlooked.
– New code can be introduced using the old coding strategy.
11
12
What is Coccinelle?
(semantic patches).
Linux kernel developer.
13
Starting point: a patch
+++ b/drivers/atm/nicstar.c @@ -287,4 +287,2 @@
+ setup_timer(&ns_timer, ns_poll, 0UL); ns_timer.expires = jiffies + NS_POLL_PERIOD;
14
Semantic patches
(line numbers, spacing, variable names, etc.)
15
Example: Creating an init_timer → setup_timer semantic patch
A patch: derived from drivers/atm/nicstar.c
+ setup_timer(&ns_timer, ns_poll, 0UL); ns_timer.expires = jiffies + NS_POLL_PERIOD;
16
Example: Creating an init_timer → setup_timer semantic patch
Remove irrelevant code:
+ setup_timer(&ns_timer, ns_poll, 0UL); ...
17
Example: Creating an init_timer → setup_timer semantic patch
Abstract over subterms:
@@ expression timer, fn_arg, data_arg; @@
+ setup_timer(&timer, fn_arg, data_arg); ...
18
Example: Creating an init_timer → setup_timer semantic patch
Generalize a little more:
@@ expression timer, fn_arg, data_arg; @@
+ setup_timer(&timer, fn_arg, data_arg); ...
...
19
Results
Dataset: 598 Linux kernel init_timer fjles from difgerent versions.
Untreated example: drivers/tty/n_gsm.c:
20
Results
Dataset: 598 Linux kernel init_timer fjles from difgerent versions.
Untreated example: drivers/tty/n_gsm.c:
init_timer(&dlci->t1); dlci->t1.function = gsm_dlci_t1; dlci->t1.data = (unsigned long)dlci; 21
Example: Creating an init_timer → setup_timer semantic patch
Extended semantic patch:
@@ expression timer, fn_arg, data_arg; @@
+ setup_timer(&timer, fn_arg, data_arg); ...
...
Covers 656/828 calls.
22
Example: Creating an init_timer → setup_timer semantic patch
Extended semantic patch:
@@ expression timer, fn_arg, data_arg; @@
+ setup_timer(&timer, fn_arg, data_arg); ...
...
@@ expression timer, fn_arg, data_arg; @@
+ setup_timer(&timer, fn_arg, data_arg); ...
...
Covers 656/828 calls.
23
Example: Creating an init_timer → setup_timer semantic patch
Remaining issues
Complete semantic patch
24
Semantic patch example
@@ expression root,e; local idexpression child; iterator name for_each_child_of_node; @@ for_each_child_of_node(root, child) { ... when != of_node_put(child) when != e = child +
? break; ... } ... when != child
Used in the big v5.4 cleanup.
25
Assessment
relationships between them.
– ... connects related fragments over control-fmow paths.
– Coccinelle fjnds an updates all relevant code automatically.
– Semantic patches are easily adapted to new variants.
– Semantic patches in commit logs document changes. – Semantic patches can be collected in a library and checked during continuous integration.
26
Assessment
relationships between them.
– ... connects related fragments over control-fmow paths.
– Coccinelle fjnds an updates all relevant code automatically.
– Semantic patches are easily adapted to new variants.
– Semantic patches in commit logs document changes. – Semantic patches can be collected in a library and checked during continuous integration.
26
Assessment
relationships between them.
– ... connects related fragments over control-fmow paths.
– Coccinelle fjnds an updates all relevant code automatically.
– Semantic patches are easily adapted to new variants.
– Semantic patches in commit logs document changes. – Semantic patches can be collected in a library and checked during continuous integration.
26
Assessment
relationships between them.
– ... connects related fragments over control-fmow paths.
– Coccinelle fjnds an updates all relevant code automatically.
– Semantic patches are easily adapted to new variants.
– Semantic patches in commit logs document changes. – Semantic patches can be collected in a library and checked during continuous integration.
26
Impact: Patches in the Linux kernel
Over 7700 Linux kernel commits up to Linux v5.5 (Jan 2020). 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 200 400 number
Coccinelle developers Outreachy interns Kernel maintainers Dedicated user Others
27
Impact: Cleanup vs. bug fjx changes among maintainer patches using Coccinelle
2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 100 200 300 number
Cleanups Bug fjxes
28
Impact: Maintainer use examples
remove that argument.
29
Impact: 0-day reports mentioning Coccinelle per year
2013 2014 2015 2016 2017 200 400 # with patches api free iterators locks null tests misc 2013 2014 2015 2016 2017 100 200 # with message only
30
Conclusion
developer.
– Enables needed evolution, independent of the amount of afgected code.
http://coccinelle.lip6.fr/ https://github.com/coccinelle/coccinelle https://github.com/kanghj/coccinelle/tree/java
31
Conclusion
developer.
– Enables needed evolution, independent of the amount of afgected code.
http://coccinelle.lip6.fr/ https://github.com/coccinelle/coccinelle https://github.com/kanghj/coccinelle/tree/java
31
Conclusion
developer.
– Enables needed evolution, independent of the amount of afgected code.
http://coccinelle.lip6.fr/ https://github.com/coccinelle/coccinelle https://github.com/kanghj/coccinelle/tree/java
31
Conclusion
developer.
– Enables needed evolution, independent of the amount of afgected code.
http://coccinelle.lip6.fr/ https://github.com/coccinelle/coccinelle https://github.com/kanghj/coccinelle/tree/java
31
Conclusion
developer.
– Enables needed evolution, independent of the amount of afgected code.
http://coccinelle.lip6.fr/ https://github.com/coccinelle/coccinelle https://github.com/kanghj/coccinelle/tree/java
31
Conclusion
developer.
– Enables needed evolution, independent of the amount of afgected code.
http://coccinelle.lip6.fr/ https://github.com/coccinelle/coccinelle https://github.com/kanghj/coccinelle/tree/java
31