dingo taming device drivers
play

Dingo: Taming Device Drivers Leonid Ryzhyk Peter Chubb Ihor - PowerPoint PPT Presentation

Dingo: Taming Device Drivers Leonid Ryzhyk Peter Chubb Ihor Kuz Gernot Heiser UNSW, NICTA, Open Kernel Labs (Australia) The problem with drivers 70% of OS crashes are caused by device drivers Drivers contain 1.5x-7x bugs per loc


  1. Dingo: Taming Device Drivers Leonid Ryzhyk Peter Chubb Ihor Kuz Gernot Heiser UNSW, NICTA, Open Kernel Labs (Australia)

  2. The problem with drivers • 70% of OS crashes are caused by device drivers • Drivers contain 1.5x-7x bugs per loc compared to the rest of the kernel 1 Ganapathi et al. Windows XP kernel crash analysis, 2006 2 Chou et al. An Empirical study of operating system errors, 2001

  3. Previous approaches Dealing with faulty drivers Runtime isolation Static analysis Mach, L4, Nooks, MINIX, XFI, SLAM, MC, Singularity, etc. SafeDrive, etc. • Performance overhead • Detects a limited subset of bugs • T ransparent recovery is hard

  4. The Dingo approach Can we develop drivers that contain fewer bugs in the first place? Localise complexity in driver development ● Many driver bugs are provoked by the complexity of the OS interface Reduce bugs by improving the design of this interface

  5. Dingo for Linux Dingo runtime Native Linux Native Linux Dingo drivers Dingo drivers driver drivers

  6. A study of driver bugs

  7. A study of Linux driver bugs Driver #loc #bugs USB 827 16 RTL8150 USB-to-Ethernet adapter 710 2 EL1210a USB-to-Ethernet adapter 925 15 KL5kusb101 USB-to-Ethernet apapter 1028 45 Generic USB network driver 2234 67 USB hub 989 50 USB-to-serial converter 803 23 USB mass storage Firewire 1413 22 IEEE1394 Ethernet controller 1713 46 SBP-2 transport protocol PCI 11718 123 Mellanox InfiniHost InfiniBand adapter 5412 51 BNX2 Ethernet adapter 2920 16 i810 frame buffer 2660 22 CMI8338 audio 498

  8. A study of Linux driver bugs OS protocol Driver device protocol

  9. A study of Linux driver bugs Device protocol violation examples: Issuing a command to  uninitialised device Writing an invalid register value OS protocol  Incorrectly managing DMA  descriptors Driver device protocol

  10. Device protocol violations Device protocol violations 38%

  11. OS protocol violations Mellanox Infinihost controller OS protocol driver READY RESET Driver ` if(cur_state==IB_RESET && new_state==IB_RESET){ device protocol return 0; }

  12. OS protocol violations Device protocol violations OS protocol violations 38% 38% 20%

  13. Concurrency errors Race in config functions: Race in hot unplug handler: Deadlock in an atomic context: Race in the data path: Race in PM functions: Uninitialised lock: Imbalanced locks: Other: 0 5 10 15 20 25 30 35

  14. Concurrency errors Race in config functions: Race in hot unplug handler: Deadlock in an atomic context: Race in the data path: Race in PM functions: Uninitialised lock: Imbalanced locks: Other: 0 5 10 15 20 25 30 35

  15. Concurrency errors Race in config functions: Race in hot unplug handler: Deadlock in an atomic context: Race in the data path: Race in PM functions: Uninitialised lock: Imbalanced locks: Other: 0 5 10 15 20 25 30 35

  16. Concurrency errors Device protocol violations OS protocol violations Concurrency errors 38% 38% 38% 19% 20% 20%

  17. Generic errors Device protocol violations OS protocol violations 23% Concurrency errors 38% 38% 38% 38% Generic errors 19% 19% 20% 20% 20%

  18. Dealing with concurrency bugs

  19. Dealing with concurrency bugs Threads request2 irq request1 driver

  20. Dealing with concurrency bugs Threads Events request2 irq request1 request2 irq Dingo request1 evt3 driver evt2 evt1 driver

  21. Writing non-blocking drivers Linux Dingo void probe () int probe () { { ... ... write_config_reg (); write_config_reg (); timeout(20, probe2); msleep(20); } read_status_reg (); ... void probe2 () } { read_status_reg (); ... }

  22. Writing non-blocking drivers Linux Dingo int probe () void probe () { { ... simple_evt notif; write_config_reg (); ... msleep(20); write_config_reg (); read_status_reg (); CALL (timeout(20), notif); ... read_status_reg (); } ... }

  23. Performance of the AX88772 USB-to-Ethernet adapter driver Evaluation platform: 4 x 2GHz Itanium II (SMT, 2 threads per core) CPU Utilisation (%) 50 Linux 40 Dingo 30 20 10 0 1 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 Number of Connections 800 Round-Trip (μsec) 600 400 200 0 1 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32

  24. Impact of serialisation on performance Special case: drivers for very-high-performance devices ● Examples: 10Gb Ethernet, Infiniband ● For such drivers, serialisation affects performance on multiprocessors Solution: Re-introduce multithreading at the data path ● Avoid concurrency bugs at the control path, while maintaining high performance at the data path

  25. Performance of the Mellanox InfiniBand adapter driver CPU Utilisation (%) 50 40 30 Linux 20 Dingo (serialised) 10 Dingo (multithreaded) 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 Number of Connections 5000 Throughput (Mb/s) 4000 3000 2000 1000 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32

  26. Dealing with OS protocol violations

  27. Modeling driver protocols with state machines ? - incoming call from the OS ! - outgoing call to the OS init ?start ?unplugged start !startComplete ?unplugged running unplugged ?stop ?unplugged stop !stopComplete !stopComplete

  28. Ethernet controller protocol fragment enabled !enableComplete txq_stalled enable ?enable !txStartQueue !txStopQueue txq_running disabled ?transmit rx disable !disableComplete ?disable ?receive ?suspend ...

  29. Other features of the language Other features of the specification language: ● Timeouts ● Protocol variables ● Dynamic protocol spawning ● etc.

  30. Ethernet controller protocol fragment enabled !enableComplete txq_stalled enable ?enable !txStartQueue !txStopQueue txq_running disabled ?transmit rx disable !disableComplete ?disable ?receive ?suspend ...

  31. Runtime failure detection OS protocol Driver

  32. Runtime failure detection EthernetController protocol SM OS protocol Driver

  33. Evaluation

  34. Evaluation How effective is Dingo in reducing driver bugs? ● Evaluation methodology: artificially injected 61 bugs found in similar Linux drivers into Dingo drivers

  35. Evaluation How effective is Dingo in reducing driver bugs? ● Evaluation methodology: artificially injected 61 bugs found in similar Linux drivers into Dingo drivers Bugs eliminated by design 20% Reduced likelihood Unchanged likelihood 59% 21%

  36. Summary • 40% of driver bugs are caused by the complexity of the OS interface • Dingo reduces bugs through an improved design of this interface • These improvements are implemented in an existing operating system without sacrificing the performance

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend