prototyping architectural support for program rollback
play

Prototyping Architectural Support for Program Rollback Using FPGAs - PowerPoint PPT Presentation

Prototyping Architectural Support for Program Rollback Using FPGAs Radu Teodorescu and Josep Torrellas http://iacoma.cs.uiuc.edu University of Illinois at Urbana - Champaign Motivation Problem: Software bugs major cause of system


  1. Prototyping Architectural Support for Program Rollback Using FPGAs Radu Teodorescu and Josep Torrellas http://iacoma.cs.uiuc.edu University of Illinois at Urbana - Champaign

  2. Motivation • Problem: • Software bugs – major cause of system failure • Production software is hard to debug • Continuous debugging is needed • Software - based dynamic monitoring tools • Can catch a wide range of bugs • Orders of magnitude slowdowns Radu Teodorescu - University of Illinois 2 Architectural Support for Program Rollback

  3. Motivation • Alternative solutions • Hardware support for debugging • Low overhead • Exiting support is still modest • Our system: • Hardware - assisted, lightweight debugger • Monitoring, detection and recovery from bugs in production systems Radu Teodorescu - University of Illinois 3 Architectural Support for Program Rollback

  4. Contributions • W e implemented a hardware prototype of a debugging - aware processor • W e show that simple changes to a general purpose processor can provide powerful debugging primitives • W e run experiments on buggy programs • Implementation technology: FPGA • Ideal platform for rapid prototyping • V alidate design, measure hardware overheads, run realistic experiments Radu Teodorescu - University of Illinois 4 Architectural Support for Program Rollback

  5. Debugging Production Code Dynamic execution • Applications run in multiple states: • Normal • Speculative ( can be undone ) • Re - execute • T ransition between states is controlled by software Radu Teodorescu - University of Illinois 5 Architectural Support for Program Rollback

  6. Debugging Production Code Original code Instrumented code Dynamic execution num=1; num=1; num=1; ... Replay p=m[a[*x]]+&y; enter_spec(); enter_spec(); ... p=m[a[*x]]+&y; num++; p=m[a[*x]]+&y; p=m[a[*x]]+&y; ... ... ... if(pstate()==REEXEC) if(pstate()==REEXEC) if(pstate()==REEXEC) Rollback { { info_collect(); exit_spec(flag); info_collect(); } } exit_spec(flag); exit_spec(flag); num++; Normal num++; Speculative Re - execute Radu Teodorescu - University of Illinois 6 Architectural Support for Program Rollback

  7. System Implementation Radu Teodorescu - University of Illinois 7 Architectural Support for Program Rollback

  8. Hardware Extensions • Undo program execution checkpointed state • Large code sections CPU • Small overhead • Software control • Lightweight checkpointing Data Cache • Hardware support needed: • Register checkpointing Memory • Speculative data cache Radu Teodorescu - University of Illinois 8 Architectural Support for Program Rollback

  9. Register Checkpointing • Needed to allow restoration of processor state • Beginning of speculative execution • Register fi le is copied into a shadow register fi le • End of speculative execution • Commit: discard checkpoint • Rollback: restore registers & PC from checkpoint Radu Teodorescu - University of Illinois 9 Architectural Support for Program Rollback

  10. Speculative Data Cache • Holds both speculative and non - CPU CPU speculative data Rollback • Each line has a “ speculative ” bit • Cache walk: merging or invalidating lines • Speculative lines cannot be evicted Line A Line B SPEC TAG DIRTY DATA Data cache data cache line Radu Teodorescu - University of Illinois 10 Architectural Support for Program Rollback

  11. Software Control • Give the compiler control over speculative execution • Control instructions: • Begin speculation • End speculation ( commit or rollback ) • W e use SPARC ’ s special access load • LDA [r0] code, r1 Radu Teodorescu - University of Illinois 11 Architectural Support for Program Rollback

  12. Begin Speculative Execution Normal IF ID EX MEM WB STALL! Speculative BS BS BS BS BS Re - execute Data Cache CPU release pipeline checkpoint done Register Cache begin checkpoint Checkpoiting Controller Radu Teodorescu - University of Illinois 12 Architectural Support for Program Rollback

  13. Limits • Size of the speculative window is a ff ected by: • Cache size and associativity - cache over fl ow • I/O operations cannot be rolled back • In both cases exceptions are raised • Early commit • OS intervention: bu ff er speculative state or I/O instructions Radu Teodorescu - University of Illinois 13 Architectural Support for Program Rollback

  14. Experiments and Results Radu Teodorescu - University of Illinois 14 Architectural Support for Program Rollback

  15. Processor Prototype • LEON2 - SPARC V8 compliant processor • Single issue, 5 - stage pipeline • Windowed register fi le • 2-32 sets, 16 registers • L1 instruction and data caches • 1-4 sets, up to 64KB /set • Synthesizable, open source VHDL code Radu Teodorescu - University of Illinois 15 Architectural Support for Program Rollback

  16. Experimental Infrastructure • System on a chip: PCI, Ethernet and serial interfaces • Development tools • RTL Simulation - ModelSIM • Synthesis - Xilinx ISE 6.1 • Development board: • Xilinx Virtex II XC2V3000, 64 Mbytes SDRAM • Linux embedded Radu Teodorescu - University of Illinois 16 Architectural Support for Program Rollback

  17. Deployment J Processor T Netlist A G C Output O Terminal M PCI Communication Binaries Tool Radu Teodorescu - University of Illinois 17 Architectural Support for Program Rollback

  18. Hardware Overhead Con fi gurable Logic Blocks 9000 8000 Average overhead 4.5% 7000 6000 5000 CLBs 4000 3000 ba 2000 1000 0 4KB 8KB 16KB 32KB 64KB Data Cache Size base base+reg_ckpt base+reg_ckpt+spec_cache Radu Teodorescu - University of Illinois 18 Architectural Support for Program Rollback

  19. Buggy Applications • Applications with known bugs DETECTION WINDOW • Manually instrument the code bug location • Detection window contains: • bug location bug manifestation • bug manifestation • Determine if we can roll back the buggy code section • Test con fi guration: 32KB data cache, 4KB instruction Radu Teodorescu - University of Illinois 19 Architectural Support for Program Rollback

  20. Buggy Applications Successful Dynamic Application Bug Description rollback Instructions ncompress-4.2.4 Input file name longer than 1024 bytes Yes 10653 corrupts stack return address polymorph-0.4.0 Input file name longer than 2048 bytes No 103838 corrupts stack return address Unexpected loop bounds causes heap tar-1.13.25 Yes 193 object overflow Wrong bounds checking causes static man-1.5h1 Yes 54217 object corruption Input file name longer than 1024 bytes gzip-1.2.4 Yes 17535 overflows a global variable Radu Teodorescu - University of Illinois 20 Architectural Support for Program Rollback

  21. Conclusions • W e implemented a hardware prototype of a processor with software controlled speculative execution • W e show that simple changes to a general purpose processor can provide powerful debugging primitives • Obtained an estimate of the hardware overhead and run experiments on buggy programs • W e are looking at the integration of our hardware with compiler and operating system support Radu Teodorescu - University of Illinois 21 Architectural Support for Program Rollback

  22. Prototyping Architectural Support for Program Rollback Using FPGAs Radu Teodorescu and Josep Torrellas http://iacoma.cs.uiuc.edu University of Illinois at Urbana - Champaign 22

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend