yhc the york haskell compiler
play

Yhc: The York Haskell Compiler By Tom Shackell What? Yhc is a - PowerPoint PPT Presentation

Yhc: The York Haskell Compiler By Tom Shackell What? Yhc is a rewrite of the back end of the nhc98 system. The back-end of the compiler is replaced. The runtime system is replaced. The instruction set is different. The


  1. Yhc: The York Haskell Compiler By Tom Shackell

  2. What? ● Yhc is a rewrite of the back end of the nhc98 system. ● The back-end of the compiler is replaced. ● The runtime system is replaced. ● The instruction set is different. ● The Prelude is heavily modified.

  3. Why? ● It was written to address some issues with the nhc98 back end. ● In particular: The high bit problem. ● Also as an experiment: Can we make nhc98 more portable?

  4. The High Bit Problem

  5. Graph Reduction ● Lazy functional languages are usually implemented using graph reduction. ● Haskell expressions are represented by graphs. sum :: [Int] -> Int sum [] = 0 sum (x:xs) = x + sum xs ● The expression 'sum [1,2]' might be represented by the graph: sum : : [ ] 1 2

  6. Reduction sum : 1 : 2 [ ]

  7. Reduction sum : 1 : 2 [ ]

  8. Reduction sum 3 : 1 : 2 [ ]

  9. Reduction IND 3

  10. Heap Node We can see there are 4 types of graph node : sum Constructor Thunk sum IND Blackholed Thunk Indirection In nhc and Yhc these graph nodes are represented with 4 types of heap node

  11. Heap Nodes in nhc sum Constructor Constructor Information 10 Thunk Function Information Pointer 0 1 Blackholed Thunk Function Information Pointer 1 1 Indirection Redirection Pointer 00

  12. The “High Bit” problem ● nhc assumes that it can use the topmost bit of a pointer to store information. ● This is not always the case: many modern Linux-x86 kernels allocate memory in addresses too high to fit in 31bits. Constructor Constructor Information 10 Thunk Function Information Pointer 0 1 Blackholed Thunk Function Information Pointer 1 1 Indirection Redirection Pointer 00

  13. Heap Nodes in Yhc ● Yhc makes sure that all FInfo structures are 4 byte aligned. Freeing up a bit at the bottom for Thunk nodes. ● It also represents constructors by using a pointer to the information about the constructor, rather than encoding the information into the heap word. Constructor Constructor Information Pointer 01 Thunk Function Information Pointer 0 1 Blackholed Thunk Function Information Pointer 1 1 Indirection Redirection Pointer 00

  14. Instruction Sets ● The instruction set for Yhc is much simpler than for nhc. ● Both are based on stack machines. ● However, nhc has instructions for directly manipulating both the heap and the stack. ● Where as Yhc only directly manipulates the stack.

  15. Instructions main :: IO () main = putStrLn (show 42) nhc instructions Yhc instructions main(): main(): HEAP_CVAL show PUSH_INT 42 HEAP_INT 42 MK_AP show PUSH_HEAP MK_AP putStrLn HEAP_CVAL putStrLn RETURN_EVAL HEAP_OFF -3 RETURN_EVAL

  16. nhc instructions nhc instructions main(): main(): Heap HEAP_CVAL show HEAP_CVAL show HEAP_INT 42 HEAP_INT 42 PUSH_HEAP PUSH_HEAP HEAP_CVAL putStrLn HEAP_CVAL putStrLn HEAP_OFF -3 HEAP_OFF -3 RETURN_EVAL RETURN_EVAL Stack Constants

  17. nhc instructions main(): Heap HEAP_CVAL show HEAP_INT 42 PUSH_HEAP HEAP_CVAL putStrLn HEAP_OFF -3 show RETURN_EVAL Stack Constants

  18. nhc instructions main(): Heap HEAP_CVAL show HEAP_INT 42 PUSH_HEAP HEAP_CVAL putStrLn HEAP_OFF -3 show RETURN_EVAL Stack Constants 42

  19. nhc instructions main(): Heap HEAP_CVAL show HEAP_INT 42 PUSH_HEAP HEAP_CVAL putStrLn HEAP_OFF -3 show RETURN_EVAL Stack Constants 42

  20. nhc instructions main(): Heap HEAP_CVAL show HEAP_INT 42 PUSH_HEAP HEAP_CVAL putStrLn HEAP_OFF -3 show putStrLn RETURN_EVAL Stack Constants 42

  21. nhc instructions main(): Heap HEAP_CVAL show HEAP_INT 42 PUSH_HEAP HEAP_CVAL putStrLn HEAP_OFF -3 show putStrLn RETURN_EVAL Stack Constants 42

  22. nhc instructions main(): Heap HEAP_CVAL show HEAP_INT 42 PUSH_HEAP HEAP_CVAL putStrLn HEAP_OFF -3 show putStrLn RETURN_EVAL Stack Constants 42

  23. Yhc instructions Heap main(): PUSH_INT 42 MK_AP show MK_AP putStrLn RETURN_EVAL Stack

  24. Yhc instructions Heap main(): PUSH_INT 42 MK_AP show MK_AP putStrLn RETURN_EVAL Stack 42

  25. Yhc instructions Heap main(): PUSH_INT 42 MK_AP show MK_AP putStrLn RETURN_EVAL Stack show 42

  26. Yhc instructions Heap main(): PUSH_INT 42 MK_AP show MK_AP putStrLn RETURN_EVAL putStrLn Stack show 42

  27. Yhc instructions Heap main(): PUSH_INT 42 MK_AP show MK_AP putStrLn RETURN_EVAL putStrLn Stack show 42

  28. Comparison ● Yhc uses less instructions to do the same thing. ● Because it doesn't have to have explicit movements between heap and stack. ● ... and because it can reference other nodes implicitly rather than using explicit heap offsets. ● Yhc instructions are also smaller ● Because it has more 'specializations' ● ... and again, because heap references are implicit ● These two factors make Yhc about 20% faster than nhc

  29. Improving Portability

  30. Bytecode in nhc ● nhc compiles Haskell functions into a bytecode for an abstract machine that manipulates graphs: The G-Machine. ● The bytecode is placed in a C source file, using an array of bytes. The C source file is then compiled and linked with the nhc interpreter to form an executable. unsigned char[] FN_Prelude_46sum = { NEEDHEAP_I32, HEAP_CVAL_I3, HEAP_ARG, 1, HEAP_CVAL_I4, HEAP_ARG, 1, HEAP_CVAL_I5, HEAP_OFF_N1, 3, HEAP_CADR_N1, 1, PUSH_HEAP, HEAP_CVAL_P1, 6, HEAP_OFF_N1, 8, HEAP_OFF_N1, 5, RETURN, ENDCODE };

  31. Portable? ● The C code is portable, isn't it? ● Yes, but: ● It creates a dependency on a C compiler. ● There are issues with the nuances of various C compilers. ● The bytecode can't be loaded dynamically.

  32. Improved Portability. ● Yhc also compiles Haskell functions into bytecode instructions for a G-Machine. ● However, Yhc places the bytecodes in a separate file which is then loaded by the interpretter at runtime. Similar to Java's classfile system. ● More portable, but it means Yhc has to do its own linking.

  33. More Portable Still? ● Can we extend portability to include portability over a network? ● Then we could take a closure on one machine and have it run on another machine. ● Not implemented yet, but some interesting ideas.

  34. Computer A Computer B calc data

  35. Computer A Computer B calc data calc data

  36. Computer A Computer B calc data calc data

  37. Computer A Computer B calc data

  38. Computer A Computer B calc data

  39. Computer A Computer B calc data

  40. Computer A Computer B calc data Need calc

  41. Computer A Computer B calc data Need calc

  42. Computer A Computer B calc data Need calc

  43. Computer A Computer B calc data Need calc calc calc(x): PUSH_ARG x PUSH_CONST subcalc MK_AP iter RETURN_EVAL

  44. Computer A Computer B calc data calc calc(x): PUSH_ARG x PUSH_CONST subcalc MK_AP iter RETURN_EVAL

  45. Computer A Computer B calc data calc calc(x): PUSH_ARG x PUSH_CONST subcalc MK_AP iter RETURN_EVAL

  46. Computer A Computer B calc data calc calc(x): PUSH_ARG x PUSH_CONST subcalc MK_AP iter RETURN_EVAL

  47. Computer A Computer B calc data iter subcalc calc calc(x): PUSH_ARG x PUSH_CONST subcalc MK_AP iter RETURN_EVAL

  48. Computer A Computer B IND data iter subcalc

  49. Computer A Computer B IND data iter Need iter subcalc

  50. Computer A Computer B IND data iter And so on ... subcalc

  51. Computer A Computer B IND IND 42

  52. Computer A Computer B IND IND 42 Result

  53. Computer A Computer B 42 Result

  54. Computer A Computer B 42 Result

  55. Computer A Computer B 42 Result

  56. Computer A Computer B calc data 42 Result

  57. Computer A Computer B IND 42 Result

  58. Challenges ● Needs concurrency to be useful. ● Complicates Garbage collection. ● Level of granularity versus laziness. ● Possible architecture differences.

  59. Other Things! ● Other people have written various interpretters and backends for Yhc bytecode: Java, Python, .NET ● ... and various related tools such as interactive interpretters. ● I'm also using Yhc to do my Hat G-Machine work.

  60. Questions?

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend