Understanding STG

后端 未结 4 484
不思量自难忘°
不思量自难忘° 2021-01-30 05:15

The design of GHC is based on something called STG, which stands for \"spineless, tagless G-machine\".

Now G-machine is apparently short for \"graph reduction machine\",

相关标签:
4条回答
  • 2021-01-30 05:57

    GHC wiki contains an introductory article about STG written by Max Bolingbroke:

    I know kung fu: learning STG by example

    The STG machine is an essential part of GHC, the world's leading Haskell compiler. It defines how the Haskell evaluation model should be efficiently implemented on standard hardware. Despite this key role, it is generally poorly understood amongst GHC users. This document aims to provide an overview of the STG machine in its modern, eval/apply-based, pointer-tagged incarnation by a series of simple examples showing how Haskell source code is compiled.

    0 讨论(0)
  • 2021-01-30 05:57

    You want to read SPJ's book about functional PL implementation:

    • http://research.microsoft.com/en-us/um/people/simonpj/papers/slpj-book-1987/index.htm
    0 讨论(0)
  • 2021-01-30 05:57

    The answer by migle is exactly what spinlessness and taglessness of the STGM mean. Today it does not worth trying to understand the names of the two features because the names stem from the history of graph reduction technologies: from G-machine, Spineless G-machine, and Spineless and Tagless G-machine.

    The G-machine uses both spine and tags. A spine is a list of edges from the root node of a function application to the node of the function. For example, a function application of "f e1 e2 ... en" is represented as

    root = AP left_n en
    left_n = AP left_n-1 en-1 ...
    left_2 = AP left_1 e1
    left_1 = FUN f
    

    in G-machine, and so a spine is a list of edges consisting of left_n -> left_n-1 -> ... -> left_2 -> left_1. It is literally a spine of the function application!

    In the same function application, there are tags AP and FUN.

    In the next advanced G-machine so called the Spineless G-machine, there is no such spine by representing such a function application in a contiguous block whose first slot points to f, the second slot points to e1, ..., and the n+1-th slot points to en. In this representation, we do not need a spine. But the block starts a special tag designating the number of slots and so on.

    In the most advanced G-machine so called the Spineless Tagless G-machine, such a tag is replaced with a function pointer. To evaluate a function application is to jump to the code by the function pointer.

    It is unfortunate to find that Simone Peyton Jones's STGM paper does not give compilation/evaluation rules in some abstract level, and so it is very natural for people not to be easy to understand the essence of the STGM.

    0 讨论(0)
  • 2021-01-30 05:59

    You are right about the "Spineless", that is it, if I'm correct. It is basically described on the 1988 article by Burn, Peyton-Jones and Robson, "The Spineless G-Machine". I've read it, but it is not so fresh in my mind. Basically, on the G-Machine, all stack entries point to an application node except the one on the top, which points to the head of the expression. Those application nodes make access to the arguments indirect, and in some G-Machine descriptions, before applying a function the stack is rearranged, so that the last n nodes on the stack are made to point to the argument instead of the application node. If I am not mistaken, the "Spineless" part is about avoiding having these application nodes (which are called the spine of the graph) on the stack altogether, thus avoiding that re-arrangement before each reduction.

    As to the "Tagless" part, you are more correct now that you used to be, but... Using tags on nodes is a very, very old thing. Can you think on how a dynamically-typed language such as LISP was implemented? Every cell must have its value and a tag which says the type. If you want something you must examine the tag and act accordingly. In the case of Haskell, the evaluation state is more important than type, Haskell is statically typed.

    In the STG machine, tags are not used. Tags were replaced, maybe through inspiration of OO lanaguages, by a set of function pointers. When you want the value of a node which has not been computed, the function will compute it. When it is already computed, the function returns it. This allows for a lot of creativity in what this function can do without making client code any more complex.

    This "Tagless" part yes, is described in the "implementation of functional languages on stock hardware" article by SPJ.

    There is also objection to this "tagless" thing. Basically, this involves function pointers, which is an indirect jump in computer architecture terms. And indirect jumps are an obstacle to branch prediction and hence to pipelining in general. Because either the architecture considers there is a data dependency on the jump argument, halting the pipeline, or the architecture assumes it does not know the destination and halts the pipeline.

    0 讨论(0)
提交回复
热议问题