Static types, polymorphism and specialization

前端 未结 3 1960
别那么骄傲
别那么骄傲 2021-01-01 09:49

When I first learned Haskell, I very quickly came to love parametric polymorphism. It\'s a delightfully simple idea that works astonishingly well. The whole \"if it compiles

相关标签:
3条回答
  • 2021-01-01 10:30

    Whole program compilers take advantage of global access to type information to make very aggressive optimizations, as you describe above. Examples include JHC and MLton. GHC with inlining is partially "whole program" as well, for similar reasons. Other techniques that take advantage of global information include super compilation.

    Note that you can massively increase code size by specializing polymorphic functions at all the types they're used at -- this then needs heavy inlining to reduce code back to normal values. Managing this is a challenge.

    0 讨论(0)
  • 2021-01-01 10:43

    Haskell (with no extensions) permits polymorphic recursion, and this feature alone makes it impossible to statically specialize a program to a completely monomorphic one. Here is a program that will print an N-deep nested list, where N is a command-line parameter:

    import System
    
    foo :: Show a => Int -> a -> IO ()
    foo 0 x = print x
    foo n x = foo (n-1) [x]
    
    main = do [num_lists] <- getArgs
              foo (read num_lists) 0
    

    In the first call to foo, a has type Int. In the next recursive call, it has type [Int], then [[Int]], and so forth.

    If polymorphic recursion is prohibited, then I believe it's possible to statically specialize a program.

    0 讨论(0)
  • 2021-01-01 10:51

    Yep, I've thought about this too. Basically, the idea is that it seems like you could implement Haskell 98, but not some of the language extensions to it, using polymorphism-by-multiinstantiation instead of polymorphism-by-boxing.

    You can get some insight into this by trying to implement some Haskell features as C++ libraries (as you note, C++ does polymorphism-by-multiinstatiation). What you find is that you can do everything that Haskell can do, except that it's impossible to have polymorphic values, which includes references to polymorphic functions.

    What this looks like is that if you have

    template<typename T>
    void f(T);           // f :: a -> IO ()
    

    you can take the address of a particular instantiation to pass around as a function pointer at runtime:

    &f<int>
    

    but you cannot take the address of a template (&f). This makes sense: templates are a purely compile-time construct. It also makes sense that if you're doing polymorphism by multiinstantiation, you can have a pointer to any particular instantiation, but you cannot have a pointer to the polymorphic function itself, because at the machine code level, there isn't one.

    So where does Haskell use polymorphic values? At first glance it seems like a good rule of thumb of is "anywhere you have to write an explicit forall". So PolymorphicComponents, Rank2Types, RankNTypes, and ImpredicativeTypes are obvious no-nos. You can't translate this to C++:

    data MkList = MkList (forall a. a -> [a])
    singleton = MkList (\x -> [x])
    

    On the other hand, ExistentialQuantification is doable in at least some cases: it means having a non-template class with a template constructor (or more generally, a class whose constructor is templated on more things than the class itself).

    If in Haskell you have:

    data SomeShow = forall a. Show a => SomeShow a
    instance Show SomeShow where show (SomeShow a) = show a
    

    you can implement this in C++ as:

    // a function which takes a void*, casts it to the given type, and
    // calls the appropriate show() function (statically selected based
    // on overload resolution rules)
    template<typename T>
    String showVoid(void *x)
    {
        show(*(T*)x);
    }
    
    class SomeShow
    {
        private:
            void *m_data;
            String (*m_show)(void*); // m_show :: Any -> String
    
        public:
            template<typename T>
            SomeShow(T x)
                : m_data(new T(x)) // memory management issues here, but that's orthogonal
                , m_show(&showVoid<T>)
            {
            }
    
            String show()
            {
                // alternately we could declare the top-level show() as a friend and
                // put this there
                return m_show(m_data);
            }
    };
    
    // C++ doesn't have type classes per se, but it has overloading, which means
    // that interfaces are implicit: where in Haskell you would write a class and
    // instances, in C++ you just write a function with the same name for each type
    String show(SomeShow x)
    {
        return x.show();
    }
    

    In both languages you have a non-polymorphic type with a polymorphic constructor.

    So we have shown that there are some language extensions you can implement and some you can't, but what about the other side of the coin: is there anything in Haskell 98 that you can't implement? Judging by the fact that you need a language extension (ExplicitForAll) to even write a forall, you would think that the answer is no. And you would almost be right, but there's two wrinkles: type classes and polymorphic recursion. Type classes are typically implemented using dictionary passing: each instance declaration results in a record of functions, which are implicitly passed around wherever they're needed.

    So for Monad for example you would have:

    data MonadDict m = MonadDict { 
        return :: forall a. a -> m a,
        (>>=)  :: forall a b. m a -> (a -> m b) -> m b
    }
    

    Well would you look at those foralls! You can't write them explicitly, but in dictionary passing implementations, even in Haskell 98, classes with polymorphic methods result in records containing polymorphic functions. Which if you're trying to implement the whole thing using multiinstantion is obviously going to be a problem. You can almost get away without dictionary passing because, if you stick to Haskell 98, instances are almost always global and statically known. Each instance results in some polymorphic functions, but because which one to call is almost always known at compile time, you almost never need to pass references to them around at runtime (which is good, because you can't). The tradeoff is that you need to do whole-program compilation, because otherwise instances are no longer statically known: they might be in a different module. And the exception is polymorphic recursion, which practically requires you to build up a dictionary at runtime. See the other answer for more details on that. Polymorphic recursion kills the multiinstantiation approach even without type classes: see the comment about BTrees. (Also ExistentialQuantification *plus* classes with polymorphic methods is no longer doable, because you would have to again start storing pointers to polymorphic functions.)

    0 讨论(0)
提交回复
热议问题