How to hack GHCi (or Hugs) so that it prints Unicode chars unescaped?

前端 未结 7 1883
情话喂你
情话喂你 2020-12-05 00:26

Look at the problem: Normally, in the interactive Haskell environment, non-Latin Unicode characters (that make a part of the results) are printed escaped, even if the locale

相关标签:
7条回答
  • 2020-12-05 01:22

    There has been some progress with this issue; thanks to bravit (Vitaly Bragilevsky)!:

    • work in progress: Даёшь кириллицу в GHCi! — 2 -- around the related ticket;
    • the result of the work: Даёшь кириллицу в GHCi! — 3 -- with the patch and another one for the docs by bravit (Vitaly Bragilevsky). These enhancements have been committed: 1 and 2.

    Probably incorporated into GHC 7.6.1. (Is it?..)

    How to make it print Cyrillic now:

    The parameter passed to GHCi should be a function which can print Cyrillic. No such function has been found on Hackage. So, we have to create a simple wrapper, as for now:

    module UPPrinter where
    import System.IO
    import Text.PrettyPrint.Leijen
    
    upprint a = (hPutDoc stdout . pretty) a >> putStrLn ""
    

    And run ghci this way: ghci -interactive-print=UPPrinter.upprint UPPrinter

    Of course, this can be written down once and for all into .ghci.

    Practical problem: coming up with an alternative nice Show

    So, now there is a practical problem: what to use as a substitute of the standard Show (which--the standard Show--escapes the wanted symbols against our wish)?

    Using others' work: other pretty-printers

    Above, Text.PrettyPrint.Leijen is suggested, probably because it is known not escape such symbols in strings.

    Our own Show based on Show -- attractive, but not practical

    What about writing our own Show, say, ShowGhci as was suggested in an answer here. Is it practical?..

    To save work defining the instances for an alternative Show class (like ShowGhci), one might be tempted to use the existing instances of Show by default, only re-define the instance for String and Char. But that won't work, because if you use showGhci = show, then for any complex data containing strings show is "hard-compiled" to call old show to show the string. This situation asks for the ability to pass different dictionaries implementing the same class interface to functions which use this interface (show would pass it down to subshows). Any GHC extensions for this?

    Basing on Show and wanting to redefine only the instances for Char and String is not very practical, if you want it to be as "universal" (widely applicable) as Show.

    Re-parsing show

    A more practical (and short) solution is in another answer here: parse the output from show to detect chars and strings, and re-format them. (Although seems a bit ugly semantically, the solution is short and safe in most cases (if there are no quotes used for other purposes in show; must not be the case for standard stuff, because the idea of show is to be more-or-less correct parsable Haskell.)

    Semantic types in your programs

    And one more remark.

    Actually, if we care about debugging in GHCi (and not simply demonstrating Haskell and wanting to have a pretty output), the need for showing non-ASCII letters must come from some inherent presence of these characters in your program (otherwise, for debugging, you could substitute them with Latin characters or not care much about being shown the codes). In other words, there is some MEANING in these characters or strings from the point of view of the problem domain. (For example, I've been recently engaged with grammatical analysis of Russian, and the Russian words as part of an example dictionary were "inherently" present in my program. Its work would make sense only with these specific words. So I needed to read them when debugging.)

    But look, if the strings have some MEANING, then they are not plain strings any more; it's data of a meaningful type. Probably, the program would become even better and safer, if you would declare a special type for this kind of meanings.

    And then, hooray!, you simply define your instance of Show for this type. And you are OK with debugging your program in GHCi.

    As an example, in my program for grammatical analysis, I have done:

    newtype Vocable = Vocable2 { ortho :: String } deriving (Eq,Ord)
    instance IsString Vocable -- to simplify typing the values (with OverloadedStrings)
        where fromString = Vocable2 . fromString
    

    and

    newtype Lexeme = Lexeme2 { lemma :: String } deriving (Eq,Ord)
    instance IsString Lexeme -- to simplify typing the values (with OverloadedStrings)
        where fromString = Lexeme2 . fromString
    

    (the extra fromString here is because I might switch the internal representation from String to ByteString or whatever)

    Apart from being able to show them nicely, I got safer because I wouldn't be able to mix different types of words when composing my code.

    0 讨论(0)
提交回复
热议问题