How to hack GHCi (or Hugs) so that it prints Unicode chars unescaped?

前端 未结 7 1882
情话喂你
情话喂你 2020-12-05 00:26

Look at the problem: Normally, in the interactive Haskell environment, non-Latin Unicode characters (that make a part of the results) are printed escaped, even if the locale

相关标签:
7条回答
  • 2020-12-05 01:02

    Now that I know ghci's -interactive-print, this is a great feature. Many thanks for writing the question and answers! By the way, existing pretty printers I can find on the web have some corner cases, and the problem of writing good Unicode show turned out to be more complicated than it seems.

    Therefore, I decided to write a Haskell package unicode-show for this purpose, that (hopefully) prints cornercase strings and compound types well.

    Best wishes, that this package is useful to people who searched for this Q&A :)

    0 讨论(0)
  • 2020-12-05 01:04

    Things will change on the next version 7.6.1 of Ghci as it supplies a new Ghci option called: -interactive-print. Here is copied from ghc-manual: (And I writed myShow and myPrint as follows)

    2.4.8. Using a custom interactive printing function
    
    [New in version 7.6.1] By default, GHCi prints the result of expressions typed at the prompt using the function System.IO.print. Its type signature is Show a => a -> IO (), and it works by converting the value to String using show.
    
    This is not ideal in certain cases, like when the output is long, or contains strings with non-ascii characters.
    
    The -interactive-print flag allows to specify any function of type C a => a -> IO (), for some constraint C, as the function for printing evaluated expressions. The function can reside in any loaded module or any registered package.
    
    As an example, suppose we have following special printing module:
    
         module SpecPrinter where
         import System.IO
    
         sprint a = putStrLn $ show a ++ "!"
    
    The sprint function adds an exclamation mark at the end of any printed value. Running GHCi with the command:
    
         ghci -interactive-print=SpecPrinter.sprinter SpecPrinter
    
    will start an interactive session where values with be printed using sprint:
    
         *SpecPrinter> [1,2,3]
         [1,2,3]!
         *SpecPrinter> 42
         42!
    
    A custom pretty printing function can be used, for example, to format tree-like and nested structures in a more readable way.
    
    The -interactive-print flag can also be used when running GHC in -e mode:
    
         % ghc -e "[1,2,3]" -interactive-print=SpecPrinter.sprint SpecPrinter
         [1,2,3]!
    
    
    module MyPrint (myPrint, myShow) where
    -- preparing for the 7.6.1
    myPrint :: Show a => a -> IO ()
    myPrint = putStrLn . myShow
    
    myShow :: Show a => a -> String
    myShow x = con (show x) where
      con :: String -> String
      con [] = []
      con li@(x:xs) | x == '\"' = '\"':str++"\""++(con rest)
                    | x == '\'' = '\'':char:'\'':(con rest')
                    | otherwise = x:con xs where
                      (str,rest):_ = reads li
                      (char,rest'):_ = reads li
    

    And they work well:

    *MyPrint> myPrint "asf萨芬速读法"
    "asf萨芬速读法"
    *MyPrint> myPrint "asdffasdfd"
    "asdffasdfd"
    *MyPrint> myPrint "asdffa撒旦发"
    "asdffa撒旦发"
    *MyPrint> myPrint '此'
    '此'
    *MyPrint> myShow '此'
    "'\27492'"
    *MyPrint> myPrint '此'
    '此'
    
    0 讨论(0)
  • 2020-12-05 01:04

    What would be ideal is a patch to ghci allowing the user to :set a function to use for displaying results other than show. No such feature currently exists. However, Don's suggestion for a :def macro (with or without the text package) isn't bad at all.

    0 讨论(0)
  • 2020-12-05 01:05

    Option 1 (bad):

    Modify this line of code:

    https://github.com/ghc/packages-base/blob/ba98712/GHC/Show.lhs#L356

    showLitChar c s | c > '\DEL' =  showChar '\\' (protectEsc isDec (shows (ord c)) s)
    

    And recompile ghc.

    Option 2 (lots of work):

    When GHCi type checks a parsed statement it ends up in tcRnStmt which relies on mkPlan (both in https://github.com/ghc/ghc/blob/master/compiler/typecheck/TcRnDriver.lhs). This attempts to type check several variants of the statement that was typed in including:

    let it = expr in print it >> return [coerce HVal it]
    

    Specifically:

    print_it  = L loc $ ExprStmt (nlHsApp (nlHsVar printName) (nlHsVar fresh_it))
                                          (HsVar thenIOName) placeHolderType
    

    All that might need to change here is printName (which binds to System.IO.print). If it instead bound to something like printGhci which was implemented like:

    class ShowGhci a where
        showGhci :: a -> String
        ...
    
    -- Bunch of instances?
    
    instance ShowGhci Char where
        ...  -- The instance we want to be different.
    
    printGhci :: ShowGhci a => a -> IO ()
    printGhci = putStrLn . showGhci
    

    Ghci could then change what is printed by bringing different instances into context.

    0 讨论(0)
  • 2020-12-05 01:05

    You could switch to using the 'text' package for IO. E.g.

    Prelude> :set -XOverloadedStrings
    Prelude> Data.Text.IO.putStrLn "hello: привет"
    hello: привет
    

    The package is part of the standard Haskell distribution, the Haskell Platform, and provides an efficient packed, immutable Unicode text type with IO operations. Many encodings are supported.

    Using a .ghci file you could set -XOverloadStrings to be on by default, and write a :def macro to introduce a :text command that shows a value via text only. That would work.

    0 讨论(0)
  • 2020-12-05 01:11

    One way to hack this is to wrap GHCi into a shell wrapper that reads its stdout and unescapes Unicode characters. This is not the Haskell way of course, but it does the job :)

    For example, this is a wrapper ghci-esc that uses sh and python3 (3 is important here):

    #!/bin/sh
    
    ghci "$@" | python3 -c '
    import sys
    import re
    
    def tr(match):
        s = match.group(1)
        try:
            return chr(int(s))
        except ValueError:
            return s
    
    for line in sys.stdin:
        sys.stdout.write(re.sub(r"\\([0-9]{4})", tr, line))
    '
    

    Usage of ghci-esc:

    $ ./ghci-esc
    GHCi, version 7.0.2: http://www.haskell.org/ghc/  :? for help
    > "hello"
    "hello"
    > "привет"
    "привет"
    > 'Я'
    'Я'
    > show 'Я'
    "'\Я'"
    > :q
    Leaving GHCi.
    

    Note that not all unescaping above is done correctly, but this is a fast way to show Unicode output to your audience.

    0 讨论(0)
提交回复
热议问题