Automatic conversion between String and Data.Text in haskell

前端 未结 2 1045
感动是毒
感动是毒 2020-12-14 00:53

As Nikita Volkov mentioned in his question Data.Text vs String I also wondered why I have to deal with the different String implementations type String = [Char]

相关标签:
2条回答
  • 2020-12-14 01:13

    Almost Yes: Data.String.Conversions

    Haskell libraries make use of different types, so there are many situations in which there is no choice but to heavily use conversion, distasteful as it is - rewriting libraries doesn't count as a real choice.

    I see two concrete problems, either of which being potentially a significant problem for Haskell adoption :

    • coding ends up requiring specific implementation knowledge of the libraries you want to use.This is a big issue for a high-level language

    • performance on simple tasks is bad - which is a big issue for a generalist language.

    Abstracting from the specific types

    In my experience, the first problem is the time spent guessing the package name holding the right function for plumbing between libraries that basically operate on the same data.

    To that problem there is a really handy solution : the Data.String.Conversions package, provided you are comfortable with UTF-8 as your default encoding.

    This package provides a single cs conversion function between a number of different types.

    • String
    • Data.ByteString.ByteString
    • Data.ByteString.Lazy.ByteString
    • Data.Text.Text
    • Data.Text.Lazy.Text

    So you just import Data.String.Conversions, and use cs which will infer the right version of the conversion function according to input and output types.

    Example:

    import Data.Aeson              (decode)
    import Data.Text               (Text)
    import Data.ByteString.Lazy    (ByteString)
    import Data.String.Conversions (cs)
    
    decodeTextStoredJson' :: T.Text -> MyStructure
    decodeTextStoredJson' x = decode (cs x) :: Maybe MyStructure
    

    NB : In GHCi you generally do not have a context that gives the target type so you direct the conversion by explicitly stating the type of the result, like for read

    let z = cs x :: ByteString
    

    Performance and the cry for a "true" solution

    I am not aware of any true solution as of yet - but we can already guess the direction

    • it is legitimate to require conversion because the data does not change ;
    • best performance is achieved by not converting data from one type to another for administrative purposes ;
    • coercion is evil - coercitive, even.

    So the direction must be to make these types not different, i.e. to reconcile them under (or over) an archtype from which they would all derive, allowing composition of functions using different derivations, without the need to convert.

    Nota : I absolutely cannot evaluate the feasability / potential drawbacks of this idea. There may be some very sound stoppers.

    0 讨论(0)
  • 2020-12-14 01:17

    No.

    Haskell doesn't have implicit coercions for technical, philosophical, and almost religious reasons.

    As a comment, converting between these representations isn't free and most people don't like the idea that you have hidden and potentially expensive computations lurking around. Additionally, with strings as lazy lists, coercing them to a Text value might not terminate.

    We can convert literals to Texts automatically with OverloadedStrings by desugaring a string literal "foo" to fromString "foo" and fromString for Text just calls pack.

    The question might be to ask why you're coercing so much? Is there some why do you need to unpack Text values so often? If you constantly changing them to strings it defeats the purpose a bit.

    0 讨论(0)
提交回复
热议问题