Can I speedup YAML?

前端 未结 3 1989
耶瑟儿~
耶瑟儿~ 2020-12-29 05:22

I made a little test case to compare YAML and JSON speed :

import json
import yaml
from datetime import datetime
from random import randint

NB_ROW=1024

pri         


        
3条回答
  •  囚心锁ツ
    2020-12-29 05:59

    You've probably noticed that Python's syntax for data structures is very similar to JSON's syntax.

    What's happening is Python's json library encodes Python's builtin datatypes directly into text chunks, replacing ' into " and deleting , here and there (to oversimplify a bit).

    On the other hand, pyyaml has to construct a whole representation graph before serialising it into a string.

    The same kind of stuff has to happen backwards when loading.

    The only way to speedup yaml.load() would be to write a new Loader, but I doubt it could be a huge leap in performance, except if you're willing to write your own single-purpose sort-of YAML parser, taking the following comment in consideration:

    YAML builds a graph because it is a general-purpose serialisation format that is able to represent multiple references to the same object. If you know no object is repeated and only basic types appear, you can use a json serialiser, it will still be valid YAML.

    -- UPDATE

    What I said before remains true, but if you're running Linux there's a way to speed up Yaml parsing. By default, Python's yaml uses the Python parser. You have to tell it that you want to use PyYaml C parser.

    You can do it this way:

    import yaml
    from yaml import CLoader as Loader, CDumper as Dumper
    
    dump = yaml.dump(dummy_data, fh, encoding='utf-8', default_flow_style=False, Dumper=Dumper)
    data = yaml.load(fh, Loader=Loader)
    

    In order to do so, you need yaml-cpp-dev (package later renamed to libyaml-cpp-dev) installed, for instance with apt-get:

    $ apt-get install yaml-cpp-dev
    

    And PyYaml with LibYaml as well. But that's already the case based on your output.

    I can't test it right now because I'm running OS X and brew has some trouble installing yaml-cpp-dev but if you follow PyYaml documentation, they are pretty clear that performance will be much better.

提交回复
热议问题