Read and write YAML files without destroying anchors and aliases?

后端 未结 3 1873
滥情空心
滥情空心 2020-12-01 16:15

I need to open a YAML file with aliases used inside it:

defaults: &defaults
  foo: bar
  zip: button

node:
  <<: *defaults
  foo: other

相关标签:
3条回答
  • 2020-12-01 16:52

    The use of << to indicate an aliased mapping should be merged in to the current mapping isn’t part of the core Yaml spec, but it is part of the tag repository.

    The current Yaml library provided by Ruby – Psych – provides the dump and load methods which allow easy serialization and deserialization of Ruby objects and use the various implicit type conversion in the tag repository including << to merge hashes. It also provides tools to do more low level Yaml processing if you need it. Unfortunately it doesn’t easily allow selectively disabling or enabling specific parts of the tag repository – it’s an all or nothing affair. In particular the handling of << is pretty baked in to the handling of hashes.

    One way to achieve what you want is to provide your own subclass of Psych’s ToRuby class and override this method, so that it just treats mapping keys of << as literals. This involves overriding a private method in Psych, so you need to be a little careful:

    require 'psych'
    
    class ToRubyNoMerge < Psych::Visitors::ToRuby
      def revive_hash hash, o
        @st[o.anchor] = hash if o.anchor
    
        o.children.each_slice(2) { |k,v|
          key = accept(k)
          hash[key] = accept(v)
        }
        hash
      end
    end
    

    You would then use it like this:

    tree = Psych.parse your_data
    data = ToRubyNoMerge.new.accept tree
    

    With the Yaml from your example, data would then look something like

    {"defaults"=>{"foo"=>"bar", "zip"=>"button"},
     "node"=>{"<<"=>{"foo"=>"bar", "zip"=>"button"}, "foo"=>"other"}}
    

    Note the << as a literal key. Also the hash under the data["defaults"] key is the same hash as the one under the data["node"]["<<"] key, i.e. they have the same object_id. You can now manipulate the data as you want, and when you write it out as Yaml the anchors and aliases will still be in place, although the anchor names will have changed:

    data['node']['foo'] = "yet another"
    puts Yaml.dump data
    

    produces (Psych uses the object_id of the hash to ensure unique anchor names (the current version of Psych now uses sequential numbers rather than object_id)):

    ---
    defaults: &2151922820
      foo: bar
      zip: button
    node:
      <<: *2151922820
      foo: yet another
    

    If you want to have control over the anchor names, you can provide your own Psych::Visitors::Emitter. Here’s a simple example based on your example and assuming there’s only the one anchor:

    class MyEmitter < Psych::Visitors::Emitter
      def visit_Psych_Nodes_Mapping o
        o.anchor = 'defaults' if o.anchor
        super
      end
    
      def visit_Psych_Nodes_Alias o
        o.anchor = 'defaults' if o.anchor
        super
      end
    end
    

    When used with the modified data hash from above:

    #create an AST based on the Ruby data structure
    builder = Psych::Visitors::YAMLTree.new
    builder << data
    ast = builder.tree
    
    # write out the tree using the custom emitter
    MyEmitter.new($stdout).accept ast
    

    the output is:

    ---
    defaults: &defaults
      foo: bar
      zip: button
    node:
      <<: *defaults
      foo: yet another
    

    (Update: another question asked how to do this with more than one anchor, where I came up with a possibly better way to keep anchor names when serializing.)

    0 讨论(0)
  • 2020-12-01 16:52

    Have you try Psych ? Another question with psych here.

    0 讨论(0)
  • 2020-12-01 16:57

    YAML has aliases and they can round-trip, but you disable it by hash merging. << as a mapping key seems a non-standard extension to YAML (both in 1.8's syck and 1.9's psych).

    require 'rubygems'
    require 'yaml'
    
    yaml = <<EOS
    defaults: &defaults
      foo: bar
      zip: button
    
    node: *defaults
    EOS
    
    data = YAML.load yaml
    print data.to_yaml
    

    prints

    --- 
    defaults: &id001 
      zip: button
      foo: bar
    node: *id001
    

    but the << in your data merges the aliased hash into a new one which is no longer an alias.

    0 讨论(0)
提交回复
热议问题