How to merge YAML arrays?

后端未结

关注

 5  1622

I would like to merge arrays in YAML, and load them via ruby -

some_stuff: &some_stuff
 - a
 - b
 - c

combined_stuff:
  <<: *some_stuff
  - d
  -


                      
              相关标签:


      
      
        
          5条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  情歌与酒        
                
              
                            
                2020-11-29 23:55
              
            
            
                                                                       
If the aim is to run a sequence of shell commands, you may be able to achieve this as follows:

# note: no dash before commands
some_stuff: &some_stuff |-
    a
    b
    c

combined_stuff:
  - *some_stuff
  - d
  - e
  - f


This is equivalent to:

some_stuff: "a\nb\nc"

combined_stuff:
  - "a\nb\nc"
  - d
  - e
  - f


I have been using this on my gitlab-ci.yml (to answer @rink.attendant.6 comment on the question).



Working example that we use to support requirements.txt having private repos from gitlab:

.pip_git: &pip_git
- git config --global url."https://gitlab-ci-token:${CI_JOB_TOKEN}@gitlab.com".insteadOf "ssh://git@gitlab.com"
- mkdir -p ~/.ssh
- chmod 700 ~/.ssh
- echo "$SSH_KNOWN_HOSTS" > ~/.ssh/known_hosts
- chmod 644 ~/.ssh/known_hosts

test:
    image: python:3.7.3
    stage: test
    script:
        - *pip_git
        - pip install -q -r requirements_test.txt
        - python -m unittest discover tests

use the same `*pip_git` on e.g. build image...


where requirements_test.txt contains e.g.

-e git+ssh://git@gitlab.com/example/example.git@v0.2.2#egg=example
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  天命终不由人        
                
              
                            
                2020-11-29 23:55
              
            
            
                                                                       
You can merge mappings then convert their keys into a list, under these conditions:


if you are using jinja2 templating and 
if item order is not important


some_stuff: &some_stuff
 a:
 b:
 c:

combined_stuff:
  <<: *some_stuff
  d:
  e:
  f:

{{ combined_stuff | list }}

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  萌比男神i        
                
              
                            
                2020-11-30 00:03
              
            
            
                                                                       
If you only need to merge one item into a list you can do

fruit:
  - &banana
    name: banana
    colour: yellow

food:
  - *banana
  - name: carrot
    colour: orange


which yields

fruit:
  - name: banana
    colour: yellow

food:
  - name: banana
    colour: yellow
  - name: carrot
    colour: orange

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  醉梦人生        
                
              
                            
                2020-11-30 00:06
              
            
            
                                                                       
Update: 2019-07-01 14:06:12


Note: another answer to this question was substantially edited with an update on alternative approaches. 


That updated answer mentions an alternative to the workaround in this answer. It has been added to the See also section below.



Context

This post assumes the following context:


python 2.7
python YAML parser


Problem

lfender6445 wishes to merge two or more lists within a YAML file, and have those
merged lists appear as one singular list when parsed.

Solution (Workaround)

This may be obtained simply by assigning YAML anchors to mappings, where the
desired lists appear as child elements of the mappings. There are caveats to this, however, (see "Pitfalls" infra).

In the example below we have three mappings (list_one, list_two, list_three) and three anchors
and aliases that refer to these mappings where appropriate.

When the YAML file is loaded in the program we get the list we want, but
it may require a little modification after load (see pitfalls below).

Example

Original YAML file

  list_one: &id001
   - a
   - b
   - c

  list_two: &id002
   - e
   - f
   - g

  list_three: &id003
   - h
   - i
   - j

  list_combined:
      - *id001
      - *id002
      - *id003


Result after YAML.safe_load

## list_combined
  [
    [
      "a",
      "b",
      "c"
    ],
    [
      "e",
      "f",
      "g"
    ],
    [
      "h",
      "i",
      "j"
    ]
  ]


Pitfalls


this approach produces a nested list of lists, which may not be the exact desired output, but this can be post-processed using the flatten method
the usual caveats to YAML anchors and aliases apply for uniqueness and declaration order


Conclusion

This approach allows creation of merged lists by use of the alias and anchor feature of YAML.

Although the output result is a nested list of lists, this can be easily transformed using the flatten method.

See also

Updated alternative approach by @Anthon


See alternative approach


Examples of the flatten method


Javascript flatten ;; Merge/flatten an array of arrays
Ruby flatten ;; http://ruby-doc.org/core-2.2.2/Array.html#method-i-flatten
Python flatten ;; https://softwareengineering.stackexchange.com/a/254676/23884

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  南笙        
                
              
                            
                2020-11-30 00:17
              
            
            
                                                                       
This is not going to work:


merge is only supported by the YAML specifications for mappings and not for sequences
you are completely mixing things by having a merge key <<
followed by the key/value separator : and a value that is a
reference and then continue with a list at the same indentation
level


This is not correct YAML:

combine_stuff:
  x: 1
  - a
  - b


So your example syntax would not even make sense as a YAML extension proposal.

If you want to do something like merging multiple arrays you might want to consider a syntax like:

combined_stuff:
  - <<: *s1, *s2
  - <<: *s3
  - d
  - e
  - f


where s1, s2, s3 are anchors on sequences (not shown) that you
want to merge into a new sequence and then have the d, e and f
appended to that. But YAML is resolving these kind of structures depth
first, so there is no real context available during the processing
of the merge key. There is no array/list available to you where you
could attach the processed value (the anchored sequence) to.

You can take the approach as proposed by @dreftymac, but this has the huge disadvantage that you
somehow need to know which nested sequences to flatten (i.e. by knowing the "path" from the root
of the loaded data structure to the parent sequence), or that you recursively walk the loaded
data structure searching for nested arrays/lists and indiscriminately flatten all of them.

A better solution IMO would be to use tags to load data structures
that do the flattening for you. This allows for clearly denoting what
needs to be flattened and what not and gives you full control over
whether this flattening is done during loading, or done during
access. Which one to choose is a matter of ease of implementation and
efficiency in time and storage space. This is the same trade-off that needs to be made
for implementing the merge key feature and 
there is no single solution that is always the best.

E.g. my ruamel.yaml library uses the brute force merge-dicts during
loading when using its safe-loader, which results in merged
dictionaries that are normal Python dicts. This merging has to be done
up-front, and duplicates data (space inefficient) but is fast in value
lookup. When using the round-trip-loader, you want to be able to dump
the merges unmerged, so they need to be kept separate. The dict like
datastructure loaded as a result of round-trip-loading, is space
efficient but slower in access, as it needs to try and lookup a key
not found in the dict itself in the merges (and this is not cached, so
it needs to be done every time).  Of course such considerations are
not very important for relatively small configuration files.



The following implements a merge like scheme for lists in python using objects with tag flatten
which on-the-fly recurses into items which are lists and tagged toflatten. Using these two tags 
you can have YAML file:

l1: &x1 !toflatten
  - 1 
  - 2
l2: &x2
  - 3 
  - 4
m1: !flatten
  - *x1
  - *x2
  - [5, 6]
  - !toflatten [7, 8]


(the use of flow vs block style sequences is completely arbitrary and has no influence on the
loaded result).

When iterating over the items that are the value for key m1 this
"recurses" into the sequences tagged with toflatten, but displays
other lists (aliased or not) as a single item.

One possible way with Python code to achieve that is:

import sys
from pathlib import Path
import ruamel.yaml

yaml = ruamel.yaml.YAML()


@yaml.register_class
class Flatten(list):
   yaml_tag = u'!flatten'
   def __init__(self, *args):
      self.items = args

   @classmethod
   def from_yaml(cls, constructor, node):
       x = cls(*constructor.construct_sequence(node, deep=True))
       return x

   def __iter__(self):
       for item in self.items:
           if isinstance(item, ToFlatten):
               for nested_item in item:
                   yield nested_item
           else:
               yield item


@yaml.register_class
class ToFlatten(list):
   yaml_tag = u'!toflatten'

   @classmethod
   def from_yaml(cls, constructor, node):
       x = cls(constructor.construct_sequence(node, deep=True))
       return x



data = yaml.load(Path('input.yaml'))
for item in data['m1']:
    print(item)


which outputs:

1
2
[3, 4]
[5, 6]
7
8


As you can see you can see, in the sequence that needs flattening, you
can either use an alias to a tagged sequence or you can use a tagged
sequence. YAML doesn't allow you to do: 

- !flatten *x2


, i.e. tag an
anchored sequence, as this would essentially make it into a different
datastructure.

Using explicit tags is IMO better than having some magic going on as
with YAML merge keys <<. If nothing else you now have to go through
hoops if you happen to have a YAML file with a mapping that has a key
<< that you don't want to act like a merge key, e.g. when you make a
mapping of C operators to their descriptions in English (or some other natural language).
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复