Python dictionaries into yaml documents using PyYaml

10,706

Solution 1

How about:

class Bunch(yaml.YAMLObject):
    yaml_tag = u'!Bunch'
    def __init__(self, **kwargs):
        self.__dict__.update(kwargs)
    def __repr__(self):
        return '{c}({a})'.format(
            c = self.__class__.__name__,
            a = ', '.join(
                ['='.join(map(str,item)) for item in self.__dict__.items()]))
tag_names = ['define', 'action']
namespace = {}
for name in tag_names:
    namespace[name] = type(name, (Bunch,), {'yaml_tag':u'!{n}'.format(n = name)})

definitions = {"one" : 1, "two" : 2, "three" : 3}
actions = {"run" : "yes", "print" : "no", "report" : "maybe"}
text = yaml.dump_all([namespace['define'](**definitions),
                      namespace['action'](**actions)],
                     default_flow_style = False,
                     explicit_start = True)
print(text)

which yields

--- !define
one: 1
three: 3
two: 2
--- !action
print: 'no'
report: maybe
run: 'yes'

And to load the YAML back into Python objects:

for item in  yaml.load_all(text):
    print(item)
    # define(one=1, three=3, two=2)
    # action(print=no, report=maybe, run=yes)

The subclasses of YAMLObject were used to create the application-specific tags.

Solution 2

Well, I'm still looking into automatic comments (couldn't find the docs for that right away) but this should do the trick:

import yaml

definitions = {"one" : 1, "two" : 2, "three" : 3}
actions = {"run" : "yes", "print" : "no", "report" : "maybe"}

output = yaml.dump(actions, default_flow_style=False, explicit_start=True)
output += yaml.dump(definitions, default_flow_style=False, explicit_start=True)

print output

One word of caution, dictionaries are unordered, so the order of your resulting YAML is not guaranteed. If you want order in the house - look at OrderedDict.

Share:
10,706
Periodic Maintenance
Author by

Periodic Maintenance

Updated on June 04, 2022

Comments

  • Periodic Maintenance
    Periodic Maintenance almost 2 years

    I have two python dictionaries which I want to write to a single yaml file, with two documents:

    definitions = {"one" : 1, "two" : 2, "three" : 3}
    actions = {"run" : "yes", "print" : "no", "report" : "maybe"}
    

    The yaml file should look like:

    --- !define
    one: 1
    two: 2
    three: 3
    
    -- !action
    run: yes
    print: no
    report: maybe
    ...
    

    Using PyYaml I did not find a clear way to do that. I'm sure there is a simple method, but digging into PyYaml documentation, only got me confused. Do I need a dumper, emitter, or what? And what type of output each of these types produces? Yaml text? yaml nodes? YAMLObject? Anyway I would be grateful for any clarifications.


    Following unutbu's answer below, here is the most concise version I could come up with:

    DeriveYAMLObjectWithTag is a function to create a new class, derived from YAMLObject with the required tag:

    def DeriveYAMLObjectWithTag(tag):
        def init_DeriveYAMLObjectWithTag(self, **kwargs):
            """ __init__ for the new class """
            self.__dict__.update(kwargs)
    
        new_class = type('YAMLObjectWithTag_'+tag,
                        (yaml.YAMLObject,),
                        {'yaml_tag' : '!{n}'.format(n = tag),
                        '__init__' :  init_DeriveYAMLObjectWithTag})
        return new_class
    

    And here is how to use DeriveYAMLObjectWithTag to get the required Yaml:

    definitions = {"one" : 1, "two" : 2, "three" : 3, "four" : 4}
    actions = {"run" : "yes", "print" : "no", "report" : "maybe"}
    namespace = [DeriveYAMLObjectWithTag('define')(**definitions),
                 DeriveYAMLObjectWithTag('action')(**actions)]
    
    text = yaml.dump_all(namespace,
                         default_flow_style = False,
                         explicit_start = True)
    

    Thanks to all those who answered. I seems there's a lack of functionality in PyYaml, and this is the most elegant way to overcome it.

  • Periodic Maintenance
    Periodic Maintenance over 11 years
    @favoretti: Indeed it's the explicit_start=True that divides the output to documents. explicit_start is not documented at all in PyYAMLDocumentation, only mentioned in an example.
  • Periodic Maintenance
    Periodic Maintenance over 11 years
    The idea to declare a class for each specific tag works, but will not scale up for a large project where there are many such tags. The more I use PyYaml I realize how lame it is.
  • unutbu
    unutbu over 11 years
    I've added some code to show how classes could be defined programmatically.