Controlling Yaml Serialization Order in Python
Solution 1
Took me a few hours of digging through PyYAML docs and tickets, but I eventually discovered this comment that lays out some proof-of-concept code for serializing an OrderedDict as a normal YAML map (but maintaining the order).
e.g. applied to my original code, the solution looks something like:
>>> import yaml
>>> from collections import OrderedDict
>>> def dump_anydict_as_map(anydict):
... yaml.add_representer(anydict, _represent_dictorder)
...
>>> def _represent_dictorder( self, data):
... if isinstance(data, Document):
... return self.represent_mapping('tag:yaml.org,2002:map', data.__getstate__().items())
... else:
... return self.represent_mapping('tag:yaml.org,2002:map', data.items())
...
>>> class Document(object):
... def __init__(self, name):
... self.name = name
... self.otherstuff = 'blah'
... def __getstate__(self):
... d = OrderedDict()
... d['name'] = self.name
... d['otherstuff'] = self.otherstuff
... return d
...
>>> dump_anydict_as_map(Document)
>>> doc = Document('obj-20111227')
>>> print yaml.dump(doc, indent=4)
!!python/object:__main__.Document
name: obj-20111227
otherstuff: blah
Solution 2
New Solution (as of 2020 and PyYAML 5.1)
You can dump a dictionary in its current order by simply using
yaml.dump(data, default_flow_style=False, sort_keys=False)
Solution 3
I think the problem is when you dump the data.
I looked into the code of PyYaml and there is a optional argument called sort_keys
, setting that value to False
seems to do the trick.
Related videos on Youtube
Cerin
Updated on June 04, 2022Comments
-
Cerin almost 2 years
How do you control how the order in which PyYaml outputs key/value pairs when serializing a Python dictionary?
I'm using Yaml as a simple serialization format in a Python script. My Yaml serialized objects represent a sort of "document", so for maximum user-friendliness, I'd like my object's "name" field to appear first in the file. Of course, since the value returned by my object's
__getstate__
is a dictionary, and Python dictionaries are unordered, the "name" field will be serialized to a random location in the output.e.g.
>>> import yaml >>> class Document(object): ... def __init__(self, name): ... self.name = name ... self.otherstuff = 'blah' ... def __getstate__(self): ... return self.__dict__.copy() ... >>> doc = Document('obj-20111227') >>> print yaml.dump(doc, indent=4) !!python/object:__main__.Document otherstuff: blah name: obj-20111227
-
Cerin over 12 yearsLike my post says, I know Python dictionaries are unordered. Unfortunately, there's a big difference in Yaml readability between a dictionary and a list of tuples, so this won't work in my case.
-
Mattwmaster58 about 4 yearsPython dictionaries are ordered as of 3.6
-
Voxel Minds almost 4 yearsThis answer is what I was looking for. If you set
sort_keys
toFalse
, PyYaml will respect your dictionary ordering.python yaml.dump(data, file, sort_keys=False)
-
Ainz Titor over 3 yearsThank you so much, it's so great to know that such a simple option exists in the latest version. Just made my day!