How to convert a boto3 Dynamo DB item to a regular dictionary in Python?

26,375

Solution 1

In order to understand how to solve this, it's important to recognize that boto3 has two basic modes of operation: one that uses the low-level Client API, and one that uses higher level abstractions like Table. The data structure shown in the question is an example of what is consumed/produced by the low-level API, which is also used by the AWS CLI and the dynamodb web services.

To answer your question - if you can work exclusively with the high-level abstractions like Table when using boto3 then things will be quite a bit easier for you, as the comments suggest. Then you can sidestep the whole problem - python types are marshaled to and from the low-level data format for you.

However, there are some times when it's not possible to use those high-level constructs exclusively. I specifically ran into this problem when dealing with DynamoDB streams attached to Lambdas. The inputs to the lambda are always in the low-level format, and that format is harder to work with IMO.

After some digging I found that boto3 itself has some nifty features tucked away for doing conversions. These features are used implicitly in all of the internal conversions mentioned previously. To use them directly, import the TypeDeserializer/TypeSerializer classes and combine them with dict comprehensions like so:

import boto3

low_level_data = {
  "ACTIVE": {
    "BOOL": True
  },
  "CRC": {
    "N": "-1600155180"
  },
  "ID": {
    "S": "bewfv43843b"
  },
  "params": {
    "M": {
      "customer": {
        "S": "TEST"
      },
      "index": {
        "N": "1"
      }
    }
  },
  "THIS_STATUS": {
    "N": "10"
  },
  "TYPE": {
    "N": "22"
  }
}

# Lazy-eval the dynamodb attribute (boto3 is dynamic!)
boto3.resource('dynamodb')

# To go from low-level format to python
deserializer = boto3.dynamodb.types.TypeDeserializer()
python_data = {k: deserializer.deserialize(v) for k,v in low_level_data.items()}

# To go from python to low-level format
serializer = boto3.dynamodb.types.TypeSerializer()
low_level_copy = {k: serializer.serialize(v) for k,v in python_data.items()}

assert low_level_data == low_level_copy

Solution 2

You can use the TypeDeserializer class

from boto3.dynamodb.types import TypeDeserializer
deserializer = TypeDeserializer()

document = { "ACTIVE": { "BOOL": True }, "CRC": { "N": "-1600155180" }, "ID": { "S": "bewfv43843b" }, "params": { "M": { "customer": { "S": "TEST" }, "index": { "N": "1" } } }, "THIS_STATUS": { "N": "10" }, "TYPE": { "N": "22" } }
deserialized_document = {k: deserializer.deserialize(v) for k, v in document.items()}
print(deserialized_document)

Solution 3

There is a python package called "dynamodb-json" that can help you achieve this. The dynamodb-json util works the same as json loads and dumps functions. I prefer using this as it takes care of converting Decimal objects inherently.

You can find examples and how to install it by following this link - https://pypi.org/project/dynamodb-json/

Share:
26,375
manelmc
Author by

manelmc

Updated on June 29, 2021

Comments

  • manelmc
    manelmc almost 3 years

    In Python, when an item is retrieved from Dynamo DB using boto3, a schema like the following is obtained.

    {
      "ACTIVE": {
        "BOOL": true
      },
      "CRC": {
        "N": "-1600155180"
      },
      "ID": {
        "S": "bewfv43843b"
      },
      "params": {
        "M": {
          "customer": {
            "S": "TEST"
          },
          "index": {
            "N": "1"
          }
        }
      },
      "THIS_STATUS": {
        "N": "10"
      },
      "TYPE": {
        "N": "22"
      }
    }
    

    Also when inserting or scanning, dictionaries have to be converted in this fashion. I haven't been able to find a wrapper that takes care of such conversion. Since apparently boto3 does not support this, are there better alternatives than implementing code for it?

  • umbreonben
    umbreonben over 5 years
    Or much better, and Python2 compatible: python_data = deserializer.deserialize({'M':low_level_data})
  • Eric Platon
    Eric Platon over 5 years
    Note with boto3==1.9.79, I had to import the deserializer a different way: from boto3.dynamodb.types import TypeDeserializer. The module source code shows the deserializer is not exposed (anymore?) as @killthrush originally explained.
  • killthrush
    killthrush over 5 years
    Hmm... I tried a clean virtualenv with both 1.9.79 and 11.9.82 in the REPL and was not able to reproduce @Eric Platon. The original code seemed to work for me both times. Are you doing something different?
  • amaurs
    amaurs over 4 years
    This work like a charm! I was trying to copy a dynamodb table to another one and I had to use the low level api + the high level to do the batch writing. This saved me. Thanks!
  • Arvind
    Arvind over 4 years
    You sir are a life saver. This saved me from reimplementing the wheel.
  • Pierre-Francoys Brousseau
    Pierre-Francoys Brousseau about 4 years
    Note that this will not support 'B' (Binary type) if your "low_level" comes from json.loads, due to the data being a utf-8 string when it needs to be base64 bytes. I had to either pre-process and look for 'B', or simply monkey-patch deserializer._deserialize_b to b64decode for this case only.
  • Haktan Suren
    Haktan Suren almost 4 years
    I think this is still not supported for string set :(
  • Thirumal
    Thirumal over 3 years
    {'Message': 'New item!', 'Id': Decimal('101')} DataType Decimal is added to the Value. How to avoid?
  • killthrush
    killthrush over 3 years
    @Thirumal - Decimals are used automatically to avoid loss of precision between DynamoDB's Number type and python floats. There's been a feature request open for years in boto3: github.com/boto/boto3/issues/369. Maybe one of those workarounds might help you? If you're storing integers, then I agree you really don't need Decimal here.
  • Sơn Lâm
    Sơn Lâm over 2 years
    It works for me. Thank you ;)