How to convert string int JSON into real int with json.loads

30,951

Solution 1

As we established in the comments, there is no existing functionality to do this for you. And I read through the documentation and some examples on the JSONDecoder and it also appears to not do what you want without processing the data twice.

The best option, then, is something like this:

class Decoder(json.JSONDecoder):
    def decode(self, s):
        result = super().decode(s)  # result = super(Decoder, self).decode(s) for Python 2.x
        return self._decode(result)

    def _decode(self, o):
        if isinstance(o, str) or isinstance(o, unicode):
            try:
                return int(o)
            except ValueError:
                return o
        elif isinstance(o, dict):
            return {k: self._decode(v) for k, v in o.items()}
        elif isinstance(o, list):
            return [self._decode(v) for v in o]
        else:
            return o

This has the downside of processing the JSON object twice — once in the super().decode(s) call, and again to recurse through the entire structure to fix things. Also note that this will convert anything which looks like an integer into an int. Be sure to account for this appropriately.

To use it, you do e.g.:

>>> c = '{"value": "42"}'
>>> json.loads(c, cls=Decoder)
{'value': 42}

Solution 2

In addition to the Pierce response, I think you can use the json.loads object_hook parameter instead of cls one, so you don't need to walk the json object twice.

For example:

def _decode(o):
    # Note the "unicode" part is only for python2
    if isinstance(o, str) or isinstance(o, unicode):
        try:
            return int(o)
        except ValueError:
            return o
    elif isinstance(o, dict):
        return {k: _decode(v) for k, v in o.items()}
    elif isinstance(o, list):
        return [_decode(v) for v in o]
    else:
        return o

# Then you can do:
json.loads(c, object_hook=_decode)

As @ZhanwenChen pointed out in a comment, the code above is for python2. For python3 you'll need to remove the or isinstance(o, unicode) part in the first if condition.

Solution 3

For my solution I used object_hook, which is useful when you have nested json

>>> import json
>>> json_data = '{"1": "one", "2": {"-3": "minus three", "4": "four"}}'
>>> py_dict = json.loads(json_data, object_hook=lambda d: {int(k) if k.lstrip('-').isdigit() else k: v for k, v in d.items()})

>>> py_dict
{1: 'one', 2: {-3: 'minus three', 4: 'four'}}

There is a filter only for parsing a json key to int. You can use int(v) if v.lstrip('-').isdigit() else v to filter for json values too.

Share:
30,951
Léo
Author by

Léo

Updated on July 16, 2022

Comments

  • Léo
    Léo almost 2 years

    I'm trying to convert a string which represents a JSON object to a real JSON object using json.loads but it doesn't convert the integers:

    (in the initial string, integers are always strings)

    $> python
    Python 2.7.9 (default, Aug 29 2016, 16:00:38)
    [GCC 4.2.1 Compatible Apple LLVM 7.3.0 (clang-703.0.31)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import json
    >>> c = '{"value": "42"}'
    >>> json_object = json.loads(c, parse_int=int)
    >>> json_object
    {u'value': u'42'}
    >>> json_object['value']
    u'42'
    >>>
    

    Instead of {u'value': u'42'} I'd like it becomes {u'value': 42}. I know I can run through the whole object, but I don't want to do that, it's not really efficient to do it manually, since this parse_int argument exists (https://docs.python.org/2/library/json.html#json.loads).

    Thanks to Pierce's proposition:

    Python 2.7.9 (default, Aug 29 2016, 16:00:38)
    [GCC 4.2.1 Compatible Apple LLVM 7.3.0 (clang-703.0.31)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import json
    >>>
    >>> class Decoder(json.JSONDecoder):
    ...     def decode(self, s):
    ...         result = super(Decoder, self).decode(s)
    ...         return self._decode(result)
    ...     def _decode(self, o):
    ...         if isinstance(o, str) or isinstance(o, unicode):
    ...             try:
    ...                 return int(o)
    ...             except ValueError:
    ...                 try:
    ...                     return float(o)
    ...                 except ValueError:
    ...                     return o
    ...         elif isinstance(o, dict):
    ...             return {k: self._decode(v) for k, v in o.items()}
    ...         elif isinstance(o, list):
    ...             return [self._decode(v) for v in o]
    ...         else:
    ...             return o
    ...
    >>>
    >>> c = '{"value": "42", "test": "lolol", "abc": "43.4",  "dcf": 12, "xdf": 12.4}'
    >>> json.loads(c, cls=Decoder)
    {u'test': u'lolol', u'dcf': 12, u'abc': 43.4, u'value': 42, u'xdf': 12.4}