Validate JSON data using python

44,279

Solution 1

If you haven't check jsonschema library, it can be useful to validate data. JSON Schema is a way to describe the content of JSON. The library just uses the format to make validations based on the given schema.

I made a simple example from basic usage.

import json
from jsonschema import validate

# Describe what kind of json you expect.
schema = {
    "type" : "object",
    "properties" : {
        "description" : {"type" : "string"},
        "status" : {"type" : "boolean"},
        "value_a" : {"type" : "number"},
        "value_b" : {"type" : "number"},
    },
}

# Convert json to python object.
my_json = json.loads('{"description": "Hello world!", "status": true, "value_a": 1, "value_b": 3.14}')

# Validate will raise exception if given json is not
# what is described in schema.
validate(instance=my_json, schema=schema)

# print for debug
print(my_json)

Solution 2

As you're using a JSON file, you can use this example:

import json
def validate(filename):
    with open(filename) as file:
        try:
            return json.load(file) # put JSON-data to a variable
        except json.decoder.JSONDecodeError:
            print("Invalid JSON") # in case json is invalid
        else:
            print("Valid JSON") # in case json is valid

Solution 3

I had the same problem and was dissatisfied with the existing solution of using jsonschema. It gives terrible error messages that are not at all user-friendly.

I wrote my own library to define schemas, which has a lot of additional functionality over jsonschema:

https://github.com/FlorianDietz/syntaxTrees

It gives very precise error messages, allows you to write code to customize the validation process, lets you define functions that operate on the validated JSON, and even creates HTML documentation for the schemas you define.

Schemas are defined as classes, similar to how Django defines models:

class MyExampleNode(syntaxTreesBasics.Node):
    field_1 = fields.Float(default=0)
    field_2 = fields.String()
    field_3 = fields.Value('my_example_node', null=True, default=None)
    class Meta:
        name = 'my_example_node'

Solution 4

While Jsonschema module is good but documentation is missing complex examples. And library not reporting any errors for invalid schema just ignoring!

This is example:

from jsonschema import validate

set_tl_schema = {
    "type" : "object",
    "properties" :  {

    "level": {
      "value": {"type" : "number"},
      "updatedAt": {"type" : "number"}
    }
}
}

x = {'level': {'updatedAt': '1970-01-01T00:00:00.000Z', 'value': 1}, }

try:
    validate(instance=x, schema=set_tl_schema)
except jsonschema.exceptions.ValidationError as ex:
    print(ex)

Mistake was that level also need properties field. But validator never report you that.

I found very efficient and easy to use module:

https://pypi.org/project/json-checker/

>>> from json_checker import Checker

>>> current_data = {'first_key': 1, 'second_key': '2'}
>>> expected_schema = {'first_key': int, 'second_key': str}


>>> checker = Checker(expected_schema)
>>> result = checker.validate(current_data)
Share:
44,279
LennyDan
Author by

LennyDan

Updated on December 08, 2021

Comments

  • LennyDan
    LennyDan over 2 years

    I need to create a function that validates incoming json data and returns a python dict. It should check if all necessary fields are present in a json file and also validate the data types of those fields. I need to use try-catch. Could you provide some kind of snippets or examples that give me answers?

  • Apostolos
    Apostolos almost 4 years
    'JSONDecodeError' is not defined and produces an error itself! The correct name is 'json.decoder.JSONDecodeError' !
  • ATernative
    ATernative almost 4 years
    @Apostolos sorry, appears that I've made this mistake for some reason. Edited, so answer would be correct for future viewers. Although, my answer is not the one which really answers original question
  • skoriy
    skoriy about 3 years
    @ATernative , the question was about checking if necessary fields are present rather than some string is able to be parsed as JSON.