pymongo typeError: document must be an instance of dict, bson.son.SON, bson.raw_bson.RawBSONDocument

30,071

Check out this bulk insert example from MongoDB:s webpage. Skip the json.dumps call (which turns your array of documents into a json formatted string) and insert odbcArray directly:

mongoImp = dbo.insert_many(odbcArray)
Share:
30,071
N Raghu
Author by

N Raghu

Tech Architect having 10 years of experience in various technical stacks and business domains. Develop event based microservices in python using Apache Kafka and Message Pack Create python scripts to transform Salesforce and heterogeneous application data to warehouse Worked on container services like Docker and AWS-ECS Worked on AWS services like Serverless architecture, S3, EC2, Athena, I AM roles and file systems Working with product owner to break product features into user stories Lead new members in the team to understand the role and act as a mentor to quickly start working on the project

Updated on March 23, 2020

Comments

  • N Raghu
    N Raghu about 4 years

    I was trying to migrate data from SQL Server to MongoDB but was getting below type error in the last phase while importing data to MongoDB.

    mongoImp = dbo.insert_many(jArray)
      File "/home/lrsa/.local/lib/python2.7/site-packages/pymongo/collection.py", line 710, in insert_many
        blk.ops = [doc for doc in gen()]
      File "/home/lrsa/.local/lib/python2.7/site-packages/pymongo/collection.py", line 702, in gen
        common.validate_is_document_type("document", document)
      File "/home/lrsa/.local/lib/python2.7/site-packages/pymongo/common.py", line 407, in validate_is_document_type
        "collections.MutableMapping" % (option,))
    TypeError: document must be an instance of dict, bson.son.SON, bson.raw_bson.RawBSONDocument, or a type that inherits from collections.MutableMapping
    

    I have also checked the type(jArray) which is a str. Tried with converting the data type to list as well but could not succeed.
    My Code:

    import pyodbc
    import json
    import collections
    import pymongo
    from bson import json_util
    
    odbcArray = []
    mongoConStr = '192.168.10.107:36006'
    sqlConStr = 'DRIVER={MSSQL-NC1311};SERVER=tcp:192.168.10.103,57967;DATABASE=AdventureWorks;UID=testuser;PWD=testuser'
    mongoConnect = pymongo.MongoClient(mongoConStr)
    sqlConnect = pyodbc.connect(sqlConStr)
    
    dbo = mongoConnect.eaedw.sqlData
    dbDocs = dbo.find()
    sqlCur = sqlConnect.cursor()
    sqlCur.execute("""
                SELECT TOP 2 BusinessEntityID,Title, Demographics, rowguid, ModifiedDate
                FROM Person.Person
                """)
    
    tuples = sqlCur.fetchall()
    
    for tuple in tuples:
        doc = collections.OrderedDict()
        doc['id'] = tuple.BusinessEntityID
        doc['title'] = tuple.Title
        doc['dgrap'] = tuple.Demographics
        doc['rowi'] = tuple.rowguid
        doc['mtime'] = tuple.ModifiedDate
        odbcArray.append(doc)
    
    jArray = json.dumps(odbcArray, default=json_util.default)
    mongoImp = dbo.insert_many(jArray)
    
    mongoConnect.close()
    sqlConnect.close()
    
  • N Raghu
    N Raghu about 7 years
    Actually, Since odbcArray is not in a JSON format and thought would not work. So, I have explicitly converted into jsonArray. Anyways, thanks for the help it was working like a charm
  • NealWalters
    NealWalters over 3 years
    I had similar issue where I wrote a big JSON to S3, then read it back in later and sent it to Mongo, after doing "dict_obj = json.loads(str_value)". I forgot it was multiple documents, so I just changed insert_one to insert_many to fix the issue.
  • nCoder
    nCoder over 2 years
  • CaptainNemo
    CaptainNemo about 2 years