How to import JSON to Cosmos DB

javascript azure azure-cosmosdb

10,392

Solution 1

I used DocumentDB Data Migration Tool to import one .json file which contains 4 documents and it works well for me.

My .json file :

[
    {
        "name":"jay1",
        "age":20
    },
    {
        "name":"jay2",
        "age":20
    },
    {
        "name":"jay3",
        "age":20
    },
    {
        "name":"jay4",
        "age":20
    }
]

And the documents are imported into collection separately.

The JSON file source importer option allows you to import one or more single document JSON files or JSON files that each contain an array of JSON documents. When adding folders that contain JSON files to import, you have the option of recursively searching for files in subfolders.

You could check if you followed the above rules which is stated in the official tutorial to create your .json file. (For example , are your documents in an array?)

Hope it helps you.

Update Answer:

I tired to import data which is followed the format you mentioned in the reply into cosmos db.

My update .json file:

[
    {
        "id":"001",
        "name":"jay1",
        "age":20,
        "array": {
            "a":1,
            "b":2,
            "c":{
                "d":3,
                "e": {
                    "f":4
                }
            }
        }
    },
    {
        "id":"002",
        "name":"jay2",
        "age":20,
        "array": {
            "a":1,
            "b":2,
            "c":{
                "d":3,
                "e": {
                    "f":4
                }
            }
        }
    },
    {
        "id":"003",
        "name":"jay3",
        "age":20,
        "array": {
            "a":1,
            "b":2,
            "c":{
                "d":3,
                "e": {
                    "f":4
                }
            }
        }
    }
]

And the documents are imported into collection successfully.

Update Answer 2:

I followed the format of the document you offered and created 3 sample documents in my .json file.(I just update the '_id' property to 'testX')

My .json file:

[
    {
        "_id": "test1", 
        "_rev": "3-99aef1458fe1a8f310c83156b9d06a69", 
        "delivery": {
            "application": "EnvTest", 
            "sender": {
                "id": "[email protected]", 
                "type": "user"
            }, 
            "recipients": [
                {
                    "type": "email", 
                    "recipient": "\"Artzer, Daniel J\" <[email protected]>", 
                    "sentTS": "2018-01-30T19:46:31.515Z", 
                    "id": "45281ab0-05f6-11e8-a86a-61a54dcb42aa"
                }
            ]
        }, 
        "payload": {
            "startTS": "2017-11-30T07:38:33-06:00", 
            "equipmentOrLineType": "gathering system", 
            "scheduledIndicator": "upset", 
            "field": "Gulf Plains", 
            "eventDateTime": "2017-11-30T07:31:39-06:00", 
            "numCopies": "1", 
            "region": "South", 
            "facility": "", 
            "equipment": "", 
            "causeForRelease": "", 
            "correctiveAction": "", 
            "hgbUsed": "no-hgb-available", 
            "esdActivated": "no", 
            "submitTS": "2017-11-30T13:38:33.940Z", 
            "lineSegment": [
                {
                    "lineName": "GPL-1", 
                    "Diameter": "12.725", 
                    "Length": "1000", 
                    "duration": "no", 
                    "volume": "3"
                }
            ], 
            "creator": "[email protected]", 
            "savedTS": "2017-11-30T13:38:33.955Z", 
            "trainingDoc": true, 
            "_id": "[email protected]_2017-11-30T13:38:33.955Z", 
            "syncTS": "2017-11-30T13:38:34.794Z"
        }
    },
    {
        "_id": "test2", 
        "_rev": "3-99aef1458fe1a8f310c83156b9d06a69", 
        "delivery": {
            "application": "EnvTest", 
            "sender": {
                "id": "[email protected]", 
                "type": "user"
            }, 
            "recipients": [
                {
                    "type": "email", 
                    "recipient": "\"Artzer, Daniel J\" <[email protected]>", 
                    "sentTS": "2018-01-30T19:46:31.515Z", 
                    "id": "45281ab0-05f6-11e8-a86a-61a54dcb42aa"
                }
            ]
        }, 
        "payload": {
            "startTS": "2017-11-30T07:38:33-06:00", 
            "equipmentOrLineType": "gathering system", 
            "scheduledIndicator": "upset", 
            "field": "Gulf Plains", 
            "eventDateTime": "2017-11-30T07:31:39-06:00", 
            "numCopies": "1", 
            "region": "South", 
            "facility": "", 
            "equipment": "", 
            "causeForRelease": "", 
            "correctiveAction": "", 
            "hgbUsed": "no-hgb-available", 
            "esdActivated": "no", 
            "submitTS": "2017-11-30T13:38:33.940Z", 
            "lineSegment": [
                {
                    "lineName": "GPL-1", 
                    "Diameter": "12.725", 
                    "Length": "1000", 
                    "duration": "no", 
                    "volume": "3"
                }
            ], 
            "creator": "[email protected]", 
            "savedTS": "2017-11-30T13:38:33.955Z", 
            "trainingDoc": true, 
            "_id": "[email protected]_2017-11-30T13:38:33.955Z", 
            "syncTS": "2017-11-30T13:38:34.794Z"
        }
    },
    {
        "_id": "test3", 
        "_rev": "3-99aef1458fe1a8f310c83156b9d06a69", 
        "delivery": {
            "application": "EnvTest", 
            "sender": {
                "id": "[email protected]", 
                "type": "user"
            }, 
            "recipients": [
                {
                    "type": "email", 
                    "recipient": "\"Artzer, Daniel J\" <[email protected]>", 
                    "sentTS": "2018-01-30T19:46:31.515Z", 
                    "id": "45281ab0-05f6-11e8-a86a-61a54dcb42aa"
                }
            ]
        }, 
        "payload": {
            "startTS": "2017-11-30T07:38:33-06:00", 
            "equipmentOrLineType": "gathering system", 
            "scheduledIndicator": "upset", 
            "field": "Gulf Plains", 
            "eventDateTime": "2017-11-30T07:31:39-06:00", 
            "numCopies": "1", 
            "region": "South", 
            "facility": "", 
            "equipment": "", 
            "causeForRelease": "", 
            "correctiveAction": "", 
            "hgbUsed": "no-hgb-available", 
            "esdActivated": "no", 
            "submitTS": "2017-11-30T13:38:33.940Z", 
            "lineSegment": [
                {
                    "lineName": "GPL-1", 
                    "Diameter": "12.725", 
                    "Length": "1000", 
                    "duration": "no", 
                    "volume": "3"
                }
            ], 
            "creator": "[email protected]", 
            "savedTS": "2017-11-30T13:38:33.955Z", 
            "trainingDoc": true, 
            "_id": "[email protected]_2017-11-30T13:38:33.955Z", 
            "syncTS": "2017-11-30T13:38:34.794Z"
        }
    }
]

The documents were imported into the db and no error occurred.

If any differences between you and me ,please let me know.

Solution 2

The other answer is about a year ago, it uses dtui.exe.

dtui.exe is a GUI tool
dt.exe is a command-line tool

Both can be downloaded as of today from https://www.microsoft.com/en-us/download/details.aspx?id=46436

Below solution uses dt.exe (automates your process so that it can be repeated & without issues in future too)

`"C:\path-to-your-dt-exe\dt.exe"  /s:JsonFile   /s.Files:"C:\path-to-your-json-files-of-one-collection-only\*.json"   /t:DocumentDB    /t.ConnectionString:"AccountEndpoint=https://localhost:8081/;AccountKey=<big account key here>;Database=<your db name here>"   /t.Collection:<your collection name here>   /t.PartitionKey:<your partition key here>   /t.CollectionThroughput:2500`

Note
- database (<your db name here>) & collection (<your collection name here>) in above, must already exist
- replace <...> with your data in above, and run in cmd.
- /s is the source, /t is the target, here the source is JSON files, and the target is local documentBD

Hope that helps.

10,392

CarCrazyBen

I have extensive SQL experience / knowledge, primarily in Oracle and less so in Teradata & SQL Server. My latest project involves NoSQL databases, JSON documents, and (attempting) to identify and configure relationships between those and legacy RDBMS systems.

Updated on June 04, 2022

Comments

CarCrazyBen almost 2 years

I have an existing CouchDB instance with several databases. Using Postman, I exported 10 documents to a .json file. Using DocumentDB Data Migration Tool I was able to import the contents of this file to Cosmos.

However, it seems I have effectively imported a single document. Azure Data Explorer indicates I have one single document in my "collection".

In CouchDB, the highest-level Identifier is doc._id. I have tried specifying "doc._id", "id", and "_id" as my ID Field, all to no avail. What am I doing wrong? How do I import each JSON document separately?

Sample document:

{
    "_id": "3147cb0e74449e1c28c6ded2b4a3fa45e0d65481-bd_RXMARTINEZ@emailaddress.com_2017-11-30T13:38:33.955Z", 
    "_rev": "3-99aef1458fe1a8f310c83156b9d06a69", 
    "delivery": {
        "application": "EnvTest", 
        "sender": {
            "id": "[email protected]", 
            "type": "user"
        }, 
        "recipients": [
            {
                "type": "email", 
                "recipient": "\"Artzer, Daniel J\" <[email protected]>", 
                "sentTS": "2018-01-30T19:46:31.515Z", 
                "id": "45281ab0-05f6-11e8-a86a-61a54dcb42aa"
            }
        ]
    }, 
    "payload": {
        "startTS": "2017-11-30T07:38:33-06:00", 
        "equipmentOrLineType": "gathering system", 
        "scheduledIndicator": "upset", 
        "field": "Gulf Plains", 
        "eventDateTime": "2017-11-30T07:31:39-06:00", 
        "numCopies": "1", 
        "region": "South", 
        "facility": "", 
        "equipment": "", 
        "causeForRelease": "", 
        "correctiveAction": "", 
        "hgbUsed": "no-hgb-available", 
        "esdActivated": "no", 
        "submitTS": "2017-11-30T13:38:33.940Z", 
        "lineSegment": [
            {
                "lineName": "GPL-1", 
                "Diameter": "12.725", 
                "Length": "1000", 
                "duration": "no", 
                "volume": "3"
            }
        ], 
        "creator": "[email protected]", 
        "savedTS": "2017-11-30T13:38:33.955Z", 
        "trainingDoc": true, 
        "_id": "[email protected]_2017-11-30T13:38:33.955Z", 
        "syncTS": "2017-11-30T13:38:34.794Z"
    }
}

CarCrazyBen over 6 years

thank you. Perhaps my problem is my JSON document is more complex than is permitted. I saved a file of multiple JSON documents, each of which contain 3 nested Objects. Perhaps this format is not compatible in some way? I can provide a sample if that would be helpful.
Jay Gong over 6 years

@CarCrazyBen That will be helpful if you offer a sample document you want to import .I'll test on my side via the tool.
CarCrazyBen over 6 years

here is a sample document. This may be a mess because I have to send it in 2 parts: { "_id": "3147cb0e74449e1c28c6ded2b4a3fa45e0d65481-bd_RXMARTINEZ@emai‌laddress.com_2017-11‌-30T13:38:33.955Z", "_rev": "3-99aef1458fe1a8f310c83156b9d06a69", "delivery": { "application": "EnvTest", "sender": {"id": "[email protected]","type": "user"}, "recipients": [ {"type": "email","recipient": "\"Artzer, Daniel J\" <[email protected]>", "sentTS": "2018-01-30T19:46:31.515Z","id": "45281ab0-05f6-11e8-a86a-61a54dcb42aa"}, ] },
CarCrazyBen over 6 years

"payload": { "startTS": "2017-11-30T07:38:33-06:00", "equipmentOrLineType": "gathering system", "scheduledIndicator": "upset", "field": "Gulf Plains", "eventDateTime": "2017-11-30T07:31:39-06:00", "numCopies": "1", "region": "South", "facility": "", "equipment": "", "causeForRelease": "", "correctiveAction": "", "hgbUsed": "no-hgb-available", "esdActivated": "no", "submitTS": "2017-11-30T13:38:33.940Z",
CarCrazyBen over 6 years

"lineSegment": [ {"lineName": "GPL-1", "Diameter": "12.725", "Length": "1000", "duration": "no", "volume": "3"} ], "creator": "[email protected]", "savedTS": "2017-11-30T13:38:33.955Z", "trainingDoc": true, "_id": "[email protected]_2017-11-30T13:38:33.955Z", "syncTS": "2017-11-30T13:38:34.794Z" } } Correction, 3 parts.
CarCrazyBen over 6 years

please see sample document I provided in Comments. Thank you for your help with this!
Jay Gong over 6 years

@CarCrazyBen Well, I will try to put the pieces together. In fact, you could post the sample document in your question instead in the comment,it'll be more clear.
Jay Gong over 6 years

@CarCrazyBen Hi, please see my update answer 2.If any concern,just let me know. BTW, I already added your sample document in your question for you,it will be more clear.
CarCrazyBen over 6 years

I think I see the difference now. You enclosed the documents list in square brackets, [], while I did not. I will re-try the Data Migration Tool. Thanks again for your help!
Jay Gong over 6 years

@CarCrazyBen Yes, you need to put multiple documents into an array [] if you want to import them.That's the key! If you done successfully, you could mark this answer for others' reference on the forum. Thx.