How to import large JSON datasets to Cloud Firestore periodically?
You appear to have hit the writes and transactions limit of 500.
Limit | Details |
---|---|
Maximum API request size | 10 MiB |
Maximum number of writes that can be passed to a Commit operation or performed in a transaction | 500 |
Maximum number of field transformations that can be performed on a single document in a Commit operation or in a transaction | 500 |
A better way to do this is using Batched writes.
A batched write can contain up to 500 operations. Each operation in the batch counts separately towards your Cloud Firestore usage. Within a write operation, field transforms like
serverTimestamp
,arrayUnion
, and increment each count as an additional operation.
You can structure your directories something like this:
batch_uploader\
json_files\
json_1.json
json_1.json
json_1.json
json_1.json
uploader.js
....
uploader.js
var admin = require("firebase-admin");
var serviceAccount = require("./service_key.json");
admin.initializeApp({
credential: admin.credential.cert(serviceAccount),
databaseURL: "YOUR_PROJECT_LINK"
});
const firestore = admin.firestore();
const path = require("path");
const fs = require("fs");
const directoryPath = path.join(__dirname, "files");
fs.readdir(directoryPath, function(err, files) {
if (err) {
return console.log("Unable to scan directory: " + err);
}
files.forEach(function(file) {
var lastDotIndex = file.lastIndexOf(".");
var menu = require("./json_files/" + file);
menu.forEach(function(obj) {
firestore
.collection(file.substring(0, lastDotIndex))
.doc(obj.itemID)
.set(obj)
.then(function(docRef) {
console.log("Document written");
})
.catch(function(error) {
console.error("Error adding document: ", error);
});
});
});
});
If you want to run the script periodically, you can schedule a function to run at specified times. To run the script every five minutes, for example, you can do something like this:
exports.scheduledFunction = functions.pubsub.schedule('every 5 minutes').onRun((context) => {
console.log('This will be run every 5 minutes!');
return null;
});
Emir Kutlugün
Updated on December 27, 2022Comments
-
Emir Kutlugün over 1 year
I am trying to import JSON with 180k records. As you can see in this code, I can upload 500 records per run, but I need to upload a 180k record periodically.
What I am trying To Achieve:
- Parse JSON (DONE).
- Create a model from each JSON element (DONE)
- Upload this to Cloud Firestore (DONE BUT 500 DOCUMENT EACH)
Factory of Model:
factory Academician.fromJson(Map<String, dynamic> json) => Academician( // Using this to parse data from JSON, // rating and reviewList does not exist in JSON, // that's why i need custom model name: json["name"] ?? "", designation: json["designation"] ?? "", field: json["field"] ?? "", universityName: json["university"] ?? "", department: json["department"] ?? "", rating: 0, reviewList: [], );
Parsing and uploading to Cloud Firestore:
parseJsonFromAssets('json/akademik_kadro.json') .then((value) => value['Sayfa1'].forEach((value) { academicianList.add(Academician.fromJson(value)); })) .then((value) { setState(() { for (int z = 0; z < 500; z++) { //500 is the single time upload limit to Cloud Firestore //I guess dbOperation.addToCollection(academicianList[z]); } }); });
-
Emir Kutlugün over 3 yearsSo, i have to use cloud functions for this?
-
Andrew over 3 yearsIf you want to schedule the import then yes.