Spring Data MongoRepository save(T) not working... sometimes
Solution 1
Problem solved.
A different async call from the JS client, to a different endpoint in the Java backend, was overwriting my updated document in a different thread with the original values.
Both update operations were calling findById
before saving. Problem was that they did so at the same time, so they were getting the same original values.
Each then carried on with updating their relevant fields and calling save
at the end, resulting in the other thread effectively overriding my changes.
Each call was logged with just the relevant modified fields, so I didn't realize that one of them was overwriting the changes of the other.
Once I added systemLog.verbosity: 3
to MongoDB's config.cfg
so it logged all operations, it was clear that 2 different WRITE operations were happening at the same time (~500 ms apart) but using different values.
Then it was just a matter of moving the findById
closer to the save
and ensuring that the JS calls were done in order (by making one of the promises depend on the other).
In hindsight, this probably wouldn't have happened if I used MongoOperations
or MongoTemplate
, which offer single update
and findAndModify
methods that also allow single-field operations, instead of MongoRepository
where I'm forced to do it in 3 steps (find
, modify returned entity, save
) and to work with the full document.
EDIT: I didn't really like my first "move findById
closer to save
" approach, so in the end I did what I felt was right and implemented custom save methods that used MongoTemplate
's finer-grained update
API. Final code:
/* MongoRepository provides entity-based default Spring Data methods */
/* BookDataRepositoryCustom provides field-level update methods */
public interface BookDataRepository extends MongoRepository<BookData, String>, BookDataRepositoryCustom {
BookData findById(String id);
}
/* Interface for the custom methods */
public interface BookDataRepositoryCustom {
int saveCurrentPage(String id, Integer currentPage);
}
/* Custom implementation using MongoTemplate. */
@SuppressWarnings("unused")
public class BookDataRepositoryImpl implements BookDataRepositoryCustom {
@Inject
MongoTemplate mongoTemplate;
@Override
public int saveCurrentPage(String id, Integer currentPage) {
Query query = new Query(Criteria.where("_id").is(id));
Update update = new Update();
update.set("currentPage", currentPage);
WriteResult result = mongoTemplate.updateFirst(query, update, BookData.class);
return result == null ? 0 : result.getN();
}
}
// Old code: get entity from DB, update, save. 3 steps with plenty of room for interferences.
// BookData bookData = bookDataRepository.findById(bookDataId);
// bookData.setCurrentPage(currentPage);
// bookDataRepository.save(bookData);
// New code: update single field. 1 step, 0 problems.
bookDataRepository.saveCurrentPage(bookDataId, currentPage);
By doing so, each endpoint can update
as frequently as needed via MongoTemplate
without ever worrying about overwriting unrelated fields, and I still keep the entity-based MongoRepository
methods for things like new entity creation, findBy
methods, annotated @Query
s etc.
Solution 2
MongoDB is inherently a cache store, by which I mean, the contents are not guaranteed to be latest or necessarily correct. I haven't been able to find the configuration options for flush time (but they would be configured in the DB itself), but MongoDB has added features so that you can choose fast+dirty, or slow+clean. This "freshness" factor is most likely your issue if you are seeing this kind of problem. (Even if you aren't running distributed, there is a timing difference between request acknowledge, and request committed)
Here is a link to post regarding "clean reading" (Key point in following quote)
http://www.dagolden.com/index.php/2633/no-more-dirty-reads-with-mongodb/
I encourage MongoDB users to place themselves (or at least, their application activities) into one of the following groups:
"I want low latency" – Dirty reads are OK as long as things are fast. Use w=1 and read concern 'local'. (These are the default settings.) "I want consistency" – Dirty reads are not OK, even at the cost of latency or slightly out of date data. Use w='majority' and read concern 'majority. use MongoDB v1.2.0;
my $mc = MongoDB->connect( $uri, { read_concern_level => 'majority', w => 'majority', } );
further reading that may or may not be useful
- https://docs.mongodb.com/manual/faq/concurrency/
- https://www.rainforestqa.com/blog/2012-11-05-mongodb-gotchas-and-how-to-avoid-them/ (Namely, the Journaling section)
Update
If running in a multi-thread environment, make sure your threads aren't trampling over another's updates. You can verify if this is happening by configuring the system or query logging level to 5. https://docs.mongodb.com/manual/reference/log-messages/#log-messages-configure-verbosity
Related videos on Youtube
walen
Updated on April 23, 2020Comments
-
walen about 4 years
So there's this little Angular + Java + Spring Boot + MongoDB app I'm working with. It's getting quite a lot of action (read: code modifications) lately, but the data access classes have gone largely untouched AFAIK.
However, it seems likeMongoRepository
suddenly decided to stop persisting the changes I'msave()
ing to DB.Inspecting
mongod.log
this is what I see when thesave()
works:2018-04-11T15:04:06.840+0200 I COMMAND [conn6] command pdfviewer.bookData command: find { find: "bookData", filter: { _id: "ID_1" }, limit: 1, singleBatch: true } planSummary: IDHACK keysExamined:1 docsExamined:1 idhack:1 cursorExhausted:1 keyUpdates:0 writeConflicts:0 numYields:1 nreturned:1 reslen:716 locks:{ Global: { acquireCount: { r: 4 } }, Database: { acquireCount: { r: 2 } }, Collection: { acquireCount: { r: 2 } } } protocol:op_query 102ms 2018-04-11T17:30:19.615+0200 I WRITE [conn7] update pdfviewer.bookData query: { _id: "ID_1" } update: { _class: "model.BookData", _id: "ID_1", config: { mode: "normal", offlineEnabled: true }, metadata: { title: "PDFdePrueba3pag copia 6 ", ...}, downloaded: false, currentPageNumber: 2, availablePages: 3, bookmarks: [], stats: { _id: "c919e517-3c68-462c-8396-d4ba391762e6", dateOpen: new Date(1523460575872), dateClose: new Date(1523460575951), timeZone: "+2", ... }, ... } keysExamined:1 docsExamined:1 nMatched:1 nModified:1 keyUpdates:0 writeConflicts:1 numYields:1 locks:{ Global: { acquireCount: { r: 2, w: 2 } }, Database: { acquireCount: { w: 2 } }, Collection: { acquireCount: { w: 2 } } } 315ms 2018-04-11T17:30:19.615+0200 I COMMAND [conn7] command pdfviewer.$cmd command: update { update: "bookData", ordered: false, updates: [ { q: { _id: "ID_1" }, u: { _class: "model.BookData", _id: "ID_1", config: { mode: "normal", offlineEnabled: true }, metadata: { title: "PDFdePrueba3pag copia 6 ", ...}, downloaded: false, currentPageNumber: 2, availablePages: 3, bookmarks: [], stats: { _id: "c919e517-3c68-462c-8396-d4ba391762e6", dateOpen: new Date(1523460575872), dateClose: new Date(1523460575951), timeZone: "+2", ... }, ... }, upsert: true } ] } keyUpdates:0 writeConflicts:0 numYields:0 reslen:55 locks:{ Global: { acquireCount: { r: 2, w: 2 } }, Database: { acquireCount: { w: 2 } }, Collection: { acquireCount: { w: 2 } } } protocol:op_query 316ms
And this is what I see when it doesn't:
2018-04-11T18:13:21.864+0200 I NETWORK [initandlisten] connection accepted from 127.0.0.1:64271 #1 (1 connection now open) 2018-04-11T18:18:51.425+0200 I NETWORK [initandlisten] connection accepted from 127.0.0.1:64329 #2 (2 connections now open) 2018-04-11T18:19:06.967+0200 I NETWORK [initandlisten] connection accepted from 127.0.0.1:64346 #3 (3 connections now open)
By doing a
tail -f
1 on the log file while debugging, I've seen those connections appear right when my code callsfindById()
orsave()
, so it seems like the app can reach the DB.This is (more or less) the relevant Java code:
/* BookData.java */ @Document public class BookData { @Id private String id; // Some more non-Id Strings... private Config config; private Metadata metadata; private Boolean downloaded; private Integer currentPageNumber; private int availablePages; private List<Bookmark> bookmarks; private StatsModel stats; @Transient private byte[] contents; public BookData() {} // getters and setters } /* BookDataRepository.java */ // MongoRepository comes from spring-boot-starter-parent-1.4.5.RELEASE public interface BookDataRepository extends MongoRepository<BookData, String> { BookData findById(String id); } /* BookDataServiceImpl.java */ public BookData updateBookData(String id, BookData newData) { final BookData original = bookDataRepository.findById(id); if (original == null) { return null; } original.setCurrentPageNumber(Optional.ofNullable(newData.getCurrentPageNumber()).orElseGet(original::getCurrentPageNumber)); // similar code for a couple other fields return bookDataRepository.save(original); }
I've stepped through that part a hundred times while debugging and everything seems to be OK:
findById(id)
correctly returns the expectedBookData original
object: check ✓newData
contains the expected values to be used for updating: check ✓- right before calling
save(original)
,original
has been correctly modified usingnewData
values: check ✓ save()
executes without errors: check ✓save()
returns a newBookData
with correctly updated values: to my own surprise, check ✓- after
save()
returns, adb.bookData.find()
query in Mongo Shell shows that the values have been updated: fail. - after
save()
returns, theBookData
object retrieved by new calls tofindById()
contains the updated values: fail (sometimes it does, sometimes it doesn't).
It just looks like MongoDB is waiting for some kind of
flush()
, but this is not a JPA repository where one could callsaveAndFlush()
instead.Any ideas why this might be happening?
EDIT: versions (as requested):
- Java 8
- Spring Boot 1.4.5
- MongoDB 3.2.6
- Windows 10
I also included
BookData
above.-
walen about 6 years[1] Actually a
Get-content mongod.log -Tail 10 -Wait
in PowerShell, but not really relevant. -
Paizo about 6 yearsI suggest you to add some logs there in
updateBookData
since there is a null check and in some case the original would be untouched. log the bean before the changes and right before thesave
could help on app errors. Can you post yourBookData
bean? -
Anatoly Shamov about 6 yearsHave you customized WriteResultChecking or WriteConcern policies?
-
walen about 6 years@Paizo Thanks for your chiming in. Using step-by-step execution I confirmed that
original
is being modified as expected before thesave
(as noted in the question), so that is not the problem. I'll addBookData
as requested. -
walen about 6 years@AnatolyShamov Nope, everything regarding Spring Data and
MongoRepository
is as vanilla as it can be. -
Tarun Lalwani about 6 years@LuisG., can you add information about the mongo server as well? Version, OS etc..
-
walen about 6 years@TarunLalwani Done.
-
Tarun Lalwani about 6 yearsCan you try with MongoDB 3.4 or 3.6? docs.mongodb.com/manual/tutorial/install-mongodb-on-windows
-
Tezra about 6 years@LuisG. Do you have a test case to reliably reproduce the issue?
-
walen about 6 years@TarunLalwani Our production MongoDB instance is shared by several apps including mine, so upgrading to 3.4 or 3.6 would require non-negligible effort by other teams to test their apps against the new version etc. -- something I'd rather avoid for now, unless there's some documented problem with 3.2.x that could justify upgrading.
-
walen about 6 years@Tezra Sadly, no. I cannot get even the actual app to fail consistently, so...
-
Tarun Lalwani about 6 yearsCould it be that the Mongo client version library you are using now has some problem? I thought this was a local instance and thats why suggested an upgrade
-
walen about 6 years@TarunLalwani Yes, for development I'm using a local instance. Both my local instance and the prod one are 3.2.6. What I mean is that if I upgrade my local instance, and then everything works... I still wouldn't know why it sometimes failed before and I'd probably need a better explanation than that before I get DevOps and the other teams involved.
-
Tarun Lalwani about 6 yearsIts hard to say because a recent patch or something on your machine may have impacted the behavior. There could be a lot of combination and it could be even specific to your machine. Until unless your corelate around different environments and configs its hard to say
-
walen about 6 years@Tezra As said in the question and in one of my comments above, I've confirmed via step-by-step execution that
updateBookData
is being executed every time, and theoriginal
object is being modified beforesave
ing, and thesave
call even returns an updated object. But the changes are not persisted to the actual DB. Smells like caching to me, but I'm not aware of any caching options forMongoRepository
. -
Tezra about 6 years@LuisG. Run some more tests, but add a 61 second sleep between save and return. If that works, than this is definitely a flush issue (or locking issue).
-
s7vr about 6 yearsCould you enable the debug level logging for both spring mongo and mongodb dependencies ? So when save fails you will have more information. Try
logging.level.org.springframework.data.mongodb.core.MongoTemplate=DEBUG logging.level.org.mongodb.driver.protocol.insert=DEBUG logging.level.org.mongodb.driver.protocol.query=DEBUG logging.level.org.mongodb.driver.protocol.command=DEBUG logging.level.org.mongodb.driver.protocol.update=DEBUG
in application.properties and compare the logs. Hopefully this will help you narrow down the issue. -
s7vr about 6 yearslogs like
MongoTemplate : findOne using query: { "_id" : { "$oid" : "somevalue"}} in db.collection: db.somecollection protocol.command: Sending command {find : BsonString{value='somecollection'}} to database [connectionId{localValue:3, serverValue:67}] to server 127.0.0.1:27017 protocol.command: Command completed MongoTemplate: Calling update using query: { "_id" : { "$oid" : "somevalue"} , "version" : 4} and update: update json in collection: some collection protocol.update: Updating in namespace db.somecollection on connection serverValue:67}] protocol.update: Update completed
-
walen about 6 years@Veeram I found the issue, see this comment. I'm torn between making one of those despised "EDIT: I solved it, it was because of X" edits, and writing some kind of answer about logging levels and how to detect if some other thread is overwriting the changes.
-
s7vr about 6 yearsGlad you solved it but you should set some logging to allow you to debug the issue when it happens. What happens when you enable the logging on application side ? It should capture the logs when other thread overwrite the changes.
-
walen about 6 years@Veeram Both update operations were logged both at the JS client and at the Java server, but each log showed only the "relevant" modified fields, so I didn't realize the problem until I raised the DB log level.
-
Paizo about 6 yearsso the solution to unmask the problem was to put a conditional break point by filtering the bookId on the
repository.save(BookData bookData)
? :D -
walen about 6 years@Paizo I thought the problem was on MongoDB's side, so my solution was to raise the
verbosity
level (see my answer) and that's how I discovered the two updates. If I had already been suspicious that "two updates instead of one" on the Java side was the root cause, then I would've probably done as you say, and indeed the problem would have surfaced too since I would have seen twosave
calls coming from different endpoints. Hindsight is 20/20. -
Tezra about 6 yearsSince this question is resolved, you should accept an answer so that others know it has been resolved.
-
walen about 6 years@Tezra Well, it is solved for me indeed, but (as you yourself said) other users who find this question might have a different problem. 11 people voted the question, but only 2 voted my (rather specific) answer, and only I voted yours (even though I had already figured out the thread problem by the time you mentioned it): maybe none of the answers is good enough for the community? Hence why the question and its bounty will stay open til the grace period ends, in case anybody posts a more complete / general answer (or enhances a current one ;). Don't rush things that need not be rushed.
-
Tezra about 6 yearsYou accept based on what helped you personally. You can wait till the bounty expires, but just accept on your own judgement, not the communities. Others with a similar issue (if it ever exists) can open their own question for that scenario. :3
-
walen about 6 yearsThanks for answering. I get that MongoDB's "eventual consistency" might lead to some dirty reads now and then when operations are just a few seconds apart. However, what I'm seeing is that changes are never persisted. I just did an update, waited for 15 minutes before querying the DB again, and the changes weren't anywhere to be seen. Surely 15 minutes should be enough for MongoDB to propagate changes in a non-sharded, single instance config?
-
walen about 6 years(BTW both bottom links are purple to me, checked them last week. The second one was very interesting but none of what it says helped, since that "flush every 60 s, journal every 100 ms" doesn't seem to match the behavior I'm seeing. But thanks anyway!)
-
Tezra about 6 years@LuisG. I imagine for most people who find this question, this will be their actual problem. In your case, most likely not. Of course, there are a lot of things it COULD be. If it's not a flush issue, do a properties equal compare of the pre-post save objects to verify they are the same (ie, the ID didn't change or anything), and run a pre-post dbstats (docs.mongodb.com/manual/reference/command/dbStats) to see if ANYTHING changed in the DB. If you are multi threaded, it is also possible another thread is trampling your changes.
-
walen about 6 years"If you are multi threaded, it is also possible another thread is trampling your changes." I increased MongoDB's
systemLog.verbosity
to 3 and after careful inspection, that seems to be the actual problem. Some other thread is updating the same document barely 500 ms after my update, but with different changes. So my changes are being overwritten almost instantly and that's why I cannot see them in the next call tofindById()
nor by runningdb.bookData.find()
in Mongo Shell. Oh well. -
walen about 6 yearsThe question about why
mongod.log
stopped showing the updates is still open, though. Might have something to do withslowms
in logging configuration (as in, the queries I was seeing were slower thanslowms
and that's the only reason I was seeing them at all, the default being not to see any queries unlessverbosity
is increased). -
Tezra about 6 years@LuisG. The default log configuration is 100+ms is a "slow" query, and the logger will only log "slow" queries. (explained here docs.mongodb.com/manual/reference/method/db.setProfilingLevel) To log all queries, you need to change logLevel from 0 (min), to 5 (max) (docs.mongodb.com/manual/reference/log-messages/… (Set query verbosity)). Unless you changed the default settings, this is what you are seeing.
-
walen about 6 yearsYes. Two promises from the JS client invoked two different endpoints from the Java backend which in turn made two
findById
at the same time. Each then made their own modifications and calledsave
, the last one to do so effectively overriding the first. -
Tezra about 6 years@LuisG. Good. ^_^ I've updated the answer to include the multi-thread safety case for future readers so they don't have to dig through the comments, and since that was the actual answer for your question. Glad I could be of help. :3
-
tinesoft almost 5 yearsYou could also make use of Optimistic Locking (see docs.spring.io/spring-data/mongodb/docs/current/reference/html/…) to prevent those conncurent updates. Basically, you'll need to add a 'version' field of type 'long' in your BookData entity, annotated with @Version, and Spring Data will take care of the rest for you!
-
Khanna111 over 2 yearsIf you would like to prevent dirty reads, non-repeatable reads, use Transactions. If T1 reads a document D1 and then T2 also reads and updates before T1, then T1 would get a write error.
-
walen over 2 years@Khanna111 MongoDB transactions didn't exist back in April 2018. They were introduced in MongoDB 4.2 -- I was using MongoDB 3.2.6.
-
Khanna111 over 2 yearsk, ic. Thanks. I'll leave the comment on as a placeholder for a possible current solution along with your comment on the version that introduced transactions.