JPA/Hibernate improve batch insert performance

13,088

Solution 1

I have managed to solve this problem by using Hibernate Sessions for each 'group' of inserts. The results are about a 7-fold reduction in time needed to save the data. Used to take approximately 2000ms to save one 'packet' and now it takes between 200ms and 300ms to do the same thing.

Just to repeat - this is valid for Play! Framework 1.2.3 - I am not sure whether, or how this applies to other frameworks or applications that utilize Hibernate.

    Session mySession = (Session) Pressure.em().getDelegate();

    for(int i = 0 ; i < data.size() ; i++){
        initializeFromJsonAndSave(data.get(i), mySession);
    }
    s.flush();
    s.clear();

The 'initializeFromJsonAndSave' method was changed so that, instead of calling the object's save() method, calls mySession.save(myNewObject).

Solution 2

Here are two good answers on the subject

Notice that with identity generator (it is the generator used by default with play) batch insert is disabled.

Share:
13,088
Iv4n
Author by

Iv4n

Updated on August 03, 2022

Comments

  • Iv4n
    Iv4n over 1 year

    I have a data model that has a ONE TO MANY relationship between ONE entity and 11 other entities. These 12 entities together represent one data packet. The problem I am having is to do with the number of inserts that occur on the 'many' side of these relationships. Some of them can have as many as 100 individual values so to save one whole data packet in the database it requires up to 500 inserts.

    I am using MySQL 5.5 with InnoDB tables. Now, from testing the database I see that it can easily do 15000 inserts per second when processing a batch insert (and even more with LOAD DATA, but that's not practical for this case).

    Is there some way to bunch up these individual 500 inserts into, say - 5 inserts with 100 VALUES (for the 5 linked entities that each has 100 values) using Hibernate?

    As Requested:

    @OneToMany(mappedBy="beat", cascade=CascadeType.ALL)
    @OrderBy("miliseconds ASC")
    public List<AmbientLight> lights;
    

    I should probably also mention one important piece of information - I am using Play! Framework 1.2.3

  • Iv4n
    Iv4n over 10 years
    I don't think Play Framework uses identity by default, does it?
  • Seb Cesbron
    Seb Cesbron over 10 years
    This limit the overhead due to hibernate dirty checking but under the hood you don't use jdbc batch api so this is not as fast as raw jdbc, check your sql logs to see what happens
  • Seb Cesbron
    Seb Cesbron over 10 years
    Yes it does, see play.db.jpa.Model class
  • Iv4n
    Iv4n over 10 years
    Do you have suggestions how to further improve this without veering too far outside the typical Play! Framework constraints?
  • Seb Cesbron
    Seb Cesbron over 10 years
    If the performance is ok for you you can stay with this solution. The other improvement you can have is to separate your commit into multiple transactions instead of flushing and clearing hibernate session. One easy way to do this with play is to use a Job. Your Job take for example 100 objects iterate over them and commit. In your controller you can launch several Jobs in parallel if the order does not matter and wait for all jobs to terminate in your controller