JPA/Hibernate bulk(batch) insert

56,702

Solution 1

If you're using the database to generate ids, then Hibernate has to execute a query to generate the primary key for each entity.

Solution 2

I have found it much more efficient to bypass hibernate for bulk inserts. You must do away with ORM (object relational mapping) but you can still leverage the connection associated with the current session and the transaction management.

While you temporarily lose the convenience of your ORM, the payoff is significant, especially if you have natively generated Ids since hibernate would normally perform one SELECT for each INSERT.

Session.doWork is very handy for facilitating this.

private MyParentObject saveMyParentObject(final MyParentObject parent, final List<MyChildObject> children)
{
    transaction = session.beginTransaction();
    try
    {
        session.save(parent); // NOTE: parent.parentId assigned and returned here

        session.doWork(new Work()
        {
            public void execute(Connection con) throws SQLException
            {
                // hand written insert SQL - can't use hibernate
                PreparedStatement st = con.prepareStatement("INSERT INTO my_child (parent_id, name, ...) values (?, ?, ...)");

                for (MyChildObject child : children)
                {
                    MyChildObject child = new MyChildObject();
                    child.setParentId(parent.getParentId()); // assign parent id for foreign key

                    // hibernate can't help, determine jdbc parameters manually
                    st.setLong(1, child.getParentId());
                    st.setString(2, child.getName());
                    ...
                    st.addBatch();
                }

                // NOTE: you may want to limit the size of the batch
                st.executeBatch();
            }
        });

        // if your parent has a OneToMany relationship with child(s), refresh will populate this 
        session.refresh(parent);
        transaction.commit();
        return parent;
    }
    catch(Throwable e)
    {
        transaction.rollback();
        throw new RuntimeException(e);
    }   
}

Solution 3

I have written a short blog which talks about batch insert gotchas and also has pointer to small project that has all the right configurations to get started with batch insert with Hibernate. See the details at http://sensiblerationalization.blogspot.com/2011/03/quick-tip-on-hibernate-batch-operation.html

Share:
56,702

Related videos on Youtube

Andriy Kopachevskyy
Author by

Andriy Kopachevskyy

Updated on October 29, 2020

Comments

  • Andriy Kopachevskyy
    Andriy Kopachevskyy over 3 years

    Here is simple example I've created after reading several topics about jpa bulk inserts, I have 2 persistent objects User, and Site. One user could have many site, so we have one to many relations here. Suppose I want to create user and create/link several sites to user account. Here is how code looks like, considering my willing to use bulk insert for Site objects.

    User user = new User("John Doe");
    
    user.getSites().add(new Site("google.com", user));
    user.getSites().add(new Site("yahoo.com", user));
    
    EntityTransaction tx = entityManager.getTransaction();
    tx.begin();
    entityManager.persist(user);
    tx.commit();
    

    But when I run this code (I'm using hibernate as jpa implementation provider) I see following sql output:

    Hibernate: insert into User (id, name) values (null, ?)
    Hibernate: call identity()
    Hibernate: insert into Site (id, url, user_id) values (null, ?, ?)
    Hibernate: call identity()
    Hibernate: insert into Site (id, url, user_id) values (null, ?, ?)
    Hibernate: call identity()
    

    So, I means "real" bulk insert not works or I am confused?

    Here is source code for this example project, this is maven project so you have only download and run mvn install to check output.

    UPDATED:

    After Ken Liu kindly advise, I've disabled Site object id auto generation:

        User user = new User("John Doe");
        user.getSites().add(new Site(1, "google.com", user));
        user.getSites().add(new Site(2, "yahoo.com", user));
        entityManager.setFlushMode(FlushModeType.COMMIT);
        EntityTransaction tx = entityManager.getTransaction();
        tx.begin();
        entityManager.persist(user);
        tx.commit();
    

    Now I have following line in debug output:

    DEBUG: org.hibernate.jdbc.AbstractBatcher - Executing batch size: 2

    It works!

  • Andriy Kopachevskyy
    Andriy Kopachevskyy about 14 years
    Oh, this is make sense! Thanks
  • Ken Liu
    Ken Liu about 14 years
    how are you generating your keys now? You will have to be sure that your keys are unique.
  • aioobe
    aioobe over 10 years
    Wouldn't it be possible to make Hibernate perform UPDATE sometbl SET counter=counter+1000 before inserting 1000 objects, and then just use the 1000 id's it just reserved?
  • rogue lad
    rogue lad over 8 years
    I am using this same technique as you provided. But there is still a problem; this method prepares a different statement for each insert as it can be shown in sql profiler. To increase the performance it needs to compile or prepare statement once then call that compiled statement for the rest of the inserts.
  • bish
    bish about 7 years
    @KenLiu Does this means you can't bulk insert multiple entries using oracle when the PK is generated via a sequence`?
  • Lluis Martinez
    Lluis Martinez about 6 years
    @Max from the DBMS point of view, the statement is the same and will pick up the same access plan. It would be diffent only if variable binding was not used (not the case).
  • Ran
    Ran over 3 years
    Saving about 50k records only took about 2 seconds! Amazing, thank you!