JPA / Hibernate is very slow when fetching data

13,780

In the comments you point out that your @Joincolumn are or FetchType.EAGER. This is very aggressive and your problem.

Eager means Hibernate will JOIN FETCH all said columns in your query, parse and load all the records to dispatch the data in new instances of your entities. You may know that, if you join table A and B, the result will be a huge A.*, B.* with A x B records and A repeated many times. This is what's going on with Eager. That can escalate very quickly, especially with many Eager columns.

Switching to Lazy tells Hibernate to not load the data. It just prepares a Proxy object which will call a separate SELECT only when required (if your transaction is still open).

You can always force a FETCH manually in your HQL with JOIN FETCH table. This is the preferable way.

Share:
13,780
Sayak Banerjee
Author by

Sayak Banerjee

Updated on June 20, 2022

Comments

  • Sayak Banerjee
    Sayak Banerjee almost 2 years

    I have a DAO that uses JPA/Hibernate to fetch data from SQL Server 2008 R2. In my specific usecase, i am doing a simple SELECT query that returns about 100000 records, but takes more than 35 minutes to do so.

    I created a basic JDBC connection by manually loading the sql server driver and the same query returned 100k records in 15 seconds. So something is very wrong in my configuration.

    Here is my springDataContext.xml:

    <?xml version="1.0" encoding="UTF-8"?>
    <beans xmlns="http://www.springframework.org/schema/beans"
           xmlns:jpa="http://www.springframework.org/schema/data/jpa"
           xmlns:context="http://www.springframework.org/schema/context"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xmlns:tx="http://www.springframework.org/schema/tx"
           xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-3.0.xsd
           http://www.springframework.org/schema/data/jpa http://www.springframework.org/schema/data/jpa/spring-jpa-1.3.xsd
           http://www.springframework.org/schema/tx http://www.springframework.org/schema/tx/spring-tx-3.1.xsd
           http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context-3.1.xsd">
    
        <bean id="jpaVendorAdapter" class="org.springframework.orm.jpa.vendor.HibernateJpaVendorAdapter"/>
    
        <bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
            <property name="dataSource" ref="myDataSource"/>
            <property name="jpaVendorAdapter" ref="jpaVendorAdapter"/>
            <property name="persistenceXmlLocation" value="classpath:persistence.xml"/>
            <property name="jpaProperties">
                <props>
                    <prop key="hibernate.dialect">org.hibernate.dialect.SQLServer2008Dialect</prop>
                    <prop key="hibernate.ejb.naming_strategy">org.hibernate.cfg.ImprovedNamingStrategy</prop>
                    <prop key="hibernate.format_sql">true</prop>
                    <prop key="hibernate.hbm2ddl.auto">none</prop>
                    <prop key="hibernate.use_sql_comments">false</prop>
                    <prop key="hibernate.show_sql">${hibernate.show_sql:false}</prop>
                    <prop key="jadira.usertype.autoRegisterUserTypes">true</prop>
                    <prop key="jadira.usertype.javaZone">America/Chicago</prop>
                    <prop key="jadira.usertype.databaseZone">America/Chicago</prop>
                    <prop key="jadira.usertype.useJdbc42Apis">false</prop>
                </props>
            </property>
        </bean>
    
        <bean id="transactionManager" class="org.springframework.orm.jpa.JpaTransactionManager">
            <property name="entityManagerFactory" ref="entityManagerFactory"/>
        </bean>
    
        <tx:annotation-driven/>
    
        <jpa:repositories base-package="com.mycompany.foo"/>
    
        <context:component-scan base-package="com.mycompany.foo.impl" />
    </beans>
    

    The bean myDataSource is provided by whatever app consumes the DAO, which is defined like so:

    <bean id="myDataSource" class="org.springframework.jdbc.datasource.DriverManagerDataSource">
        <property name="driverClassName" value="com.microsoft.sqlserver.jdbc.SQLServerDriver"/>
        <property name="url" value="jdbc:sqlserver://mysqlhost.mycompany.com:1433;database=MYDB"/>
        <property name="username" value="username"/>
        <property name="password" value="chucknorris"/>
    </bean>
    

    I have a complex query wherein I am setting fetchSize:

    package com.mycompany.foo.impl;
    
    import com.mycompany.foo.RecurringLoanPaymentAccountsDao;
    import org.hibernate.Query;
    import org.hibernate.ScrollMode;
    import org.hibernate.ScrollableResults;
    import org.hibernate.Session;
    import org.joda.time.DateTime;
    
    import javax.inject.Named;
    import javax.persistence.EntityManager;
    import javax.persistence.PersistenceContext;
    
    @Named
    public class FooDaoImpl implements FooDao {
        @PersistenceContext
        private EntityManager entityManager;
    
        public ScrollableResults getData(int chunkSize, DateTime tomorrow, DateTime anotherDate) {
            Session session = entityManager.unwrap(Session.class);
    
            Query query = session.createQuery(
                "from MyAccountTable as account " +
                // bunch of left joins (I am using @JoinColumns in my models)
                "where account.some_date >= :tomorrow " +
                "and account.some_other_date < :anotherDate"
                // <snip snip>
            );
    
            query.setParameter("tomorrow", tomorrow)
                 .setParameter("anotherDate", anotherDate)
                 .setFetchSize(chunkSize);
    
            return query.scroll(ScrollMode.FORWARD_ONLY);
        }
    }
    

    I also switched to vanilla JPA and did jpaQuery.getResultList(), but that was equally slow.

    I can provide other relevant code if required. Any clue on what I'm doing wrong here?

    Update 1: Add schema details

    Unfortunately, I work for a bank, so I cannot share the exact code. But let me try to represent the relevant bits.

    Here is my main table that I am querying:

    @Entity
    @Table(name = "MY_TABLE")
    public class MyTable {
        @EmbeddedId
        private Id id = new Id();
    
        @Column(name = "SOME_COL")
        private String someColumn;
    
        // other columns
    
        @OneToMany(fetch = FetchType.LAZY)
        @JoinColumns({
            @JoinColumn(name = "ACCT_NBR", referencedColumnName = "ACCT_NBR", insertable = false, updatable = false),
            @JoinColumn(name = "CUST_NBR", referencedColumnName = "CUST_NBR", insertable = false, updatable = false)
        })
        private List<ForeignTable> foreignTable;
    
        // getter and setter for properties
    
        public static class Id implements Serializable {
            @Column(name = "ACCT_NBR")
            private short accountNumber;
    
            @Column(name = "CUST_NBR")
            private short customerNumber;
        }
    }
    

    Basically, every table has ACCT_NBR and CUST_NBR columns (including the foreign table) which are unique when clustered together. So my join condition includes both of these.

    The model for the foreign table looks exactly the same (with and embedded ID like the main table above), of course with its own set of columns in addition to the ACCT_NBR and CUST_NBR.

    Now i only care about 2 other columns in the foreign table apart from the ID columns mentioned above: TYPE_ID and ACCT_BALANCE.

    TYPE_ID is what I wish to use in my LEFT JOIN condition, so that the final SQL looks like this:

    LEFT JOIN FOREIGN_TABLE FRN
    ON MAIN.ACCT_NBR = FRN.ACCT_NBR
    AND MAIN.CUST_NBR = FRN.CUST_NBR
    AND FRN.TYPE_ID = <some_static_id>
    

    I want the LEFT JOIN to yield NULL data when there is no match for this specific TYPE_ID.

    And in my SELECT claus, I select FOREIGN_TABLE.ACCT_BALANCE, which of course will be null if the left join above has no matching row.

    This method is invoked from a spring batch application's Tasklet, and I have a feeling that it gave a null pointer exception once the tasklet finished processing and therefore, the readonly transaction was closed.

    • Guillaume F.
      Guillaume F. over 8 years
      Can you make sure your statement is readonly ? Hibernate is very bad at handling a lots of object in its session. Also, check if your JDBC connection has cursors enabled, by default JDBC copies everything in memory instead of using the database's cursor
    • Guillaume F.
      Guillaume F. over 8 years
      You should maybe consider JTDS as an alternative driver, too. The one provided by Microsoft has a few violent bugs, and is a lot slower then JTDS.
    • Sayak Banerjee
      Sayak Banerjee over 8 years
      I have a @Transactional(readOnly = true) annotation on the method that calls this DAO method. How can I mark this statement as read-only? Do I need to mark the entity as read only by removing the setters?
    • Guillaume F.
      Guillaume F. over 8 years
      @Transactional(readOnly = true) is correct. Hibernate shouldn't do dirty checks on your session objects. That will speed-up the query a lot.
    • Sayak Banerjee
      Sayak Banerjee over 8 years
      Well, unfortunately, I already have that in there. And it still is painfully slow. Let me look at JTDS. Being a .net developer, all this is very daunting for me :)
    • Guillaume F.
      Guillaume F. over 8 years
      Are the joined columns on FetchType.Lazy or Eager ? Eager is not recommended unless you really know what you're doing.
    • Sayak Banerjee
      Sayak Banerjee over 8 years
      It is a @OneToMany with fetchtype.Eager. Let me try our lazy as well. If that doesn't make a difference, I'll look into JTDS.
    • Sayak Banerjee
      Sayak Banerjee over 8 years
      @GuillaumeF. whoa, changing to lazy made a HUGE difference!! I am going to run an end to end test and confirm.
  • Sayak Banerjee
    Sayak Banerjee over 8 years
    Now i have a different issue, I get a NullPointerException when I try to access the joined entity. And i cannot use JOIN FETCH either as I have a "with" claus (since I have to join the foreign table with one of the columns = some static value). Any ideas?
  • Guillaume F.
    Guillaume F. over 8 years
    It really depends on your schema. I am blind without seeing the actual code. Accessing a Lazy getter, if the transaction is open, should call a SELECT. A NullPointerException is unlikely to be the direct cause. Can you post more code in your question as an addendum?
  • Sayak Banerjee
    Sayak Banerjee over 8 years
    I've added some details. I apologize that I cannot post the exact code as my company has rules against it. I've tried my best to explain the schema, but please let me know if I missed out something.
  • Guillaume F.
    Guillaume F. over 8 years
    You can use LEFT JOIN FETCH FOREIGN_TABLE FRN but you might get the same problem as with Eager (It depends how many Eager Columns you had before). || If that foreign join returns only one record, it should be fast. Just make sure your foreign table has a composite index with the three reference columns. || Otherwise solving the Lazy loading is a bit more difficult (especially with Spring Batch).
  • Sayak Banerjee
    Sayak Banerjee over 8 years
    Problem is, to have that TYPE_ID = <static_int>, I was using WITH in my HQL, which i can no longer use with LEFT JOIN FETCH. Let me try to figure out if I can somehow keep the transaction active. Otherwise, I'd be back to square one :(
  • Guillaume F.
    Guillaume F. over 8 years
    Not really to square one, you identified the problem why your query is slow. You just don't have the right solution yet.
  • Sayak Banerjee
    Sayak Banerjee over 8 years
    I have eliminated the join altogether. Thanks a lot for your help on this issue! Have a great new year ahead :)