How to persist LARGE BLOBs (>100MB) in Oracle using Hibernate

19,050

Solution 1

I was having the same problems as you in attempting to map using "blob" type. Here is a link to a post I made on the hibernate site: https://forum.hibernate.org/viewtopic.php?p=2452481#p2452481

Hibernate 3.6.9
Oracle Driver 11.2.0.2.0
Oracle Database 11.2.0.2.0

To fix the problem I used code that had a custom UserType for the Blob, I had the return type be java.sql.Blob.

Here are the key method implementations of this UserType:

public Object nullSafeGet(ResultSet rs, String[] names, Object owner) throws HibernateException, SQLException {

   Blob blob = rs.getBlob(names[0]);
   if (blob == null)
      return null;

   return blob;
}

public void nullSafeSet(PreparedStatement st, Object value, int index)
     throws HibernateException, SQLException {
   if (value == null) {
      st.setNull(index, sqlTypes()[0]);
   }
   else {
      InputStream in = null;
      OutputStream out = null;
      // oracle.sql.BLOB
      BLOB tempBlob = BLOB.createTemporary(st.getConnection(), true, BLOB.DURATION_SESSION);
      tempBlob.open(BLOB.MODE_READWRITE);
      out = tempBlob.getBinaryOutputStream();
      Blob valueAsBlob = (Blob) value;
      in = valueAsBlob.getBinaryStream();
      StreamUtil.toOutput(in, out);
      out.flush();
      StreamUtil.close(out);
      tempBlob.close();
      st.setBlob(index, tempBlob);
      StreamUtil.close(in);
   }
}

Solution 2

Personally I store files up to 200MB in Oracle BLOB columns using Hibernate, so I can assure it works. So...

You should try newer version of Oracle JDBC driver. It seems that this behavior of using byte arrays instead of streams was changed a little bit over time. And the drivers are backward compatible. I'm not sure, if that's going to fix your problem, but it works for me. Additionally You should switch to org.hibernate.dialect.Oracle10gDialect - which retires the use of the oracle.jdbc.driver package in favor of oracle.jdbc - and it might also help.

Solution 3

I just discovered this question when I was having the same problem with Oracle and Hibernate. The issue is in the Hibernate blob handling. It seems to copy the blob to memory depending on the Dialect in use. I guess they do so, because it's required by some databases/drivers. For Oracle though, this behaviour does not seem to be required.

The fix is pretty simple, just create a custom OracleDialect containing this code:

public class Oracle10DialectWithoutInputStreamToInsertBlob extends Oracle10gDialect {
    public boolean useInputStreamToInsertBlob() {
        return false;
    }
}

Next you need to configure your session factory to use this Dialect. I've tested it with the ojdbc6-11.2.0.1.0 driver towards Oracle 11g, and confirmed that this fixes the issue with memory consumption.

If some of you tries this with another Oracle database and/or with a different Oracle driver I would love to hear if it works for you. If it works with several configurations, I'll send a pull request to the Hibernate team.

Solution 4

It's not best solution, but you can allow Java to use more memory with -Xmx parametr

Edit: You should try to analyse the problem more into depth, try to use JConsole. It helps you see the memory load.

Even with Postgres you might get neaar the heap size limit, but not cross it because the loaded driver takes a bit less memory.

With default settings your heam size limit is about half your physical memory. Try how much bigger blob you can save into postgres.

Share:
19,050
paoloyx
Author by

paoloyx

Updated on June 13, 2022

Comments

  • paoloyx
    paoloyx almost 2 years

    I'm struggling to find a way to insert LARGE images (>100MB, mostly TIFF format) in my Oracle database, using BLOB columns.

    I've searched thoroughly across the web and even in StackOverflow, without being able to find an answer to this problem.
    First of all, the problem...then a short section on the relevant code (java classes/configuration), finally a third section where i show the junit test i've written to test image persistence (i receive the error during my junit test execution)

    Edit: i've added a section, at the end of the question, where i describe some tests and analysis using JConsole

    The problem

    I receive an java.lang.OutOfMemoryError: Java heap space error using hibernate and trying to persist very large images/documents:

    java.lang.OutOfMemoryError: Java heap space
    at java.util.Arrays.copyOf(Arrays.java:2786)
    at java.io.ByteArrayOutputStream.toByteArray(ByteArrayOutputStream.java:133)
    at org.hibernate.type.descriptor.java.DataHelper.extractBytes(DataHelper.java:190)
    at org.hibernate.type.descriptor.java.BlobTypeDescriptor.unwrap(BlobTypeDescriptor.java:123)
    at org.hibernate.type.descriptor.java.BlobTypeDescriptor.unwrap(BlobTypeDescriptor.java:47)
    at org.hibernate.type.descriptor.sql.BlobTypeDescriptor$4$1.doBind(BlobTypeDescriptor.java:101)
    at org.hibernate.type.descriptor.sql.BasicBinder.bind(BasicBinder.java:91)
    at org.hibernate.type.AbstractStandardBasicType.nullSafeSet(AbstractStandardBasicType.java:283)
    at org.hibernate.type.AbstractStandardBasicType.nullSafeSet(AbstractStandardBasicType.java:278)
    at org.hibernate.type.AbstractSingleColumnStandardBasicType.nullSafeSet(AbstractSingleColumnStandardBasicType.java:89)
    at org.hibernate.persister.entity.AbstractEntityPersister.dehydrate(AbstractEntityPersister.java:2184)
    at org.hibernate.persister.entity.AbstractEntityPersister.insert(AbstractEntityPersister.java:2430)
    at org.hibernate.persister.entity.AbstractEntityPersister.insert(AbstractEntityPersister.java:2874)
    at org.hibernate.action.EntityInsertAction.execute(EntityInsertAction.java:79)
    at org.hibernate.engine.ActionQueue.execute(ActionQueue.java:273)
    at org.hibernate.engine.ActionQueue.executeActions(ActionQueue.java:265)
    at org.hibernate.engine.ActionQueue.executeActions(ActionQueue.java:184)
    at org.hibernate.event.def.AbstractFlushingEventListener.performExecutions(AbstractFlushingEventListener.java:321)
    at org.hibernate.event.def.DefaultFlushEventListener.onFlush(DefaultFlushEventListener.java:51)
    at org.hibernate.impl.SessionImpl.flush(SessionImpl.java:1216)
    at it.paoloyx.blobcrud.manager.DocumentManagerTest.testInsertDocumentVersion(DocumentManagerTest.java:929)
    

    The code (domain objects, repository classes, configuration)

    Here is the stack of technologies i'm using (from DB to business logic tier). I use JDK6.

    • Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Prod
    • ojdbc6.jar (for 11.2.0.3 release)
    • Hibernate 4.0.1 Final
    • Spring 3.1.GA RELEASE

    I've two domain classes, mapped in a one-to-many fashion. A DocumentVersion has many DocumentData, each of one can represent different binary content for the same DocumentVersion.

    Relevant extract from DocumentVersion class:

    @Entity
    @Table(name = "DOCUMENT_VERSION")
    public class DocumentVersion implements Serializable {
    
    private static final long serialVersionUID = 1L;
    private Long id;
    private Set<DocumentData> otherDocumentContents = new HashSet<DocumentData>(0);
    
    
    @Id
    @GeneratedValue(strategy = GenerationType.TABLE)
    @Column(name = "DOV_ID", nullable = false)
    public Long getId() {
        return id;
    }
    
    @OneToMany
    @Cascade({ CascadeType.SAVE_UPDATE })
    @JoinColumn(name = "DOD_DOCUMENT_VERSION")
    public Set<DocumentData> getOtherDocumentContents() {
        return otherDocumentContents;
    }
    

    Relevant extract from DocumentData class:

    @Entity
    @Table(name = "DOCUMENT_DATA")
    public class DocumentData {
    
    private Long id;
    
    /**
     * The binary content (java.sql.Blob)
     */
    private Blob binaryContent;
    
    @Id
    @GeneratedValue(strategy = GenerationType.TABLE)
    @Column(name = "DOD_ID", nullable = false)
    public Long getId() {
        return id;
    }
    
    @Lob
    @Column(name = "DOD_CONTENT")
    public Blob getBinaryContent() {
        return binaryContent;
    }
    

    Here are my Spring and Hibernate configuration main parameters:

    <bean id="sessionFactory"
        class="org.springframework.orm.hibernate4.LocalSessionFactoryBean">
        <property name="dataSource" ref="dataSource" />
        <property name="packagesToScan" value="it.paoloyx.blobcrud.model" />
        <property name="hibernateProperties">
            <props>
                <prop key="hibernate.dialect">org.hibernate.dialect.Oracle10gDialect</prop>
                <prop key="hibernate.hbm2ddl.auto">create</prop>
                <prop key="hibernate.jdbc.batch_size">0</prop>
                <prop key="hibernate.jdbc.use_streams_for_binary">true</prop>
            </props>
        </property>
    </bean>
    <bean class="org.springframework.orm.hibernate4.HibernateTransactionManager"
        id="transactionManager">
        <property name="sessionFactory" ref="sessionFactory" />
    </bean>
    <tx:annotation-driven transaction-manager="transactionManager" />
    

    My datasource definition:

    <bean class="org.apache.commons.dbcp.BasicDataSource"
        destroy-method="close" id="dataSource">
        <property name="driverClassName" value="${database.driverClassName}" />
        <property name="url" value="${database.url}" />
        <property name="username" value="${database.username}" />
        <property name="password" value="${database.password}" />
        <property name="testOnBorrow" value="true" />
        <property name="testOnReturn" value="true" />
        <property name="testWhileIdle" value="true" />
        <property name="timeBetweenEvictionRunsMillis" value="1800000" />
        <property name="numTestsPerEvictionRun" value="3" />
        <property name="minEvictableIdleTimeMillis" value="1800000" />
        <property name="validationQuery" value="${database.validationQuery}" />
    </bean>
    

    where properties are taken from here:

    database.driverClassName=oracle.jdbc.OracleDriver
    database.url=jdbc:oracle:thin:@localhost:1521:devdb
    database.username=blobcrud
    database.password=blobcrud
    database.validationQuery=SELECT 1 from dual
    

    I've got a service class, that delegates to a repository class:

    @Transactional
    public class DocumentManagerImpl implements DocumentManager {
    
    DocumentVersionDao documentVersionDao;
    
    public void setDocumentVersionDao(DocumentVersionDao documentVersionDao) {
        this.documentVersionDao = documentVersionDao;
    }
    

    and now the relevant extracts from repository classes:

    public class DocumentVersionDaoHibernate implements DocumentVersionDao {
    
    @Autowired
    private SessionFactory sessionFactory;
    
    @Override
    public DocumentVersion saveOrUpdate(DocumentVersion record) {
        this.sessionFactory.getCurrentSession().saveOrUpdate(record);
        return record;
    }
    

    The JUnit test that causes the error

    If i run the following unit test i've got the aforementioned error (java.lang.OutOfMemoryError: Java heap space):

    @RunWith(SpringJUnit4ClassRunner.class)
    @ContextConfiguration(locations = { "classpath*:META-INF/spring/applicationContext*.xml" })
    @Transactional
    public class DocumentManagerTest {
    
    @Autowired
    protected DocumentVersionDao documentVersionDao;
    
    @Autowired
    protected SessionFactory sessionFactory;
    
    @Test
    public void testInsertDocumentVersion() throws SQLException {
    
        // Original mock document content
        DocumentData dod = new DocumentData();
        // image.tiff is approx. 120MB
        File veryBigFile = new File("/Users/paoloyx/Desktop/image.tiff");
        try {
            Session session = this.sessionFactory.getCurrentSession();
            InputStream inStream = FileUtils.openInputStream(veryBigFile);
            Blob blob = Hibernate.getLobCreator(session).createBlob(inStream, veryBigFile.length());
            dod.setBinaryContent(blob);
        } catch (IOException e) {
            e.printStackTrace();
            dod.setBinaryContent(null);
        }
    
        // Save a document version linked to previous document contents
        DocumentVersion dov = new DocumentVersion();
        dov.getOtherDocumentContents().add(dod);
        documentVersionDao.saveOrUpdate(dov);
        this.sessionFactory.getCurrentSession().flush();
    
        // Clear session, then try retrieval
        this.sessionFactory.getCurrentSession().clear();
        DocumentVersion dbDov = documentVersionDao.findByPK(insertedId);
        Assert.assertNotNull("Il document version ritornato per l'id " + insertedId + " è nullo", dbDov);
        Assert.assertNotNull("Il document version recuperato non ha associato contenuti aggiuntivi", dbDov.getOtherDocumentContents());
        Assert.assertEquals("Il numero di contenuti secondari non corrisponde con quello salvato", 1, dbDov.getOtherDocumentContents().size());
    }
    

    The same code works against a PostreSQL 9 installation. The images is being written in the database. Debugging my code, i've been able to find that the PostgreSQL jdbc drivers writes on the database using a buffered output stream....while the Oracle OJDBC driver tries to allocate all at once all the byte[]representing the image.

    From the error stack:

    java.lang.OutOfMemoryError: Java heap space
    at java.util.Arrays.copyOf(Arrays.java:2786)
    at java.io.ByteArrayOutputStream.toByteArray(ByteArrayOutputStream.java:133)
    

    Is the error due to this behavior? Can anyone give me some insights on this problem?

    Thanks everyone.

    Memory Tests with JConsole

    Thanks to the suggestions received for my question, i've tried to do some simple tests to show memory usage of my code using two different jdbc drivers, one for PostgreSQL and one for Oracle. Test setup:

    1. The test has been conducted using the JUnit test described in the previous section.
    2. JVM Heap Size has been set to 512MB, using parameter -Xmx512MB
    3. For Oracle database, I've used ojdbc6.jar driver
    4. For Postgres database, I've used 9.0-801.jdbc3 driver (via Maven)

    First test, with a file of approx 150MB

    In this first test, both Oracle and Postgres passed the test (this is BIG news). The file is sized 1/3 of available JVM heap size. Here the picture of JVM memory consumption:

    Testing Oracle, 512MB Heap Size, 150MB file Testing Oracle, 512MB Heap Size, 150MB file

    Testing PostgreSQL, 512MB Heap Size, 150MB file Testing PostgreSQL, 512MB Heap Size, 150MB file

    Second test, with a file of approx 485MB

    In this second test, only Postgres passed the test and Oracle failed it . The file is sized very near the size of the available JVM heap space. Here the picture of JVM memory consumption:

    Testing Oracle, 512MB Heap Size, 485MB file Testing Oracle, 512MB Heap Size, 485MB file

    Testing PostgreSQL, 512MB Heap Size, 485MB file Testing PostgreSQL, 512MB Heap Size, 485MB file

    Analysis of the tests:

    It seems that PostgreSQL driver handles memory without surpassing a certain threshold, while Oracle driver behaves very differently.

    I can't honestly explain why Oracle jdbc driver leads me to error (the same java.lang.OutOfMemoryError: Java heap space) when using file sized near the available heap space.

    Is there anyone that can give me more insights? Thanks a lot for you help :)

  • Łukasz Rżanek
    Łukasz Rżanek about 12 years
    Additional information: take a look at the hierarchy of OracleDialect and Oracle10gDialect. Those implementations are different!
  • paoloyx
    paoloyx about 12 years
    Well, Łukasz, thank you for your response. I've tried to use Oracle10gDialect and pointing to 'oracle.jdbc.OracleDriver' in my datasource definition (as you can see in my edited answer, I've added a small section of my datasource definition). Hovewer, i still got this frustrating error...may it be a Spring-related problem? Should i try using plain Hibernate and see if i get the same error? I'm afraid that's a oracle-driver related problem...are you using Spring/Hibernate when you store files up to 200MB in your database? Thanks
  • paoloyx
    paoloyx about 12 years
    Ah, by the way, now i'm using the last version of Oracle Jdbc drivers (ojdbc6.jar)
  • Łukasz Rżanek
    Łukasz Rżanek about 12 years
    No, I'm using plain old DAO with Hibernate, not Spring. Spring might be an issue here, but I doubt it will be the driver - it fully supports streaming blobs as well chunking. Hibernate, on the other hand, might be an issue here and I will try to take a look at it.
  • Łukasz Rżanek
    Łukasz Rżanek about 12 years
    Would it be possible for you to post or send me the full log of hibernate start up for this context? I would like to see how it's doing all of the mappings and so on. Great if it could be on ALL level...
  • paoloyx
    paoloyx about 12 years
    Hi Łukasz, i feel you're right, it shouldn't be ad Oracle driver problem after all...using plain jdbc i can write my large images on blob fields. It must be something related to Hibernate or to how hibernate interacts with the oracle driver... Anyway, here is my Hibernate complete log, logger set at ALL-TRACE level: dl.dropbox.com/u/824575/hibernate_all_trace.log Thanks a lot for your precious help :)
  • paoloyx
    paoloyx about 12 years
    Yes, i'm using Hibernate4... :)
  • paoloyx
    paoloyx about 12 years
    Thanks, Hurda, but i'm using this service class within a web application (i.e. multiple user can concurrently persist large images on DB). Extend the JVM memory, unfortunately, is not an option for me :(
  • Łukasz Rżanek
    Łukasz Rżanek about 12 years
    For sure it's not the driver - I'm running Oracle's training for years now and uploading large blobs is one of the exercises for at least 3-4 years. My bet is either on Hibernate or Spring, but 90% it's Hibernate messing things up. I will take a look at the log files and try to dig up my own code from old and forgotten SVN...
  • Hurda
    Hurda about 12 years
    But how big is your memory? You have to store whole BLOB in your memory - to work with it - For soem reason it looks like it makes copy of that part of memory - it's possible that it just need twice the size it would take in the best scenerio. Is it still a problem to use two times size of the memory?
  • paoloyx
    paoloyx about 12 years
    Well, in this case I haven't tweaked nor extended the JVM memory using the -Xmx parameter, as i don't think it is the correct solution, at least for this particular case. At the moment i can't make a prevision on number of users/concurrent file uploads that my system will have to support...And even if I set more memory to the JVM, I might have to handle a higher number of users, and then I would enter into an "endless" round of memory adjustments. BTW...using PostgreSQL i can store my images, is working like a charm. So it must be something in the interaction between Oracle and Hibernate
  • Hurda
    Hurda about 12 years
    @paoloyx you make a lot of assumptions here, if you have not tried to manipulate with -Xmx, you can get into porblem, that you run out of memory with postrges when two simultaneous request to store images hit the server. Try to use JConsole to connect to serevr process and see how much of heap is being used.
  • paoloyx
    paoloyx about 12 years
    Ok, thanks a lot for your suggestion, i'll surely try it. Even the JConsole thing to monitor heap space will be useful. Cheers
  • Łukasz Rżanek
    Łukasz Rżanek about 12 years
    OK, I invested a litle time into this. It surely seems like a Hibernate bug. Oracle driver is OK and it's being used in the correct way, but for some reason Hibernate is creating byte arrays in order to do chunking. Why it is working for me and not here - I have no clue. But I got my old code and I will try to investigate that further. Overall: Spring seems to be OK, Oracle driver as well, Hibernate - not really.
  • paoloyx
    paoloyx about 12 years
    Thanks very much. Your help and effort is really precious. I'm not an Hibernate-expert, but it seems that the problem is related to the way Hibernate talks with oracle jdbc driver. I've added a section where i show some tests (and JVM memory usage)
  • paoloyx
    paoloyx about 12 years
    Hi Hurda, following your advice i've added a section where i show some tests (and JVM memory usage). Thanks for your precious help. :) Postgres and Oracle jdbc drivers behave VERY differently, looking at the output from JConsole...
  • Admin
    Admin about 12 years
    this is a Hibernate issue, straight JDBC works just fine! Hibernate is full of memory related edge cases that are broken in terrible ways!
  • paoloyx
    paoloyx about 12 years
    Thank you, i'm reading you post in the Hibernate forums...i'll try with your suggestions and i'll let you know!
  • Atticus
    Atticus over 10 years
    Is this solution working? Could you give some feedback please?