Slick 3.0 bulk insert or update (upsert)

18,405

Solution 1

There are several ways that you can make this code faster (each one should be faster than the preceding ones, but it gets progressively less idiomatic-slick):

  • Run insertOrUpdateAll instead of insertOrUpdate if on slick-pg 0.16.1+

    await(run(TableQuery[FooTable].insertOrUpdateAll rows)).sum
    
  • Run your DBIO events all at once, rather than waiting for each one to commit before you run the next:

    val toBeInserted = rows.map { row => TableQuery[FooTable].insertOrUpdate(row) }
    val inOneGo = DBIO.sequence(toBeInserted)
    val dbioFuture = run(inOneGo)
    // Optionally, you can add a `.transactionally`
    // and / or `.withPinnedSession` here to pin all of these upserts
    // to the same transaction / connection
    // which *may* get you a little more speed:
    // val dbioFuture = run(inOneGo.transactionally)
    val rowsInserted = await(dbioFuture).sum
    
  • Drop down to the JDBC level and run your upsert all in one go (idea via this answer):

    val SQL = """INSERT INTO table (a,b,c) VALUES (?, ?, ?)
    ON DUPLICATE KEY UPDATE c=VALUES(a)+VALUES(b);"""
    
    SimpleDBIO[List[Int]] { session =>
      val statement = session.connection.prepareStatement(SQL)
      rows.map { row =>
        statement.setInt(1, row.a)
        statement.setInt(2, row.b)
        statement.setInt(3, row.c)
        statement.addBatch()
      }
      statement.executeBatch()
    }
    

Solution 2

As you can see at Slick examples, you can use ++= function to insert using JDBC batch insert feature. Per instance:

val foos = TableQuery[FooTable]
val rows: Seq[Foo] = ...
foos ++= rows // here slick will use batch insert

You can also "size" you batch by "grouping" the rows sequence:

val batchSize = 1000
rows.grouped(batchSize).foreach { group => foos ++= group }
Share:
18,405
opus111
Author by

opus111

Updated on June 07, 2022

Comments

  • opus111
    opus111 almost 2 years

    what is the correct way to do a bulk insertOrUpdate in Slick 3.0?

    I am using MySQL where the appropriate query would be

    INSERT INTO table (a,b,c) VALUES (1,2,3),(4,5,6)
    ON DUPLICATE KEY UPDATE c=VALUES(a)+VALUES(b);
    

    MySQL bulk INSERT or UPDATE

    Here is my current code which is very slow :-(

    // FIXME -- this is slow but will stop repeats, an insertOrUpdate
    // functions for a list would be much better
    val rowsInserted = rows.map {
      row => await(run(TableQuery[FooTable].insertOrUpdate(row)))
    }.sum
    

    What I am looking for is the equivalent of

    def insertOrUpdate(values: Iterable[U]): DriverAction[MultiInsertResult, NoStream, Effect.Write]