What are key differences between sbt-pack and sbt-assembly?

21,744

Solution 1

(Disclaimer: I maintain sbt-assembly)

sbt-assembly

sbt-assembly creates a fat JAR - a single JAR file containing all class files from your code and libraries. By evolution, it also contains ways of resolving conflicts when multiple JARs provide the same file path (like config or README file). It involves unzipping of all library JARs, so it's a bit slow, but these are heavily cached.

sbt-pack

sbt-pack keeps all the library JARs intact, moves them into target/pack directory (as opposed to ivy cache where they would normally live), and makes a shell script for you to run them.

sbt-native-packager

sbt-native-packager is similar to sbt-pack but it was started by a sbt committer Josh Suereth, and now maintained by highly capable Nepomuk Seiler (also known as muuki88). The plugin supports a number of formats like Windows msi file and Debian deb file. The recent addition is a support for Docker images.

All are viable means of creating deployment images. In certain cases like deploying your application to a web framework etc., it might make things easier if you're dealing with one file as opposed to a dozen.

Honorable mention: sbt-progard and sbt-onejar.

Solution 2

Although Eugene Yokota's explanation is complete, I would like to explain the mentioned plugins with package command in the aspect of usages and how different results are generated.

Directory settings and build.sbt

lazy val commonSettings = Seq(
  organization := "stackOverFlow",
  scalaVersion := "2.11.12",
  version := "1.0",
)

lazy val app  = (project in file ("app")).
  enablePlugins(PackPlugin).
  settings(commonSettings)

Above build.sbt file declares project called app and includes all the source files in the app directory. To enable Pack plugins, enablePlugins(PackPlugin) should be included in the sbt file.

Also, I've put the below line in project/plugins.sbt file to use pack plugins in our project

addSbtPlugin("org.xerial.sbt" % "sbt-pack" % "0.9.3")
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.5")

The package is already integrated into the sbt by default, so you don't have to explicitly specify the plugins using addSbtPlugins. However, the sbt-pack and sbt-assembly plugins are not included in the sbt by default, so you have to specify that you want to use them. addSbtPlugin is a way to say that "I want to use xxx, yyy plugins in my project" to your sbt.

Also, I implemented two contrived scala files in the ./app/src/main/scala:

AppBar.scala

class AppBar {
  def printDescription() = println(AppBar.getDescription)
}

object AppBar {
  private val getDescription: String = "Hello World, I am AppBar"

  def main (args: Array[String]): Unit = {
    val appBar = new AppBar
    appBar.printDescription()
  }
}

AppFoo.scala

class AppFoo {
  def printDescription() = println(AppFoo.getDescription)
}

object AppFoo {
  private val getDescription: String = "Hello World, I am AppFoo"

  def main (args: Array[String]): Unit = {
    val appFoo = new AppFoo
    appFoo.printDescription()
  }
}

sbt package

This is very basic sbt command included in the sbt to help you distribute your project through the jar file. The jar file generated by the package command is located in the projectDirectoy/target/scala-2.11/app_2.11-1.0.jar (Here, the specified scalaVersion and version setting keys included in the build.sbt file are used to generate the jar file name).

When you look inside the jar, you can see the class files generated by the sbt tool, which is the result of compiling the sources in the app/src/main/scala. Also, it includes a MANIFEST file.

$vi  projectDirectoy/target/scala-2.11/app_2.11-1.0.jar

META-INF/MANIFEST.MF
AppBar$.class
AppBar.class
AppFoo.class
AppFoo$.class

Note that it only includes the class files generated from the scala files located in the app/src/main/scala directory. The jar file generated by the package command does not include any scala related libraries such as collection in the scala library (e.g., collection.mutable.Map.class). Therefore, to execute the program you may require scala library because the generate jar file only contains the minimal classes generated from the scala sources that I implemented. That is the reason why the jar file contains AppBar.class, AppBar$.class for companion object, etc.

sbt-assembly

As mentioned by the Eugene Yokota, sbt-assembly also help you distribute your project through generating the jar file; however the generated jar file includes not only the class files generated by your source code, but also all the libraries that you need to execute the program. For example, to execute the main function defined in the AppFoo object, you may need scala libraries. Also, when you add external libraries in your project, which can be included by adding the dependencies to the libraryDependencies key.

libraryDependencies ++= Seq("org.json4s" %% "json4s-jackson" % "3.5.3")

For example, you can include json4s libraries in your project, and jar files related to supporting json4s in your project also will be added to the final jar file generated by the sbt-assembly. In other words, when you invoke assembly in your sbt, it generates one jar file containing all the requirements to execute your program, so that you don't need another dependency to execute yout program.

When you prompt assembly command in your sbt shell, then it will generate one jar file in your target directory. In this case, you may find the app-assembly-1.0.jar in the app/target/scala-2.11 directory. When you look inside the jar file, you can find that it contains lots of classes.

$vi  projectDirectoy/target/scala-2.11/app_2.11-1.0.jar
ETA-INF/MANIFEST.MF
scala/
scala/annotation/
scala/annotation/meta/
scala/annotation/unchecked/
scala/beans/
scala/collection/
scala/collection/concurrent/
scala/collection/convert/
scala/collection/generic/
scala/collection/immutable/
scala/collection/mutable/
scala/collection/parallel/
scala/collection/parallel/immutable/
scala/collection/parallel/mutable/
scala/collection/script/
scala/compat/
scala/concurrent/
scala/concurrent/duration/
scala/concurrent/forkjoin/
scala/concurrent/impl/
scala/concurrent/util/
scala/io/
scala/math/
scala/ref/
scala/reflect/
scala/reflect/macros/
scala/reflect/macros/internal/
scala/runtime/
scala/sys/
scala/sys/process/
scala/text/
scala/util/
scala/util/control/
scala/util/hashing/
scala/util/matching/
AppBar$.class
AppBar.class
AppFoo$.class
AppFoo.class
......

As mentioned before, because the jar file generated by the assembly contains all the dependencies such as scala and external libraries to execute your program in the jar, you may think that you can invoke the main functions defined in the AppFoo object and AppBar object.

jaehyuk@ubuntu:~/work/sbt/app/target/scala-2.11$ java -cp './*' AppFoo
Hello World, I am AppFoo
jaehyuk@ubuntu:~/work/sbt/app/target/scala-2.11$ java -cp './*' AppBar
Hello World, I am AppBar

Yeah~ you can execute the main function using the generated jar file.

sbt-pack

sbt-pack is almost same as the sbt-assembly; it saves all the library on which your project depends as jar files required to execute your program. However, sbt-pack doesn't integrate all the dependencies into one jar files, instead, it generates multiple jar files which correspond to one library dependencies and your classes (e.g., AppFoo.class).

Also, interestingly it automatically generates scripts for invoking all the main functions defined in your scala source files and Makefiles to install the program. Let's take a look at the pack directory created after you prompt pack command on your sbt shell.

jaehyuk@ubuntu:~/work/sbt/app/target/pack$ ls
bin  lib  Makefile  VERSION
jaehyuk@ubuntu:~/work/sbt/app/target/pack$ ls bin/
app-bar  app-bar.bat  app-foo  app-foo.bat
jaehyuk@ubuntu:~/work/sbt/app/target/pack$ ls lib/
app_2.11-1.0.jar  sbt_2.12-0.1.0-SNAPSHOT.jar  scala-library-2.11.12.jar
jaehyuk@ubuntu:~/work/sbt/app/target/pack$ 

As shown in the above, two directories and two files are created; bin contains all the script files to execute the functions defined in your sources (each file is a script that helps you execute the main method defined in your scala files); lib contains all the required jar files to execute your program; and lastly Makefile can be used to install your program and dependent libraries in your system.

For the details, please refer the github pages for each plugins.

Share:
21,744
Jacek Laskowski
Author by

Jacek Laskowski

I am an IT freelancer specializing in apache-spark, delta-lake, apache-kafka and kafka-streams (with scala and sbt). I offer consultancy, courses, workshops, mentoring and software development services. Contact me at [email protected] or DM me on twitter @jaceklaskowski. I'm best known by "The Internals Of" online books available free of charge at https://books.japila.pl/. I run Warsaw Data Engineering meetups.

Updated on July 09, 2022

Comments

  • Jacek Laskowski
    Jacek Laskowski almost 2 years

    I've just stumbled upon the sbt-pack plugin. The development stream seems steady. It's surprising to me as I believed that the only plugin for (quoting sbt-pack's headline) "creating distributable Scala packages." is sbt-assembly (among the other features).

    What are the key differences between the plugins? When should I use one over the other?

  • pommedeterresautee
    pommedeterresautee about 10 years
    Hi, is there a solution more size efficient? I am using sbt-assembly but my little app using Akka and few other lib is 20 Mb! Not cool :-(
  • Alexey Romanov
    Alexey Romanov about 10 years
    @pommedeterresautee As mentioned, sbt-proguard may work for you.
  • matanster
    matanster about 8 years
    @EugeneYokota how does sbt package relate to these then?