Zeppelin throws java.lang.OutOfMemoryError: Java heap space

10,237

Can you try increasing the memory in SPARK_SUBMIT_OPTIONS in conf/zeppelin-env.sh:

export SPARK_SUBMIT_OPTIONS="--driver-java-options -Xmx20g"

This thread may help http://apache-zeppelin-users-incubating-mailing-list.75479.x6.nabble.com/Can-not-configure-driver-memory-size-td1513.html

Share:
10,237
Kiran
Author by

Kiran

Updated on June 15, 2022

Comments

  • Kiran
    Kiran almost 2 years

    I am trying to use Zeppelin with the following code:

    val dataText = sc.parallelize(IOUtils.toString(new URL("http://XXX.XX.XXX.121:8090/my_data.txt"),Charset.forName("utf8")).split("\n"))
    
    
    case class Data(id: string, time: long, value1: Double, value2: int, mode: int)
    val dat = dataText .map(s => s.split("\t")).filter(s => s(0) != "Header:").map(
        s => Data(s(0), 
                s(1).toLong,
                s(2).toDouble,
                s(3).toInt,
                s(4).toInt
            )
    ).toDF()
    dat.registerTempTable("mydatatable")
    

    this keeps throwing me following error :

    java.lang.OutOfMemoryError: Java heap space
        at java.util.Arrays.copyOf(Arrays.java:2367)
        at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:130)
        at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:114)
        at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:535)
        at java.lang.StringBuilder.append(StringBuilder.java:204)
        at org.apache.commons.io.output.StringBuilderWriter.write(StringBuilderWriter.java:138)
        at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2002)
        at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1980)
        at org.apache.commons.io.IOUtils.copy(IOUtils.java:1957)
        at org.apache.commons.io.IOUtils.copy(IOUtils.java:1907)
        at org.apache.commons.io.IOUtils.toString(IOUtils.java:778)
        at org.apache.commons.io.IOUtils.toString(IOUtils.java:896)
        at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:38)
        at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:43)
        at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:45)
        at $iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:47)
        at $iwC$$iwC$$iwC$$iwC.<init>(<console>:49)
        at $iwC$$iwC$$iwC.<init>(<console>:51)
        at $iwC$$iwC.<init>(<console>:53)
        at $iwC.<init>(<console>:55)
        at <init>(<console>:57)
        at .<init>(<console>:61)
        at .<clinit>(<console>)
        at .<init>(<console>:7)
        at .<clinit>(<console>)
        at $print(<console>)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
        at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1338)
    

    I have already set the following in the zeppelin-env.sh

    export ZEPPELIN_JAVA_OPTS="-Dhdp.version=2.3.0.0-2557 -Dspark.executor.memory=4g"

    any idea what I may be missing. File I am parsing my_data.txt is about 200MB

    BTW I am using the Hortonworks Sandbox if that matters

    EDIT 1 Here is my zeppelin-env.sh

    export HADOOP_CONF_DIR=/etc/hadoop/conf
    export ZEPPELIN_PORT=9995
    export ZEPPELIN_JAVA_OPTS="-Dhdp.version=2.3.0.0-2557 -Dspark.executor.memory=4g"
    export SPARK_SUBMIT_OPTIONS="--driver-java-options -Xmx4g"
    export ZEPPELIN_INT_MEM="-Xmx4g"
    export SPARK_HOME=/usr/hdp/2.3.0.0-2557/spark
    

    Regards Kiran