Spark/PySpark: An error occurred while trying to connect to the Java server (127.0.0.1:39543)

34,135

If you are

using this one in Google Cloud:

custom (8 vCPUs, 200 GB)

then you significantly oversubscribe memory. Ignoring that spark.executor.memory has no effect in local mode.

spark.executor.memory accounts only for JVM heap and doesn't cover:

  • PySpark workers memory.
  • PySpark driver memory.

Even with JVM only a part of it can be used for data processing (see Memory Management Overview) so spark.driver.maxResultSize equal to the total assigned memory does not make sense.

Share:
34,135
qwertz
Author by

qwertz

Updated on March 05, 2020

Comments

  • qwertz
    qwertz about 4 years

    Good afternoon,

    In the last two days occurs many connection problems to the Java server. It´s a little bit uncommon because the error occurs not always, only sometimes...

    I am using PySpark combined with Jupyter Notebook. Everything is running on a VM instance in the Google Cloud. I am using this one in Google Cloud:

    custom (8 vCPUs, 200 GB) 
    

    These are the other settings:

    conf = pyspark.SparkConf().setAppName("App")
    conf = (conf.setMaster('local[*]')
            .set('spark.executor.memory', '180G')
            .set('spark.driver.memory', '180G')
            .set('spark.driver.maxResultSize', '180G'))
    
    sc = pyspark.SparkContext(conf=conf)
    sq = pyspark.sql.SQLContext(sc)
    

    I trained a Random Forest Model and made predictions:

    model = rf.fit(train)
    predictions = model.transform(test)
    

    Afterwards I created the ROC-Curve and compute the AUC-value.

    Then I wanted to see the confusion matrix:

    confusion_mat = metrics.confusionMatrix().toArray()
    print(confusion_mat_train_rf)
    

    And now the error occurs:

        Traceback (most recent call last):
      File "/usr/lib/python2.7/SocketServer.py", line 290, in _handle_request_noblock
        self.process_request(request, client_address)
      File "/usr/lib/python2.7/SocketServer.py", line 318, in process_request
        self.finish_request(request, client_address)
      File "/usr/lib/python2.7/SocketServer.py", line 331, in finish_request
        self.RequestHandlerClass(request, client_address, self)
      File "/usr/lib/python2.7/SocketServer.py", line 652, in __init__
        self.handle()
      File "/usr/local/lib/python2.7/dist-packages/pyspark/accumulators.py", line 235, in handle
        num_updates = read_int(self.rfile)
      File "/usr/local/lib/python2.7/dist-packages/pyspark/serializers.py", line 577, in read_int
        raise EOFError
    EOFError
    ERROR:root:Exception while sending command.
    Traceback (most recent call last):
      File "/usr/local/lib/python2.7/dist-packages/py4j/java_gateway.py", line 883, in send_command
        response = connection.send_command(command)
      File "/usr/local/lib/python2.7/dist-packages/py4j/java_gateway.py", line 1040, in send_command
        "Error while receiving", e, proto.ERROR_ON_RECEIVE)
    Py4JNetworkError: Error while receiving
    ERROR:py4j.java_gateway:An error occurred while trying to connect to the Java server (127.0.0.1:39543)
    Traceback (most recent call last):
      File "/usr/local/lib/python2.7/dist-packages/py4j/java_gateway.py", line 963, in start
        self.socket.connect((self.address, self.port))
      File "/usr/lib/python2.7/socket.py", line 228, in meth
        return getattr(self._sock,name)(*args)
    error: [Errno 111] Connection refused
    

    Here is the output from the console:

    OpenJDK 64-Bit Server VM warning
    : INFO: os::commit_memory(0x00007f4998300000, 603979776, 0) failed; error='Cannot allocate memory' (errno=12)
    #
    # There is insufficient memory for the Java Runtime Environment to continue.
    # Native memory allocation (mmap) failed to map 603979776 bytes for committing reserved memory.
    

    Logfile:

    #
    # There is insufficient memory for the Java Runtime Environment to continue.
    # Native memory allocation (mmap) failed to map 603979776 bytes for committing reserved memory.
    # Possible reasons:
    #   The system is out of physical RAM or swap space
    #   In 32 bit mode, the process size limit was hit
    # Possible solutions:
    #   Reduce memory load on the system
    #   Increase physical memory or swap space
    #   Check if swap backing store is full
    #   Use 64 bit Java on a 64 bit OS
    #   Decrease Java heap size (-Xmx/-Xms)
    #   Decrease number of Java threads
    #   Decrease Java thread stack sizes (-Xss)
    #   Set larger code cache with -XX:ReservedCodeCacheSize=
    # This output file may be truncated or incomplete.
    #
    #  Out of Memory Error (os_linux.cpp:2643), pid=2377, tid=0x00007f1c94fac700
    #
    # JRE version: OpenJDK Runtime Environment (8.0_151-b12) (build 1.8.0_151-8u151-b12-0ubuntu0.16.04.2-b12)
    # Java VM: OpenJDK 64-Bit Server VM (25.151-b12 mixed mode linux-amd64 )
    # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
    #
    
    ---------------  S Y S T E M  ---------------
    
    OS:DISTRIB_ID=Ubuntu
    DISTRIB_RELEASE=16.04
    DISTRIB_CODENAME=xenial
    DISTRIB_DESCRIPTION="Ubuntu 16.04.3 LTS"
    
    uname:Linux 4.13.0-1008-gcp #11-Ubuntu SMP Thu Jan 25 11:08:44 UTC 2018 x86_64
    libc:glibc 2.23 NPTL 2.23 
    rlimit: STACK 8192k, CORE 0k, NPROC 805983, NOFILE 1048576, AS infinity
    load average:7.69 4.51 3.57
    
    /proc/meminfo:
    MemTotal:       206348252 kB
    MemFree:         1298460 kB
    MemAvailable:     250308 kB
    Buffers:            6812 kB
    Cached:           438232 kB
    SwapCached:            0 kB
    Active:         203906416 kB
    Inactive:         339540 kB
    Active(anon):   203804300 kB
    Inactive(anon):     8392 kB
    Active(file):     102116 kB
    Inactive(file):   331148 kB
    Unevictable:        3652 kB
    Mlocked:            3652 kB
    SwapTotal:             0 kB
    SwapFree:              0 kB
    Dirty:              4688 kB
    Writeback:             0 kB
    AnonPages:      203805168 kB
    Mapped:            23076 kB
    Shmem:              8776 kB
    Slab:             114476 kB
    SReclaimable:      50640 kB
    SUnreclaim:        63836 kB
    KernelStack:        4752 kB
    PageTables:       404292 kB
    NFS_Unstable:          0 kB
    Bounce:                0 kB
    WritebackTmp:          0 kB
    CommitLimit:    103174124 kB
    Committed_AS:   205956256 kB
    VmallocTotal:   34359738367 kB
    VmallocUsed:           0 kB
    VmallocChunk:          0 kB
    HardwareCorrupted:     0 kB
    AnonHugePages:         0 kB
    ShmemHugePages:        0 kB
    ShmemPmdMapped:        0 kB
    CmaTotal:              0 kB
    CmaFree:               0 kB
    HugePages_Total:       0
    HugePages_Free:        0
    HugePages_Rsvd:        0
    HugePages_Surp:        0
    Hugepagesize:       2048 kB
    DirectMap4k:       71628 kB
    DirectMap2M:     4122624 kB
    DirectMap1G:    207618048 kB
    
    
    CPU:total 8 (initial active 8) (4 cores per cpu, 2 threads per core) family 6 model 85 stepping 3, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx
    

    Does anyone have any idea what the problem might be and how i can solve this? I am desperate. :(

    // I think the Java Runtime Environment has not enough memory to continue... But what can i do?

    Thank you very much!

  • qwertz
    qwertz about 6 years
    Thanks for your objection. So what should i do now? Decrease the google cloud machine type to maybe 8 vCPUs, 52 GB? What would make sense not to oversubscribe the memory?