Garbage collection time very high in spark application causing program halt

20,401

You should fine-tune the GC configuration in your application. As a starting point you can look into the following JVM options:

-XX:+UseG1GC 
-XX:ConcGCThread=<XX> //Replace <XX> with appropriate configuration

Also, the following options might come in handy to look into GC details while fine-tuning:

-XX:+PrintFlagsFinal 
-XX:+PrintReferenceGC 
-verbose:gc 
-XX:+PrintGCDetails 
-XX:+PrintGCTimeStamps 
-XX:+PrintAdaptiveSizePolicy
-XX:+UnlockDiagnosticVMOptions 
-XX:+G1SummarizeConcMark  

For more details, check out this blog : https://databricks.com/blog/2015/05/28/tuning-java-garbage-collection-for-spark-applications.html

Share:
20,401
CRM
Author by

CRM

Learner

Updated on January 07, 2020

Comments

  • CRM
    CRM over 4 years

    I am running a spark application in local mode. I am using spark 1.5.2 with scala 2.10.4.

    When I use large datasets as input, I keep getting the following messages in the log.

    16/01/04 14:44:12 INFO collection.ExternalSorter: Thread 88 spilling in-memory map of 553.6 MB to disk (1 time so far)
    16/01/04 14:44:19 INFO collection.ExternalSorter: Thread 78 spilling in-memory map of 580.0 MB to disk (1 time so far)
    16/01/04 14:45:11 INFO collection.ExternalSorter: Thread 88 spilling in-memory map of 553.6 MB to disk (2 times so far)
    16/01/04 14:45:29 INFO collection.ExternalSorter: Thread 78 spilling in-memory map of 580.0 MB to disk (2 times so far)
    16/01/04 14:46:12 INFO collection.ExternalSorter: Thread 88 spilling in-memory map of 553.6 MB to disk (3 times so far)
    16/01/04 14:46:32 INFO collection.ExternalSorter: Thread 78 spilling in-memory map of 580.0 MB to disk (3 times so far)
    16/01/04 14:47:20 INFO collection.ExternalSorter: Thread 88 spilling in-memory map of 553.6 MB to disk (4 times so far)
    16/01/04 14:47:40 INFO collection.ExternalSorter: Thread 78 spilling in-memory map of 595.2 MB to disk (4 times so far)
    16/01/04 14:48:32 INFO collection.ExternalSorter: Thread 88 spilling in-memory map of 555.5 MB to disk (5 times so far)
    16/01/04 14:49:00 INFO collection.ExternalSorter: Thread 78 spilling in-memory map of 580.0 MB to disk (5 times so far)
    16/01/04 14:49:43 INFO collection.ExternalSorter: Thread 88 spilling in-memory map of 581.1 MB to disk (6 times so far)
    16/01/04 14:50:08 INFO collection.ExternalSorter: Thread 78 spilling in-memory map of 580.0 MB to disk (6 times so far)
    16/01/04 14:50:50 INFO collection.ExternalSorter: Thread 88 spilling in-memory map of 553.6 MB to disk (7 times so far)
    16/01/04 14:51:34 INFO collection.ExternalSorter: Thread 78 spilling in-memory map of 580.0 MB to disk (7 times so far)
    16/01/04 14:52:00 INFO collection.ExternalSorter: Thread 88 spilling in-memory map of 553.6 MB to disk (8 times so far)
    16/01/04 14:52:48 INFO collection.ExternalSorter: Thread 78 spilling in-memory map of 580.0 MB to disk (8 times so far)
    16/01/04 14:53:18 INFO collection.ExternalSorter: Thread 88 spilling in-memory map of 553.6 MB to disk (9 times so far)
    16/01/04 14:53:44 INFO collection.ExternalSorter: Thread 78 spilling in-memory map of 580.0 MB to disk (9 times so far)
    16/01/04 14:54:34 INFO collection.ExternalSorter: Thread 88 spilling in-memory map of 553.6 MB to disk (10 times so far)
    16/01/04 14:54:51 INFO collection.ExternalSorter: Thread 78 spilling in-memory map of 580.0 MB to disk (10 times so far)
    16/01/04 14:55:47 INFO collection.ExternalSorter: Thread 78 spilling in-memory map of 586.8 MB to disk (11 times so far)
    16/01/04 14:56:01 INFO collection.ExternalSorter: Thread 88 spilling in-memory map of 553.6 MB to disk (11 times so far)
    16/01/04 14:56:50 INFO collection.ExternalSorter: Thread 78 spilling in-memory map of 580.0 MB to disk (12 times so far)
    16/01/04 14:57:06 INFO collection.ExternalSorter: Thread 88 spilling in-memory map of 553.7 MB to disk (12 times so far)
    16/01/04 14:57:52 INFO collection.ExternalSorter: Thread 78 spilling in-memory map of 597.8 MB to disk (13 times so far)
    16/01/04 14:58:06 INFO collection.ExternalSorter: Thread 88 spilling in-memory map of 553.6 MB to disk (13 times so far)
    16/01/04 14:59:10 INFO collection.ExternalSorter: Thread 78 spilling in-memory map of 580.0 MB to disk (14 times so far)
    16/01/04 14:59:12 INFO collection.ExternalSorter: Thread 88 spilling in-memory map of 556.9 MB to disk (14 times so far)
    16/01/04 15:00:09 INFO collection.ExternalSorter: Thread 88 spilling in-memory map of 608.0 MB to disk (15 times so far)
    16/01/04 15:00:21 INFO collection.ExternalSorter: Thread 78 spilling in-memory map of 580.2 MB to disk (15 times so far)
    16/01/04 15:01:06 INFO collection.ExternalSorter: Thread 88 spilling in-memory map of 553.6 MB to disk (16 times so far)
    16/01/04 15:01:17 INFO collection.ExternalSorter: Thread 78 spilling in-memory map of 600.3 MB to disk (16 times so far)
    16/01/04 15:02:55 INFO collection.ExternalSorter: Thread 78 spilling in-memory map of 580.0 MB to disk (17 times so far)
    16/01/04 15:03:08 INFO collection.ExternalSorter: Thread 88 spilling in-memory map of 591.4 MB to disk (17 times so far)
    16/01/04 15:06:30 INFO collection.ExternalSorter: Thread 78 spilling in-memory map of 590.1 MB to disk (18 times so far)
    16/01/04 15:07:22 INFO collection.ExternalSorter: Thread 88 spilling in-memory map of 553.6 MB to disk (18 times so far)
    16/01/04 15:09:36 INFO collection.ExternalSorter: Thread 78 spilling in-memory map of 580.0 MB to disk (19 times so far)
    16/01/04 15:10:36 INFO collection.ExternalSorter: Thread 88 spilling in-memory map of 553.6 MB to disk (19 times so far)
    16/01/04 15:11:30 INFO collection.ExternalSorter: Thread 78 spilling in-memory map of 580.0 MB to disk (20 times so far)
    16/01/04 15:11:53 INFO collection.ExternalSorter: Thread 88 spilling in-memory map of 553.6 MB to disk (20 times so far)
    16/01/04 15:13:03 INFO collection.ExternalSorter: Thread 78 spilling in-memory map of 628.5 MB to disk (21 times so far)
    16/01/04 15:13:10 INFO collection.ExternalSorter: Thread 88 spilling in-memory map of 553.6 MB to disk (21 times so far)
    16/01/04 15:14:14 INFO collection.ExternalSorter: Thread 88 spilling in-memory map of 566.0 MB to disk (22 times so far)
    16/01/04 15:14:31 INFO collection.ExternalSorter: Thread 78 spilling in-memory map of 580.0 MB to disk (22 times so far)
    16/01/04 15:15:17 INFO collection.ExternalSorter: Thread 88 spilling in-memory map of 553.6 MB to disk (23 times so far)
    16/01/04 15:16:02 INFO collection.ExternalSorter: Thread 78 spilling in-memory map of 580.0 MB to disk (23 times so far)
    16/01/04 15:16:27 INFO collection.ExternalSorter: Thread 88 spilling in-memory map of 568.4 MB to disk (24 times so far)
    16/01/04 15:17:11 INFO collection.ExternalSorter: Thread 78 spilling in-memory map of 594.0 MB to disk (24 times so far)
    16/01/04 15:17:21 INFO collection.ExternalSorter: Thread 88 spilling in-memory map of 581.1 MB to disk (25 times so far)
    16/01/04 15:18:23 INFO collection.ExternalSorter: Thread 78 spilling in-memory map of 580.0 MB to disk (25 times so far)
    16/01/04 15:18:29 INFO collection.ExternalSorter: Thread 88 spilling in-memory map of 553.6 MB to disk (26 times so far)
    16/01/04 15:19:28 INFO collection.ExternalSorter: Thread 78 spilling in-memory map of 602.9 MB to disk (26 times so far)
    16/01/04 15:19:47 INFO collection.ExternalSorter: Thread 88 spilling in-memory map of 553.6 MB to disk (27 times so far)
    16/01/04 15:20:38 INFO collection.ExternalSorter: Thread 78 spilling in-memory map of 580.0 MB to disk (27 times so far)
    

    Finally program halts showing GC overhead limit exceeded error.

    ERROR executor.Executor: Exception in task 1.0 in stage 8.0 (TID 13)
    java.lang.OutOfMemoryError: GC overhead limit exceeded
    

    I have given maximum available memory in --driver-memory option.

    spark.executor.extraJavaOptions=-XX:+PrintGCDetails -XX:+PrintGCTimeStamps
    

    produces the following result(partial).

    14.283: [GC [PSYoungGen: 2079622K->217460K(2240000K)] 2079702K->217540K(7360000K), 0.2618140 secs] [Times: user=0.79 sys=0.14, real=0.26 secs]
    15.547: [GC [PSYoungGen: 2137460K->319987K(2240000K)] 2137540K->417388K(7360000K), 0.4224430 secs] [Times: user=1.43 sys=0.20, real=0.42 secs]
    17.011: [GC [PSYoungGen: 2239987K->319987K(2240000K)] 2337388K->478956K(7360000K), 0.4046090 secs] [Times: user=1.41 sys=0.13, real=0.40 secs]
    18.419: [GC [PSYoungGen: 2239987K->319988K(1763840K)] 2398956K->528884K(6883840K), 0.3581830 secs] [Times: user=1.35 sys=0.04, real=0.36 secs]
    19.355: [GC [PSYoungGen: 1763828K->558048K(2001920K)] 1972724K->1054180K(7121920K), 0.8325280 secs] [Times: user=2.87 sys=0.21, real=0.84 secs]
    20.689: [GC [PSYoungGen: 2001888K->558048K(1412096K)] 2498020K->1768356K(6532096K), 1.3117840 secs] [Times: user=4.62 sys=0.49, real=1.31 secs]
    22.424: [GC [PSYoungGen: 1412064K->852960K(1707008K)] 2622372K->2369128K(6827008K), 0.9998700 secs] [Times: user=3.71 sys=0.20, real=1.00 secs]
    23.876: [GC [PSYoungGen: 1706976K->448320K(1707008K)] 3223144K->2433368K(6827008K), 0.8896110 secs] [Times: user=3.13 sys=0.35, real=0.89 secs]
    25.384: [GC [PSYoungGen: 1302336K->46112K(1707008K)] 3287384K->2481212K(6827008K), 0.7484000 secs] [Times: user=2.02 sys=0.33, real=0.75 secs]
    26.638: [GC [PSYoungGen: 900128K->23776K(1707008K)] 3335228K->2505212K(6827008K), 0.0983860 secs] [Times: user=0.35 sys=0.03, real=0.10 secs]
    27.198: [GC [PSYoungGen: 877792K->31584K(1707008K)] 3359228K->2536964K(6827008K), 0.1347550 secs] [Times: user=0.48 sys=0.02, real=0.14 secs]
    27.873: [GC [PSYoungGen: 885600K->23872K(1707008K)] 3390980K->2560996K(6827008K), 0.1403900 secs] [Times: user=0.44 sys=0.02, real=0.14 secs]
    28.553: [GC [PSYoungGen: 877888K->23776K(1707008K)] 3415012K->2584924K(6827008K), 0.1400410 secs] [Times: user=0.44 sys=0.03, real=0.14 secs]
    29.259: [GC [PSYoungGen: 877792K->23840K(1707008K)] 3438940K->2608932K(6827008K), 0.1459830 secs] [Times: user=0.42 sys=0.02, real=0.14 secs]
    29.892: [GC [PSYoungGen: 877856K->23968K(1707008K)] 3462948K->2633076K(6827008K), 0.1455850 secs] [Times: user=0.44 sys=0.01, real=0.14 secs]
    30.540: [GC [PSYoungGen: 877984K->24128K(1707008K)] 3487092K->2657396K(6827008K), 0.1357490 secs] [Times: user=0.43 sys=0.02, real=0.13 secs]
    31.184: [GC [PSYoungGen: 878144K->33568K(1707008K)] 3511412K->2691116K(6827008K), 0.1540460 secs] [Times: user=0.45 sys=0.02, real=0.15 secs]
    31.894: [GC [PSYoungGen: 887584K->24032K(1707008K)] 3545132K->2715340K(6827008K), 0.1585390 secs] [Times: user=0.46 sys=0.02, real=0.16 secs]
    32.589: [GC [PSYoungGen: 878048K->23648K(1707008K)] 3569356K->2739100K(6827008K), 0.1436490 secs] [Times: user=0.49 sys=0.02, real=0.15 secs]
    33.252: [GC [PSYoungGen: 877664K->23872K(1707008K)] 3593116K->2763132K(6827008K), 0.1563300 secs] [Times: user=0.44 sys=0.02, real=0.15 secs]
    33.961: [GC [PSYoungGen: 877888K->24928K(1707008K)] 3617148K->2788244K(6827008K), 0.1603540 secs] [Times: user=0.47 sys=0.02, real=0.16 secs]
    34.659: [GC [PSYoungGen: 878944K->48160K(1765888K)] 3642260K->2836620K(6885888K), 0.1520400 secs] [Times: user=0.44 sys=0.02, real=0.16 secs]
    35.563: [GC [PSYoungGen: 968736K->55680K(1714688K)] 3757196K->2892556K(6834688K), 0.1839930 secs] [Times: user=0.51 sys=0.03, real=0.18 secs]
    16/01/07 14:14:06 INFO collection.ExternalSorter: Thread 77 spilling in-memory map of 251.0 MB to disk (1 time so far)
    36.584: [GC [PSYoungGen: 976256K->36800K(1861120K)] 3813132K->2929588K(6981120K), 0.1855900 secs] [Times: user=0.47 sys=0.04, real=0.19 secs]
    37.510: [GC [PSYoungGen: 1154496K->31872K(1816576K)] 4047284K->2961316K(6936576K), 0.1417060 secs] [Times: user=0.52 sys=0.02, real=0.14 secs]
    38.185: [GC [PSYoungGen: 1149568K->275840K(1892352K)] 4079012K->3236228K(7012352K), 0.3095680 secs] [Times: user=1.18 sys=0.02, real=0.31 secs]
    39.208: [GC [PSYoungGen: 1510784K->32832K(1897472K)] 4471172K->3270092K(7017472K), 0.4793580 secs] [Times: user=1.47 sys=0.20, real=0.48 secs]
    40.382: [GC [PSYoungGen: 1267776K->29376K(1968640K)] 4505036K->3299732K(7088640K), 0.1475130 secs] [Times: user=0.54 sys=0.03, real=0.14 secs]
    41.289: [GC [PSYoungGen: 1371840K->33120K(1933824K)] 4642196K->3333140K(7053824K), 0.1600760 secs] [Times: user=0.60 sys=0.01, real=0.16 secs]
    42.200: [GC [PSYoungGen: 1375584K->41440K(2037760K)] 4675604K->3374892K(7157760K), 0.1626620 secs] [Times: user=0.60 sys=0.03, real=0.17 secs]
    43.299: [GC [PSYoungGen: 1522656K->42380K(2003456K)] 4856108K->3417537K(7123456K), 0.1908380 secs] [Times: user=0.70 sys=0.04, real=0.19 secs]
    16/01/07 14:14:14 INFO collection.ExternalSorter: Thread 90 spilling in-memory map of 292.5 MB to disk (1 time so far)
    16/01/07 14:14:14 INFO collection.ExternalSorter: Thread 78 spilling in-memory map of 292.5 MB to disk (1 time so far)
    16/01/07 14:14:15 INFO collection.ExternalSorter: Thread 89 spilling in-memory map of 296.0 MB to disk (1 time so far)
    44.820: [GC [PSYoungGen: 1523596K->41984K(2101760K)] 4898753K->3459769K(7221760K), 0.2204760 secs] [Times: user=0.76 sys=0.04, real=0.22 secs]
    47.428: [GC [PSYoungGen: 1654784K->8352K(2071040K)] 5072571K->3466699K(7191040K), 0.1032460 secs] [Times: user=0.35 sys=0.03, real=0.10 secs]
    49.528: [GC [PSYoungGen: 1621152K->17536K(2150400K)] 5079499K->3480019K(7270400K), 0.0703050 secs] [Times: user=0.27 sys=0.00, real=0.07 secs]
    51.109: [GC [PSYoungGen: 1733760K->26630K(2125824K)] 5196243K->3506130K(7245824K), 0.0819420 secs] [Times: user=0.30 sys=0.01, real=0.08 secs]
    52.975: [GC [PSYoungGen: 1742854K->247786K(2106368K)] 5222354K->3752084K(7226368K), 0.2441990 secs] [Times: user=0.94 sys=0.02, real=0.24 secs]
    56.163: [GC [PSYoungGen: 1970154K->146112K(2141184K)] 5474452K->3894756K(7261184K), 0.4435860 secs] [Times: user=1.57 sys=0.17, real=0.44 secs]
    59.494: [GC [PSYoungGen: 1868480K->39296K(2174976K)] 5617124K->3934524K(7294976K), 0.2607190 secs] [Times: user=0.92 sys=0.11, real=0.26 secs]
    

    When I see the details of Stages in Spark-UI, GC time looks to be high. How do I handle this problem?

    gc limit