Spark: How to set spark.yarn.executor.memoryOverhead property in spark-submit
17,773
Solution 1
Please find example. The values can also be given in Sparkconf.
Example:
./bin/spark-submit \
--[your class] \
--master yarn \
--deploy-mode cluster \
--num-exectors 17
--conf spark.yarn.executor.memoryOverhead=4096 \
--executor-memory 35G \ //Amount of memory to use per executor process
--conf spark.yarn.driver.memoryOverhead=4096 \
--driver-memory 35G \ //Amount of memory to be used for the driver process
--executor-cores 5
--driver-cores 5 \ //number of cores to use for the driver process
--conf spark.default.parallelism=170
/path/to/examples.jar
Solution 2
spark.yarn.executor.memoryOverhead
has now been deprecated:
WARN spark.SparkConf: The configuration key 'spark.yarn.executor.memoryOverhead' has been deprecated as of Spark 2.3 and may be removed in the future. Please use the new key 'spark.executor.memoryOverhead' instead.
You can programmatically set spark.executor.memoryOverhead
by passing it as a config:
spark = (
SparkSession.builder
.master('yarn')
.appName('StackOverflow')
.config('spark.driver.memory', '35g')
.config('spark.executor.cores', 5)
.config('spark.executor.memory', '35g')
.config('spark.dynamicAllocation.enabled', True)
.config('spark.dynamicAllocation.maxExecutors', 25)
.config('spark.yarn.executor.memoryOverhead', '4096')
.getOrCreate()
)
sc = spark.sparkContext
Author by
Micah Pearce
Updated on June 30, 2022Comments
-
Micah Pearce about 2 years
In Spark 2.0. How do you set the spark.yarn.executor.memoryOverhead when you run spark submit.
I know for things like spark.executor.cores you can set
--executor-cores 2
. Is it the same pattern for this property? e.g.--yarn-executor-memoryOverhead 4096