How to break lines into multiple lines in Pyspark
18,690
Solution 1
You can use slashes and parenthesis
spark = SparkSession \
.builder \
.appName("Python Spark SQL basic example") \
.config("spark.some.config.option", "some-value") \
.getOrCreate()
Edit: and an example from a Spark Submit job
./bin/spark-submit \
--master <yarn> \
--deploy-mode <cluster> \
--num-executors <2> \
--executor-cores <2> \
Solution 2
There is no need to add blank space before backslash in PySpark.
conf = SparkConf()
conf.setAppName('appName')\
.set("spark.executor.memory","10g")\
.set("spark.executor.cores",5)
sc = sparkContext(conf=conf)

Author by
Baktaawar
Updated on June 15, 2022Comments
-
Baktaawar almost 2 years
I know in Python one can use backslash or even parentheses to break line into multiple lines.
But somehow in pyspark when I do this, i do get the next line as red which kind of shows something might be wrong.
(conf.setAppName('Learnfit_Recommender') .set("spark.executor.memory", "10g") .set("spark.executor.cores",5) .set("spark.executor.instances",50) .set("spark.yarn.executor.memoryOverhead",1024) )
EDIT 1: I changed the parentheses to backslash. And if you see the image, I see few '.' as red and even the sc variable is marked as red.
Is this the correct way to break lines in pyspark?
-
gold_cy over 7 yearsIt looks like you have a
|
character, unless thats your cursor. I use PySpark in the Jupyter Notebook as well but why are you building it? You can simply append the Spark path to your bash profile. Also seems redundant to writeconf = conf
since you already specified it in your first line. -
Baktaawar over 7 yearsNo thats the cursor. Bash profile I dont want to add since currently I am testing the settings. Once the right settings are found then I can add those to bash profile