How to break lines into multiple lines in Pyspark

18,690

Solution 1

You can use slashes and parenthesis

spark = SparkSession \
    .builder \
    .appName("Python Spark SQL basic example") \
    .config("spark.some.config.option", "some-value") \
    .getOrCreate()

Edit: and an example from a Spark Submit job

./bin/spark-submit \
--master <yarn> \
--deploy-mode <cluster> \
--num-executors <2> \
--executor-cores <2> \

Solution 2

There is no need to add blank space before backslash in PySpark.

conf = SparkConf()
conf.setAppName('appName')\
.set("spark.executor.memory","10g")\
.set("spark.executor.cores",5) 
sc = sparkContext(conf=conf)
Share:
18,690
Baktaawar
Author by

Baktaawar

Updated on June 15, 2022

Comments

  • Baktaawar
    Baktaawar almost 2 years

    I know in Python one can use backslash or even parentheses to break line into multiple lines.

    But somehow in pyspark when I do this, i do get the next line as red which kind of shows something might be wrong.

    (conf.setAppName('Learnfit_Recommender')
     .set("spark.executor.memory", "10g")
     .set("spark.executor.cores",5)
     .set("spark.executor.instances",50)
     .set("spark.yarn.executor.memoryOverhead",1024)
    )
    

    EDIT 1: I changed the parentheses to backslash. And if you see the image, I see few '.' as red and even the sc variable is marked as red.

    enter image description here

    Is this the correct way to break lines in pyspark?

  • gold_cy
    gold_cy over 7 years
    It looks like you have a | character, unless thats your cursor. I use PySpark in the Jupyter Notebook as well but why are you building it? You can simply append the Spark path to your bash profile. Also seems redundant to write conf = confsince you already specified it in your first line.
  • Baktaawar
    Baktaawar over 7 years
    No thats the cursor. Bash profile I dont want to add since currently I am testing the settings. Once the right settings are found then I can add those to bash profile