Parquet file to CSV conversion

15,659

Solution 1

val df = spark.read.parquet("infile.parquet")

df.write.csv("outfile.csv")

Both "infile.parquet" and "outfile.csv" should be locations on the hdfs file system.

Solution 2

This worked for me when using spark 2.1.0. First run spark shell. Something like:

./bin/spark-shell

then:

val sqlContext = new org.apache.spark.sql.SQLContext(sc)
val df = sqlContext.parquetFile("parquet-file.parquet")
df.printSchema()
df.write.format("csv").save("directory")

it will create csv files in directory

Share:
15,659
Avneet
Author by

Avneet

Updated on June 04, 2022

Comments

  • Avneet
    Avneet almost 2 years

    I want to convert my Parquet file into CSV . Is there a way for the same as i am only able to find CSV to Parquet file and not vice versa.