Write spark dataframe to file using python and '|' delimiter
30,904
Solution 1
You can try to write to csv choosing a delimiter of |
df.write.option("sep","|").option("header","true").csv(filename)
This would not be 100% the same but would be close.
Alternatively you can collect to the driver and do it youself e.g.:
myprint(df.collect())
or
myprint(df.take(100))
df.collect and df.take return a list of rows.
Lastly you can collect to the driver using topandas and use pandas tools
Solution 2
In Spark 2.0+, you can use in-built CSV writer. Here delimiter
is ,
by default and you can set it to |
df.write \
.format('csv') \
.options(delimiter='|') \
.save('target/location')
Author by
Brian Waters
Updated on January 27, 2020Comments
-
Brian Waters over 4 years
I have constructed a Spark dataframe from a query. What I wish to do is print the dataframe to a text file with all information delimited by '|', like the following:
+-------+----+----+----+ |Summary|col1|col2|col3| +-------+----+----+----+ |row1 |1 |14 |17 | |row2 |3 |12 |2343| +-------+----+----+----+
How can I do this?