Spark DataFrame: How to specify schema when writing as Avro
10,259
After applying the patch in https://github.com/databricks/spark-avro/pull/222/, I was able to specify a schema on write as follows:
df.write.option("forceSchema", myCustomSchemaString).avro("/path/to/outputDir")
Author by
erwaman
Updated on June 23, 2022Comments
-
erwaman almost 2 years
I want to write a DataFrame in Avro format using a provided Avro schema rather than Spark's auto-generated schema. How can I tell Spark to use my custom schema on write?
-
erwaman about 6 yearsThe "avroSchema" option has no effect for
.write
. It's only used for.read
by theDefaultSource
. See github.com/databricks/spark-avro/blob/….