Spark DataFrame: How to specify schema when writing as Avro

10,259

After applying the patch in https://github.com/databricks/spark-avro/pull/222/, I was able to specify a schema on write as follows:

df.write.option("forceSchema", myCustomSchemaString).avro("/path/to/outputDir")
Share:
10,259
erwaman
Author by

erwaman

Updated on June 23, 2022

Comments

  • erwaman
    erwaman almost 2 years

    I want to write a DataFrame in Avro format using a provided Avro schema rather than Spark's auto-generated schema. How can I tell Spark to use my custom schema on write?

  • erwaman
    erwaman about 6 years
    The "avroSchema" option has no effect for .write. It's only used for .read by the DefaultSource. See github.com/databricks/spark-avro/blob/….