Generate Avro Schema from certain Java Object

47,451

Solution 1

Take a look at the Java reflection API.

Getting a schema looks like:

Schema schema = ReflectData.get().getSchema(T);

See the example from Doug on another question for a working example.

Credits of this answer belong to Sean Busby.

Solution 2

Here's how to Generate an Avro Schema from POJO definition

ObjectMapper mapper = new ObjectMapper(new AvroFactory());
AvroSchemaGenerator gen = new AvroSchemaGenerator();
mapper.acceptJsonFormatVisitor(RootType.class, gen);
AvroSchema schemaWrapper = gen.getGeneratedSchema();
org.apache.avro.Schema avroSchema = schemaWrapper.getAvroSchema();
String asJson = avroSchema.toString(true);

Solution 3

** Example**

Pojo class

public class ExportData implements Serializable {
    private String body;
    // ... getters and setters
}

Serialize

File file = new File(fileName);
DatumWriter<ExportData> writer = new ReflectDatumWriter<>(ExportData.class);
DataFileWriter<ExportData> dataFileWriter = new DataFileWriter<>(writer);
Schema schema = ReflectData.get().getSchema(ExportData.class);
dataFileWriter.create(schema, file);
for (Row row : resultSet) {
    String rec = row.getString(0);
    dataFileWriter.append(new ExportData(rec));
}
dataFileWriter.close();

Deserialize

File file = new File(avroFilePath);
DatumReader<ExportData> datumReader = new ReflectDatumReader<>(ExportData.class);
DataFileReader<ExportData> dataFileReader = new DataFileReader<>(file, datumReader);
ExportData record = null;
while (dataFileReader.hasNext()){
    record = dataFileReader.next(record);
    // process record
}
Share:
47,451
Richard Le
Author by

Richard Le

Updated on July 09, 2022

Comments

  • Richard Le
    Richard Le almost 2 years

    Apache Avro provides a compact, fast, binary data format, rich data structure for serialization. However, it requires user to define a schema (in JSON) for object which need to be serialized.

    In some case, this can not be possible (e.g: the class of that Java object has some members whose types are external java classes in external libraries). Hence, I wonder there is a tool can get the information from object's .class file and generate the Avro schema for that object (like Gson use object's .class information to convert certain object to JSON string).

  • Remis Haroon - رامز
    Remis Haroon - رامز over 2 years
    This works well with non-nullable columns, but I ve some fields that are nullable. Is there a way to make those fields nullable in Aro schema. Otherwise it throws an exception = org.apache.avro.file.DataFileWriter$AppendWriteException: java.lang.NullPointerException: in models.RawData in double null of double in field offset of models.RawData
  • Remis Haroon - رامز
    Remis Haroon - رامز over 2 years
    This works well with non-nullable columns, but I ve some fields that are nullable. Is there a way to make those fields nullable in Aro schema. Otherwise it throws an exception = org.apache.avro.file.DataFileWriter$AppendWriteException: java.lang.NullPointerException: in models.RawData in double null of double in field offset of models.RawData
  • Admin
    Admin about 2 years
    For nullable fields, use the AllowNull sub-class: Schema schema = ReflectData.AllowNull.get().getSchema(T);