How to generate fields of type String instead of CharSequence using Avro?
Solution 1
If you want all you string fields be instances of java.lang.String
then you only have to configure the compiler:
java -jar /path/to/avro-tools-1.7.7.jar compile -string schema
or if you are using the Maven plugin
<plugin>
<groupId>org.apache.avro</groupId>
<artifactId>avro-maven-plugin</artifactId>
<version>1.7.7</version>
<configuration>
<stringType>String</stringType>
</configuration>
[...]
</plugin>
If you want one specific field to be of type java.lang.String then... you can't. It is not supported by the compiler. You can use "java-class" with the reflect API but the compiler does not care.
If you want to learn more, you can set a breakpoint in SpecificCompiler line 372, Avro 1.7.7. You can see that before the call to addStringType()
the schema have the required information in the props
field. If you pass this schema to SpecificCompiler.javaType()
then it will do what you want. But then addStringType
replaces your schema by a static one. I will most likely ask the question on the mailing list since I don't see the point.
Solution 2
You can set it per field level, just change the type to an object, and include "type" : "string" and "avro.java.string" : "String"
See below for example:
{
"type": "record",
"name": "test",
"fields": [
{
"name": "name",
"type": {
"type": "string",
"avro.java.string": "String"
}
}
]
}
Shekhar
Currently working as a Techno architect for AstraZeneca. Have vast experience of Big Data application design, planning, development, deployment and other phases of application development. Have hands on experience in Amazon Web Services, Hadoop, Hive, Pig, HBase, Kafka, IoT, Java, Storm technologies.
Updated on June 03, 2022Comments
-
Shekhar almost 2 years
I wrote one Avro schema in which some of the fields ** need to be ** of type
String
but Avro has generated those fields of typeCharSequence
.I am not able to find any way to tell Avro to make those fields of type
String
.I tried to use
"fields": [ { "name":"startTime", "type":"string", "avro.java.stringImpl":"String" }, { "name":"endTime", "type":"string", "avro.java.string":"String" } ]
but for both the fields Avro is generating fields of type
CharSequence
.Is there any other way to make those fields of type
String
?