How to use Apache Avro to serialize the JSON document and then write it into Cassandra?
Since you already use jackson, you could try the Jackson dataformat module to support Avro-encoded data.
Admin
Updated on June 09, 2022Comments
-
Admin almost 2 years
I have been reading a lot about
Apache Avro
these days and I am more inclined towards using it instead of usingJSON
. Currently, what we are doing is, we are serializing theJSON
document usingJackson
and then writing that serializeJSON
document intoCassandra
for eachrow key/user id
. Then we have a REST service that reads the wholeJSON
document using the row key and then deserialize it and use it further.We will write into Cassandra like this-
user-id column-name serialize-json-document-value
Below is an example which shows the JSON document that we are writing into Cassandra. This JSON document is for particular row key/user id.
{ "lv" : [ { "v" : { "site-id" : 0, "categories" : { "321" : { "price_score" : "0.2", "confidence_score" : "0.5" }, "123" : { "price_score" : "0.4", "confidence_score" : "0.2" } }, "price-score" : 0.5, "confidence-score" : 0.2 } } ], "lmd" : 1379214255197 }
Now we are thinking to use Apache Avro so that we can compact this JSON document by serializing with Apache Avro and then store it in Cassandra. I have couple of questions on this-
- Is it possible to serialize the above JSON document using Apache Avro first of all and then write it into Cassandra? If yes, how can I do that? Can anyone provide a simple example?
- And also we need to deserialize it as well while reading back from Cassandra from our REST service. Is this also possible to do?
Below is my simple code which is serializing the JSON document and printing it out on the console.
public static void main(String[] args) { final long lmd = System.currentTimeMillis(); Map<String, Object> props = new HashMap<String, Object>(); props.put("site-id", 0); props.put("price-score", 0.5); props.put("confidence-score", 0.2); Map<String, Category> categories = new HashMap<String, Category>(); categories.put("123", new Category("0.4", "0.2")); categories.put("321", new Category("0.2", "0.5")); props.put("categories", categories); AttributeValue av = new AttributeValue(); av.setProperties(props); Attribute attr = new Attribute(); attr.instantiateNewListValue(); attr.getListValue().add(av); attr.setLastModifiedDate(lmd); // serialize it try { String jsonStr = JsonMapperFactory.get().writeValueAsString(attr); // then write into Cassandra System.out.println(jsonStr); } catch (JsonGenerationException e) { e.printStackTrace(); } catch (JsonMappingException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } }
Serialzie JSON document will look something like this -
{"lv":[{"v":{"site-id":0,"categories":{"321":{"price_score":"0.2","confidence_score":"0.5"},"123":{"price_score":"0.4","confidence_score":"0.2"}},"price-score":0.5,"confidence-score":0.2}}],"lmd":1379214255197}
AttributeValue
andAttribute
class are usingJackson Annotations
.And also one important note, properties inside the above json document will get changed depending on the column names. We have different properties for different column names. Some column names will have two properties, some will have 5 properties. So the above JSON document will have its correct properties and its value according to our metadata that we are having.
I hope the question is clear enough. Can anyone provide a simple example for this how can I achieve that using Apache Avro. I am just starting with Apache Avro so I am having lot of problems..