Convert RDD to JSON Object

16,771

First I used the following code to reproduce the scenario that you mentioned:

val sampleArray = Array(
("FRUIT", List("Apple", "Banana", "Mango")),
("VEGETABLE", List("Potato", "Tomato")))

val sampleRdd = sc.parallelize(sampleArray)
sampleRdd.foreach(println) // Printing the result

Now, I am using json4s Scala library to convert this RDD into the JSON structure that you requested:

import org.json4s.native.JsonMethods._
import org.json4s.JsonDSL.WithDouble._

val json = "categories" -> sampleRdd.collect().toList.map{
case (name, nodes) =>
  ("name", name) ~
  ("nodes", nodes.map{
    name => ("name", name)
  })
}

println(compact(render(json))) // Printing the rendered JSON

The result is:

{"categories":[{"name":"FRUIT","nodes":[{"name":"Apple"},{"name":"Banana"},{"name":"Mango"}]},{"name":"VEGETABLE","nodes":[{"name":"Potato"},{"name":"Tomato"}]}]}
Share:
16,771
Vamsi
Author by

Vamsi

Android developer

Updated on June 28, 2022

Comments

  • Vamsi
    Vamsi almost 2 years

    I have an RDD of type RDD[(String, List[String])].

    Example:

    (FRUIT, List(Apple,Banana,Mango))
    (VEGETABLE, List(Potato,Tomato))
    

    I want to convert the above output to json object like below.

    {
      "categories": [
        {
          "name": "FRUIT",
          "nodes": [
            {
              "name": "Apple",
              "isInTopList": false
            },
            {
              "name": "Banana",
              "isInTopList": false
            },
            {
              "name": "Mango",
              "isInTopList": false
            }
          ]
        },
        {
          "name": "VEGETABLE",
          "nodes": [
            {
              "name": "POTATO",
              "isInTopList": false
            },
            {
              "name": "TOMATO",
              "isInTopList": false
            },
          ]
        }
      ]
    }
    

    Please suggest the best possible way to do it.

    NOTE: "isInTopList": false is always constant and has to be there with every item in the jsonobject.