Using jq to extract values and format in CSV

161,636

Solution 1

jq has a filter, @csv, for converting an array to a CSV string. This filter takes into account most of the complexities associated with the CSV format, beginning with commas embedded in fields. (jq 1.5 has a similar filter, @tsv, for generating tab-separated-value files.)

Of course, if the headers and values are all guaranteed to be free of commas and double quotation marks, then there may be no need to use the @csv filter. Otherwise, it would probably be better to use it.

For example, if the 'Company Name' were 'Smith, Smith and Smith', and if the other values were as shown below, invoking jq with the "-r" option would produce valid CSV:

$ jq -r '.data | map(.displayName), map(.value) | @csv' so.json2csv.json
"First Name","Last Name","Position","Company Name","Country"
"John (""Johnnie"")","Doe","Director, Planning and Posterity","Smith, Smith and Smith","Transylvania"

Solution 2

I prefer to make each record a row in my CSV.

jq '.data | map([.displayName, .rank, .value] | join(", ")) | join("\n")'

Solution 3

Given just this file, you can do something like:

<testfile jq -r '.data | map(.displayName), map(.value) | join(", ")'

The . operator selects a field from an object/hash. Thus, we start with .data, which returns the array with the data in it. We then map over the array twice, first selecting the displayName, then selecting the value, giving us two arrays with just the values of those keys. For each array, we join the elements with ", " forming two lines. The -r argument tells jq to not quote the resulting strings.

If your actual file is longer (ie, has entries for more than one person), you will likely need something a bit more complicated.

Solution 4

I've found jq hard to wrap my head around. Here's some Ruby:

ruby -rjson -rcsv -e '
  data = JSON.parse(File.read "file.json")
  data["data"].collect {|item| [item["displayName"], item["value"]]}
              .transpose
              .each {|row| puts row.to_csv}
'
First Name,Last Name,Position,Company Name,Country
VALUE,VALUE,VALUE,VALUE,VALUE

The ruby JSON parser barfed about the trailing comma before the close bracket.

Solution 5

Since you tagged this python and assuming name of json file is x.json

import os, json
with open('x.json') as f:
    x  = json.load(f)
    print '{}{}{}'.format(', '.join(y['displayName'] for y in x['data']), os.linesep,
             ', '.join(y['value'] for y in x['data']))
First Name, Last Name, Position, Company Name, Country
VALUE, VALUE, VALUE, VALUE, VALUE
Share:
161,636

Related videos on Youtube

Alex Bitek
Author by

Alex Bitek

Life is like riding a bicycle. To keep your balance, you must keep moving!

Updated on September 18, 2022

Comments

  • Alex Bitek
    Alex Bitek over 1 year

    I have the below JSON file:

    {
    "data": [
        {
            "displayName": "First Name",
            "rank": 1,
            "value": "VALUE"
        },
        {
            "displayName": "Last Name",
            "rank": 2,
            "value": "VALUE"
        },
        {
            "displayName": "Position",
            "rank": 3,
            "value": "VALUE"
        },
        {
            "displayName": "Company Name",
            "rank": 4,
            "value": "VALUE"
        },
        {
            "displayName": "Country",
            "rank": 5,
            "value": "VALUE"
        },
    ]
    }
    

    I would like to have a CSV file in this format:

    First Name, Last Name, Position, Company Name, Country
    VALUE, VALUE, VALUE, VALUE, VALUE, VALUE
    

    Is this possible by using only jq? I don't have any programming skills.

    • Admin
      Admin over 9 years
      I provided an answer below, but I'm now looking closer at your question and I can't help wondering - where is the 6th VALUE supposed to come from?
    • Admin
      Admin almost 7 years
    • Admin
      Admin almost 5 years
    • Admin
      Admin over 2 years
      The OP output csv is just one record row (plus a header), but the input data is very strangely transposed into separate records for each of the 5 columns (not row). Normally, you would a json array with 1 object of 5 keys each to create the desired CSV.
  • herve
    herve almost 8 years
    It is not working for me. In a related topic, the answer stackoverflow.com/questions/32960857/… is both working and very well explained!
  • Cos
    Cos over 7 years
    What if .value is a number? I receive the error "string and number cannot be added"
  • matheeeny
    matheeeny over 7 years
    @Cos something like .value|tostring instead of .value in the above example
  • ciscogambo
    ciscogambo over 7 years
    @Cos, I found parenthesis are required. (.value|tostring)
  • flickerfly
    flickerfly about 7 years
    I was able to 'jq somestuff | map(.) | @csv', very handy! Thanks
  • Rag
    Rag over 5 years
    Your example is going to put all of the display names on the first line and all of the values on the second line, instead of having one line per record.
  • Clay
    Clay over 4 years
    Also, use jq -r to strip the quotes
  • karatedog
    karatedog about 2 years
    This will list all .displayName fields comma separated and after that it will list all the .value fields in a comma separated list. The aim is to have a line that has <.displayname>, <.value> for every line
  • Admin
    Admin almost 2 years
    @Rag This is exactly what the user in the question wants to do.
  • Admin
    Admin almost 2 years
    ... but this was not what the user in the question asked for.
  • Admin
    Admin almost 2 years
    @Kusalananda True, but due to the broad question title, many people are visiting this page to find what this answer achieves.