looping through JSON array in shell script

shell scripting json jq

65,650

Solution 1

For the use case provided in the Question, @JigglyNaga's answer is probably better than this, but for some more complicated task, you could also loop through the list items using keys:

from file:

for k in $(jq '.children.values | keys | .[]' file); do
    ...
done

or from string:

for k in $(jq '.children.values | keys | .[]' <<< "$MYJSONSTRING"); do
    ...
done

So e.g. you might use:

for k in $(jq '.children.values | keys | .[]' file); do
    value=$(jq -r ".children.values[$k]" file);
    name=$(jq -r '.path.name' <<< "$value");
    type=$(jq -r '.type' <<< "$value");
    size=$(jq -r '.size' <<< "$value");
    printf '%s\t%s\t%s\n' "$name" "$type" "$size";
done | column -t -s$'\t'

if you have no newlines for the values, you can make it with a single jq call inside the loop which makes it much faster:

for k in $(jq '.children.values | keys | .[]' file); do
    IFS=$'\n' read -r -d '' name type size \
        <<< "$(jq -r ".children.values[$k] | .path.name,.type,.size" file)"
    printf '%s\t%s\t%s\n' "$name" "$type" "$size";
done | column -t -s$'\t'

Solution 2

Extracting the members

jq -c '.children.values[]|[.path.components[0],.type,.size]'

.children.values[] outputs every member of the array .values.
| pipes the previous result through the next filter, rather like a shell pipe
[...,...,...] makes all the terms inside appear in a single array
The -c option produces "compact" format ie. one object per line

Result:

[".gitignore","FILE",224]
["Jenkinsfile","FILE",1396]
["README.md","FILE",237]
...

Formatting the result

If you want to output a neatly-aligned table, that's a task better handled by other tools, such as column or paste.

jq -c '.children.values[]|[.path.components[0],.type,.size]' | column -t -s'[],"'

-t tells column to guess the number of columns based on the input
-s... specifies the delimiter character(s)

Result:

.gitignore   FILE       224
Jenkinsfile  FILE       1396
README.md    FILE       237

This relies on the characters [, ], , and " not appearing in your filenames, which is not a safe assumption.

paste can also arrange multiple inputs side-by-side. For this, we can remove the JSON structures altogether, and output raw lines (hat-tip to @muru):

jq -r '.children.values[]|.path.components[0],.type,.size' | paste - - -

paste - - - means 3 columns, all read from the the same source. This time, the only assumption is that the filenames don't contain newlines.

Solution 3

jq can render its output into a variety of formats: see https://stedolan.github.io/jq/manual/#Formatstringsandescaping

For tab-separated output:

$ jq -r '.children.values[] | [.path.name, .type, .size] | @tsv' file.json
.gitignore  FILE    224
Jenkinsfile FILE    1396
README.md   FILE    237
pom.xml FILE    2548
src DIRECTORY

Solution 4

one liner solution based on jtc and xargs:

bash $ jtc -x'<values>l[+0]<size>l[-1]' -y'<name>l' -y'<type>l' -y'<size>l' your.json | xargs -n3
.gitignore FILE 224
Jenkinsfile FILE 1396
README.md FILE 237
pom.xml FILE 2548
bash $

Note: there's json irregularity in your file (size key is not present in every record), to exclude it the first argument -x is built that way (process only those records where size is present).

Solution 5

Solution with ramda-cli:

% curl ... | ramda -o tsv '.children.values' 'map flat' 'map props ["path.name", "type", "size"]'
.gitignore      FILE    224
Jenkinsfile     FILE    1396
README.md       FILE    237
pom.xml FILE    2548
src     DIRECTORY

First we traverse into the list of values, then map over the list with flat to convert each entry which is a deep object structure into a shallow one, with keys separated by dots.

Then, we can map over the list again, and pick the wanted properties based on their paths, represented with strings.

Finally, -o tsv takes care of converting the resulting list of lists into tsv format.

To debug or further understand what is happening, you may check what each argument does by removing them one by one from the end of the command and observing the difference in output at each step. They are simply operations (or functions) applied on the data one at a time from left to right.

View more solutions

65,650

Author by

Sugatur Deekshith S N

Updated on September 18, 2022

Comments

Sugatur Deekshith S N over 1 year

Below is the curl command output (file information about branch), need script or command to print file name, filetype and size.

I have tried with jq but was able fetch single value ( jq '.values[].size')

{
  "path": {
    "components": [],
    "name": "",
    "toString": ""
  },
  "revision": "master",
  "children": {
    "size": 5,
    "limit": 500,
    "isLastPage": true,
    "values": [
      {
        "path": {
          "components": [
            ".gitignore"
          ],
          "parent": "",
          "name": ".gitignore",
          "extension": "gitignore",
          "toString": ".gitignore"
        },
        "contentId": "c9e472ef4e603480cdd85012b01bd5f4eddc86c6",
        "type": "FILE",
        "size": 224
      },
      {
        "path": {
          "components": [
            "Jenkinsfile"
          ],
          "parent": "",
          "name": "Jenkinsfile",
          "toString": "Jenkinsfile"
        },
        "contentId": "e878a88eed6b19b2eb0852c39bfd290151b865a4",
        "type": "FILE",
        "size": 1396
      },
      {
        "path": {
          "components": [
            "README.md"
          ],
          "parent": "",
          "name": "README.md",
          "extension": "md",
          "toString": "README.md"
        },
        "contentId": "05782ad495bfe11e00a77c30ea3ce17c7fa39606",
        "type": "FILE",
        "size": 237
      },
      {
        "path": {
          "components": [
            "pom.xml"
          ],
          "parent": "",
          "name": "pom.xml",
          "extension": "xml",
          "toString": "pom.xml"
        },
        "contentId": "9cd4887f8fc8c2ecc69ca08508b0f5d7b019dafd",
        "type": "FILE",
        "size": 2548
      },
      {
        "path": {
          "components": [
            "src"
          ],
          "parent": "",
          "name": "src",
          "toString": "src"
        },
        "node": "395c71003030308d1e4148b7786e9f331c269bdf",
        "type": "DIRECTORY"
      }
    ],
    "start": 0
  }
}

expected output should be something like below

.gitignore    FILE     224

Jenkinsfile   FILE     1396

pLumo over 5 years

add " to -s and you have the exact output that OP wants
muru over 5 years

Or if entries won't have newlines in them: jq -r '.children.values[] | .path.components[0],.type,.size' | paste - - -
Sugatur Deekshith S N over 5 years

@JigglyNaga thanks for the detailed answer but I have one more question assume if I want store size value in variable how would I do that , because I want to convert those size values to MB
pLumo over 5 years

then you can use my answer ;-)
VocalFan over 5 years

@RoVo Good point, it's only one more unlikely character.
muru over 5 years

@SugaturDeekshithSN jq can do division (.size/1024, or .size/(1024*1024) ...)
VocalFan over 5 years

@muru Thanks, "no newlines" is more likely to be safe; although paste's single tab doesn't line up some of the later, longer entries.
Sugatur Deekshith S N over 5 years

is the file is a keyword ? or the json array should be present in file?
Sugatur Deekshith S N over 5 years

@muru in this case its fine assume I need to do little more complex calculations in that case , I think we should store it in some variable
VocalFan over 5 years

@SugaturDeekshithSN Please could you update the question to show what you want to see? The current "expected output" can be achieved with the commands I've written so far.
pLumo over 5 years

updated the Answer to clarify this.