CloudWatch InSights: how to extract/query all JSON array elements at once as a list

8,580

So the solution for my particular case was simple enough since the array in question contained only strings. I just parsed the content of the array inside the [ and ] as a single string. That works for an array of strings or numbers or booleans. It would not be so pretty if I wanted to extract the IDs of an array of objects.

In any case, here is a sample query parsing out the strings in the array:

fields @timestamp, id, method # you don't need to put the 'policyNumbers' up here - it is added automatically
| parse @message '"policyNumbers":[*]' as policyNumbers
#| filter policyNumbers like '234w' # Uncomment to show only entries that mention a specific policy

That will parse the following line:

{"timestamp":"2020-07-21T12:03:46.970Z","id":222,"method": "getRelatedPolicies","dataAccess":{"policyNumbers":["123q", "234w", "345e", "456r"]}}}

With id being 222, method being getRelatedPolicies, and policyNumbers having a value of "123q", "234w", "345e", "456r"

Share:
8,580
Michael Oryl
Author by

Michael Oryl

I'm the Director of Web Development for a life insurance company. In my previous life I was the founder and editor-in-chief of MobileBurn.com. For work I focus mostly on JavaScript projects using Node.js, Angular, and Node. I also dabble in iOS/Swift when needed, though I claim no proficiency there. I used to spend a lot of time with Java/Groovy, but I don't work in that space any longer. For fun, I like working on Arduino and similar micro-controllers for robotics projects as well as generic "maker" stuff. #SOreadytohelp

Updated on September 18, 2022

Comments

  • Michael Oryl
    Michael Oryl almost 2 years

    My company has started using JSON logging in order to better support CloudWatch InSights queries on AWS. The queries are quite easy to work with, except when we are dealing with array data.

    For example, if we have a log entries like the following:

    {
      "id": 123,
      "method": "getRelatedPolicies",
      "policiesRetrieved": [
        "333g",
        "444q"
      ]
    }
    {
      "id": 222,
      "method": "getRelatedPolicies",
      "policiesRetrieved": [
        "123q",
        "234w",
        "345e",
        "456r"
      ]
    }
    {
      "id": 432,
      "method": "getRelatedPolicies",
      "policiesRetrieved": [
        "345e"
      ]
    }
    

    They flatten out in CloudWatch Insights like the following:

      id                    123,
      method                getRelatedPolicies
      policiesRetrieved.0   333g
      policiesRetrieved.1   444q
    
    
      id                    222,
      method                getRelatedPolicies
      policiesRetrieved.0   123q
      policiesRetrieved.1   234w
      policiesRetrieved.2   345e
      policiesRetrieved.3   456r
    
    
      id                    432,
      method                getRelatedPolicies
      policiesRetrieved.0   345e
    

    But what can I do to search for any log entry where the policiesRetrieved array contains the value 345e? There could be any number of entries in the array, so I can't just start adding filter lines like or policiesRetrieved.0 = "345e" or policiesRetrieved.1 = "345e"....

    If I could collapse all of the values into a delimited string, then I could search for a match in the string PLUS I could also easily make use of that list if the user exported the data into CSV or some other non-AWS format for further analysis.

    Can I somehow parse the array values into a string? I've looked through all of the available helper functions available in the queries, and nothing struck me as viable.

    Any solutions would be appreciated.