Regex for parsing single key: values out of JSON in Javascript
I would strongly discourage you from doing this. JSON is not a regular language as clearly stated here: https://cstheory.stackexchange.com/questions/3987/is-json-a-regular-language
To quote from the above post:
For example, consider an array of arrays of arrays:
[ [ [ 1, 2], [2, 3] ] , [ [ 3, 4], [ 4, 5] ] ]
Clearly you couldn't parse that with true regular expressions.
I'd recommend converting your JSON to an object (JSON.parse) & implementing a find function to traverse the structure.
Other than that, you can take a look at guts of Douglas Crockford's json2.js parse method. Perhaps an altered version would allow you to search through the JSON string & just return the particular object you were looking for without converting the entire structure to an object. This is only useful if you never retrieve any other data from your JSON. If you do, you might as well have converted the whole thing to begin with.
EDIT
Just to further show how Regex breaks down, here's a regex that attempts to parse JSON
If you plug it into http://regexpal.com/ with "Dot Matches All" checked. You'll find that it can match some elements nicely like:
Regex
"Comments"[ :]+((?=\[)\[[^]]*\]|(?=\{)\{[^\}]*\}|\"[^"]*\")
JSON Matched
"Comments": [ { "User":"Fairy God Mother", "Comment": "Ha, can't say I didn't see it coming" } ]
Regex
"Name"[ :]+((?=\[)\[[^]]*\]|(?=\{)\{[^\}]*\}|\"[^"]*\")
JSON Matched
"Name": "Humpty"
However as soon as you start querying for the higher structures like "Posts", which has nested arrays, you'll find that you cannot correctly return the structure since the regex does not have context of which "]" is the designated end of the structure.
Regex
"Posts"[ :]+((?=\[)\[[^]]*\]|(?=\{)\{[^\}]*\}|\"[^"]*\")
JSON Matched
"Posts": [ { "Title": "How I fell", "Comments": [ { "User":"Fairy God Mother", "Comment": "Ha, can't say I didn't see it coming" } ]
AshHeskes
Updated on July 09, 2022Comments
-
AshHeskes almost 2 years
I'm trying to see if it's possible to lookup individual
keys
out of aJSON
string in Javascript and return it'sValue
withRegex
. Sort of like building aJSON
search tool.Imagine the following JSON
"{ "Name": "Humpty", "Age": "18", "Siblings" : ["Dracula", "Snow White", "Merlin"], "Posts": [ { "Title": "How I fell", "Comments": [ { "User":"Fairy God Mother", "Comment": "Ha, can't say I didn't see it coming" } ] } ] }"
I want to be able to search through the
JSON
string and only pull out individual properties.lets assume it's a
function
already, it would look something like.function getPropFromJSON(prop, JSONString){ // Obviously this regex will only match Keys that have // String Values. var exp = new RegExp("\""+prop+"\"\:[^\,\}]*"); return JSONString.match(exp)[0].replace("\""+prop+"\":",""); }
It would return the substring of the
Value
for theKey
.e.g.
getPropFromJSON("Comments") > "[ { "User":"Fairy God Mother", "Comment": "Ha, can't say I didn't see it coming" } ]"
If your wondering why I want to do this instead of using
JSON.parse()
, I'm building a JSON document store aroundlocalStorage
.localStorage
only supports key/value pairs, so I'm storing aJSON
string of the entireDocument
in a uniqueKey
. I want to be able to run a query on the documents, ideally without the overhead ofJSON.parsing()
the entireCollection
ofDocuments
then recursing over theKeys
/nestedKeys
to find a match.I'm not the best at
regex
so I don't know how to do this, or if it's even possible withregex
alone. This is only an experiment to find out if it's possible. Any other ideas as a solution would be appreciated. -
AshHeskes over 12 yearsI had a look at the json2.js parse method earlier. It doesn't really do any kind of parsing. It just does a lot of replacing bad/dangerous/escaped characters/content/scripts so the JSON is clean. Then it just passes the clean string to
eval();
. I think your right on using theRegex
alone thing. I'm going to try and use a combination ofJS
andRegex
. I disagree on the converting the whole thing and traversing it, for my use case. It would be far too intensive on largecollections || documents
, not to mention searching and matching on multiple properties. -
Brandon Boone over 12 yearsFair enough. Only other thing I could recommend (and I'm not an expert in this field) is to use a format that is relational data friendly. I'm assuming Ms-Sql, MySql, & Oracle have optimal ways of storing the data so reading, writing, comparing, & joining data is super fast (and I doubt it's stored as JSON). Just a thought.
-
JAAulde over 12 yearsYou should follow the advice in this answer and avoid doing this via any method other than properly deserializing the JSON and searching through the resulting structure.
-
Paul almost 11 yearsIf you put a finite fixed limit on the nesting depth of your JSON, it becomes a regular language, however the regex would be very ugly unless your limit is only 1 or 2.