regular expression to extract JSON string from text

29,554

The following regex should work:

\{\s*"mTitle"\s*:\s*(.+?)\s*,\s*"mPoster":\s*(.+?)\s*,\s*"mYear"\s*:\s*(.+?)\s*,\s*"mDate"\s*:\s*(.+?)\s*\}

Check demo here.

The main difference from your regex is the .+? part, that, broken down, means:

  • Match any character (.)
  • One or more times (+)
  • As little as possible (?)

The ? operator after the + is very important here --- because if you removed it, the first .+ (in \{\s*"mTitle"\s*:\s*(.+?)) would match the whole text, not the text up to the "mPoster" word, that is what you want.

Notice it is just a more complicated version of \{"mTitle":(.+?),"mPoster":(.+?),"mYear":(.+?),"mDate":(.+?)\} (with \s* to match spaces, allowed by the JSON notation).

Share:
29,554
isaaijan
Author by

isaaijan

Updated on August 14, 2022

Comments

  • isaaijan
    isaaijan over 1 year

    I'm looking for regex to extract json string from text. I have the text below, which contains

    JSON string(mTitle, mPoster, mYear, mDate)
    

    like that:

    {"999999999":"138138138","020202020202":{"846":{"mTitle":"\u0430","mPoster":{"
    small":"\/upload\/ms\/b_248.jpg","middle":"600.jpg","big":"400.jpg"},"mYear"
    :"2013","mDate":"2014-01-01"},"847":{"mTitle":"\u043a","mPoster":"small":"\/upload\/ms\/241.jpg","middle":"600.jpg","big":"
    138.jpg"},"mYear":"2013","mDate":"2013-12-26"},"848":{"mTitle":"\u041f","mPoster":{"small":"\/upload\/movies\/2
    40.jpg","middle":"138.jpg","big":"131.jpg"},"mYear":"2013","mDate":"2013-12-19"}}}
    

    In order to parse JSON string I should extract JSON string from the text. That is why, my question: Could you help me to get only JSON string from text? Please help.

    I've tried this regular expression with no success:

    {"mTitle":(\w|\W)*"mDate":(\w|\W)*}