extract string between double quotes

7,187

Solution 1

I am not sure how you want to input the string. This has the effect you want to achieve, but it might need to be modified according to how the string is entered:

aa() { echo $3 ; } ; aa "abcd efgh" "ijkl mnop" "qrst uvwxyz"

Edit: So, if it is in variable (it has to be defined with quoted ") :

AA="\"abcd efgh\" \"ijkl mnop\" \"qrst uvwxyz\""
echo $AA

then:

FIRST=`echo $AA| awk -F \" '{print $2}'`
SECOND=`echo $AA| awk -F \" '{print $4}'`
THIRD=`echo $AA| awk -F \" '{print $6}'`
echo $FIRST : $SECOND : $THIRD

as jasonwryan pointed out above. You said, you wanted to use sed, but it makes it unnecessary complex :

FIRST=`echo $AA| sed 's/^\"\([^\"]*\)\".*/\1/'`
SECOND=`echo $AA| sed 's/^\"[^\"]*\" \"\([^\"]*\)\".*/\1/'`
THIRD=`echo $AA| sed 's/^\"[^\"]*\" \"[^\"]*\" \"\([^\"]*\)\".*/\1/'`

Edit2: It is actually possible to achieve completely without sed,awk,perl,.. only with bash, using its "read" builtin function like this (echos are for debugging):

#!/bin/bash

aa() {
echo '$1'="$1"
IFS=\" read aaa FIRST bbb SECOND ccc THIRD ddd <<< "$1"
echo FIRST=$FIRST : SECOND=$SECOND : THIRD=$THIRD
}

AA="\"abcd efgh\" \"ijkl mnop\" \"qrst uvwxyz\""
echo '$AA'="$AA"
aa "$AA"

Solution 2

You said this in a comment:

above string is in variable. I need to extract each section(first,second,third) and store it different variables

So let's just split it.

IFS=\"                  #set the shell's field separator
set -f                  #don't try to glob 
set -- $var             #split on $IFS
var1=$2 var2=$4 var3=$6 #yay
unset IFS               #restore something like a sane default

That won't handle strings which might contain backslash-escaped quotes, though. I hope that's not a problem, because I don't like doing those.

Solution 3

Building on (and somewhat simplifying) ludvik02's sed answer:

AA='"abcd efgh" "ijkl mnop" "qrst uvwxyz"'
AA1=$(echo "$AA" | sed -r 's/^([^"]*"){1}([^"]*).*/\2/')
AA2=$(echo "$AA" | sed -r 's/^([^"]*"){3}([^"]*).*/\2/')
AA3=$(echo "$AA" | sed -r 's/^([^"]*"){5}([^"]*).*/\2/')
(Note that                        this ↑                        is different on every line.)

The -r option to sed enables extended regular expressions.  We need that to use {n}, which means n occurrences of the preceding regex.  ([^"]*") is a compound (group) regex that matches any number of characters other than ", terminated by a ".  The example input string can match this regex up to six times (because it has six " characters); with ...s inserted to distinguish the gaps between the strings, those occurrences are

"abcd efgh" ... "ijkl mnop" ... "qrst uvwxyz"
↑↑--------↑↑----↑↑--------↑↑----↑↑----------↑
1    2       3       4       5        6

Matching any odd number (e.g., three) of occurrences of that consumes everything up through and including the nth (e.g., third) " character, i.e., the " character at the beginning of the ((n+1)/2)th (e.g., second) quoted string.  Then ([^"]*) matches (and groups) everything up to (but not including) the (n+1)th " character, i.e., the " character at the end of the second quoted string.  So this group (the second group) matches the second quoted string.  Finally, .* consumes the rest of the input string.  So then we replace the entire input line with \2, the value of the second group:

"abcd efgh" ... "ijkl mnop" ... "qrst uvwxyz"
↑↑--------↑↑----↑⇑=======⇑⇈≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡⇈
1    2       3
↑---------------↑⇑=======⇑⇈≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡⇈
        \1          \2          (unnamed)
Share:
7,187

Related videos on Youtube

siva
Author by

siva

Updated on September 18, 2022

Comments

  • siva
    siva over 1 year

    I have requirement that i need to extract string using quotes. My string as follows.

    "abcd efgh" "ijkl mnop" "qrst uvwxyz"
    

    can you help me to get the string between second double quotes(ijkl mnop) using sed or grep command.

    In other words if I say give me the string in first quotes I want the first string; if I say second string it should give me string between second double quotes and similarly third one also.

    • Admin
      Admin almost 9 years
      awk -F\" '{print $4}'...
    • Admin
      Admin almost 9 years
      cut -d\" -f3 is too short without this stuff.
    • Admin
      Admin almost 9 years
      Thanks jason replying my question. Can you please help me with using SED or GREP command instead of AWK
  • siva
    siva almost 9 years
    above string is in variable. I need to extract each section(first,second,third) and store it different variables.
  • siva
    siva almost 9 years
    This is really helpful. However my string does not have the format you have pointed out. My shell script has parameters. all my parameters are coming in one singe($1) string with the format that I have given. Now I want to extract each string out of one big string. Can you please help me with the format that I have mentioned.
  • ludvik02
    ludvik02 almost 9 years
    I think that it should work. I have only escaped quotes to get them into one variable. Just replace $AA with $1 inside the function.
  • siva
    siva almost 9 years
    I really appreciate your help. it worked.
  • ludvik02
    ludvik02 almost 9 years
    @siva : For sake of completeness, I have added another edit with solution that uses bash builtin functions only.
  • siva
    siva almost 9 years
    Thank you for the solution. you first solution worked for me.
  • siva
    siva almost 9 years
    Thanks for providing the solution with sed. I really appreciate your help.
  • 123
    123 almost 9 years
    BOOOO no explanation, here on "super user" we need explanation.
  • G-Man Says 'Reinstate Monica'
    G-Man Says 'Reinstate Monica' almost 9 years
    Touché … (42 minutes later) … Ta-da!
  • Andrew Falanga
    Andrew Falanga about 7 years
    I have a question. Why, when changing the splitting character to " are the fields suddenly dereferrenced by $2, $4, etc. instead of their relative position? For example, echo "\"key\" : \"the value for key\"" | awk -F \" -e '{print $2, $4}' will print "key the value for key". Why isn't it {print $1, $2}? I haven't, as yet, found a suitable explanation in the man page.