Extracting substring from environment variable

5,550

Solution 1

You can use parameter expansion, which is available in any POSIX compliant shell.

$ export FOO=http://unix.stackexchange.com/questions/ask
$ tmp="${FOO#*//}" # remove http://
$ echo "${tmp%%/*}" # remove everything after the first /
unix.stackexchange.com

A more reliable, but uglier method would be to use an actual URL parser. Here is an example for python:

$ echo "$FOO" | python -c 'import urlparse; import sys;  print urlparse.urlparse(sys.stdin.read()).netloc' 
unix.stackexchange.com

Solution 2

If the URLs all follow this pattern I have this short and ugly hack for you:

echo "$FOO" | cut -d / -f 3

Solution 3

You can do it many ways, some of them being:

export _URL='http://unix.stackexchange.com/questions/ask'

echo "$_URL" | sed -ne 'y|/|\n|;s/.*\n\n/;P'

expr "$_URL" : 'http://\([^/]*\)'

echo "$_URL" |  perl -lpe '($_) = m|^http://\K[^/]+|g'

perl -le 'print+(split m{/}, $ENV{_URL})[2]'

(set -f; IFS=/; set -- $_URL; echo "$3";)

Solution 4

Can be done also with regex groups:

$ a="http://unix.stackexchange.com/questions/ask"
$ perl -pe 's|(.*//)(.*?)(/.*)|\2|' <<<"$a"
unix.stackexchange.com
Share:
5,550

Related videos on Youtube

Toothrot
Author by

Toothrot

Updated on September 18, 2022

Comments

  • Toothrot
    Toothrot over 1 year

    In a bash or zsh script, how might I extract the host from a url, e.g. unix.stackexchange.com from http://unix.stackexchange.com/questions/ask, if the latter is in an environment variable?

  • George Vasiliou
    George Vasiliou about 7 years
    Nice alternatives. +1. Though the sed solution has a small mistake; one slash is missing. should be echo "$_URL" | sed -ne 'y|/|\n|;s/.*\n\n//;P' or even better echo "$_URL" | sed -ne 'y|/|\n|;s|.*\n\n||;P'