Text manipulation: Extract everything inside brackets
6,560
Solution 1
with the help of awk
,
$ awk -F'[][]' '{print $2}' < input
Dinero
Dia
Perro
Using grep
,
grep -oP '\[\K[^\]]+' input
\K
means that use look around regex advanced feature. More precisely, it's a positive look-behind assertion
if you lack the -P
option, you can do this with perl
:
perl -lne '/\[\K[^\]]+/ and print $&' input
use -i
option to edit file in place.
Or simply you can use cut
as suggested by @juliepelletier,
cut -d"[" -f2 < input | cut -d"]" -f1
Solution 2
sed 's/^.*\[//;s/\].*$//' /path/to/input > /path/to/output
Related videos on Youtube
Author by
Billy
Updated on September 18, 2022Comments
-
Billy over 1 year
I have a text file where every line is in a similar format to this:
%#&#%# [Dinero] / Money / !#@%$@ [Dia] / Day / $%&$^#@ [Perro] / Dog /
I am looking to extract the words inside the brackets, ie. Ola, Dinero, Perro, etc, and save it all to a new text file line by line. Essentially, I am looking to omit/delete/erase all words, letters, special characters, and anything else outside the brackets, including the brackets themselves.
-
Julie Pelletier almost 8 yearsFor the sake of simplicity, I was thinking of answering:
cat input|cut -d\[ -f2|cut -d\] -f1
-
Rahul almost 8 years@juliepelletier yes, updated your suggestion in answer. Thanks
-
VLAZ almost 8 yearshuh, I've not actually used the \K flag. seems useful. Given the requirements, I'd have simply done
grep -o -e '\[.*\]' file.txt | sed -e 's/\[//' -e 's/\]//'
to find and extract all words with the brackets surrounding them and then strip away the brackets. And yes, that's the slightly verbose variant - could also be done more concisely with the same effect. It just better describes the flow I'd have put the data through. I keep meaning to pick up awk but I can never find the time. -
Matthew Finlay almost 8 yearsI think you need <input, not > input
-
Rahul almost 8 years@MatthewFinlay I have not used
> input
anywhere in my post. -
Matthew Finlay almost 8 years@Rahul $ awk -F'[][]' '{print $2}' > input
-
Rahul almost 8 years@MatthewFinlay Ah, silly typo. Anyway thanks for pointing out.