Extract xml tag value using awk command
41,638
Solution 1
You can use awk
as shown below, however, this is NOT a robust solution and will fail if the xml is not formatted correctly e.g. if there are multiple elements on the same line.
$ dt=$(awk -F '[<>]' '/IntrBkSttlmDt/{print $3}' file)
$ echo $dt
1967-08-13
I suggest you use a proper xml processing tool, like xmllint
.
$ dt=$(xmllint --shell file <<< "cat //IntrBkSttlmDt/text()" | grep -v "^/ >")
$ echo $dt
1967-08-13
Solution 2
The following gawk command uses a record separator regex pattern to match the XML tags. Anything starting with a < followed by at least one non-> and terminated by a > is considered to be a tag. Gawk assigns each RS match into the RT variable. Anything between the tags will be parsed as the record text which gawk assigns to $0.
gawk 'BEGIN { RS="<[^>]+>" } { print RT, $0 }' myfile
Author by
user1929905
Updated on July 09, 2022Comments
-
user1929905 almost 2 years
I have a xml like below
<root> <FIToFICstmrDrctDbt> <GrpHdr> <MsgId>A</MsgId> <CreDtTm>2001-12-17T09:30:47</CreDtTm> <NbOfTxs>0</NbOfTxs> <TtlIntrBkSttlmAmt Ccy="EUR">0.0</TtlIntrBkSttlmAmt> <IntrBkSttlmDt>1967-08-13</IntrBkSttlmDt> <SttlmInf> <SttlmMtd>CLRG</SttlmMtd> <ClrSys> <Prtry>xx</Prtry> </ClrSys> </SttlmInf> <InstgAgt> <FinInstnId> <BIC>AAAAAAAAAAA</BIC> </FinInstnId> </InstgAgt> </GrpHdr> </FIToFICstmrDrctDbt> </root>
I need to extract the value of each tag value in separate variables using awk command. how to do it?