Extracting value of xml tag in PostgreSQL
Use the xpath()
function:
WITH x(col) AS (SELECT '<?xml version="1.0" ?><response><status>ERROR_MISSING_DATA</status></response>'::xml)
SELECT xpath('./status/text()', col) AS status
FROM x
/text()
strips the surrounding <status>
tag.
Returns an array of xml
- with a single element in this case:
status
xml[]
-------
{ERROR_MISSING_DATA}
Applied to your table
In response to your question update, this can simply be:
SELECT id, xpath('./status/text()', response::xml) AS status
FROM tbl;
If you are certain there is only a single status tag per row, you can simply extract the first item from the array:
SELECT id, (xpath('./status/text()', response::xml))[1] AS status
FROM tbl;
If there can be multiple status items:
SELECT id, unnest(xpath('./status/text()', response::xml)) AS status
FROM tbl;
Gets you 1-n rows per id
.
Cast to xml
Since you defined your columns to be of type text
(instead of xml
, you need to cast to xml
explicitly. The function xpath()
expects the 2nd parameters of type xml
. An untyped string constant is coerced to xml
automatically, but a text
column is not. You need to cast explicitly.
This works without explicit cast:
SELECT xpath('./status/text()'
,'<?xml version="1.0" ?><response><status>SUCCESS</status></response>')
A CTE like in my first example needs a type for every column in the "common table expression". If I had not cast to a specific type, the type unknown
would have been used - which is not the same thing as an untyped string. Obviously, there is no direct conversion implemented between unknown
and xml
. You'd have to cast to text
first: unknown_type_col::text::xml
. Better to cast to ::xml
right away.
This has been tightened with PostgreSQL 9.1 (I think). Older versions were more permissive.
Either way, with any of these methods the string has to be valid xml or the cast (implicit or explicit) will raise an exception.
Comments
-
ronak almost 2 years
Below is the column response from my Postgres table. I want to extract the status from all the rows in my Postgres database. The status could be of varying sizes like
SUCCESS
as well so I do not want to use the substring function. Is there a way to do it?<?xml version="1.0" ?><response><status>ERROR_MISSING_DATA</status><responseType>COUNTRY_MISSING</responseType><country_info>USA</country_info><phone_country_code>1234</phone_country_code></response>
so my table structure is like this
Column | Type | Modifiers -------------+-----------------------------+---------------------------------------------------------- id | bigint | not null default nextval('events_id_seq'::regclass) hostname | text | not null time | timestamp without time zone | not null trn_type | text | db_ret_code | text | request | text | response | text | wait_time | text |
And I want to extract status from each and every request. How do i do this?
Below is a sample row. And assume the table name abc_events
id | 1870667 hostname | abcd.local time | 2013-04-16 00:00:23.861 trn_type | A request | <?xml version="1.0" ?><response><status>ERROR_MISSING_DATA</status><responseType>COUNTRY_MISSING</responseType><country_info>USA</country_info><phone_country_code>1234</phone_country_code></response> response | <?xml version="1.0" ?><response><status>ERROR_MISSING_DATA</status><responseType>COUNTRY_MISSING</responseType><country_info>USA</country_info><phone_country_code>1234</phone_country_code></response>
-
Phrogz about 11 yearsDo you need the
::xml
? I was just doingSELECT xpath('...', '<raw>xml</raw>');
and it seems to work. -
ronak about 11 yearsI edited my question. Essentially what I want is to extract value of a tag from the column that has the xml request/response.
-
ronak about 11 yearsI followed it but I am getting this error
LINE 1: select unnest(xpath('./status/text()', request)) from abc_events ^ HINT: No function matches the given name and argument types. You might need to add explicit type casts.
It is pointing to the xpath function. -
Erwin Brandstetter about 11 years@Phrogz: I added a chapter on the topic of casting, since my initial comment wasn't completely correct. A cast is actually needed with a CTE in this case ...
-
Erwin Brandstetter about 11 years@ronak: I added a bit to my answer. Note the addendum about casting to xml. Also note I had the wrong cast at first. Must be
::xml
. -
ronak about 11 yearsThanks for the help Erwin. This helped me a lot.
-
Erwin Brandstetter about 11 years@ronak: Cool. :) For more advanced acrobatics with
xpath()
consider this related answer. -
Aamir over 10 yearsBut what if there is multiple tags in a column? How i can extract them? Suppose xml-data in a column is like - <status>abc</status><response>ERROR_MISSING_DATA</response>
-
Peter Krauss over 6 yearsHi, simple "cast XML to text" must use
//text()
... Soarray_to_string( xpath('path//text()', xcontent)::text[] , '')
to obtain all text from, eg., the TXT of an HTML document. -
Surya over 5 years<?xml version="1.0" encoding="UTF-8"?><BookList xmlns="azkhaban.com/DEM-ON-TOR-20040511#" xmlns:xsd="hogwarts.org/2001/XMLSchema" xmlns:xsi="p07.org/1989/…> When the xml is something like this how can I get the xpath? For every entry in the table the urls may not be the same so I cannot keep the url as a part of the xpath right?