List can not be serialized error when using Xpath with lxml etree
Solution 1
articles = root.xpath('//article[contains(text(), "stuff")]')
for article in articles:
print etree.tostring(article, pretty_print=True)
root.xpath
returns a Python list. So e
is a list. etree.tostring
converts lxml _Elements
to strings; it does not convert lists of _Elements
to strings. So use a for-loop
to print the _Elements
inside the list as strings.
Solution 2
You can also use built-in join function like this.
e = root.xpath('//article[contains(text(), "stuff")]')
joined_string = "".join(e)//list to string conversion
print joined_string
Solution 3
Here is an executable and working solution, which also uses join
(but correctly) - using list comprehension:
from lxml import etree
root = etree.fromstring('''<stuff>
<article date="2014-05-18 17:14:44" title="Some stuff">stuff in text
<tags>Hello and stuff</tags>
</article>
<article date="whatever" title="Some stuff">no s_t_u_f_f in text
<tags>Hello and stuff</tags>
</article>
<article date="whatever" title="whatever">More stuff in text
<tags>Hello and stuff</tags>
</article>
</stuff>''')
articles = root.xpath('//article[contains(text(), "stuff")]')
print("".join([etree.tostring(article, encoding="unicode", pretty_print=True) for article in articles]))
(For encoding="unicode" see e.g. http://makble.com/python-why-lxml-etree-tostring-method-returns-bytes)
James
Updated on June 14, 2022Comments
-
James about 2 years
I am trying to search for a string within an XML document, and then print out the entire element, or elements, that contain that string. This is my code so far:
post = open('postf.txt', 'r') postf = str(post.read()) root = etree.fromstring(postf) e = root.xpath('//article[contains(text(), "stuff")]') print etree.tostring(e, pretty_print=True)
This is the XML that is being searched from postf.txt
<stuff> <article date="2014-05-18 17:14:44" title="Some stuff">More testing debug [done] <tags>Hello and stuff </tags></article> </stuff>
And finally, this is my error:
File "cliassis-1.2.py", line 107, in command print etree.tostring(e, pretty_print=True) File "lxml.etree.pyx", line 3165, in lxml.etree.tostring (src\lxml\lxml.etree.c:69414) TypeError: Type 'list' cannot be serialized.
What I want this to do, is search for all elements containing the string I searched for, and then print out the tags. So if I have test and stuff, and I search for 'test', I want it to print out "test and stuff
-
James about 10 yearsThis worked perfectly, and the explanation made me understand why it wasn't working. Thank you. :D