List can not be serialized error when using Xpath with lxml etree

10,707

Solution 1

articles = root.xpath('//article[contains(text(), "stuff")]')

for article in articles:
    print etree.tostring(article, pretty_print=True)

root.xpath returns a Python list. So e is a list. etree.tostring converts lxml _Elements to strings; it does not convert lists of _Elements to strings. So use a for-loop to print the _Elements inside the list as strings.

Solution 2

You can also use built-in join function like this.

e = root.xpath('//article[contains(text(), "stuff")]')
joined_string = "".join(e)//list to string conversion
print joined_string

Solution 3

Here is an executable and working solution, which also uses join (but correctly) - using list comprehension:

from lxml import etree

root = etree.fromstring('''<stuff>

<article date="2014-05-18 17:14:44" title="Some stuff">stuff in text
<tags>Hello and stuff</tags>
</article>

<article date="whatever" title="Some stuff">no s_t_u_f_f in text
<tags>Hello and stuff</tags>
</article>

<article date="whatever" title="whatever">More stuff in text
<tags>Hello and stuff</tags>
</article>

</stuff>''')
articles = root.xpath('//article[contains(text(), "stuff")]')

print("".join([etree.tostring(article, encoding="unicode", pretty_print=True) for article in articles]))

(For encoding="unicode" see e.g. http://makble.com/python-why-lxml-etree-tostring-method-returns-bytes)

Share:
10,707
James
Author by

James

Updated on June 14, 2022

Comments

  • James
    James about 2 years

    I am trying to search for a string within an XML document, and then print out the entire element, or elements, that contain that string. This is my code so far:

    post = open('postf.txt', 'r')
    postf = str(post.read())
    
    root = etree.fromstring(postf)
    
    e = root.xpath('//article[contains(text(), "stuff")]')
    
    print etree.tostring(e, pretty_print=True)
    

    This is the XML that is being searched from postf.txt

    <stuff>
    
    <article date="2014-05-18 17:14:44" title="Some stuff">More testing
    debug
    [done]
    <tags>Hello and stuff
    </tags></article>
    
    </stuff>
    

    And finally, this is my error:

      File "cliassis-1.2.py", line 107, in command
        print etree.tostring(e, pretty_print=True)
      File "lxml.etree.pyx", line 3165, in lxml.etree.tostring (src\lxml\lxml.etree.c:69414)
    TypeError: Type 'list' cannot be serialized.
    

    What I want this to do, is search for all elements containing the string I searched for, and then print out the tags. So if I have test and stuff, and I search for 'test', I want it to print out "test and stuff

  • James
    James about 10 years
    This worked perfectly, and the explanation made me understand why it wasn't working. Thank you. :D