xml.etree.ElementTree.ParseError -- exception handling not catching errors

10,078

There are some workarounds, like defining custom entities, suggested at:

But, if you are able to switch to lxml, its XMLParser() can work in the "recover" mode that would "ignore" the undefined entities:

import lxml.etree as ET

parser = ET.XMLParser(recover=True)
tree = ET.parse('cic.fam_lat.xml', parser=parser)

for name in root.iter('name'):
    print(root.tag, name.text)

(worked for me - got the tag names and texts printed)

Share:
10,078
Daniel
Author by

Daniel

Updated on June 04, 2022

Comments

  • Daniel
    Daniel almost 2 years

    I'm trying to parse an xml document that has a number of undefined entities that cause a ParseError when I try to run my code, which is as follows:

    import xml.etree.ElementTree as ET
    
    tree = ET.parse('cic.fam_lat.xml')
    root = tree.getroot()
    
    while True:
        try:
            for name in root.iter('name'):
                print(root.tag, name.text)
        except xml.etree.ElementTree.ParseError:
            pass
    
    for name in root.iter('name'):
        print(name.text)
    

    An example of said error is as follows, and there are a number of undefined entities that will all throw the same error: error description

    I just want to ignore them rather than go in and edit out each one. How should I edit my exception handling to catch these error instances? (i.e., what am I doing wrong?)