Find and Replace Values in XML using Python
Solution 1
The basics:
from xml.etree import ElementTree as et
tree = et.parse(datafile)
tree.find('idinfo/timeperd/timeinfo/rngdates/begdate').text = '1/1/2011'
tree.find('idinfo/timeperd/timeinfo/rngdates/enddate').text = '1/1/2011'
tree.write(datafile)
You can shorten the path if the tag name is unique. This syntax finds the first node at any depth level in the tree.
tree.find('.//begdate').text = '1/1/2011'
tree.find('.//enddate').text = '1/1/2011'
Also, read the documentation, esp. the XPath support for locating nodes.
Solution 2
If you just want to replace the bits enclosed with %
, then this isn't really an XML problem. You can easily do it with regex:
import re
xmlstring = open('myxmldocument.xml', 'r').read()
substitutions = {'SITEDESCR': 'myvalue', ...}
pattern = re.compile(r'%([^%]+)%')
xmlstring = re.sub(pattern, lambda m: substitutions[m.group(1)], xmlstring)
Mike
Updated on July 09, 2022Comments
-
Mike almost 2 years
I am looking to edit XML files using python. I want to find and replace keywords in the tags. In the past, a co-worker had set up template XML files and used a "find and replace" program to replace these key words. I want to use python to find and replace these key words with values. I have been teaching myself the Elementtree module, but I am having trouble trying to do a find and replace. I have attached a snid-bit of my XML file. You will seen some variables surrounded by % (ie %SITEDESCR%) These are the words I want to replace and then save the XML to a new file. Any help or suggestions would be great.
Thanks, Mike
<metadata> <idinfo> <citation> <citeinfo> <origin>My Company</origin> <pubdate>05/04/2009</pubdate> <title>POLYGONS</title> <geoform>vector digital data</geoform> <onlink>\\C$\ArcGISDevelopment\Geodatabase\PDA_STD_05_25_2009.gdb</onlink> </citeinfo> </citation> <descript> <abstract>This dataset represents the mapped polygons developed from the field data for the %SITEDESCR%.</abstract> <purpose>This dataset was created to accompany some stuff.</purpose> </descript> <timeperd> <timeinfo> <rngdates> <begdate>%begdate%</begdate> <begtime>unknown</begtime> <enddate>%enddate%</enddate> <endtime>unknown</endtime> </rngdates> </timeinfo> <current>ground condition</current> </timeperd>
-
Mike almost 13 yearsI tested this in a standalone script and this works well. I will add this to my python library for future reference. Thanks for responding.
-
Mike almost 13 yearsHey, thanks Mark. This is exactly what I was looking for. This works with my exsiting python program.
-
Rodney Richardson over 6 yearsThis can be fragile - XML files are not just text. Whitespace is generally not important in XML, so a change to the input file can result in identical XML which your code doesn't recognise.
-
Rostislav Matl over 6 yearsthere is no whitespace in the %something% placeholders, I'd only add encode special XML characters like especially < and > if necessary
-
Aleister Tanek Javas Mraz almost 5 yearsI suppose I just don't see how this solution would work when the value of the XML node is unknown.
-
Rostislav Matl almost 5 yearsAs long as 'uknown' is a not taken for a valid placeholder, it will be left as it is. It's just a string.