Find and Replace Values in XML using Python

92,507

Solution 1

The basics:

from xml.etree import ElementTree as et
tree = et.parse(datafile)
tree.find('idinfo/timeperd/timeinfo/rngdates/begdate').text = '1/1/2011'
tree.find('idinfo/timeperd/timeinfo/rngdates/enddate').text = '1/1/2011'
tree.write(datafile)

You can shorten the path if the tag name is unique. This syntax finds the first node at any depth level in the tree.

tree.find('.//begdate').text = '1/1/2011'
tree.find('.//enddate').text = '1/1/2011'

Also, read the documentation, esp. the XPath support for locating nodes.

Solution 2

If you just want to replace the bits enclosed with %, then this isn't really an XML problem. You can easily do it with regex:

import re
xmlstring = open('myxmldocument.xml', 'r').read()
substitutions = {'SITEDESCR': 'myvalue', ...}
pattern = re.compile(r'%([^%]+)%')
xmlstring = re.sub(pattern, lambda m: substitutions[m.group(1)], xmlstring)
Share:
92,507
Mike
Author by

Mike

Updated on July 09, 2022

Comments

  • Mike
    Mike almost 2 years

    I am looking to edit XML files using python. I want to find and replace keywords in the tags. In the past, a co-worker had set up template XML files and used a "find and replace" program to replace these key words. I want to use python to find and replace these key words with values. I have been teaching myself the Elementtree module, but I am having trouble trying to do a find and replace. I have attached a snid-bit of my XML file. You will seen some variables surrounded by % (ie %SITEDESCR%) These are the words I want to replace and then save the XML to a new file. Any help or suggestions would be great.

    Thanks, Mike

    <metadata>
    <idinfo>
    <citation>
    <citeinfo>
     <origin>My Company</origin>
     <pubdate>05/04/2009</pubdate>
     <title>POLYGONS</title>
     <geoform>vector digital data</geoform>
     <onlink>\\C$\ArcGISDevelopment\Geodatabase\PDA_STD_05_25_2009.gdb</onlink>
    </citeinfo>
    </citation>
     <descript>
     <abstract>This dataset represents the mapped polygons developed from the field data for the %SITEDESCR%.</abstract>
     <purpose>This dataset was created to accompany some stuff.</purpose>
     </descript>
    <timeperd>
    <timeinfo>
    <rngdates>
     <begdate>%begdate%</begdate>
     <begtime>unknown</begtime>
     <enddate>%enddate%</enddate>
     <endtime>unknown</endtime>
     </rngdates>
     </timeinfo>
     <current>ground condition</current>
     </timeperd>
    
  • Mike
    Mike almost 13 years
    I tested this in a standalone script and this works well. I will add this to my python library for future reference. Thanks for responding.
  • Mike
    Mike almost 13 years
    Hey, thanks Mark. This is exactly what I was looking for. This works with my exsiting python program.
  • Rodney Richardson
    Rodney Richardson over 6 years
    This can be fragile - XML files are not just text. Whitespace is generally not important in XML, so a change to the input file can result in identical XML which your code doesn't recognise.
  • Rostislav Matl
    Rostislav Matl over 6 years
    there is no whitespace in the %something% placeholders, I'd only add encode special XML characters like especially < and > if necessary
  • Aleister Tanek Javas Mraz
    Aleister Tanek Javas Mraz almost 5 years
    I suppose I just don't see how this solution would work when the value of the XML node is unknown.
  • Rostislav Matl
    Rostislav Matl almost 5 years
    As long as 'uknown' is a not taken for a valid placeholder, it will be left as it is. It's just a string.