Modify xml values file using python

13,420

Solution 1

Use lxml. This example uses lxml.etree and would actually fail on your example xml, as it has some unclosed tags in it. If you have the same problems with the real-world data that you are going to parse, use lxml.html, that can handle broken xml (instructions added to the code as comments).

In [14]: import lxml.etree as et  # for broken xml add an import:
                                  # import lxml.html as lh

In [15]: doc = et.fromstring(xmlstr)  # for broken xml replace this line with:
                                      # doc = lh.fromstring(xmlstr)

                                      # if you read xml from a file:
                                      # doc = et.parse('file_path')

In [16]: for elem in doc.xpath('.//temp-config'):
    ...:     elem.text = 'Prod'
    ...:     

In [17]: print et.tostring(doc,pretty_print=True)
<config>
  <logging/>
  <test-mode>false</test-mode>
  <test name="test02">
    <mail/>
    <test-system>0</test-system>
    <system id="0" name="suite1" type="regression">
      <temp-config>Prod</temp-config>
      <rpm>0.5</rpm>
      <cycles>3</cycles>
    </system>
    <system id="1" name="suite2" type="regression">
      <temp-config>Prod</temp-config>
      <rpm>0.5</rpm>
      <cycles>3</cycles>
    </system>
    <system id="2" name="suite3" type="regression">
      <temp-config>Prod</temp-config>
      <rpm>0.5</rpm>
      <cycles>3</cycles>
    </system>
    <system id="3" name="suite4" type="regression">
      <temp-config>Prod</temp-config>
      <rpm>0.5</rpm>
      <cycles>3</cycles>
    </system>
    <system id="4" name="suite5" type="regression">
      <temp-config>Prod</temp-config>
      <rpm>0.5</rpm>
      <cycles>3</cycles>
    </system>
  </test>
</config>

Note: As pointed out by others, you have some less powerful alternatives in the standard library. For simple tasks as this they may be well suited, however, if your xml files are broken, parsing them with standard library tools equals wasting your time.

Solution 2

ElementTree is a great choice -- pure Python and included in the standard library, so it's the most portable option. However, I always go straight to lxml -- it has the same API, it's just faster and it can do a lot more (since it is effectively a wrapper around libxml2).

from lxml import etree
tree = etree.parse(path_to_my_xml)

for elem in tree.findall('.//test'):
    assert elem.attrib['name'] == 'test02'
    elem.attrib['name'] == 'test03'

for elem in tree.findall('.//temp-config'):
    assert elem.text == 'QA'
    elem.text = 'Prod'

with open(path_to_output_file, 'w') as file_handle:
    file_handle.write(etree.tostring(tree, pretty_print=True, encoding='utf8'))
Share:
13,420
DigitalDyn
Author by

DigitalDyn

Updated on July 20, 2022

Comments

  • DigitalDyn
    DigitalDyn almost 2 years

    I'm very new to python and I need to modify the

    <test name="test02"></xmpp> to <test name="test03"></xmpp> 
    
    <temp-config>QA</temp-config> to <temp-config>Prod</temp-config> 
    

    for all 5 occurrences using python. Not sure what lib to use. Any help in this is really appreciated.

    <config>
    <logging></logging>
    <test-mode>false</test-mode>
    <test name="test02"></xmpp>
    <mail></mail>
    <test-system>0</test-system>
    <system id="0" name="suite1" type="regression">
        <temp-config>QA</temp-config>
        <rpm>0.5</rpm>
        <cycles>3</cycles>
    </system>
    <system id="1" name="suite2" type="regression">
        <temp-config>QA</temp-config>
        <rpm>0.5</rpm>
        <cycles>3</cycles>
    </system>
    <system id="2" name="suite3" type="regression">
        <temp-config>QA</temp-config>
        <rpm>0.5</rpm>
        <cycles>3</cycles>
    </system>
    <system id="3" name="suite4" type="regression">
        <temp-config>QA</temp-config>
        <rpm>0.5</rpm>
        <cycles>3</cycles>
    </system>
    <system id="4" name="suite5" type="regression">
        <temp-config>QA</temp-config>
        <rpm>0.5</rpm>
        <cycles>3</cycles>
    </system>
    </config>