XML Data Handling in Python

Hi all,

Welcome to the tutorial of “XML data handling using Python” tutorial. In this tutorial, I will explain how to handle XML data using Python.

Ok, Let’s start…

Method 1: Using xmltodict library

XML to JSON output

This example shows, how to convert XML data to JSON formatted data by using the xmltodict library.

  • First, install xmltodict library to your project using the following command.
pip install xmltodict
  • Open the Python Console and run the following example.
import xmltodict
import json


xmlData = """<table name="ShippedItems">
<column name="ItemNumber" value="primaryKey"/>
<column name="Weight"/>
<column name="Dimension"/>
<column name="InsuaranceAmt"/>
<column name="Destination"/>
<column name="FinalDeliveryDate"/>
<column name="UniqueID" value="foreignKey"/>
</table>
"""
print(json.dumps(xmltodict.parse(xmlData)))

You can see the following output :

{"table": {"@name": "ShippedItems", "column": [{"@name": "ItemNumber", "@value": "primaryKey"}, {"@name": "Weight"}, {"@name": "Dimension"}, {"@name": "InsuaranceAmt"}, {"@name": "Destination"}, {"@name": "FinalDeliveryDate"}, {"@name": "UniqueID", "@value": "foreignKey"}]}}

JSON to XML

This example shows, how to convert JSON formatted data to XML formatted data by using the xmltodict library.

Open the Python Console and run the following example.

import xmltodict
import json
mydict={"relation": {"member": [{"@name": "ShippedItems", "@rel": "Shipped_Via"}, {"@name": "TranspotationEvent", "@rel": "Shipped_Via"}], "cardinality": {"card": [{"@name": "ShippedItems", "@value": "many"}, {"@name": "TranspotationEvent", "@value": "many"}]}}}print(xmltodict.unparse(mydict, pretty=True))

You can see the following output:

<?xml version="1.0" encoding="utf-8"?>
<relation>
<member name="ShippedItems" rel="Shipped_Via"></member>
<member name="TranspotationEvent" rel="Shipped_Via"></member>
<cardinality>
<card name="ShippedItems" value="many"></card>
<card name="TranspotationEvent" value="many"></card>
</cardinality>
</relation>

Method 2: Using the ElementTree class

Step 1: Create a “data.xml” as using the following data and save it inside your project folder.

<database name="UPS">
<
table name="TranspotationEvent">
<
column name="ScheduleNumber" value="primaryKey"/>
<
column name="Type"/>
<
column name="DeliveryRout"/>
</
table>
<
table name="RetailCenter">
<
column name="UniqueID" value="primaryKey"/>
<
column name="Type"/>
<
column name="Address"/>
</
table>
<
table name="Shipped_Via">
<
column name="ItemNumber" value="primaryKey"/>
<
column name="ScheduleNumber" value="primaryKey"/>
</
table>
</
database>

Step 2: Create an “xml-handle.py” file.

  • Import ElementTree class it can be used to wrap an element structure, and convert it from and to XML.
  • This example using, xml.etree.ElementTree (ET in short)
import xml.etree.ElementTree as ET

Loads an external XML section into this element tree.

tree = ET.parse('data.xml')

Returns the root element for this tree and assign it to the root variable.

root = tree.getroot()

Let’s try some examples.

  • Print root tag
root = tree.getroot() output:database
  • Print the attribute of the root tag
print(root.attrib)output:{'name': 'UPS'}
  • Print child nodes and their attributes
for child in root:    print(child.tag, child.attrib)output:table {'name': 'ShippedItems'}
table {'name': 'TranspotationEvent'}
table {'name': 'RetailCenter'}
table {'name': 'Shipped_Via'}
  • Print nested child nodes
for child in root:    tempRoot = child    for tempChild in tempRoot:        print(tempChild.tag, tempChild.attrib)output:
column {'value': 'primaryKey', 'name': 'ScheduleNumber'}
column {'name': 'Type'}
column {'name': 'DeliveryRout'}
column {'value': 'primaryKey', 'name': 'UniqueID'}
column {'name': 'Type'}
column {'name': 'Address'}
column {'value': 'primaryKey', 'name': 'ItemNumber'}
column {'value': 'primaryKey', 'name': 'ScheduleNumber'}

Or recursively iterate specific child nodes.

for neighbor in root.iter('column'):    print(neighbor.attrib)output:{'value': 'primaryKey', 'name': 'ScheduleNumber'}
{'name': 'Type'}
{'name': 'DeliveryRout'}
{'value': 'primaryKey', 'name': 'UniqueID'}
{'name': 'Type'}
{'name': 'Address'}
{'value': 'primaryKey', 'name': 'ItemNumber'}
{'value': 'primaryKey', 'name': 'ScheduleNumber'}
  • Print column names
for table in root.findall('table'):    for column in table.findall('column'):         name = column.get('name')         print(name)output:ScheduleNumber
Type
DeliveryRout
UniqueID
Type
Address
ItemNumber
ScheduleNumbe

Well done, In this tutorial you have learnt how to handle the XML data in python.

Good Luck!

Bye.

References :

https://docs.python.org/2/library/xml.etree.elementtree.html

https://pypi.org/project/xmltodict/

--

--

Senior Software Engineer at Spades | LinkedIn: https://www.linkedin.com/in/sashini-hettiarachchi/

Love podcasts or audiobooks? Learn on the go with our new app.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store