Menu Close

How to convert an XML file to a Pandas DataFrame?

Sometimes, we want to convert an XML file to a Pandas DataFrame.

In this article, we’ll look at how to convert an XML file to a Pandas DataFrame.

How to convert an XML file to a Pandas DataFrame?

To convert an XML file to a Pandas DataFrame, we can use the xml.etree.ElementTree module.

For instance, we write:

import pandas as pd
import xml.etree.ElementTree as ET

xml_str = '''<?xml version="1.0" encoding="utf-8"?>
<response>
  <head>
    <code>   200  </code>
  </head>
  <body>
    <data id="0" name="All Categories" t="2018052600" tg="1" type="category"/>
    <data id="13" name="RealEstate.com.au [H]" t="2018052600" tg="1" type="publication"/>
  </body>
</response>
'''

etree = ET.fromstring(xml_str)
dfcols = ['id', 'name']
df = pd.DataFrame(columns=dfcols)

for i in etree.iter(tag='data'):
    df = df.append(pd.Series([i.get('id'), i.get('name')], index=dfcols),
                   ignore_index=True)

h = df.head()
print(h)

We have an XML string assigned to xml_str.

And we parse it by passing that as the argument of ET.fromstring.

Next, we define the columns of the DataFrame.

And we create the DataFrame with the DataFrame constructor.

Next, we loop through the parsed XML data elements we got with etree.iter(tag='data') with the for loop.

And we call df.append to append to id and name attribute values by putting them into a series.

Then we get the first 5 rows with df.head.

Therefore, print should print:

   id                   name
0   0         All Categories
1  13  RealEstate.com.au [H]

Conclusion

To convert an XML file to a Pandas DataFrame, we can use the xml.etree.ElementTree module.

Posted in Python, Python Answers