
We can get these with the following code: root.tag # 'plist' root.attrib # len(unique_cols) # 58 Nodes in XML files, such as root, should have a tag and a dictionary of attributes. tree = ET.parse('iTunes.xml')Īt this point, I think it helps to take a look at the first lines of the XML to have an idea of how it actually looks like: Then, we can use that root as a starting point for exploring the whole tree. We can extract the “root” of this “tree” by using the method getroot(). As its name says, this data structure resembles a tree, which makes sense since XML files are a hierarchical data format. Once we have the iTunes.xml file in our working directory, we can load it in an “Element Tree” object through the function ET.parse(). Open('iTunes.xml', 'wb').write(r.content) R = requests.get(path, allow_redirects=True) This Python snippet downloads the file from that URL the first time is run (if the file has already been downloaded, it won’t do anything).
#ITUNES ALBUM URL EXTRACTOR CODE#
To allow you to run the code of this post, I put my XML file in a public URL. Importing and parsing the iTunes Library XML fileįirst things first, we import the required Python libraries: import pandas as pd All you need to do is locate the ‘iTunes Library.xml’ file and then run the Python code in this post against that file (although you probably will need some minor modifications). How I used pandas, matplotlib/seaborn, and regular expressions to answer questions on that DataFrame, such as: Which are my most listened songs, albums and artists? or Which genres predominate in each decade of my music?Īnd last but not least, how I expanded the analysis with R by using reticulate to pass pandas DataFrames to an R session, and then creating a nice table/playlist with the best songs of each era with the gt R package.Īlso, note that if you use iTunes (or Apple Music), you can do this too.
#ITUNES ALBUM URL EXTRACTOR HOW TO#
How to parse the contents of that file into a pandas DataFrame. How to import the iTunes Library XML file into Python. So here I am, sharing with you the insights and data visualizations I obtained from my music library, along with the code I used in this analysis.

I began my Python journey by reading the book Python for Data Analysis from Wes McKinney (creator of pandas, the Python equivalent of the tidyverse), and having finished it I wanted to put into practice what I’ve learned through an applied data analysis.Īnd since I love listening to music, why not analyze my own music collection? I told myself. A couple of months ago, I decided that it was time for me to finally grow out of my R comfort zone and start studying Python.
