With my Ph.D. program starting this fall, I expect I’ll be doing a lot more programming. I used to program a lot as an undergraduate, but, well, that was a long time ago.

I’ve been teaching myself Python, so I was excited when I learned a colleague was looking for a way to convert an .xml file to a .csv file. There was just one specific variable they were looking to export into .csv format, so the code is specific to that.

Since I’ll probably be coding a lot more, I figured I’d post this bit of code here.


import csv
from xml.etree import ElementTree

infile = raw_input(“Name of xml file:  “) # ask user for file to convert

# create name output file, same as input file replacing .xml with .csv
out = ” ”
for letter in infile:
if letter != “.”:
out += letter

out += “.csv”

# parse input file
with open(infile, ‘rt’) as f:
tree = ElementTree.parse(f)

#identify data to export to .csv
out_data = []
out_data.append(‘beta’)  # header column: variable we’re interested in
out_data.append(‘source’) # header column: name of file being converted

for node in tree.iter(): #iterate through .xml file
if node.tag == “{http://www.dmg.org/PMML-4_1}PCell”: #look for the tag holding the variable we’re interested in
beta = node.attrib.get(‘beta’) #grab data from variable we’re interested in
out_data.append(beta) # add data to output
out_data.append(infile) # add name of converted file to output

# write .csv file
out_file  = open(out, “wb”)
csv_writer = csv.writer(out_file, quoting=csv.QUOTE_NONE)

count = 0

for row in out_data: #iterate through output data putting commas and line breaks in correct places
count += 1
out_file.write(row) # write data to .csv file
if count%2 == 0:
csv_writer.writerow(” “) # we’re outputing two columns of data, so add a line break if two columns have been added
out_file.write(“,”) #else, add a “,” to seperate data elements on the same row

out_file.close() # close file

print “wrote %s” % out


One thought on “XML to CSV

  1. Mike Murnane

    You will find that there are a lot of Python-isims that can simplify code. For example, since out_data is a list you can write it out with commas between the elements using join :

    out_file.write( ‘, ‘.join(out_data) + ‘\n’ )

    Its a very expressive language. And then there are Dictionaries … which let you provide a key that can hook up to values that can be lists or lists of lists or other dictionaries.

    Have fun – and enjoy!
    “Uncle” Mike


Leave a Reply

Your email address will not be published. Required fields are marked *