Using Linux command line tools to parse vSphere Export List output

I live on the command line most of the time. Often times I will want to grab a list of machines from vSphere and do something with them. In this instance, I wanted to write a script to delete old orphaned VM files of machines that existed on a NFS array, but weren’t present in vSphere. In the vSphere client, it is simple enough to bring up a list of these machines in the UI (navigate to datastores view, select datastore, navigate to Virtual Machines tab). Then one uses the File -> Export -> Export List command to convert this list to a file. There are a number of formats that it is possible to export to, but CSV is by far the simplest (I tried all of them!). After copying the output to a linux machine, I tried running grep against using an obvious match and got nothing:

# head -n 1 /opt/machines.csv
├┐├żNAME,STATE,STATUS,HOST,PROVISIONED SPACE,USED SPACE,HOST CPU - MHZ,HOST MEM - MB,GUEST MEM - %,SHARES VALUE,LIMIT - IOPS,DATASTORE % SHARES,NOTES,ALARM ACTIONS,PNC.CUSTSPEC,PNC.DEPLOYED,PNC.GROUPID,PNC.SOURCE
# grep STATE /opt/machines.csv
# <no result>

It was only after trying lots and lots of different stuff, that we realised the file is UTF16 encoded, and the standard GNU tools can’t deal with it:

# file /opt/machines.csv
/opt/machines.csv: Little-endian UTF-16 Unicode text, with CRLF, CR, LF line terminators

Using iconv to convert it to ascii means it is possible to use grep/sed etc now:

# iconv -f utf-16 -t ascii /opt/machines.csv > /opt/fixed-machines.csv

# grep STATE /opt/fixed-machines.csv
NAME,STATE,STATUS,HOST,PROVISIONED SPACE,USED SPACE,HOST CPU - MHZ,HOST MEM - MB,GUEST MEM - %,SHARES VALUE,LIMIT - IOPS,DATASTORE % SHARES,NOTES,ALARM ACTIONS,PNC.CUSTSPEC,PNC.DEPLOYED,PNC.GROUPID,PNC.SOURCE

Tiny little fix, but it had been annoying me for months. Big thanks go to Matt Ponsford for debugging this with me.

This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *