Convert XLSX to CSV in Python

XLSX is a file extension for Microsoft Excel spreadsheets, while CSV is a Comma-Separated Value file.

This article discusses using Python to convert XLSX into CSV using two methods.

  • Method 1: Using the pandas package and,
  • Method 2: Using openpyxl and csv modules.

We will use the employees.xlsx Excel with two worksheets – names and roles. See the Figure below.

The objective is to learn how to use the two methods stated above to convert any or all of the sheets in the XLSX file into CSV.

Method 1: Using pandas Package

This method involves reading the XLSX file into pandas DataFrame using pandas.read_excel() function and then write the DataFrame into a CSV file using DataFrame.to_csv().

For this method, you may need to install pandas and openpyxl packages using pip as follows:

pip install openpyxl

pip install pandas

Let’s see an example.

The code snippet above converts the first sheet only. You can also specify the XLSX worksheet you want to load and convert.

Lastly, you can implement a for-loop to convert each sheet into a CSV. We can do that as follows.

Method 2: Using openpyxl and csv packages

This method involves opening the XLSX file and writing its content into a CSV row by row. If the openpyxl package is not installed, you can do that using pip by running the following command line.

pip install openpyxl

The following code snippet shows how to convert the first worksheet (or any other sheet) on the XLSX file into CSV.

Like in Method 1, we can convert all the sheets on the XLSX file into CSV through a for-loop, as shown below.

Conclusion

This article discussed two methods of converting XLSX to CSV in Python: using pandas and openpyxl.

You can choose one of the methods based on the task at hand or the data size.

If you are dealing with many data manipulation tasks, you can go for pandas because it is a great tool for that purpose. Otherwise, if you need to read and write Excel files and maintain Excel format, you should use openpyxl.

Note also that the method using pandas is slightly faster than openpyxl when converting a large XLSX into CSV.