Convert Excel Spreadsheet to JSON in Python

To convert an Excel spreadsheet into JSON, first, you have to import the necessary modules: one for reading an Excel file and the second one for converting this data into the JSON format.

For this task, let’s use a simple table with countries. Each of them has a name, capital, and official language.

First, let’s install a module that will let us read Excel files: openpyxl.

1 2	from openpyxl import load_workbook from openpyxl.utils import get_column_letter

The first line allows us to load a file with an Excel workbook. The second one will retrieve a column letter from a column number.

You also have to import JSON, so we can dump the data to the output or a file.

1	import json

The next lines are going to return the last non-empty column and row inside our spreadsheet.

1 2	last_column = len(list(ws.columns)) last_row = len(list(ws.rows))

The next part is a nested loop. We have to use it because we have to travel through rows and columns inside a worksheet.

for row in range(1, last_row + 1):

my_dict = {}

for column in range(1, last_column + 1):

column_letter = get_column_letter(column)

if row > 1:

my_dict[ws[column_letter + str(1)].value] = ws[column_letter + str(row)].value

my_list.append(my_dict)

The range starts from 1 (0 by default) because the first row and column have the number 1, not 0. In this case, we also have to add 1 to the second range parameter.

For each row, there is the new dictionary my_dict created.

Next, we have to check whether the row number is at least 2, otherwise, the header would be also written as a value.

For each column in a row (in other words, for each cell in the row) the value is added to a dictionary.

Before moving to the new row, the dictionary is appended to the end of the list, creating a list of dictionaries.

In the end, the data is dumped and written into a file.

data = json.dumps(my_list, sort_keys=True, indent=4)

with open('D:/data.json', 'w', encoding='utf-8') as f:

f.write(data)

The whole code looks like this:

from openpyxl import load_workbook

from openpyxl.utils import get_column_letter

import json

wb = load_workbook(filename='D:/countries.xlsx')

ws = wb.active

my_list = []

last_column = len(list(ws.columns))

last_row = len(list(ws.rows))