Convert Excel Spreadsheet to JSON in Python

To convert an Excel spreadsheet into JSON, first, you have to import the necessary modules: one for reading an Excel file and the second one for converting this data into the JSON format.

For this task, let’s use a simple table with countries. Each of them has a name, capital, and official language.

First, let’s install a module that will let us read Excel files: openpyxl.

from openpyxl import load_workbook
from openpyxl.utils import get_column_letter

The first line allows us to load a file with Excel workbook. The second one will retrieve a column letter from a column number.

You also have to import JSON, so we can dump the data to the output or a file.

import json

The next lines are going to return the last non-empty column and row inside our spreadsheet.

last_column = len(list(ws.columns))
last_row = len(list(ws.rows))

The next part is a nested loop. We have to use it because we have to travel through rows and column inside a worksheet.

for row in range(1, last_row + 1):
    my_dict = {}
    for column in range(1, last_column + 1):
        column_letter = get_column_letter(column)
        if row > 1:
            my_dict[ws[column_letter + str(1)].value] = ws[column_letter + str(row)].value
    my_list.append(my_dict)

The range starts from 1 (0 by default) because the first row and column have the number 1, not 0. In this case, we also have to add 1 to the second range parameter.

For each row, there is the new dictionary my_dict created.

Next, we have to check whether the row number is at least 2, otherwise, the header would be also written as a value.

For each column in a row (in other words, for each cell in the row) the value is added to a dictionary.

Before moving to the new row, the dictionary is appended to the end of the list, creating a list of dictionaries.

In the end, the data is dumped and written to a file.

data = json.dumps(my_list, sort_keys=True, indent=4)
with open('D:/data.json', 'w', encoding='utf-8') as f:
    f.write(data)

The whole code looks like this:

from openpyxl import load_workbook
from openpyxl.utils import get_column_letter

import json

wb = load_workbook(filename='D:/countries.xlsx')
ws = wb.active

my_list = []

last_column = len(list(ws.columns))
last_row = len(list(ws.rows))

for row in range(1, last_row + 1):
    my_dict = {}
    for column in range(1, last_column + 1):
        column_letter = get_column_letter(column)
        if row > 1:
            my_dict[ws[column_letter + str(1)].value] = ws[column_letter + str(row)].value
    my_list.append(my_dict)

data = json.dumps(my_list, sort_keys=True, indent=4)
with open('D:/data.json', 'w', encoding='utf-8') as f:
    f.write(data)

If you open the data.json file in a text editor, you are going to have Excel data written as a JSON file.

[
    {},
    {
        "Capital": "Andorra La Vella",
        "Country": "Andorra",
        "Language": "Catalan"
    },
    {
        "Capital": "Monaco",
        "Country": "Monaco",
        "Language": "French"
    },
    {
        "Capital": "Vaduz",
        "Country": "Liechtenstein",
        "Language": "German"
    },
    {
        "Capital": "Warsaw",
        "Country": "Poland",
        "Language": "Polish"
    }
]