An Excel file can be converted into JSON using any of the following methods:
- Method 1: Using pandas and json packages, or
- Method 2: Using openpyxl and json packages.
In most cases, Method 1 is sufficient, but you may need to use the second method if you want to process each row of the Excel before converting it into JSON.
In the code examples, we will use an Excel file named employees.xlsx with two worksheets – names and roles.
The roles sheet contains the data shown in the following Figure. We will use it in the last example.
Method 1: Using pandas and json
As a pre-requisite, ensure that pandas and openpyxl (needed for pandas to work) are installed. If not, install them using pip by executing these commands:
pip install pandas
pip install openpyxl
After that, this method works in the following steps:
- Read the Excel file using pd.excel() function,
- Convert the resulting DataFrame into JSON string,
- (Optional) Save the JSON data in a file.
Here is the code example.
1 2 3 4 5 6 7 8 |
import pandas as pd # Load the "names" sheet in the file "employees.xlsx". df = pd.read_excel("employees.xlsx", sheet_name="names") # Convert DataFrame into JSON using pd.DataFrame.to_json() function json_data = df.to_json(orient="records") # Save JSON data into a JSON file if you need to. with open("output_file.json", "w") as f: f.write(json_data) |
Output:
If you want to be able to prettify the resulting JSON file using the indent attribute on json.dump() function, you can use the following approach.
1 2 3 4 5 6 7 8 9 |
import pandas as pd import json # Load the Excel file into a DataFrame df = pd.read_excel("employees.xlsx", sheet_name="names") # Convert DataFrame into a list of dictionaries dict_data = df.to_dict(orient="records") # Write the result into a JSON file with the indent specified with open("output_file2.json", "w") as infile: json.dump(dict_data, infile, indent=3) |
Output (truncated):
The output shows that the second approach yields a more readable JSON file than the first.
Method 2: Using openpyxl and json
This method is suitable if you want to process the Excel file row by row before converting the result into JSON.
The method gets the work done in three steps:
- Load the Excel file using openpyxl.load_workbook() function,
- Iterate through the Excel rows, process the data, and store them in a list,
- Convert the resulting list of data into JSON, then save the JSON data in a file.
The following code contains a Python function you can reuse to convert an Excel file into JSON.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 |
import openpyxl import json def Excel2JSON(inpath, sheet_name=None): # Load the Excel file using openpyxl workbook = openpyxl.load_workbook(inpath) if sheet_name is None: # If the sheet to be opened is not specified, open the first one # Get the names of all the sheets in the workbook sheet_names = workbook.sheetnames # Open the first sheet sheet = workbook[sheet_names[0]] else: # else convert the specified sheet. sheet = workbook[sheet_name] # Get the header row. header_row = [cell.value for cell in sheet[1]] data = [] # Loop through the rows of the Excel file. Setting min_row=2 # is used to skip the first row of the file for record in sheet.iter_rows(min_row=2, values_only=True): # Zip the header row with the current record and convert the result into the dictionary. data_point = dict(zip(header_row, record)) # Append data_point into the data list. data.append(data_point) # Write the contents of "data" into a JSON file named after the worksheet with open(sheet.title + ".json", "w") as infile: json.dump(data, infile, indent=3) # convert the list of data into JSON data using json.dumps() json_data = json.dumps(data) return json_data # Calling the function with no sheet specified. The first one will be converted. input = "employees.xlsx" json_data1 = Excel2JSON(inpath=input) print(json_data1) # Calling the function for the second time with the worksheet specified input = "employees.xlsx" json_data2 = Excel2JSON(inpath=input, sheet_name="roles") print(json_data2) |
Output
[{"Id": 1.0, "Name": "Allan", "Year Employed": 2007.0}, …, {"Id": 5.0, "Name": "Alice", "Year Employed": 2014.0}] [{"id": 1.0, "Role": "CEO", "Department": "Management"}, …, {"id": 5.0, "Role": "Software developer", "Department": "Technology"}]
Conclusion
This guide discussed two methods for converting Excel into JSON in Python. The most common approach of using pandas is discussed as method 1, and the second method outlines how to use openpyxl to perform the conversion. The second method is the best choice if you want to process each Excel file row before writing results into JSON.