Python supports the processing of JSON data natively using json module. The package can handle string-formatted JSON data or JSON data saved in a file.
If JSON data is string-formatted, searching through it involves two steps: converting the string into a Python object using json.loads(), then search the resulting Python object accordingly.
On the other hand, if JSON data is saved on a file, searching involves one additional step at the beginning – that of loading the file.
Searching for Data in a JSON String
As said earlier, we need to convert the JSON string into a valid Python object and search for the value we want.
Example 1: Search key and value in simple JSON data
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
import json # Define the JSON data as a Python string json_data = """{ "ID": "KZ568", "Name": "Smith", "Address": null }""" # Converting JSON string into Python object (dict) data = json.loads(json_data) # Search for key if "Location" in json_data: print("Key found") else: print("Key value not found") # Check for a given ID value if data.get("ID") == "KZ568": #or data["ID"] print("ID value found.") else: print("ID value not found") |
Output:
Key value not found ID value found.
The simple example above checks if a given key exists and if the JSON data contains a given value for a given key.
We can access dictionary values in two ways: using <dict>.get(<key>) or <dict>[<key>]. The former returns value if the <key> exists, otherwise returns None, whereas the latter raises a ValueError if the <key> does not exist.
Example 2: Searching for a value in a JSON string (3 methods discussed)
The following code initializes a JSON string and converts it into a Python list.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
import json address_data = """[ { "ID": "KZ568", "Name": "Mark", "Address": null }, { "ID": "MT456", "Name": "Vincent", "Address": "Ohio" }, { "ID": "AM444", "Name": "Paul", "Address": "Minnesota" } ]""" # Convert JSON string into Python object # This will be a list of dictionaries. data1 = json.loads(address_data) |
At this point, data1 is a Python object – a list of 3 dictionaries. We can then search for values as we would on an ordinary Python list of dictionaries.
Method A: Searching through iteration
In this method, we need to iterate through the elements of data1 and search for a value in each dictionary.
1 2 3 4 5 |
# Iterate through dict searching # for an item matching the criteria for item in data1: if item["ID"]=="AM444": #or item.get("ID") print(item) |
Output:
{'ID': 'AM444', 'Name': 'Paul', 'Address': 'Minnesota'}
Method B: By list comprehension
This method works like Method A. We are just collapsing the for-loop into a list comprehension.
1 2 |
search_result = [item for item in data1 if item["ID"]=="AM444"] print(search_result) |
Output:
[{'ID': 'AM444', 'Name': 'Paul', 'Address': 'Minnesota'}]
Method C: Using lambda and filter function
The filter(func, iterable) evaluates each iterable and returns an iterator of items matching the criteria defined by func. In our case, we will use lambda x: f(x) to define the func as follows
1 2 |
search_result = list(filter(lambda x: x["ID"]=="AM444", data1)) print(search_result) |
Output:
[{'ID': 'AM444', 'Name': 'Paul', 'Address': 'Minnesota'}]
We had to cast the filter object into a list because the filter() method returns an iterator object.
Example 3: Searching for data in a Nested JSON
The following code snippet parses a JSON string into a Python list of dictionaries.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
import json # Defining nested JSON data nested_address_data = """[ { "ID": "KZ568", "Name": "Smith", "Address": null }, { "ID": "MT456", "Name": "Alice", "Address": {"State": "Ohio", "County":"Delaware"} }, { "ID": "AM444", "Name": "Bob", "Address": {"State": "Minnesota", "County" : "Dakota"} } ]""" # Parsing the JSON data into Python object nested_address_data = json.loads(nested_address_data) print(nested_address_data) |
Once the JSON data is loaded, we can search for values using any of the 3 methods discussed earlier.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
# The search: Get the item with County==Delaware # for-loop iteration for item in nested_address_data: # To skip the case where Address=null if item.get("Address") is None: continue # Check the conditions for the rest of the items # you can also use item["Address"]["County"]=="Delaware" if item.get("Address").get("County")=="Delaware": print(item) # Using a search function and filter method. search_result1 = list(filter(lambda x: x.get("Address").get("County")=="Delaware"\ if x.get("Address") is not None else [], nested_address_data)) print(search_result1) # Seach for a County that does not exist. search_result2 = list(filter(lambda x: x.get("Address").get("County")=="Washington"\ if x.get("Address") is not None else [], nested_address_data)) print(search_result2) # Using list comprehension search_result = [item for item in nested_address_data \ if item["Address"] is not None and item["Address"]["County"]=="Dakota"] print(search_result) |
Output:
{'ID': 'MT456', 'Name': 'Alice', 'Address': {'State': 'Ohio', 'County': 'Delaware'}} [{'ID': 'MT456', 'Name': 'Alice', 'Address': {'State': 'Ohio', 'County': 'Delaware'}}] [] [{'ID': 'AM444', 'Name': 'Bob', 'Address': {'State': 'Minnesota', 'County': 'Dakota'}}]
We needed to search for “County” inside the “Address” key in all three methods. The first item on the JSON data has Address = “null”. If we issued item[“Address”][“County”] in this case, we would end up with an error because item[“Address”] yields a NoneType for the first element.
Searching for Data in a JSON file
The following data is saved in the address_data.json file. We will use it in the following example.
[ { "ID": "KZ568", "Name": "Mark", "Address": null }, { "ID": "MT456", "Name": "Vincent", "Address": "Ohio" }, { "ID": "AM444", "Name": "Paul", "Address": "Minnesota" } ]
Loading the JSON file
1 2 3 4 5 6 7 8 9 10 |
import json # Load JSON data from a file data = json.load(open("address_data.json")) # or use the following two lines # with open("address_data.json") as file: # data = json.load(file) print(data) |
Output:
[{'ID': 'KZ568', 'Name': 'Mark', 'Address': None}, {'ID': 'MT456', 'Name': 'Vincent', 'Address': 'Ohio'}, {'ID': 'AM444', 'Name': 'Paul', 'Address': 'Minnesota'}]
Once the data is loaded, you can search through it using any of the methods discussed earlier.
Conclusion
As you might have already noticed, there is no one specific method to search through any JSON data. The core idea is to understand the workings of the methods discussed in this article and modify the code to fit your specific needs.