Fix Requests Max Retries Exceeded With Url in Python

The “Max retries exceeded with url” error thrown sometimes by the requests library in Python falls under two classes of errors: requests.exceptions.ConnectionError (most common) and requests.exceptions.SSLError. In this article, we will discuss the causes of the error, how to reproduce it, and, importantly, how to solve the error.

Causes of requests Max Retries Exceeded With Url

The error occurs when the requests library cannot successfully send requests to the issued site. This happens because of different reasons. Here are the common ones. of them:

Wrong URL – A typo maybe (go to Solution 1),
Failure to verify SSL certificate (Solution 2),
Using requests with no or unstable internet connection (Solution 3), and
Sending too many requests or server too busy (Solution 4)

Wrong URL – A typo?

There is a chance that the URL you requested was incorrect. It could be distorted because of a typo. For example, suppose we want to send a get request to “https://www.example.com” (which is a valid URL), but instead, we issued the URL: “https://www.example.cojkm” (we used .cojkm in the domain extension instead of .com).

import requests

url = 'https://www.example.cojkm'

response = requests.get(url)

print(response)

Output:

requests.exceptions.ConnectionError: HTTPSConnectionPool(host='www.example.cojkm', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7fd5b5d33100>: Failed to establish a new connection: [Errno 111] Connection refused'))

Failure to verify the SSL certificate

The requests library, by default, implements SSL certificate verification to ensure you are making a secure connection. If the certificates can’t be verified, you end up with an error like this:

requests.exceptions.SSLError: [Errno 1] _ssl.c:503: error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed

Using requests with no or unstable internet connection

The requests package sends and receives data via the web; therefore, the internet connection should be available and stable. If you have no or unstable internet, requests will throw an error like this:

Error: requests.exceptions.ConnectionError: HTTPSConnectionPool(host=’www.example.com’, port=443): Max retries exceeded with url: / (Caused by NewConnectionError(‘<urllib3.connection.VerifiedHTTPSConnection object at 0x7f7f5fadb100>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution’))

Sending too many requests/ server overload

Some websites blocks connections when so many requests are made so fast. Another problem related to this is when the server is overloaded – managing a large number of connections at the same time. In this case, requests.get() throws an error like this:

requests.exceptions.ConnectionError: HTTPConnectionPool(host='www.srgfesrsergserg.com', port=80): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x0000008EC69AAA90>: Failed to establish a new connection: [Errno 11001] getaddrinfo failed'))

Solutions to requests Max Retries Exceeded With Url Error

In this section, we will cover some solutions to solve the “Max Retries Exceeded With Url” Error caused by the above reasons.

Solution 1: Double check URL

Ensure that you have a correct and valid URL. Consider a valid URL mentioned earlier: “https://www.example.com“. “Max retries exceeded with url” error mostly when incorrect edits are done around www and top-level domain name (e.g., .com).

Another error arises when the scheme/protocol (https) is incorrectly edited: requests.exceptions.InvalidSchema. If the second-level domain (in our case, “example”) is wrongly edited, we will be directed to a different website altogether, and if the site does not exist, we get a 404 response.

Other wrong URLS that leads to “Max retries exceeded with url” is “wwt.example.com” and “https://www.example.com “(a white space after .com).

Solution 2: Solving SSLError

As mentioned, the error is caused by an untrusted SSL certificate. The quickest fix is to set the attribute verify=False on requests.get(). This tells requests to send a request without verifying the SSL certificate.

1	requests.get('https://example.com', verify=False)

Please be aware that the certificate won’t be verified; therefore, your application will be exposed to security threats like man-in-the-middle attacks. It is best to avoid this method for scripts used at the production level.

Solution 3: Solving the “Max retries exceeded with url” error related to an unstable connection

This solution fits cases when you have intermittent connection outages. In these cases, we want requests to be able to carry out many tries on requests before throwing an error. For this case, we can use two solutions:

Issue timeout argument in requests.get(), or
Retry connections on connections-related errors

Solution 3a: Issue timeout argument in requests.get()

If the server is overloaded, we can use a timeout to wait longer for a response. This will increase the chance of a request finishing successfully.

import requests

url = 'https://www.example.com'

response = requests.get(url, timeout=7)

print(response)

The code above will wait 7 seconds for the requests package to connect to the site and read the source.

Alternatively, you can pass a timeout as a 2-element tuple where the first element is connection timeout (time to establish a connection to the server) and the second value is read timeout (time allowed for the client to read data from the server)

1	requests.get('https://api.github.com', timeout=(3, 7))

When the above line is used, a connection must be established within 3 seconds, and data read within 7 seconds; otherwise, requests raise Timeout Error.

Solution 3b: Retry connections on connections-related errors

The requests use the Retry utility in urllib3 (urllib3.util.Retry) to retry connections. We will use the following code to send requests (explained after).

import requests

from requests.adapters import HTTPAdapter, Retry

import time

def send_request(url,

n_retries=4,

backoff_factor=0.9,

status_codes=[504, 503, 502, 500, 429]):

sess = requests.Session()

retries = Retry(connect=n_retries, backoff_factor=backoff_factor,

status_forcelist=status_codes)

sess.mount("https://", HTTPAdapter(max_retries=retries))

sess.mount("http://", HTTPAdapter(max_retries=retries))

response = sess.get(url)

return response

We have used the following parameters on urllib3.util.Retry class:

connect – the number of connection-related tries. By default, send_request() will make 4 tries plus 1 (an original request which happens immediately).
backoff_factor – determines delays between retries. The sleeping time is computed with the formula {backoff_factor} * (2 ^ ({retry_number} – 1)). We will work on an example for this argument when calling the function.
status_forcelist – retry for all connections that resulted in 504, 503, 502, 400, and 429 status codes only (Read more about status codes in https://en.wikipedia.org/wiki/List_of_HTTP_status_codes)

Let’s now call our function and time the execution.

# start the timer

start_time = time.time()

# send a request to GitHub API

url = "https://api.github.com/users"

response = send_request(url)

# print the status code

print(response.status_code)

# end timer

end_time = time.time()

# compute the run time

print("Run time: ", end_time-start_time)

Output:

200
Run time:  0.8597214221954346

The connection was completed successfully (status 200) taking 0.86 of a second to finish. To see the implementation of backoff, let’s try to send a request to a server that does not exist, catch an exception when it occurs and compute execution time.

try:

# start execution timer

start_time = time.time()

url = "http://localhost/6000"

# call send_request() method to send a request to the url

# this will never be successful because there is no server running

# on port 6000.

response = send_request(url)

print(response.status_code)

except Exception as e:

# Catch any exception - execution will end here because

# requests can't connect to http://localhost/6000

print("Error Name: ", e.__class__.__name__)

print("Error Message: ", e)

finally:

# Pick end time

end_time = time.time()

# Calculate the time taken to execute.

print("Run time: ", end_time-start_time)

Output:

Error Name:  ConnectionError
Error Message:  HTTPConnectionPool(host='localhost', port=80): Max retries exceeded with url: /6000 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f06a5862a00>: Failed to establish a new connection: [Errno 111] Connection refused'))
Run time:  12.61784315109253

After 4 retries (plus 1 original request) with a backoff_factor=0.9, the execution time was 12.6 seconds. Let’s use the formula we saw earlier to compute sleeping time.

sleeping_time = {backoff_factor} * (2 ^ ({retry_number} – 1))

There are 5 requests in total

First request (which is made immediately) – 0 seconds sleeping,
First retry ( which is also sent immediately after the failure of the first request) – 0 seconds sleeping,
Second retry -> 0.9*(2^(2-1)) = 0.9*2 = 1.8 seconds of sleeping,
Third retry -> 0.9*(2^(3-1)) = 0.9*4 = 3.6 seconds of sleeping time, and,
Fourth retry -> 0.9*(2^(4-1)) = 0.9*8 = 7.2 seconds.

That is a total of 12.6 seconds of sleeping time implemented by urllib3.util.Retry. The actual execution time is 12.61784315109253 seconds. The 0.01784315109253 difference, which is not accounted for, is attributable to the DNC and general computer power latency.

Solution 4: Using headers when sending requests

Some websites blocks web crawlers. They notice that a bot is sending requests based on headers passed. For example, let’s run this code and turn on the verbose to see what happened behind the hoods.

import http.client

# turn verbose on

http.client.HTTPConnection.debuglevel = 1

import requests

url = 'https://www.example.com'

response = requests.get(url)

Output (truncated):

send: b'GET / HTTP/1.1\r\nHost: www.example.com\r\nUser-Agent: python-requests/2.28.1\r\nAccept-Encoding: gzip, deflate\r\nAccept: */*\r\nConnection: keep-alive\r\n\r\n'
reply: 'HTTP/1.1 200 OK\r\n'

In that log, you can see that User-Agent is python-requests v2.28.1 and not a real browser. With such identification, you might get blocked and get the “Max retries exceeded with url” error. To avoid this, we need to pass our actual browser as a user-agent. You can go to the following link to get some headers: http://myhttpheader.com/. In that link, my user-agent is “Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Firefox/91.0”. Let’s now use that user agent instead.

import http.client

# Turn verbose on.

http.client.HTTPConnection.debuglevel = 1

import requests

headers = {'User-Agent':'Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Firefox/91.0'}

url = 'https://www.example.com'

response = requests.get(url, timeout=5, headers=headers)

Output (truncated)

send: b'GET / HTTP/1.1\r\nHost: www.example.com\r\nUser-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Firefox/91.0\r\nAccept-Encoding: gzip, deflate\r\nAccept: */*\r\nConnection: keep-alive\r\n\r\n'
reply: 'HTTP/1.1 200 OK\r\n'

Conclusion

The “Max retries exceeded with url” error is caused by an invalid URL, server overloading, failed SSL verification, unstable internet connection, and an attempt to send many requests to a server. In this article, we discussed solutions for all these problems using examples.

The key is always to understand the kind of error you have and then pick the appropriate solution.

Codeigo

Just programming

Fix Requests Max Retries Exceeded With Url in Python

Causes of requests Max Retries Exceeded With Url

Wrong URL – A typo?

Failure to verify the SSL certificate

Using requests with no or unstable internet connection

Sending too many requests/ server overload

Solutions to requests Max Retries Exceeded With Url Error

Solution 1: Double check URL

Solution 2: Solving SSLError

Solution 3: Solving the “Max retries exceeded with url” error related to an unstable connection

Solution 3a: Issue timeout argument in requests.get()

Solution 3b: Retry connections on connections-related errors

Solution 4: Using headers when sending requests

Conclusion