ReadTimeout: HTTPSConnectionPool(host='', port=443): Read timed out. (read timeout=10)
Solution 1
I was helped by increasing the timeout, immediately set 120 seconds. It turned out that the response from the server comes within 40 seconds.
Solution 2
Why do you have the timeout parameter in there? I would just eliminate the timeout parameter. The reason you get that error is because you set it to 10 which says if you don't receive a response from the server in 10 seconds, raise and error. So it's not necessarily the server calling you out. If no timeout is specified explicitly, requests do not time out (at least on your end).
page_one = requests.get(url, headers=headers) #< --- don't use the timeout parameter
Solution 3
This exception might occurs due to timeout or the available memory:
- The response from the server takes longer than the specified timeout. So to solve it you need to set a higher timeout.
- The file your are trying to read is large and the socket buffer is not enough to handle it. So you can try increasing the buffer size based on your machine's capacity.
import urllib3, socket
from urllib3.connection import HTTPConnection
HTTPConnection.default_socket_options = (
HTTPConnection.default_socket_options + [
(socket.SOL_SOCKET, socket.SO_SNDBUF, 1000000), #1MB in byte
(socket.SOL_SOCKET, socket.SO_RCVBUF, 1000000)
])
JB_
Student of software development. Back-end [C#, PHP] Front-end [HTML, CSS and JS]
Updated on May 28, 2021Comments
-
JB_ almost 3 years
I'm doing a webscraping on a site and sometimes when running the script I get this error:
ReadTimeout: HTTPSConnectionPool(host='...', port=443): Read timed out. (read timeout=10)
My code:
url = 'mysite.com' all_links_page = [] page_one = requests.get(url, headers=getHeaders(), timeout=10) sleep(2) if page_one.status_code == requests.codes.ok: soup_one = BeautifulSoup(page_one.content.decode('utf-8'), 'lxml') page_links_one = soup_one.select("ul.product_list") for links_one in page_links_one: for li in links_one.select("li"): all_links_page.append(li.a.get("href").strip())
The answers I found was not satisfactory
-
JB_ over 4 yearsAm I using this parameter to prevent site blocking, or am I wrong?
-
wishmaster over 4 yearsI believe it is always better to set timeout, server can keep a request hanging for quite a while specially if it suspects a bot, thus storing the link and requesting it later or using a proxy to ask again might solve it.
-
chitown88 almost 2 years@wishmaster thats a good point. Probably better to increase the timeout parameter here then.