Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Errno 11002] Temporary failure in name resolution after using URLExtract #163

Open
jackjyq opened this issue Mar 11, 2024 · 1 comment
Open

Comments

@jackjyq
Copy link

jackjyq commented Mar 11, 2024

After running URLExtract, the requests module raises [Errno 11002] Temporary failure in name resolution.

See codes below:

import requests
from urlextract import URLExtract


def call_extract_url():
    extractor = URLExtract()
    urls = extractor.find_urls(
        "https://www.baidu.com", check_dns=True, get_indices=False
    )
    print(f"call_extract_url() returns {urls}")


def call_request() -> str | None:
    r = requests.get("https://qr.1688.com/s/Q7XG2SzD", timeout=30)
    print(f"call_request() returns {r.status_code}")


call_request()
call_extract_url()
call_request()
The results
call_request() returns 200
call_extract_url() returns ['https://www.baidu.com']
Traceback (most recent call last):
  File "D:\Git\chatwoot-connector\venv\lib\site-packages\dns\resolver.py", line 1874, in _getaddrinfo
    answers = _resolver.resolve_name(host, family)
  File "D:\Git\chatwoot-connector\venv\lib\site-packages\dns\resolver.py", line 1440, in resolve_name
    v6 = self.resolve(
  File "D:\Git\chatwoot-connector\venv\lib\site-packages\dns\resolver.py", line 1321, in resolve
    timeout = self._compute_timeout(start, lifetime, resolution.errors)
  File "D:\Git\chatwoot-connector\venv\lib\site-packages\dns\resolver.py", line 1075, in _compute_timeout
    raise LifetimeTimeout(timeout=duration, errors=errors)
dns.resolver.LifetimeTimeout: The resolution lifetime expired after 2.002 seconds: Server Do53:172.24.248.17@53 answered The DNS operation timed out. 

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:\Git\chatwoot-connector\venv\lib\site-packages\urllib3\connection.py", line 203, in _new_conn
    sock = connection.create_connection(
  File "D:\Git\chatwoot-connector\venv\lib\site-packages\urllib3\util\connection.py", line 60, in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
  File "D:\Git\chatwoot-connector\venv\lib\site-packages\dns\resolver.py", line 1883, in _getaddrinfo
    raise socket.gaierror(socket.EAI_AGAIN, "Temporary failure in name resolution")
socket.gaierror: [Errno 11002] Temporary failure in name resolution

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "D:\Git\chatwoot-connector\venv\lib\site-packages\urllib3\connectionpool.py", line 790, in urlopen
    response = self._make_request(
  File "D:\Git\chatwoot-connector\venv\lib\site-packages\urllib3\connectionpool.py", line 491, in _make_request
    raise new_e
  File "D:\Git\chatwoot-connector\venv\lib\site-packages\urllib3\connectionpool.py", line 467, in _make_request
    self._validate_conn(conn)
  File "D:\Git\chatwoot-connector\venv\lib\site-packages\urllib3\connectionpool.py", line 1096, in _validate_conn
    conn.connect()
  File "D:\Git\chatwoot-connector\venv\lib\site-packages\urllib3\connection.py", line 611, in connect
    self.sock = sock = self._new_conn()
  File "D:\Git\chatwoot-connector\venv\lib\site-packages\urllib3\connection.py", line 210, in _new_conn
    raise NameResolutionError(self.host, self, e) from e
urllib3.exceptions.NameResolutionError: <urllib3.connection.HTTPSConnection object at 0x000001ADB8C93220>: Failed to resolve 'qr.1688.com' ([Errno 11002] Temporary failure in name resolution)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "D:\Git\chatwoot-connector\venv\lib\site-packages\requests\adapters.py", line 486, in send
    resp = conn.urlopen(
  File "D:\Git\chatwoot-connector\venv\lib\site-packages\urllib3\connectionpool.py", line 844, in urlopen
    retries = retries.increment(
  File "D:\Git\chatwoot-connector\venv\lib\site-packages\urllib3\util\retry.py", line 515, in increment
    raise MaxRetryError(_pool, url, reason) from reason  # type: ignore[arg-type]
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='qr.1688.com', port=443): Max retries exceeded with url: /s/Q7XG2SzD (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x000001ADB8C93220>: Failed to resolve 'qr.1688.com' ([Errno 11002] Temporary failure in name resolution)"))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "d:\Git\chatwoot-connector\scratch.py", line 20, in <module>
    call_request()
  File "d:\Git\chatwoot-connector\scratch.py", line 14, in call_request
    r = requests.get("https://qr.1688.com/s/Q7XG2SzD", timeout=30)
  File "D:\Git\chatwoot-connector\venv\lib\site-packages\requests\api.py", line 73, in get
    return request("get", url, params=params, **kwargs)
  File "D:\Git\chatwoot-connector\venv\lib\site-packages\requests\api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
  File "D:\Git\chatwoot-connector\venv\lib\site-packages\requests\sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
  File "D:\Git\chatwoot-connector\venv\lib\site-packages\requests\sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
  File "D:\Git\chatwoot-connector\venv\lib\site-packages\requests\adapters.py", line 519, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='qr.1688.com', port=443): Max retries exceeded with url: /s/Q7XG2SzD (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x000001ADB8C93220>: Failed to resolve 'qr.1688.com' ([Errno 11002] Temporary failure in name resolution)"))

As we can see from the result, the call_request succeeds before call_url_extract, and then fails after call_request.
I think this is caused by dns_cache_install(), as I comment out dns_cache_install(), the call_request succeeds.
I am wondering if we can remove this side effect in URLExtract?

Environment

  • Windows 11 23H2
  • Local DNS setting: 223.5.5.5, 223.6.6.6
  • Python 3.10
  • urlextract==1.9.0
  • requests==2.31.0
@lipoja
Copy link
Owner

lipoja commented Mar 11, 2024

Thank you for reporting it.
I would like to ask @jayvdb if he has time to have a look on that since he did the implementation of DNS check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants