A Python-based tool that scrapes websites to collect email addresses. Given a starting URL, this tool will recursively follow links found on the page and extract email addresses from all visited pages.
- Recursive scraping: Follows links on the web pages to visit multiple pages for a thorough search.
- Email extraction: Uses regular expressions to find and collect email addresses.
- Easy to use: Just enter a URL, and the tool will start scraping.
python 3.x
requests
β For making HTTP requests.beautifulsoup4
β For parsing and navigating HTML.lxml
β An XML/HTML parser for BeautifulSoup.
git clone https://github.com/AdrianTomin/email-scraper.git
cd email-scraper
python -m venv venv
source venv/bin/activate
On Windows: venv\Scripts\activate
The required libraries are listed in requirements.txt. You can install them using pip:
pip install -r requirements.txt
If you don't have the requirements.txt file yet, you can generate it as follows:
pip freeze > requirements.txt
After installing the dependencies, you can run the tool by executing the following command:
python email_scraper.py
[+] Enter url to scan: https://example.com
[1] Processing https://example.com
[2] Processing https://example.com/contact
Found emails:
info@example.com
support@example.com