Best Practices for Scaling Your Web Scraping Projects in 2025

We invite you to explore our blog for more details.

Setting Up Your Coding Environment

Before building the application, you’ll need to set up a basic Python environment. Follow these steps to get started:

Install Python 3 on your system.
Install the required dependencies by running:

python -m pip install -r requirements.txt

To make the webhook publicly accessible to Crawlbase servers for demonstration purposes, install and configure ngrok.

Obtaining API Credentials

Sign up for a Crawlbase account and log in.
Upon registration, you’ll receive 5,000 free requests to get started.
Navigate to your Account Docs and copy your Crawling API token (Normal or JavaScript requests).
Create a new Crawler to start configuring your crawl tasks.

Running the Example Scripts

Before running the examples, ensure that you replace all instances of the following placeholders:

<Normal or Javascript requests token> - Replace this with your Crawling API requests token.
<Crawler name> - Replace this with the name of your newly created crawler. You can create or view it here.

Example Scripts

Start the ngrok tunnel:

ngrok http 5768

Set the callback URL:

Copy the forwarding URL provided by ngrok and paste it into the Callback URL field of your Crawler settings. Example: https://xxxx-xxx-xxx-xxx-xx.ngrok-free.app/webhook

Run the Webhook HTTP server:

python webhook_http_server.py

Send a crawl request (in a separate terminal):

python crawl.py

🛡 Disclaimer This repository is for educational purposes only. Please make sure you comply with the Terms of Service of any website you scrape. Use this responsibly and only where permitted.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.editorconfig		.editorconfig
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
crawl.py		crawl.py
requirements.txt		requirements.txt
webhook_http_server.py		webhook_http_server.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Best Practices for Scaling Your Web Scraping Projects in 2025

Setting Up Your Coding Environment

Obtaining API Credentials

Running the Example Scripts

Example Scripts

About

Uh oh!

Releases

Packages

Languages

License

ScraperHub/best-practices-for-scaling-your-web-scraping-projects

Folders and files

Latest commit

History

Repository files navigation

Best Practices for Scaling Your Web Scraping Projects in 2025

Setting Up Your Coding Environment

Obtaining API Credentials

Running the Example Scripts

Example Scripts

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages