Official Python SDK for the ScrapeGraph API - Smart web scraping powered by AI.
pip install scrapegraph-py
- π€ AI-powered web scraping
- π Both sync and async clients
- π Structured output with Pydantic schemas
- π Detailed logging
- β‘ Automatic retries
- π Secure authentication
from scrapegraph_py import Client
client = Client(api_key="your-api-key-here")
Note
You can set the SGAI_API_KEY
environment variable and initialize the client without parameters: client = Client()
Scrapes any webpage using AI to extract specific information.
from scrapegraph_py import Client
client = Client(api_key="your-api-key-here")
# Basic usage
response = client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main heading and description"
)
print(response)
Output Schema (Optional)
from pydantic import BaseModel, Field
from scrapegraph_py import Client
client = Client(api_key="your-api-key-here")
class WebsiteData(BaseModel):
title: str = Field(description="The page title")
description: str = Field(description="The meta description")
response = client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the title and description",
output_schema=WebsiteData
)
Converts any webpage into clean, formatted markdown.
from scrapegraph_py import Client
client = Client(api_key="your-api-key-here")
response = client.markdownify(
website_url="https://example.com"
)
print(response)
Extracts information from HTML content using AI.
from scrapegraph_py import Client
client = Client(api_key="your-api-key-here")
html_content = """
<html>
<body>
<h1>Company Name</h1>
<p>We are a technology company focused on AI solutions.</p>
<div class="contact">
<p>Email: contact@example.com</p>
</div>
</body>
</html>
"""
response = client.localscraper(
user_prompt="Extract the company description",
website_html=html_content
)
print(response)
All endpoints support async operations:
import asyncio
from scrapegraph_py import AsyncClient
async def main():
async with AsyncClient() as client:
response = await client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main content"
)
print(response)
asyncio.run(main())
For detailed documentation, visit docs.scrapegraphai.com
For information about setting up the development environment and contributing to the project, see our Contributing Guide.
- π§ Email: support@scrapegraphai.com
- π» GitHub Issues: Create an issue
- π Feature Requests: Request a feature
- β API Feedback: You can also submit feedback programmatically using the feedback endpoint:
from scrapegraph_py import Client client = Client(api_key="your-api-key-here") client.submit_feedback( request_id="your-request-id", rating=5, feedback_text="Great results!" )
This project is licensed under the MIT License - see the LICENSE file for details.
Made with β€οΈ by ScrapeGraph AI