🌐 ScrapeGraph Python SDK

Official Python SDK for the ScrapeGraph API - Smart web scraping powered by AI.

📦 Installation

pip install scrapegraph-py

🚀 Features

🤖 AI-powered web scraping
🔄 Both sync and async clients
📊 Structured output with Pydantic schemas
🔍 Detailed logging
⚡ Automatic retries
🔐 Secure authentication

🎯 Quick Start

from scrapegraph_py import Client

client = Client(api_key="your-api-key-here")

Note

You can set the SGAI_API_KEY environment variable and initialize the client without parameters: client = Client()

📚 Available Endpoints

🔍 SmartScraper

Scrapes any webpage using AI to extract specific information.

from scrapegraph_py import Client

client = Client(api_key="your-api-key-here")

# Basic usage
response = client.smartscraper(
    website_url="https://example.com",
    user_prompt="Extract the main heading and description"
)

print(response)

Output Schema (Optional)

from pydantic import BaseModel, Field
from scrapegraph_py import Client

client = Client(api_key="your-api-key-here")

class WebsiteData(BaseModel):
    title: str = Field(description="The page title")
    description: str = Field(description="The meta description")

response = client.smartscraper(
    website_url="https://example.com",
    user_prompt="Extract the title and description",
    output_schema=WebsiteData
)

📝 Markdownify

Converts any webpage into clean, formatted markdown.

from scrapegraph_py import Client

client = Client(api_key="your-api-key-here")

response = client.markdownify(
    website_url="https://example.com"
)

print(response)

💻 LocalScraper

Extracts information from HTML content using AI.

from scrapegraph_py import Client

client = Client(api_key="your-api-key-here")

html_content = """
<html>
    <body>
        <h1>Company Name</h1>
        <p>We are a technology company focused on AI solutions.</p>
        <div class="contact">
            <p>Email: contact@example.com</p>
        </div>
    </body>
</html>
"""

response = client.localscraper(
    user_prompt="Extract the company description",
    website_html=html_content
)

print(response)

⚡ Async Support

All endpoints support async operations:

import asyncio
from scrapegraph_py import AsyncClient

async def main():
    async with AsyncClient() as client:
        response = await client.smartscraper(
            website_url="https://example.com",
            user_prompt="Extract the main content"
        )
        print(response)

asyncio.run(main())

📖 Documentation

For detailed documentation, visit docs.scrapegraphai.com

🛠️ Development

For information about setting up the development environment and contributing to the project, see our Contributing Guide.

💬 Support & Feedback

📧 Email: support@scrapegraphai.com
💻 GitHub Issues: Create an issue
🌟 Feature Requests: Request a feature

⭐ API Feedback: You can also submit feedback programmatically using the feedback endpoint:

from scrapegraph_py import Client

client = Client(api_key="your-api-key-here")

client.submit_feedback(
    request_id="your-request-id",
    rating=5,
    feedback_text="Great results!"
)

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🔗 Links

Website
Documentation
GitHub

Made with ❤️ by ScrapeGraph AI

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

🌐 ScrapeGraph Python SDK

📦 Installation

🚀 Features

🎯 Quick Start

📚 Available Endpoints

🔍 SmartScraper

📝 Markdownify

💻 LocalScraper

⚡ Async Support

📖 Documentation

🛠️ Development

💬 Support & Feedback

📄 License

🔗 Links

Files

README.md

Latest commit

History

README.md

File metadata and controls

🌐 ScrapeGraph Python SDK

📦 Installation

🚀 Features

🎯 Quick Start

📚 Available Endpoints

🔍 SmartScraper

📝 Markdownify

💻 LocalScraper

⚡ Async Support

📖 Documentation

🛠️ Development

💬 Support & Feedback

📄 License

🔗 Links