-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: Add e2e tests for playwright templates and pip based templates #1136
base: master
Are you sure you want to change the base?
Conversation
Example test run: https://github.com/apify/crawlee-python/actions/runs/14246372360 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds end-to-end tests for both Playwright and pip-based templates while addressing an issue with dependency installations and aligning the maximum allowed crawl requests to documentation examples.
- Added new e2e test parametrizations and markers for various template types and package managers.
- Refactored the function patching the crawlee version to support multiple package managers and updated related tests accordingly.
- Adjusted Dockerfile template and crawler configuration files to use a lower max_requests_per_crawl value.
Reviewed Changes
Copilot reviewed 7 out of 10 changed files in this pull request and generated 1 comment.
Show a summary per file
File | Description |
---|---|
tests/e2e/project_template/utils.py | Updated function to conditionally patch projects based on package manager and modified Dockerfile/requirements handling. |
tests/e2e/project_template/test_static_crawlers_templates.py | Updated test parametrizations and function calls to reflect the new patching function and limits. |
tests/e2e/conftest.py | Added pytest markers configuration for new test parameters. |
src/crawlee/project_template/templates/main_playwright.py | Lowered max_requests_per_crawl to align with documentation and test expectations. |
src/crawlee/project_template/templates/main_parsel.py | Lowered max_requests_per_crawl to align with documentation and test expectations. |
src/crawlee/project_template/templates/main_beautifulsoup.py | Lowered max_requests_per_crawl to align with documentation and test expectations. |
.github/workflows/templates_e2e_tests.yaml | Updated the workflow matrix and commands to incorporate the new test parameters and optimize runner utilization. |
Files not reviewed (3)
- Makefile: Language not supported
- src/crawlee/project_template/{{cookiecutter.project_name}}/Dockerfile: Language not supported
- src/crawlee/project_template/{{cookiecutter.project_name}}/requirements.txt: Language not supported
Comments suppressed due to low confidence (1)
tests/e2e/project_template/utils.py:25
- Using [0] on the result of re.findall may raise an IndexError if no match is found. Consider using a safer retrieval method such as using next(iter(...), '') or verifying that the list is non-empty.
crawlee_extras = re.findall(r'crawlee(\[.*\])', requirements)[0] or ''
More parallelized run with properly ordered tests to lower overall tests time: https://github.com/apify/crawlee-python/actions/runs/14260627276 6.5 minutes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a few minor comments...
secrets: | ||
APIFY_TEST_USER_API_TOKEN: | ||
description: API token of the Python testing user on Apify | ||
required: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this secret no longer required? The template tests are not running on Apify?
@@ -1,5 +1,5 @@ | |||
# % if cookiecutter.crawler_type == 'playwright-camoufox' | |||
camoufox | |||
camoufox[geoip]>=0.4.5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather major version upper bound?
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3 empty lines? :)
Description
max_requests_per_crawl
to 10 in all templates to be aligned with doc examples and to make the tests faster.Issues