✨ Add shutdown event and save per page option
What's Changed
- ✨ Add shutdown event and save per page option by @roniemartinez in #102
Other
- ⬆️ Bump playwright from 1.20.0 to 1.20.1 by @dependabot in #101
- ⬆️ Bump mypy from 0.941 to 0.942 by @dependabot in #104
- ⬆️ Bump mkdocs-material from 8.2.6 to 8.2.7 by @dependabot in #105
Full Changelog: 0.11.0...0.12.0
✨ Save data on each page
You can now save data after scraping a page. Save functions should be decorated with is_per_page=True
and execute the scraper with --save-per-page
to use it.
@save("jsonl", is_per_page=True)
def save_jsonl(data, output) -> bool:
global jsonl_file
jsonl_file.writelines((json.dumps(item) + "\n" for item in data))
return True
✨ Shutdown event
The shutdown even is called before the application terminates. This is useful when freeing resources, file handles, databases or other use-cases before ending.
@shutdown()
def zip_all():
global SAVE_DIR
shutil.make_archive("images-and-pdfs", "zip", SAVE_DIR)