Skip to content

✨ Add shutdown event and save per page option

Compare
Choose a tag to compare
@roniemartinez roniemartinez released this 25 Mar 20:49
· 396 commits to master since this release
0.12.0
35f900b

What's Changed

Other

Full Changelog: 0.11.0...0.12.0

✨ Save data on each page

You can now save data after scraping a page. Save functions should be decorated with is_per_page=True and execute the scraper with --save-per-page to use it.

@save("jsonl", is_per_page=True)
def save_jsonl(data, output) -> bool:
    global jsonl_file
    jsonl_file.writelines((json.dumps(item) + "\n" for item in data))
    return True

✨ Shutdown event

The shutdown even is called before the application terminates. This is useful when freeing resources, file handles, databases or other use-cases before ending.

@shutdown()
def zip_all():
    global SAVE_DIR
    shutil.make_archive("images-and-pdfs", "zip", SAVE_DIR)

✨ How dude runs internally

events