Skip to content

f0lie/reddit-ven-anigha-archive

 
 

Repository files navigation

This repo archives all comments by Venerable Anīgha on Reddit then generates readable files.

The sqlite database contains all comments by Ven. Anīgha as well as the original questions. This is database is populated by fetch_comments.py.

Then markdown, pdf, and epub files are created by year with generate_markdown.py. The resulting files are in the *_files/ folders.

Both database and output files are run every midnight Sunday UTC.

Most of the time, comments from Reddit have quotes markers in the correct place. However, there are some quotes that don't have their markers. Some comments have paragraphs that seem like Anigha is saying it, but it's actually a quote of the other person. This issue is quite rare though and it's very clear when Anigha is speaking in the text.

Todo

  • automatic pdf generation
  • automatic epub generation
  • Add date of generation to the files.
  • (maybe) include the parent comments within threads as opposed to only the original question.
  • Fix questions that were deleted and show up as [removed] in the database.
  • Fix issue where emojis are missing from PDFs.
  • Make external links in the PDFs more noticable. Links work but are hard to distinguish from normal text because they don't have a different style.

Packages for PDF local package generation

Install pandocs and PDF generation for the proper OS by following this link.

Install the latest Source Serif font from this repo. This should be the *_Desktop.zip, not *_WOFF.zip which is intended for web.

For Windows, unpack the zip file, select all the *.ttf fonts, right click, then click on "install" to add the fonts to the OS.

Installing fonts in other OS's should be very similar.

Running github workflow locally.

Install act to run the github workflow locally. This is very useful for debugging the workflow.

Run act -j update_comments --secret-file .secrets --artifact-server-path $PWD/.artifacts.

.secrets is an env file that should contain all the secrets used to run the github workflow.

.artfacts is a folder that stores the db between runs.

This stalls at the final push for some reason.

About

Archives comments by /u/Bhikkhu_Anigha from Reddit

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%