You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
At the top of the page, it's mentioned that to work with pdf datasets we need to have the pdfplumber package installed but the link to its installation guide points to pytorch/visioninstallation instructions instead of pdfplumber's guide
I love the work on enabling pdf dataset support and these small tweaks would help everyone navigate the docs better. Thanks!
For solving the first issue, I went through the source .mdx code of the datasets docs and found that the link is pointing to ./pdf_dataset instead of ./document_dataset
Describe the bug
Hi, just a couple of small issues I ran into while reading the docs for loading pdf data:
The link for the
Create a pdf dataset
points to https://huggingface.co/docs/datasets/main/en/pdf_dataset instead of https://huggingface.co/docs/datasets/main/en/document_dataset and hence gives a 404 error.At the top of the page, it's mentioned that to work with pdf datasets we need to have the
pdfplumber
package installed but the link to its installation guide points topytorch/vision
installation instructions instead ofpdfplumber
's guideI love the work on enabling pdf dataset support and these small tweaks would help everyone navigate the docs better. Thanks!
Steps to reproduce the bug
The issue is on the Load Document Data page of the datasets docs.
Expected behavior
For solving the first issue, I went through the source .mdx code of the datasets docs and found that the link is pointing to
./pdf_dataset
instead of./document_dataset
For the second issue, I went through the source .mdx code of the datasets docs and found that the link is
pytorch/vision
installation instructions instead ofpdfplumber
's guideJust replacing these two links should fix the bugs
Environment info
datasets v3.5.0 (main at the time of writing)
The text was updated successfully, but these errors were encountered: