You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've then highlighted the text אני סתם כותב under the first heading and שמש גשם under the second heading. I used Okular (KDE PDF viewer) for the annotations: pdfannots.pdf
In this case the headings are extracted correctly because they come as a string from the PDF metadata. The problem is that pdfminer's text extraction routines don't support right-to-left text: pdfminer/pdfminer.six#515
There are also some similar assumptions in pdfannots that affect things like the relative order that two annotations are reported when they appear on the same line of text. I could probably fix that but the bigger issue is the one linked above.
This tool is terrific, thank you.
Highlighted and underlined Hebrew text are displayed backwards. Interestingly, the title blurb preceding the highlighted text is not backwards.
The text was updated successfully, but these errors were encountered: