-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conditionally restore ToMan functionality #36
base: master
Are you sure you want to change the base?
Conversation
The ToMan formatter would search for the groff version it has available but based on CPAN RT #120229, we need to determine this much earlier to decide whether to pass on to ToMan or not. This does not make any behavioral changes (other than temporarily removing a debug message). * Two helper functions from Pod::Perldoc::BaseTo were moved into Perldoc.pm: _get_path_components() and _find_executable_in_path(). * Perldoc.pm now has a method inspect_execs() which tries to find all available executables of a given program. It uses data from a new helper function, _exec_data(). Currently only nroff is supported. * From ToMan we removed the code for detecting nroff and instead incorporated them in _exec_data() or a new function _find_executable() -- both in Perldoc.pm. * Perldoc.pm's _inspect_execs() now calls _find_executable(), _find_executable_in_path(), and _get_path_components(). ToTerm can now be switched to ToMan and everything still works as it did before, but we are not doing this yet. The next commit will start addressing the logic for picking ToMan or ToTerm.
This is just too much to read or figure out at once. Instead of every possible condition (including unless !(...), which is ridiculous), it's simply the obvious cocnditions: A set of OS's or a check on term. If I understand this logic correctly (and I had to write some tests to verify), it means that we are not using ToTerm for Windows, DOS, AmigaOS, or for the recognized terms listed ("*dumb*", "*emacs*", "*none*", and "*unknown*". The fallback is ToText. I guess that makes sense.
The formatter class will need to know which pagers have been found so it could determine whether ToTerm is a good option.
The idea is: * If we have an updated version of groff, we can just use ToMan. * If not, we can use ToTerm on the condition that less is updated enough and supports the "-R" flag that will be added to it by ToTerm. * Otherwise, do not apply anything, using the first default set of ToText.
I would like to also see this default ToMan if the nroffer is mandoc, on OpenBSD it works just fine. |
This was delayed due to comments from Zefram. It's on my list of things to do, but I need to address his comments first. |
This is resolving an issue raised by Zefram: The use of `$less_bin --version` is dubious, because, per ->pagers_guessing, $less_bin may contain shell redirection characters, such that "--version" wouldn't necessarily function as a command-line argument to less(1). You may need to parse pager strings in more detail. Instead we're removing any shell redirect character and anything that runs afterwards.
We had assumed that the first `less` pager is of a sufficient version but that is not necessarily the case. Instead, we now go through all `less` pagers and test each one separately. When we find one of the right version, we stop. If the user doesn't like this, they should set their first pager to the right `less` pager.
I've just taken over the maintenance of Pod::Perldoc, so I'll look into this. I know it's years old, and it might take me a bit to catch up. |
My "to-do list" includes this ticket, which I really want to address it. Do you want to have a quick chat about it? |
Here is a summary list of things that must be reviewed before this PR is merged.
I considered it "fine enough," using the detection from adding Zefram's comment on this being similar to the
I need to find the execution file and then call only that with
Reducing it to "which are available" (by going through them and calling
(Determined below to keep
I can change the guessing function to explicitly override whatever pager is available by those configurations and then, as a secondary measure, try to reduce the list to existing pagers, and if the first one is Zefram was okay with it (or at least agreed it makes sense).
I had left it as is since Zefram's feedback on this being part of current
We need to make sure From Zefram:
We have a solution for If so, this means the logic is:
Zefram approves this. |
Okay, let's see how this works out. |
Much of the future improvements require making sure we get the right pager. To be able to do make the adjustments successfully, we need to set a baseline with this test. It tests environment variables, "-m" option, and OS options. It already exposes a few options that might be worth creating tickets for.
We are trying to determine the formatter class (ToTerm, ToMan, etc.) during `init()` but we only do the pager guessing (to figure out which pagers we have) much later when we want to run the paging. Moving the page guessing to the init point allows us to use the pager availability and version to determine the formatter. We also need to improve the detection of `less(1)` version and to do that, we need to collect all possible pagers first, which is done in `pagers_guessing()`.
This change checks that the code for detecting the `less` version without getting confused by arguments or redirections. It tests every OS setup available, and checks whether it receives the correct binary (after cleaning up redirection and arguments), as well as the index of that binary, just so we know it's not the same binary each time (if "less" is repeated, for example). This regex has another character ("\s") that the original code doesn't, to prove that it's correct to add it. I also temporarily added a redirect to one of the "less" entries, and it showed that the redirection regex works (though it broke other tests, so I removed it before committing).
The `less` binary might include arguments (for example, we support `/usr/bin/less -isrR` for Cygwin), so it's better to clean up the arguments before calling `--version`. Arguably, this isn't a bug, but it's better to separate the arguments from the `--version` call.
Changes:
Considering we already clean up the The idea of finding the execution file is not likely, considering we're trying to support multiple operating systems and don't know which one supports |
Is this something that's ready to merge or are you planning more work? |
I'm about to go off grid until the second week of January, but I've made Sawyer a co-maintainer in PAUSE and he's a collaborator on this repo. Feel free to make appropriate decisions in my absence. I suggest releasing a dev release if you want people to try it (but they can also just install from the repo). If you tag the repo (see the latest Whatever happens can be fixed if it goes wrong. Be bold. We only really need to be bulletproof when we ask perl5-porters to include it in perl release. |
I've only addressed one issue so far. There are a few other, so definitely planning more work before we can merge. |
I originally copied it over for convenience. It should've have been committed and it's not being called.
As Zefram pointed out in a ticket, the `/less/` regexp match here doesn't mean we didn't get a different pager with the string "less" in its name. By checking that we received a version, we're also checking a regexp match on the version call to return the result for "less" pager (because the response to `--version` includes "less VERSION_NUMBER"). This means that now pagers that have the string "less" in their name would not be tripped by it (unless they answer to `--version` with "less VERSION_NUMBER") and `less` binaries that have the string "less" but not only "less" would still work (like "lessng" if one would exist). Arguably, we might want to try the `--version` check on all pagers instead of ones that match `/less/` in their name, but... I'm not sure we should.
We now detect |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does that less version match need to be anchored?
We could, but I think we should be more permissive on this front. We're also not being very strict on the binary. We check for any binary with the word "less" in it. We're okay with someone having a binary called |
On:
While
Note the words "[...] before trying to find a pager on its own." This means we explicitly state that we will continue to try one of our own if we fail, unlike I think we need to decide here between breaking our expected behavior so far and whatever expected behavior someone has if they assumed we had the same logic as While I was in favor of the former at first, I'm now in favor of the latter. I.e., keep it as is. |
Since perldoc isn't man, we've been doing it differently for at least a decade, and it's documented to do that, I say we stick with what we are doing now to find a pager. |
Is there anything more this needs? Can we let people play with it to get feedback? |
[Each commit here has been done separately with - hopefully - a good commit message. It is easier to review them separately than together.]
This Pull Request introduces ToMan back and provides conditions for when to use ToMan and when to use ToTerm, falling back to ToText (as before).
The conditions are as follows (or should be):
groff
is new enough, use ToMan. This was the previous system that was moved to ToTerm but worked well for non-macOS operating systems. ToTerm works well for those Macs but fails any non-macOS. macOS carries an older version ofgroff
.groff
is too old (macOS) and our pager isless
, use ToTerm. ToTerm will continue to (conditionally) set the-R
to makeless
work well.less
, use ToText. This is the fallback and was set before our code for deciding whetther to use ToMan or ToTerm.The way this PR works is by moving the detection of
groff
to the main module (and hopefully refactoring it to allow other stuff in the future) and then making the decision in line 593 and below in Perldoc.pm.This should resolve RT#120229.