Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate mark, unmark, ismarked, reset #58034

Open
jakobnissen opened this issue Apr 7, 2025 · 1 comment
Open

Deprecate mark, unmark, ismarked, reset #58034

jakobnissen opened this issue Apr 7, 2025 · 1 comment
Labels
io Involving the I/O subsystem: libuv, read, write, etc.

Comments

@jakobnissen
Copy link
Member

jakobnissen commented Apr 7, 2025

This is part of a larger discussion of IO, but I'm making a separate issue so we can refer here, and to make sure this is being tracked.

I propose we deprecate these functions, and remove references to these functions from docstrings.

What are they good for? As far as I can tell, their main use is

  1. To allow seeking streams that are otherwise un-seekable. However, this is semantically meaningless, and seems to be used in practice not for any documented, sensible APIs, but instead to implement stuff that should probably be implemented in another fashion
  2. Their use case in e.g. TranscodingStreams: To mark a position in an internal buffer of an IO to make sure this position is not deleted from the buffer. This ability is something we use extensively in various BioJulia packages.
  3. They can be used to seek to a position without storing the position in a variable. However, I think that's pointless. Just store the position in a variable.

These functions have problems currently, and I think we can achieve what they do, using better abstractions:

  1. Their implementation assumes that the IO has a property called mark, and that this returns a 1-based integer, with -1 as the sentinel value. This is not great.
  2. They assume the IO has a way to expose its internal buffer, but we don't have this API... yet. And, since these functions have generic definitions, if someone adds a .mark field to their IO, then their code for manipulating this buffer will, by default, be wrong.

We can achieve the same thing using my proposed buffered IO interface in #57982. This API gives users access to the IOs internal buffer. If a user then decides they want to hold only bytes from position p, they can simply call consume(io, p-1), then call fillbuffer(io) until they get the bytes they want, and operate on those.

But maybe I'm missing something about why these functions exist.

See also: #40500

@jakobnissen jakobnissen added the io Involving the I/O subsystem: libuv, read, write, etc. label Apr 7, 2025
@jakobnissen
Copy link
Member Author

jakobnissen commented Apr 7, 2025

From asking on Slack, I learned

  • These functions are fairly popular according to code search, so they probably shouldn't just be deprecated. Furthermore, unfortunately, several structs contain .mark, so the fallback definition can't just be removed.
  • For some implementations, it may be easier to implement mark etc. than position, since the latter has a notion of the absolute position which may not be available

Furthermore, @StefanKarpinski suggested an alternative interface, namely

mark(io) do inner
    # stuff
    reset(inner) # rewind
    # redo stuff
end

Where inner contain the same stream of data as io, but may be a different type, which supports marking. E.g. it could just add a buffer and wrap io. When the do block exits, the mark is removed.
We could even provide a generic wrapper type in a package which new IO types could opt into using as inner to automatically get marking.

Therefore, instead of just a straight up deprecation, we could instead document these new methods, write that authors should implement all of them at once, and write that this is the preferred method over the default mark methods.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
io Involving the I/O subsystem: libuv, read, write, etc.
Projects
None yet
Development

No branches or pull requests

1 participant