Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consistency of closed events #236

Closed
paullouisageneau opened this issue Mar 30, 2024 · 16 comments
Closed

Consistency of closed events #236

paullouisageneau opened this issue Mar 30, 2024 · 16 comments

Comments

@paullouisageneau
Copy link
Contributor

paullouisageneau commented Mar 30, 2024

Following the discussion in #223, it appears there is currently an inconsistency in the way closed states events are handled.

PeerConnectionWrapper has an explicit destroy() method which calls doResetCallbacks(), which means the onStateChange callback is triggered with closed state after calling close(). However, since callbacks are not reset, it creates memory leak issues similar to #223 when not using the API through the polyfill.

DataChannelWrapper does not have such a mechanism, it resets callabcks immediately in doClose() so onClosed is not triggered after close(). However, this does not create memory leaks.

MediaTrackWrapper acts like DataChannelWrapper, it resets callbacks immediately and onClosed is not triggered after close().

Is this expected behavior? Does it really make sense to have this mechanism in PeerConnectionWrapper while other places don't have it?

@murat-dogan
Copy link
Owner

After this comment #223 (comment) I thought the same thing.
Then I thought about why we had this and remembered this issue #87

I am totally OK with resetting all callbacks on close, but this will be a breaking change.
@pimterry do you have a comment here?

@pimterry
Copy link
Contributor

pimterry commented Apr 3, 2024

Hmm, I'm not sure I fully understand the details here (or remember the full context from when I opened #87) but in general I was running into issues where I assumed that closed would always fire on close (which does seem pretty reasonable) and I think this created inconsistent behaviour between local & remote closure that was quite inconvenient. It makes it difficult to handle closure in one single consistent place.

I'm mainly tracking this on entire connections, but to be honest I would've expected the same from channels & tracks too.

Is there no practical way to fire the closed event, and then clean up the callbacks on nextTick or similar? Of course they do need cleaning up, but I don't understand why it's not possible to run the close callback before doing so.

@murat-dogan
Copy link
Owner

Introducing the same procedures on channels is possible.
But the point that we came to is actually on Chrome if you call the close function you will not receive a callback. We want to follow this because it will make the procedure much easier and the specs say that.

Since this is a breaking change I want to also ask you because of that issue.
Also, we will not need to call the destroy function anymore.

@pimterry
Copy link
Contributor

pimterry commented Apr 3, 2024

But the point that we came to is actually on Chrome if you call the close function you will not receive a callback.

That's true, although I'd be very interested to know why the spec was written that way.

The opposite is true for datachannels in browsers though, which do always fire a close event when closed, both locally or remotely, and also always fire a close event if the overall connection itself is closed (even if that happens locally).

It's up to you of course, but I think node-datachannel has quite different use cases to browser environments and I wouldn't tie yourselves to the browser API design - I expect many users are in server-like environments, tracking many different WebRTC connections over longer time spans along with other state, and so there's much more value in being able to manage and clean up other resources when they close. Having a single reliable way to be informed when the resources are closed is very useful.

@murat-dogan
Copy link
Owner

I think for polyfills we should follow specs and for main part ,since this is a binding, we should follow libdatachannel.

@paullouisageneau
What is expected on libdatachannel for peer, channel and track if you call close function?

@paullouisageneau
Copy link
Contributor Author

paullouisageneau commented Apr 4, 2024

@murat-dogan libdatachannel always triggers the onClosed() callback after close() for data channels and tracks, and it also triggers the state callback with closed state for peer connections (which the browser WebRTC API doesn't do). So if we want to make it consistent, onClosed should be called after close() in node-datachannel too.

There might be a way to do that without the explicit destroy mechanism, by resetting the close callback itself when onClosed is called. I originally made the change myself to reset callbacks on close() because people complained that callbacks like onOpen() or onMessage() were triggered after close() (as callbacks might still be scheduled to run later on the event loop). Maybe we should only allow the onClosed callback after close.

@murat-dogan
Copy link
Owner

murat-dogan commented Apr 5, 2024

With this PR #239 I introduced a cleanup function, that will be called after calling JS cb.
With the help of this function, we can destroy resources & reset callbacks.

With this PR the procedure for peer connection, data channel, track and WebSocket will be like;

  • User calls close function (or peer side closed)
  • onClosed callback will be called
  • After returning onClosed callback, all resources and callbacks will be reset.

Also for polyfills, onClosed callback will be called and I think this will not be harmful.

@murat-dogan
Copy link
Owner

I am also investigating why we needed a destroy function.
I found also this #131

@paullouisageneau
Copy link
Contributor Author

@murat-dogan Great, I like your idea to add a cleanup function!

However, I'm afraid #239 might reintroduce the issue I originally fixed in #106.

The original issue was reported by @pimterry in #103: A peer connection emits a track, libdatachannel triggers onTrack, so the onTrack JS callback is scheduled later on the event loop. From a task currently on the event loop, the user calls close(). libdatachannel triggers onStateChange with state Closed, so the onStateChange JS callback is scheduled on the event loop after onTrack. As a result, the onTrack JS callback is first run on a seemingly closed peer connection, then only the onStateChange one, then the callbacks are reset.

I don't think this is desirable behavior. Maybe some specific callbacks like onLocalDescription, onLocalCandidate, onDataChannel, onTrack, and onOpen (for channels) should cancel if the object is closed?

@murat-dogan
Copy link
Owner

It will be beneficial to have a test case.
@pimterry is it achievable, like in test/jest-tests folder?

I pushed new commıts @paullouisageneau
Could you please also check?

@pimterry
Copy link
Contributor

pimterry commented Apr 9, 2024

@pimterry is it achievable, like in test/jest-tests folder?

Sorry, I don't really have any spare time to look at this right now! I've got quite a bit of work going on and I haven't touched the RTC bits for a little while. From the original issue tehre, it might be quite difficult to write a normal test for anyway, since there's some sort of internal race condition going on.

I think this kind of thing is probably best covered by some kind of stress test, that just opens & then closes a lots of connections in a loop for a while, with a couple of tracks & data channels on each (and calling some method on each one as it opens to check it's OK - e.g. just logging the channel label). Testing like that might be useful to flush out other issues too. In #104 I added assertions so I think most race condition cases should now immediately fail in that kind of environment if there are any issues.

@murat-dogan
Copy link
Owner

I added a multiple call here test/jest-tests/multiple-run.test.js on PR

@murat-dogan
Copy link
Owner

murat-dogan commented Apr 15, 2024

I have released new version with latest changes

https://github.com/murat-dogan/node-datachannel/releases/tag/v0.7.0

If you still think there is a problem please write here or we can close the issue

@paullouisageneau
Copy link
Contributor Author

@murat-dogan Sorry, I'm very busy at the moment, thank you for addressing the issue!

I'm only concerned about scenarios where cleanup is not done, for instance I think it's not called if the user does not set a closed callback.

Fixing this can be a bit tricky because doCleanup() must be called from the event loop, so possibly it requires scheduling an empty callback so the cleanup function is run.

@murat-dogan
Copy link
Owner

Thanks for the hint @paullouisageneau , good point.
I will check it.

@murat-dogan
Copy link
Owner

I will just released V0.9.0 with latest changes.
I guess now we can close this issue.
Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants