Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ICD]Trigger resubscription when receiving check-in and subscription has not yet become abnormal #37831

Open
wants to merge 10 commits into
base: master
Choose a base branch
from
Next Next commit
Trigger resubscription when receiving check-in and subscription has n…
…ot yet become abnoaml in client
  • Loading branch information
yunhanw-google committed Mar 11, 2025
commit 9f7bed765e255e4695b92937c8d02c85f436e4d9
13 changes: 13 additions & 0 deletions src/app/ReadClient.cpp
Original file line number Diff line number Diff line change
@@ -484,6 +484,7 @@ CHIP_ERROR ReadClient::GenerateDataVersionFilterList(DataVersionFilterIBs::Build

void ReadClient::OnActiveModeNotification()
{
// Note: this API only works when issuing subscription via SendAutoResubscribeRequest.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So hold on. When not doing auto-resubscribe, do we want to ignore this call, or still close ourselves even though it's fatal?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the whole OnActiveModeNotification is designed for auto-resubscription stuff, if they are not using OnActiveModeNotification, it is really application consumer's responsibility to decide how to manipulate close and retry logic when receiving check-in message, and application consumer need implement their CheckInDelegate and not call InteracitonModelEngine::OnActiveModeNotification.

I think we should ignore this call when not using auto-resubscription.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just update code to reflect this change.

VerifyOrDie(mpImEngine->InActiveReadClientList(this));
// When we reach here, the subscription definitely exceeded the liveness timeout. Just continue the unfinished resubscription
// logic in `OnLivenessTimeoutCallback`.
@@ -493,6 +494,18 @@ void ReadClient::OnActiveModeNotification()
return;
}

// If the server sends out check-in message, and there is no reschedule subscription yet in client side at the same time, it means
// current client does not realize subscription has gone, and we should forcibly timeout current subscription, and schedule a new one.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the thing is: per spec, the check-in target and the thing whose subscription the server checks for might be different nodes. And in practice this is actually going to happen, I can guarantee that.

So it's entirely possible for this subscription to be just fine and yet for a check-in message to arrive because some other subscription from a different node is not fine.

I assume we're already surfacing check-in messages to the SDK API consumer? If so, they would need to retrigger subscriptions as needed. Or we need some way to configure this behavior, at the very least, because the right thing to do here depends on information this code does not have.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we have already surfacing check-in messages to the SDK API consumer, it is consumer's reponsibility to retrigger subscription as need, they can mimic OnActiveModeNotification behavior to configure their behavior.

Copy link
Contributor Author

@yunhanw-google yunhanw-google Mar 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the thing is: per spec, the check-in target and the thing whose subscription the server checks for might be different nodes. And in practice this is actually going to happen, I can guarantee that.

If subscription from node A is and good, register node for check-in is for nodeB, check-in message would never arrive at nodeA, right? since server send check-in message based upon registered node.
Unless we allow that multiple nodes reside in same client device?

if yes, maybe we need add one more check between ICDClientInfo::check_in_node against the node id extracted from readClient in InteractionModelEngine:: OnActiveModeNotification? it seems we cannot get mLocalNodeId from SecureSession?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just updated code to reflect the above change.

Copy link
Contributor

@bzbarsky-apple bzbarsky-apple Mar 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If subscription from node A is and good, register node for check-in is for nodeB, check-in message would never arrive at nodeA, right?

Wrong scenario. The correct scenario is that node A is check-in target and has a subscription but that is NOT the subscription check-in is tracking. Node B is the one the check-in is tracking and that one is dead. Node A will receive the check-in, because Node B is not subscribed, and the right action is to prod Node B (or maybe Node C, depending) in the app, not tear down A's perfectly OK subscription.

Point being: ReadClient does not have the information to make this decision; the application does.

if (!mIsResubscriptionScheduled)
{
// Closing will ultimately trigger ScheduleResubscription with the aReestablishCASE argument set to true, effectively
// rendering the session defunct.
Close(CHIP_ERROR_TIMEOUT);
return;
}

// If the server sends a check-in message and a subscription is already scheduled, it indicates a client-side subscription timeout or failure.
// Cancel the scheduled subscription and initiate a new one immediately.
TriggerResubscribeIfScheduled("check-in message");
}