Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Subscription Resumption counter is not reset after subscription resumption was established successfully #37738

Closed
dvdm-qorvo opened this issue Feb 24, 2025 · 0 comments · Fixed by #37740
Labels
bug Something isn't working needs triage

Comments

@dvdm-qorvo
Copy link
Contributor

dvdm-qorvo commented Feb 24, 2025

Reproduction steps

Reproduction steps
Performed the test with QPG light switch device (enabled persisted subscriptions and subcription resumption timeout)

  1. Commission the device with chiptool and setup a subscription on switch events. (keepSubscription = false, is-urgent=true)
  2. Check that the reports are coming in as expected when the switch device is triggered
  3. Move the switch device out-or-range

[00:01:30.796][P][ZCL] ThreadDiagnosticsDelegate: OnConnectionStatusChanged
[00:01:30.802][P][ZCL] ThreadDiagnosticsDelegate: OnNetworkFaultChanged

  1. Trigger the switch, so the report can't be sent out after retries, and the subscription goes into subscription resumption mode

[00:01:35.812][P][ZCL] SwitchServer: OnMultiPressComplete
[00:01:35.819][P][EVL] Dropped 1 event from buffer with priority 1 and event number 0x000000000000000F due to overflow: event priority_level: 1
[00:01:35.859][P][EM] <<< [E:60143i S:14786 M:148604311] (S) Msg TX from 0000000000000021 to 2:00000000075BCD15 [4364] [UDP:[fd5b:46be:80af:1:5135:417b:6c5e:aa9e]:5541] --- Type 0001:05 (IM:ReportData) (B:82)
[00:01:35.861][P][EM] ??1 [E:60143i S:14786 M:148604311] (S) Msg Retransmission to 2:00000000075BCD15 scheduled for 2553ms from now [State:Idle II:500 AI:300 AT:4000]
[00:01:38.414][P][EM] <<1 [E:60143i S:14786 M:148604311] (S) Msg Retransmission to 2:00000000075BCD15
[00:01:38.415][P][EM] ??2 [E:60143i S:14786 M:148604311] (S) Msg Retransmission to 2:00000000075BCD15 scheduled for 2576ms from now [State:Idle II:500 AI:300 AT:4000]
[00:01:40.991][P][EM] <<2 [E:60143i S:14786 M:148604311] (S) Msg Retransmission to 2:00000000075BCD15
[00:01:40.992][P][EM] ??3 [E:60143i S:14786 M:148604311] (S) Msg Retransmission to 2:00000000075BCD15 scheduled for 2912ms from now [State:Idle II:500 AI:300 AT:4000]
[00:01:43.904][P][EM] <<3 [E:60143i S:14786 M:148604311] (S) Msg Retransmission to 2:00000000075BCD15
[00:01:43.905][P][EM] ??4 [E:60143i S:14786 M:148604311] (S) Msg Retransmission to 2:00000000075BCD15 scheduled for 3601ms from now [State:Idle II:500 AI:300 AT:4000]
[00:01:47.506][P][EM] <<4 [E:60143i S:14786 M:148604311] (S) Msg Retransmission to 2:00000000075BCD15
[00:01:47.507][P][EM] ??5 [E:60143i S:14786 M:148604311] (S) Msg Retransmission to 2:00000000075BCD15 scheduled for 4520ms from now [State:Idle II:500 AI:300 AT:4000]
[00:01:52.027][E][EM] <<5 [E:60143i S:14786 M:148604311] (S) Msg Retransmission to 2:00000000075BCD15 failure (max retries:4)
[00:03:37.397][P][SC] SecureSession[0x20013e60, LSID:14786]: State change 'kActive' --> 'kDefunct'
[00:03:37.397][E][DMG] Time out! failed to receive status response from Exchange: 60143i
[00:03:37.406][P][DMG] Subscription id 0x7fe5b2cd from node <00000000075BCD15, 2> torn down
[00:03:37.424][P][IM] Schedule subscription resumption when failing to establish session, Retries: 1

  1. After 5 minutes the device, tries again to resume the subscription:

[00:08:37.433][P][DIS] Resolving 6411EC2658B54364:00000000075BCD15 ...
[00:08:37.436][P][DIS] Lookup started for 6411EC2658B54364-00000000075BCD15
[00:08:37.633][P][DIS] Checking node lookup status for 6411EC2658B54364-00000000075BCD15 after 200 ms
[00:09:22.433][P][DIS] Checking node lookup status for 6411EC2658B54364-00000000075BCD15 after 45000 ms
[00:09:22.433][E][DIS] OperationalSessionSetup[2:00000000075BCD15]: operational discovery failed: 32
[00:09:22.442][E][DMG] Failed to establish CASE for subscription-resumption with error '32'
[00:09:22.461][P][IM] Schedule subscription resumption when failing to establish session, Retries: 2

  1. Now bring the device back in range

[00:09:29.101][P][DL] _OnPlatformEvent default: event->Type = 32779
[00:09:29.101][P][ZCL] ThreadDiagnosticsDelegate: OnConnectionStatusChanged
[00:09:29.114][P][DL] _OnPlatformEvent default: event->Type = 32769
[00:09:29.114][P][SVR] Joining Multicast groups

  1. After 10 minutes the device tries subscription resumption again and will now be successful:

[00:19:22.470][P][DIS] Resolving 6411EC2658B54364:00000000075BCD15 ...
[00:19:22.473][P][DIS] Lookup started for 6411EC2658B54364-00000000075BCD15
[00:19:22.670][P][DIS] Checking node lookup status for 6411EC2658B54364-00000000075BCD15 after 200 ms
[00:19:23.982][P][DIS] Node ID resolved for 6411EC2658B54364-00000000075BCD15
[00:19:23.983][P][DIS] UDP:[fec0:0:0:0:6aec:8aff:fe06:6ed3]:5541: new best score: 1 (for 6411EC2658B54364-00000000075BCD15)
[00:19:23.990][P][DIS] Checking node lookup status for 6411EC2658B54364-00000000075BCD15 after 1520 ms
[00:19:24.012][P][SC] Initiating session on local FabricIndex 2 from 0x0000000000000021 -> 0x00000000075BCD15
[00:19:24.045][P][EM] <<< [E:60144i S:0 M:163752279] (U) Msg TX from 86B183D129EFD486 to 0:0000000000000000 [0000] [UDP:[fec0:0:0:0:6aec:8aff:fe06:6ed3]:5541] --- Type 0000:30 (SecureChannel:CASE_Sigma1) (B:234)
[00:19:24.047][P][EM] ??1 [E:60144i S:0 M:163752279] (U) Msg Retransmission to 0:0000000000000000 scheduled for 2655ms from now [State:Idle II:500 AI:300 AT:4000]
[00:19:24.054][P][SC] Sent Sigma1 msg to <00000000075BCD15, 2> [II:7000ms AI:2500ms AT:5000ms]
[00:19:24.488][P][EM] >>> [E:60144i S:0 M:56111122 (Ack:163752279)] (U) Msg RX from 0:0000000000000000 [0000] to 86B183D129EFD486 --- Type 0000:33 (SecureChannel:CASE_Sigma2Resume) (B:100)
[00:19:24.490][P][EM] <<< [E:60144i S:0 M:163752280 (Ack:56111122)] (U) Msg TX from 86B183D129EFD486 to 0:0000000000000000 [0000] [UDP:[fd5b:46be:80af:1:5135:417b:6c5e:aa9e]:5541] --- Type 0000:10 (SecureChannel:StandaloneAck) (B:26)
[00:19:24.557][P][EM] <<< [E:60144i S:0 M:163752281 (Ack:56111122)] (U) Msg TX from 86B183D129EFD486 to 0:0000000000000000 [0000] [UDP:[fd5b:46be:80af:1:5135:417b:6c5e:aa9e]:5541] --- Type 0000:40 (SecureChannel:StatusReport) (B:34)
[00:19:24.559][P][EM] ??1 [E:60144i S:0 M:163752281] (U) Msg Retransmission to 0:0000000000000000 scheduled for 2355ms from now [State:Active II:500 AI:300 AT:4000]
[00:19:24.581][P][SC] SecureSession[0x20013d88, LSID:14789]: State change 'kEstablishing' --> 'kActive'
[00:19:24.594][P][DMG] Registered a ReadHandler that will schedule a report between system Timestamp: 0x000000000011C532 and system Timestamp 0x00000000002D3C72.
[00:19:24.741][E][DL] Long dispatch time: 253 ms, for event type 3
[00:19:24.742][P][DL] _OnPlatformEvent default: event->Type = 32792
[00:19:24.781][P][EM] <<< [E:60145i S:14789 M:187656167] (S) Msg TX from 0000000000000021 to 2:00000000075BCD15 [4364] [UDP:[fd5b:46be:80af:1:5135:417b:6c5e:aa9e]:5541] --- Type 0001:05 (IM:ReportData) (B:440)
[00:19:24.784][P][EM] ??1 [E:60145i S:14789 M:187656167] (S) Msg Retransmission to 2:00000000075BCD15 scheduled for 2384ms from now [State:Active II:500 AI:300 AT:4000]
[00:19:24.983][P][EM] >>> [E:60144i S:0 M:56111123 (Ack:163752281)] (U) Msg RX from 0:0000000000000000 [0000] to 86B183D129EFD486 --- Type 0000:10 (SecureChannel:StandaloneAck) (B:26)
[00:19:25.485][P][EM] >>> [E:60145i S:14789 M:257009999 (Ack:187656167)] (S) Msg RX from 2:00000000075BCD15 [4364] to 0000000000000021 --- Type 0001:01 (IM:StatusResponse) (B:42)
[00:19:25.486][P][IM] Received status response, status is 0x00

  1. Now bring the device again out-of-range and trigger the switch event again

[00:19:41.924][P][ZCL] ThreadDiagnosticsDelegate: OnConnectionStatusChanged
[00:19:41.932][P][EVL] Dropped 1 event from buffer with priority 1 and event number 0x0000000000000020 due to overflow: event priority_level: 1
[00:19:41.938][P][ZCL] ThreadDiagnosticsDelegate: OnNetworkFaultChanged
[00:19:41.951][P][EVL] Dropped 1 event from buffer with priority 1 and event number 0x0000000000000021 due to overflow: event priority_level: 1
[00:19:41.959][P][EVL] Dropped 1 event from buffer with priority 1 and event number 0x0000000000000022 due to overflow: event priority_level: 1
[00:19:41.976][P][DL] _OnPlatformEvent default: event->Type = 32769
[00:19:43.458][P][ZCL] SwitchServer: OnInitialPress
[00:19:43.460][P][EVL] Dropped 1 event from buffer with priority 1 and event number 0x0000000000000023 due to overflow: event priority_level: 1
[00:19:43.559][P][ZCL] SwitchServer: OnShortRelease
[00:19:44.058][P][-] GenericSwitchMultiPressComplete (1)
[00:19:44.058][P][ZCL] SwitchServer: OnMultiPressComplete
[00:19:44.065][P][EVL] Dropped 1 event from buffer with priority 1 and event number 0x0000000000000024 due to overflow: event priority_level: 1
[00:19:44.071][P][EVL] Dropped 1 event from buffer with priority 1 and event number 0x0000000000000025 due to overflow: event priority_level: 1
[00:19:44.117][P][EM] <<< [E:60150i S:14789 M:187656177] (S) Msg TX from 0000000000000021 to 2:00000000075BCD15 [4364] [UDP:[fd5b:46be:80af:1:5135:417b:6c5e:aa9e]:5541] --- Type 0001:05 (IM:ReportData) (B:82)
[00:19:44.119][P][EM] ??1 [E:60150i S:14789 M:187656177] (S) Msg Retransmission to 2:00000000075BCD15 scheduled for 2604ms from now [State:Idle II:500 AI:300 AT:4000]
[00:19:46.723][P][EM] <<1 [E:60150i S:14789 M:187656177] (S) Msg Retransmission to 2:00000000075BCD15
[00:19:46.724][P][EM] ??2 [E:60150i S:14789 M:187656177] (S) Msg Retransmission to 2:00000000075BCD15 scheduled for 2601ms from now [State:Idle II:500 AI:300 AT:4000]
[00:19:49.325][P][EM] <<2 [E:60150i S:14789 M:187656177] (S) Msg Retransmission to 2:00000000075BCD15
[00:19:49.326][P][EM] ??3 [E:60150i S:14789 M:187656177] (S) Msg Retransmission to 2:00000000075BCD15 scheduled for 2898ms from now [State:Idle II:500 AI:300 AT:4000]
[00:19:51.315][P][-] Battery: FULL
[00:19:51.315][P][-] Battery: 3.280V, 100(%)
[00:19:52.224][P][EM] <<3 [E:60150i S:14789 M:187656177] (S) Msg Retransmission to 2:00000000075BCD15
[00:19:52.225][P][EM] ??4 [E:60150i S:14789 M:187656177] (S) Msg Retransmission to 2:00000000075BCD15 scheduled for 3508ms from now [State:Idle II:500 AI:300 AT:4000]
[00:19:55.733][P][EM] <<4 [E:60150i S:14789 M:187656177] (S) Msg Retransmission to 2:00000000075BCD15
[00:19:55.734][P][EM] ??5 [E:60150i S:14789 M:187656177] (S) Msg Retransmission to 2:00000000075BCD15 scheduled for 4260ms from now [State:Idle II:500 AI:300 AT:4000]
[00:19:59.994][E][EM] <<5 [E:60150i S:14789 M:187656177] (S) Msg Retransmission to 2:00000000075BCD15 failure (max retries:4)
[00:21:45.655][P][SC] SecureSession[0x20013d88, LSID:14789]: State change 'kActive' --> 'kDefunct'
[00:21:45.655][E][DMG] Time out! failed to receive status response from Exchange: 60150i
[00:21:45.664][P][DMG] Subscription id 0x7fe5b2cd from node <00000000075BCD15, 2> torn down
[00:21:45.682][P][IM] Schedule subscription resumption when failing to establish session, Retries: 3

  1. You can see that the number of retries is not reset after the successful subscription resumption. This can potentially lead to very high subscription timeouts from the start.

Logs:

subscription_resumption_counter_not_reset.txt

Bug prevalence

Always

GitHub hash of the SDK that was being used

master

Platform

other

Platform Version(s)

No response

Anything else?

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs triage
Projects
None yet
1 participant