Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[1.2] Command Handler and SetGroupRequest() #32030

Open
jonsmirl opened this issue Feb 9, 2024 · 20 comments
Open

[1.2] Command Handler and SetGroupRequest() #32030

jonsmirl opened this issue Feb 9, 2024 · 20 comments

Comments

@jonsmirl
Copy link
Contributor

jonsmirl commented Feb 9, 2024

Reproduction steps

1ada682

@kkasperczyk-no A bridge I am working on makes CommandHandler objects outside of CHIP and then submits them into CHIP. In order to control the processing of the command objects I need to be able to use your new SetGroupRequest() which is marked Private, could you please set it to be Public?

Bug prevalence

always

GitHub hash of the SDK that was being used

d38a649

Platform

core

Platform Version(s)

1.2

Type

Core SDK Interopability Issue

Anything else?

No response

@jonsmirl jonsmirl added bug Something isn't working needs triage labels Feb 9, 2024
@tcarmelveilleux
Copy link
Contributor

@jonsmirl The SetGroupRequest is only valid when called from ProcessInvokeRequest, based on the incoming exchange context. Is there a reason it's being bypassed? If this is made public, it could be called at times where the other necessary state invariants are not managed properly, since the SetGroupRequest state is actually just caching to avoid needing to cross too many layers after init.

@tcarmelveilleux tcarmelveilleux added feature request and removed bug Something isn't working labels Feb 13, 2024
@jonsmirl
Copy link
Contributor Author

CHIP does not implement loopback addresses. So when I send group commands which need to utilize the state engines in app/cluster I also have to execute the same command locally using emberAf... That results in code like this all over my app. Since these are locally generated commands, I need to mark them Group to prevent CHIP from trying to send a reply over a non-existent session.

    LevelControl::Commands::MoveWithOnOff::DecodableType commandDataDT;
    commandDataDT.moveMode = (position == UP ? chip::app::Clusters::LevelControl::MoveModeEnum::kUp : chip::app::Clusters::LevelControl::MoveModeEnum::kDown);
    commandDataDT.rate.SetNonNull(MOVE_RATE);
    app::ConcreteCommandPath commandPath = {epLightID, LevelControl::Id, LevelControl::Commands::MoveWithOnOff::Id};
    app::CommandHandler commandObj(nullptr);
    commandObj.SetGroupRequest(true);

    LevelControl::Commands::MoveWithOnOff::Type commandDataT;
    commandDataT.moveMode = (position == UP ? chip::app::Clusters::LevelControl::MoveModeEnum::kUp : chip::app::Clusters::LevelControl::MoveModeEnum::kDown);
    commandDataT.rate.SetNonNull(MOVE_RATE);
    commandDataT.optionsMask = static_cast<chip::BitMask<chip::app::Clusters::LevelControl::LevelControlOptions>>(0U);
    commandDataT.optionsOverride = static_cast<chip::BitMask<chip::app::Clusters::LevelControl::LevelControlOptions>>(0U);

    client::command_handle_t cmd_handle;
    uint16_t local_endpoint_id = epLightID;
    cmd_handle.cluster_id = LevelControl::Id;
    cmd_handle.command_id = LevelControl::Commands::MoveWithOnOff::Id;
    cmd_handle.command_data = &commandDataT;

    // Press moves Position from 0 (idle) to 1 (press)
    lock::chip_stack_lock(portMAX_DELAY);
    emberAfLevelControlClusterMoveWithOnOffCallback(&commandObj, commandPath, commandDataDT);
    chip::app::Clusters::Switch::Attributes::CurrentPosition::Set(epLightID, position);
    // LongPress event takes newPosition as event data
    switch_cluster::event::send_long_press(epLightID, position);

    client::cluster_update(local_endpoint_id, &cmd_handle);
    lock::chip_stack_unlock();

@jonsmirl
Copy link
Contributor Author

@tcarmelveilleux Is there another way to implement loopback on group commands? The problem is those state engines in app/cluster, they have to be executed so that everything stays in sync. In general it would be very useful for all bindings to support loopback if you ask for it with a loopback address in the binding.

I have brought this issue up many times before:
#11071
#21626
#21633

@kkasperczyk-no
Copy link
Contributor

If this is made public, it could be called at times where the other necessary state invariants are not managed properly, since the SetGroupRequest state is actually just caching to avoid needing to cross too many layers after init.

I agree with Tennessee, making it public does not seem to be safe. This means you could change the
mGroupRequest at any time from outside of the module, so the process of processing command could be disturbed.

@jonsmirl
Copy link
Contributor Author

To explain the problem clearly: Consider a light dimmer which implements both light switch and light bulb internally. Now group it with another external light bulb into a two bulb group. Flick the switch -- the light switch sends a group command. CHIP does not implement loopback so only the remote bulb gets the group command.

What do you do about the local bulb? you can't just totally bypass CHIP (I tried doing that) because of the state machines inside app/clusters - things like MOVE and Colorloop. The only way to make this work is to call emberAf... on the local machine which triggers the state machine. And the only way to be able to call emberAf... is to set the Group flag. If the Group flag is not set, it tries to respond on a non-existent session.

@jonsmirl
Copy link
Contributor Author

Of course if CHIP allowed loopback, then I wouldn't need to call emberAf... and I wouldn't need access to the Group flag.

@jonsmirl
Copy link
Contributor Author

If you don't want to expose SetGroup() could we flip the default case and have a SetDirect() flag instead? If the default case for emberAf... is to not send a response that works for me and I don't need to change the flag.

This is where I have a problem, AddStatus() has to do the initial return and not go into FallibleAddStatus() which will fail 100% since I don't have a session.

void CommandHandler::AddStatus(const ConcreteCommandPath & aCommandPath, const Protocols::InteractionModel::Status aStatus,
                               const char * context)
{
    // Return early in case of requests targeted to a group, since they should not add a response.
    VerifyOrReturn(!IsGroupRequest());
    VerifyOrDie(FallibleAddStatus(aCommandPath, aStatus, context) == CHIP_NO_ERROR);
}

@jonsmirl
Copy link
Contributor Author

Alternatively you can eliminate the Group flag altogether by checking if there is a session attached to the command. No session == Group, Session == Direct.

@kkasperczyk-no
Copy link
Contributor

@bzbarsky-apple or @tehampson could you comment on that?

@kkasperczyk-no
Copy link
Contributor

kkasperczyk-no commented Feb 16, 2024

@jonsmirl I think I'm starting to understand your problem. Some time ago I developed bridge application and I tried to reproduce your scenario, indeed it doesn't work for me. This seems to be a serious problem for the bridge use case.

To sum up, let's assume I have a mixed network of Matter and non-Matter light bulbs + non-Matter light switch:
image

I'm able to control Matter light bulb using non-Matter light switch, but I can't control non-Matter light bulb with non-Matter light switch (at least using Matter), because from the bridge perspective it looks like sending message to itself.

@kkasperczyk-no
Copy link
Contributor

But I can see it fails on the IP address resolution level, so it could be that it's an issues with mDNS not the IM.

@jonsmirl
Copy link
Contributor Author

Simpler case, assume both the switch and the bulb are on the same bridge.

Now you are getting it, bridges HAVE to do loopback and CHIP doesn't allow it. I am using command objects to construct that loopback internally in my app.

@Damian-Nordic
Copy link
Contributor

Damian-Nordic commented Feb 16, 2024

@jonsmirl Do you think your issue could be solved at OS level using IP_MULTICAST_LOOP (https://man7.org/linux/man-pages/man7/ip.7.html)? I mean we might still want to implement multicast loop at Matter level if the system support for IP_MULTICAST_LOOP is poor but just want to confirm that we're on the same page.

@jonsmirl
Copy link
Contributor Author

You can do loopback direct bindings too.

@jonsmirl
Copy link
Contributor Author

I also seem to recall trying IP_MULTICAST_LOOP about a year ago and CHIP rejected the packets. But my memory of testing it is vague.

@Damian-Nordic
Copy link
Contributor

Well, I obviously haven't tested that so I'm not sure what else needs fixing but our unit tests use loopback endpoints to test higher layers so in theory it should be possible to implement the multicast loop at lower layers than IM (networking stack in the best case, or Matter's UDPEndpoint as a last resort).
While this might be suboptimal in some cases (for unicast bindings you would need CASE even though the traffic doesn't go outside the device), it might simplify the protocol design as well. So it's just an idea to consider :)

@jonsmirl
Copy link
Contributor Author

jonsmirl commented Feb 16, 2024

I implemented my command loopback code over a year ago and it has been working fine, the recent change to SetGroup() broke it (you hid my ability to set the group flag). I will try turning on IP_MULTICAST_LOOP again and see if it works now.

@tehampson
Copy link
Contributor

Hi @jonsmirl ,

Would you be able to have a call Tuesday next week (I am OOO today and Monday) to chat about what it is you are trying to do? I am planning a little bit of a refactor for CommandHandler so that it is less reliant on an exchange context to improve it's unit testablbility, so it would be very beneficial for me to understand what it is that you are trying to do as perhaps what I am thinking doing in that refactor I might be able to address your concern if I understood it some more.

@jonsmirl
Copy link
Contributor Author

There is another wrinkle.... My light switches implement both Dimmable Light and Dimmer Switch. They are done that way so that they can be bound into groups. For example, a switch at the top and bottom of a staircase which have to stay synchronized.

I have used this method of calling emberAf.... to internally bind the Dimmable Light and Dimmer Switch without being commissioned. I can't just wire the button to the bulb because of the state machines in app/cluster. When you press a button on my dimmer it sends a MOVE and when you let go of it, it sends a STOP. The core problem is the MOVE command. When the switch triggers a MOVE all of the light bulbs need to be running the exact same state machine so that they all dim together.

So how do I make the light switches work out-of-the-box? If I use a LOOPBACK group binding, then they have to be commissioned via my app (which would make a one-node group) before they will work. Users aren't going to like that; currently you can install the dimmer and use it with Google/Apple while ignoring my app. This alternative method of calling emberAf... is much cleaner.

I guess I could make a copy of the lighting state machine inside my app and run it when in the uncommissioned state, but then I am carrying around two copies of the same code. Things would be far simpler if I can make emberAf... calls from my app.

I see two simple options which would let me continue calling emberAf.... 1) Get rid of the Group flag altogether. If the command object has a session, it is unicast, if it doesn't it is group. 2) Flip the flag - make Group the default call and then flag Unicast.

The general problem is that I a need to be a way to run those state machines in app/cluster/* locally without involving the rest of CHIP.

I work from home so I am always around, you can contact me directly at jonsmirl@gmail.com

@tehampson
Copy link
Contributor

Sorry for the delay, but now that CommandHandler refactor #32467 has occurred, if you provide a CommandHandler that you create the various AddStatus should be a no-op now, thus eliminating the need for SetGroupRequest() workaround

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants