Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAC Agent gets unhealthy on adding Elastic Defend integration and endpoint service fails to install. #4302

Closed
amolnater-qasource opened this issue Feb 21, 2024 · 16 comments
Labels
bug Something isn't working impact:critical Immediate priority; high value or cost to the product. QA:Validated Validated by the QA Team Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team

Comments

@amolnater-qasource
Copy link

Kibana Build details:

VERSION: 8.13.0
BUILD: 71738
COMMIT: b036a9705a55f6c81d065011ad8c991cbc3101d9
Artifact Link: https://staging.elastic.co/8.13.0-4304292e/downloads/beats/elastic-agent/elastic-agent-8.13.0-darwin-x86_64.tar.gz

Host OS: MAC

Preconditions:

  1. 8.13.0-BC1 Kibana cloud environment should be available.

Steps to reproduce:

  1. Install MAC agent with Elastic Defend.
  2. Observe Agent gets unhealthy with Elastic Defend integration and endpoint service fails to install in logs.

Screenshot:
image

image

Expected Result:
Agent should remain healthy on adding Elastic Defend integration and should install endpoint service successfully.

Logs:
elastic-agent-diagnostics-2024-02-21T05-58-10Z-00.zip

What's working fine:

  • No issue is observed on Windows and Linux agent with Elastic Defend integration.
    image
@amolnater-qasource amolnater-qasource added bug Something isn't working Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team impact:critical Immediate priority; high value or cost to the product. labels Feb 21, 2024
@elasticmachine
Copy link
Contributor

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

@amolnater-qasource
Copy link
Author

@manishgupta-qasource Please review.

@manishgupta-qasource
Copy link

Secondary review for this ticket is Done

@cmacknz
Copy link
Member

cmacknz commented Feb 21, 2024

I see we fail running the endpoint command:

{"log.level":"info","@timestamp":"2024-02-21T05:38:28.866Z","log.logger":"component.runtime.endpoint-default.service_runtime","log.origin":{"file.name":"runtime/service.go","file.line":315},"message":"after check if endpoint service is installed, err: exit status 135","ecs.version":"1.6.0"}

err = s.check(ctx)
s.log.Infof("after check if %s service is installed, err: %v", name, err)

Where the check step is implemented as the verify command:

operations: &operations
check:
args:
- "verify"
- "--log"
- "stderr"

@nfritts what would cause endpoint-security verify --log stderr to run exit status 135 on an X86 Mac immediately? There are no endpoint logs to look at here because we were never able to install it.

@nfritts
Copy link

nfritts commented Feb 26, 2024

I don't have time to look further, but I'm pretty sure 135 is:

    /// @brief Process is running in emulation on non-native CPU architecture
    Error_RunningInEmulation = -135,

Any chance its running the ARM binary for some reason? or running the x86 binaries on an ARM mac?

@amolnater-qasource
Copy link
Author

Hi @nfritts

Thank you for looking into this.
Yes, we were installing x86_64 agent on ARM mac.

Screenshot:
image

However, we never got any issues till 8.12.2 while installing x86_64 agent on ARM mac.
Please let us know, if this is expected now to not to install x86_64 on ARM mac .

8.12.2-(Working fine):
elastic-agent-diagnostics-2024-03-04T09-33-23Z-00.zip

8.13.0 Logs-Issue:
elastic-agent-diagnostics-2024-03-04T06-29-10Z-00.zip

Further, we confirm that with aarch64 artifact the issue is not reproducible and host is visible under Endpoints tab.

Build details:
VERSION: 8.13.0 BC3
BUILD: 71857
COMMIT: 82f46148c91eec93ac7382147936028db2eb8883
Artifact Link: https://staging.elastic.co/8.13.0-c16d8e3a/summary-8.13.0.html#elastic-agent-package

Thanks!

@nfritts
Copy link

nfritts commented Mar 4, 2024

We need to figure out if this is a bug or not.

We added code to detect if we were running an x86 binary under ARM or not for 8.13 and updated the package spec to prevent doing so under Windows.

Mac is slightly different since we build a mac binary that can run on both and ship it in both versions of Agent. This might be an unexpected "bug" that is part of that change, but honestly it might be a good thing. running the x86 endpoint is going to be less performant on arm processors and we may want to avoid it.

@ricardoungureanu what do you think? do we want the endpoint to run in the native CPU architecture?

@nfritts
Copy link

nfritts commented Mar 4, 2024

Confirmed.

This is expected behavior, it is new in 8.13.

@amolnater-qasource
Copy link
Author

Thank you for the confirmation @nfritts

As MAC-ARM64 artifacts are working fine with ARM machine, we are closing this issue and marking as QA:Validated.

Thanks!!

@amolnater-qasource amolnater-qasource added the QA:Validated Validated by the QA Team label Mar 5, 2024
@EchoGangster
Copy link

Hello,

I am trying to understand what the fix action was for this issue. Recent 8.13.3 install from scratch. Anytime I install Elastic Defend integration to Apple with ARM architecture, I get the same error as above.
Screenshot 2024-05-06 at 12 12 11

@cmacknz
Copy link
Member

cmacknz commented May 6, 2024

Exit status 135 means endpoint is running on an emulated platform (for example x86 binary on an arm processor). Detection of this condition was added in 8.13, I don't believe endpoint supports this mode of operation any more.

CC @nfritts

@EchoGangster
Copy link

Exit status 135 means endpoint is running on an emulated platform (for example x86 binary on an arm processor). Detection of this condition was added in 8.13, I don't believe endpoint supports this mode of operation any more.

CC @nfritts

I installed the Darwin agent. I took a look at the elastic-agent.spec.yml file and it doesn't mention Apple in the file. Not sure if that is an issue. Might try and pull down an earlier version to see what the parameters are.

@cmacknz
Copy link
Member

cmacknz commented May 6, 2024

The usual cause for this error would be running the darwin-x86 executable on an M1/M2/M3 ARM Mac or running the darwin-aarch64 on an x86 executable.

@ZainULArfeen
Copy link

image
image

i have the same problem is there any fix available for this or i have to downgrade to 8.12

@EchoGangster
Copy link

I fixed my issue by uninstalling the MacOS agent presented with the Fleet command and downloaded the ARM agent from Elastic. Then to install the ARM downloaded agent I used the same commands from the Fleet install.

I do think Elastic should give the option for the ARM agent for Apple ARM when building the Fleet managed install.

@ZainULArfeen
Copy link

@EchoGangster thanks bud, it worked

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working impact:critical Immediate priority; high value or cost to the product. QA:Validated Validated by the QA Team Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team
Projects
None yet
Development

No branches or pull requests

7 participants