-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change the default gRPC port to 0 when in a container #6585
Change the default gRPC port to 0 when in a container #6585
Conversation
Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
The day sockets/npipe can be used for all installation types with be a glorious day!
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍 this change will help a lot with k8s deployments
* Override the container command's gRPC port to 0 by default. * Test that two containers don't have port collisions on the same host. * Move container override next to regular defaults. * Add changelog. * Fix application.New call in unit test. * Silence lint warning. (cherry picked from commit a61ad8c)
* Override the container command's gRPC port to 0 by default. * Test that two containers don't have port collisions on the same host. * Move container override next to regular defaults. * Add changelog. * Fix application.New call in unit test. * Silence lint warning. (cherry picked from commit a61ad8c) Co-authored-by: Craig MacKenzie <craig.mackenzie@elastic.co>
What does this PR do?
Changes the default port used when agent runs in a container to port 0. There is at least one user whose rollout of Elastic Agent is experiencing collisions with another application on the same port (nobody else could ever pick 6789 as a default port surely...) and I thought this would be straight forward to fix quickly. I was wrong. I expected ~2 hours and spent 3 days:
elastic-agent/_meta/config/elastic-agent.docker.yml.tmpl
Lines 88 to 89 in 9b8a25f
elastic-agent/deploy/helm/elastic-agent/templates/agent/k8s/_secret.tpl
Line 5 in 9b8a25f
run()
function that is the entrypoint of the agent already has a hook for overriding configurationelastic-agent/internal/pkg/agent/cmd/run.go
Lines 426 to 428 in 4199196
elastic-agent/internal/pkg/agent/cmd/container.go
Line 769 in 4199196
elastic-agent/internal/pkg/agent/cmd/run.go
Line 415 in 9b8a25f
elastic-agent/internal/pkg/agent/application/application.go
Line 99 in 4199196
note We should eventually stop using local TCP at all for the control protocol between sub-processes, and switch to unix sockets / named pipes. The capability for this was added in #4249, but it was left disabled because endpoint-security doesn't support it yet (because the upstream gRPC C++ client doesn't support it on Windows). We have now removed endpoint-security from our containers which would allow us to switch to unix sockets there, but this change required an elastic-agent-client package update and we need to test that every client we have has it first. This was more testing effort than I wanted to take on now, but I will create a follow up issue to do this.
Why is it important?
This automatically avoids port collisions between Elastic Agents using
hostNetwork: true
on Kubernetes as our DaemonSet does by default and other applications (or other Elastic Agents).Disruptive User Impact
None, but in case I'm wrong about this I made it possible to choose a specific port with an environment variable.
How to test this PR locally