-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add stop functionality for output config changes #34049
Add stop functionality for output config changes #34049
Conversation
Pinging @elastic/elastic-agent (Team:Elastic-Agent) |
Does it eventually recover from this? Or does it just stay in the failed state? |
x-pack/libbeat/management/config.go
Outdated
Blacklist ConfigBlacklistSettings `config:"blacklist" yaml:"blacklist"` | ||
Enabled bool `config:"enabled" yaml:"enabled"` | ||
Blacklist ConfigBlacklistSettings `config:"blacklist" yaml:"blacklist"` | ||
OutputRestart bool `config:"output_restart" yaml:"output_restart"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OutputRestart bool `config:"output_restart" yaml:"output_restart"` | |
RestartOnOutputChange bool `config:"restart_on_output_change" yaml:"output_restart"` |
nit: Let's use the exact same name that was used in the 8.5 spec files so it is clear we just changed the implementation when comparing the two: https://github.com/elastic/elastic-agent/blob/f3d39629cc740ce7cf868a228da9e2dec1a83dec/internal/spec/filebeat.yml#L12
@@ -338,6 +346,16 @@ func (cm *BeatV2Manager) handleOutputReload(unit *client.Unit) { | |||
} | |||
cm.logger.Debugf("Got Output unit config '%s'", rawConfig.GetId()) | |||
|
|||
// if needed, stop the beat, let agent restart |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are sure this only triggers when the actual output configuration changes? And it won't restart for changes in the unit that are not the actual output configuration, like the log level? https://github.com/elastic/elastic-agent-client/blob/15881a8e64ef95f32f688c933319cd926ff500b0/elastic-agent-client.proto#L280
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, didn't think of that! Should probably add some kind of check to diff the actual config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but I believe we should implement the change suggested by @cmacknz, I believe it also makes the overall code a bit more readable.
Aside that I added minor suggestions.
@@ -87,12 +91,16 @@ func NewV2AgentManager(config *conf.C, registry *reload.Registry, _ uuid.UUID) ( | |||
// NewV2AgentManagerWithClient actually creates the manager instance used by the rest of the beats. | |||
func NewV2AgentManagerWithClient(config *Config, registry *reload.Registry, agentClient client.V2) (lbmanagement.Manager, error) { | |||
log := logp.NewLogger(lbmanagement.DebugK) | |||
if config.OutputRestart { | |||
log.Infof("Output reload is enabled, the beat will restart as needed on change of output config") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: No need to use Infof
here, just Info
as you're not formatting anything.
log.Infof("Output reload is enabled, the beat will restart as needed on change of output config") | |
log.Info("Output reload is enabled, the beat will restart as needed on change of output config") |
@@ -338,6 +346,16 @@ func (cm *BeatV2Manager) handleOutputReload(unit *client.Unit) { | |||
} | |||
cm.logger.Debugf("Got Output unit config '%s'", rawConfig.GetId()) | |||
|
|||
// if needed, stop the beat, let agent restart | |||
if cm.stopOnOutputReload && cm.outputIsConfigured { | |||
cm.logger.Infof("beat will now stop for output reload") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: same thing here
cm.logger.Infof("beat will now stop for output reload") | |
cm.logger.Info("beat will now stop for output reload") |
Kudos, SonarCloud Quality Gate passed! |
for anyone reading: holding off on this until #34066 is merged. |
Spoke with Alex about this. The FAILED state lasts a second or two while the agent detects the process is stopped and restarts. If this happens right before the next checkin with fleet server we may have the agent reported unhealthy unnecessarily, both in that the behaviour causing the failed state is intentional and that the status won't clear until the next checkin up to 5 minutes later. To avoid this we need to debounce the process state locally as described in elastic/elastic-agent#1946 |
These changes have been merged into #34066, closing. |
What does this PR do?
Fixes elastic/elastic-agent#1913
This adds a reloader component, set by a
output_restart
config flag, that will stop the beat when it receives a config change to the output.This will require some other component changes before it works properly
FAILED
.-E management.output_restart=true
to the beat specfile.Testing this is easy. Just add the aforementioned line to the specfile, run the beat, then change the config. If there's a change to the output, the beats will shut down, agent will restart with the updated config.
Why is it important?
We still want output changes to force a reload of the beat.
Checklist
CHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.