-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[windows] capture early start-up errors #4627
Comments
Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane) |
Pinging @elastic/elastic-agent (Team:Elastic-Agent) |
For Windows, Lee suggested the best way to do this would be to have agent write to the windows event log. |
Does this run the risk of us creating a logging loop? Agent logs to Event Viewer, Winlogbeat reads Event Viewer, Winlogbeat crashes, Agent logs the crash to event viewer, winlogbeat reads the crash, etc.... |
We could narrow the scope of when we use the event log to only the period before we have our JSON logger setup and know the location those log files should be written to on disk. This will encompass a point in time where no subprocess (like winlogbeat) are running, and wouldn't run the risk of a logging loop. Winlogbeat could read a record of previous agent crashes, but by the time winlogbeat is running agent is no longer writing to the event log. We have similar problems with the monitoring filestream instance that needs some special handling and processors, I don't think we want to deal with any of that for regular uses of winlogbeat. |
I think we can make it even more limited in scope. If we just add writing an EventLog at
Then we only log to the EventLog if |
I hit this before with WIndows and added this in the unprivileged work for Windows - https://github.com/elastic/elastic-agent/blame/main/internal/pkg/agent/cmd/run.go#L145 It doesn't cover all cases where it could fail, but it does a much better job then it did before. Logging to the Windows Event Log in the worse case scenario would be nice to have. |
Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane) |
Describe the enhancement:
Capture standard error of
elastic-agent
when run as a Windows service. This isn't necessary under Linux since systemd will capture the standard error.Describe a specific use case for the enhancement or feature:
There are several steps that happen before internal logging is started. If
elastic-agent
fails to start before internal logging is started the only place the error can be found is on the standard error of theelastic-agent
process. Whenelastic-agent
is run as a service under Windows, the standard error is not captured. This is a problem becauseelastic-agent
can fail to start and there is no record of the failure. Adding this enhancement will allow us to "see" the error.What is the definition of done?
When
elastic-agent
fails as a service under Windows the standard error of the process can be retrieved.The text was updated successfully, but these errors were encountered: