Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can Runner Re-connect grpc connection when lost controller's connection? #1541

Open
creatorKoo opened this issue Mar 4, 2025 · 0 comments
Open

Comments

@creatorKoo
Copy link

Currently, it seems that when a runner pod loses its connection to the controller, it does not retry and just stops.
Would it be possible to modify this so that it retries when the connection is lost?

For example, if the controller is suddenly evicted due to a node drain, all runners lose their connection and stop.
In my case, the termination wait time is quite long, so all runners remain in a terminating state for over 20 minutes.

Instead of this, wouldn’t it be a better implementation to prevent the node from being drained while the runner is running and until the controller finishes communicating with the runner?
Am I misunderstanding something?

I’d like to hear your opinion.
Let me know if you need further refinements!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant