-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: ensure containerd-related directories removed on failed bootstrap/join-cluster
.
#863
fix: ensure containerd-related directories removed on failed bootstrap/join-cluster
.
#863
Conversation
773eb93
to
fa16a12
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall, we should probably wait and adjust for #817
@berkayoz note that it just hit me that because of the new While I believe it's better to leave that test for NIGHTLY, I've manually run the test on a 22.04 VM and I have. I needed to add this small fix (I had forgotten to actually call Can vouch that the added integration test seems to work when run manually now BUT it does sometimes exhibit the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @aznashwan - a couple of comments.
3d69d81
to
edb3fa8
Compare
The mentioned PR has been merged, so this should be fine now to go in as well, after the conflicts have been resolved. It seems I have been doing something similar as well. Though, just cleaning up the containerd-related directories is not going to be enough, we should probably also shutdown the Kubernetes services, otherwise, on the next bootstrap / join-cluster, we're going to see that the ports are in use. |
edb3fa8
to
80cb072
Compare
bootstrap/join-cluster
.bootstrap/join-cluster
.
bootstrap/join-cluster
.bootstrap/join-cluster
.
a2145d2
to
57f5b04
Compare
@bschimke95 @berkayoz related to this latest defensive commit I had to add to the cleanup @claudiubelu literally accidentally deleted his root dir using this cleanup code after I re-based my patch over his recently-merged configurable containerd paths one (thank you Claudiu, this was the definition of taking one for the team!) I think it's really bad that we have our cleanup logic split between the snap's remove hook (on I would strongly suggest we move ALL cleanup logic to a dedicated routine which we re-use internally and expose via a dedicated |
57f5b04
to
3a2ca37
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, agreed on turning this into a cleanup
command we can call on both the snap hook and in bootstrap/join cleanup. We can pick that up as a separate task and backport it to 1.32.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, minor nit and Berkays comment
please also add some description to this PR. IMHO the title is not sufficient anymore. |
3a2ca37
to
17ed765
Compare
Made the commit messages very explicit for posterity; and pasted their descriptions in the PR. |
…ap/join-cluster` `k8sd` automatically sets up some directories with the appropriate ownership/permissions to be used by containerd in the early stages of the `bootstrap` and `join-cluster` commands. In the classic (non-strict) version of the k8s-snap, these containerd directories are system-wide (e.g. `/etc/containerd`, `/run/containerd`, etc). Should any of the other setup steps fail after the containerd directories were set up, the directories would still remain on disk and thus lead to a 'partial installation' of on the host system. This patch ensures that `k8s` will automatically remove any containerd-related directories which were created in the event of the `bootstrap` / `join-cluster` commands failing. Signed-off-by: Nashwan Azhari <nashwan.azhari@canonical.com>
The containerd Base Dir is the special path all other containerd-related paths on the snap are derived from. Under classic confinement and default settings, this path defaults to the host's root (`/`), and thus extreme care must be taken to not accidentally include it in k8sd's cleanup routine or the k8s-snap's remove hook. Signed-off-by: Nashwan Azhari <nashwan.azhari@canonical.com>
17ed765
to
235bc36
Compare
fix: ensure containerd-related directories removed on failed
bootstrap/join-cluster
k8sd
automatically sets up some directories with the appropriateownership/permissions to be used by containerd in the early stages
of the
bootstrap
andjoin-cluster
commands.In the classic (non-strict) version of the k8s-snap, these
containerd directories are system-wide (e.g.
/etc/containerd
,/run/containerd
, etc).Should any of the other setup steps fail after the containerd
directories were set up, the directories would still remain on
disk and thus lead to a 'partial installation' of on the host system.
This patch ensures that
k8s
will automatically remove anycontainerd-related directories which were created in the event of
the
bootstrap
/join-cluster
commands failing.fix: ensure containerd Base Dir lockfile is never accidentally deleted.
The containerd Base Dir is the special path all other containerd-related
paths on the snap are derived from.
Under classic confinement and default settings, this path defaults
to the host's root (
/
), and thus extreme care must be taken tonot accidentally include it in k8sd's cleanup routine or the
k8s-snap's remove hook.