Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cnvm: Delete old snapshots on startup #3143

Merged
merged 8 commits into from
Mar 28, 2025

Conversation

orestisfl
Copy link
Contributor

@orestisfl orestisfl commented Mar 25, 2025

Summary of your changes

Change CNVM to add a background job cleaning up old snapshots.

The logic here is:

  1. Background routine is started. Since the amount of leftover snapshots can be quite high (e.g. 40k in our account), 3 cleanup workers are used
  2. IterOwnedSnapshots() is called which goes over all snapshots, returning an iterator. Snapshots are selected if:
    1. They are more than 48 hours old
    2. They have a tag with key "Name" and value starting with "elastic-vulnerability"
    3. They are "self-owned"
  3. Cleanup returns. On context cancelled, no extra grace period is added so the process doesn't block restarts/shutdown.

Screenshot/Data

Deleted 1178 snapshots from 2025-03-25T16:17:08.384Z to 2025-03-25T16:20:19.504Z

Related Issues

Closes #3105
Closes https://github.com/elastic/sdh-security-team/issues/1168

Checklist

  • I have added tests that prove my fix is effective or that my feature works
  • I have added the necessary README/documentation (if appropriate)

Introducing a new rule?

No

@elastic elastic deleted a comment from mergify bot Mar 25, 2025
orestisfl added a commit to orestisfl/cloudbeat that referenced this pull request Mar 25, 2025
Pulling change out of elastic#3143 to ease reviewing.

This is the result of running
```shell
hermit upgrade mockery
just validate-mocks
```

Reason is that old mockery complains about go 1.22's syntax
```go
for i := range 10 {
    ...
}
```
@orestisfl orestisfl mentioned this pull request Mar 25, 2025
orestisfl added a commit that referenced this pull request Mar 25, 2025
Pulling change out of #3143 to ease reviewing.

This is the result of running
```shell
hermit upgrade mockery
just validate-mocks
```

Reason is that old mockery complains about go 1.22's syntax
```go
for i := range 10 {
    ...
}
```
mergify bot pushed a commit that referenced this pull request Mar 25, 2025
Pulling change out of #3143 to ease reviewing.

This is the result of running
```shell
hermit upgrade mockery
just validate-mocks
```

Reason is that old mockery complains about go 1.22's syntax
```go
for i := range 10 {
    ...
}
```

(cherry picked from commit 7d838d1)
orestisfl added a commit that referenced this pull request Mar 25, 2025
Upgrade mockery (#3144)

Pulling change out of #3143 to ease reviewing.

This is the result of running
```shell
hermit upgrade mockery
just validate-mocks
```

Reason is that old mockery complains about go 1.22's syntax
```go
for i := range 10 {
    ...
}
```

(cherry picked from commit 7d838d1)

Co-authored-by: Orestis Floros <orestis.floros@elastic.co>
@orestisfl orestisfl linked an issue Mar 26, 2025 that may be closed by this pull request
2 tasks
@orestisfl orestisfl marked this pull request as ready for review March 28, 2025 12:15
@orestisfl orestisfl requested a review from a team as a code owner March 28, 2025 12:15
Comment on lines +375 to +379
for _, tag := range snap.Tags {
if aws.ToString(tag.Key) == "Name" {
return strings.HasPrefix(aws.ToString(tag.Value), snapshotPrefix)
}
}
Copy link
Member

@romulets romulets Mar 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are already querying already the snapshots here:

                    Filters: []types.Filter{
					{
						Name:   aws.String("tag:Name"),
						Values: []string{fmt.Sprintf("%s-*", snapshotPrefix)},
					},
				},

Do we need this piece of filtering?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to filter for time anyway so the name check is there just as a sanity check.

Comment on lines +148 to +150
type contextualChan[T any] struct {
ch chan T
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[ just sharing thoughts - not suggesting to change it ]
I wonder which one of "wrapper struct vs just generic functions", is leaner in that case.

eg instead of a wrapper just having something like that

func SendToChannel[T any](ctx context.Context, ch chan<- T, t T) bool
func ReadFromChannel[T any](ctx context.Context, ch <-chan T) (T, bool)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to completely avoid exposing the channel to prevent misusing it. Both can be valid approaches if direct access to the channel is needed.

@orestisfl orestisfl merged commit 8860b9d into elastic:main Mar 28, 2025
9 checks passed
@orestisfl orestisfl deleted the cnvm-delete-old-snapshots branch March 28, 2025 14:01
mergify bot pushed a commit that referenced this pull request Mar 28, 2025
### Summary of your changes

Change CNVM to add a background job cleaning up old snapshots.

The logic here is:
1. Background routine is started. Since the amount of leftover snapshots can be quite high (e.g. 40k in our account),  3 cleanup workers are used
2. `IterOwnedSnapshots()` is called which goes over all snapshots, returning an iterator. Snapshots are selected if:
    1. They are more than 48 hours old
    2. They have a tag with key "Name" and value starting with "elastic-vulnerability"
    3. They are "self-owned"
3. Cleanup returns. On context cancelled, no extra grace period is added so the process doesn't block restarts/shutdown.

### Screenshot/Data

Deleted 1178 snapshots from `2025-03-25T16:17:08.384Z` to `2025-03-25T16:20:19.504Z`

### Related Issues

Closes #3105
Closes https://github.com/elastic/sdh-security-team/issues/1168

### Checklist
- [x] I have added tests that prove my fix is effective or that my feature works
- [x] I have added the necessary README/documentation (if appropriate)

#### Introducing a new rule?
No

(cherry picked from commit 8860b9d)
orestisfl added a commit that referenced this pull request Mar 28, 2025
cnvm: Delete old snapshots on startup (#3143)

### Summary of your changes

Change CNVM to add a background job cleaning up old snapshots.

The logic here is:
1. Background routine is started. Since the amount of leftover snapshots can be quite high (e.g. 40k in our account),  3 cleanup workers are used
2. `IterOwnedSnapshots()` is called which goes over all snapshots, returning an iterator. Snapshots are selected if:
    1. They are more than 48 hours old
    2. They have a tag with key "Name" and value starting with "elastic-vulnerability"
    3. They are "self-owned"
3. Cleanup returns. On context cancelled, no extra grace period is added so the process doesn't block restarts/shutdown.

### Screenshot/Data

Deleted 1178 snapshots from `2025-03-25T16:17:08.384Z` to `2025-03-25T16:20:19.504Z`

### Related Issues

Closes #3105
Closes https://github.com/elastic/sdh-security-team/issues/1168

### Checklist
- [x] I have added tests that prove my fix is effective or that my feature works
- [x] I have added the necessary README/documentation (if appropriate)

#### Introducing a new rule?
No

(cherry picked from commit 8860b9d)

Co-authored-by: Orestis Floros <orestis.floros@elastic.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CNVM: Clean-up old snapshots on startup
3 participants