Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tasks: Drop cockpit-tasks as main entry point, move tasks secrets from /secrets to /run/secrets/tasks, drop setup-tasks, add local mock PR test #585

Merged
merged 10 commits into from
Mar 5, 2024

Conversation

martinpitt
Copy link
Member

@martinpitt martinpitt commented Mar 4, 2024

This is various general cleanup and making the tasks container a bit more flexible. I like the new "fully local almost end-to-end PR test" which makes hacking on our infra easy to validate (the new job runner infra).

@martinpitt martinpitt changed the title tasks: Drop cockpit-tasks as main entry point tasks: Drop cockpit-tasks as main entry point, move tasks secrets from /secrets to /run/secrets/tasks Mar 4, 2024
@martinpitt martinpitt force-pushed the entry-point branch 5 times, most recently from 4bd643d to 7d1a636 Compare March 5, 2024 05:02
@martinpitt martinpitt changed the title tasks: Drop cockpit-tasks as main entry point, move tasks secrets from /secrets to /run/secrets/tasks tasks: Drop cockpit-tasks as main entry point, move tasks secrets from /secrets to /run/secrets/tasks, add local mock PR Mar 5, 2024
@martinpitt martinpitt marked this pull request as ready for review March 5, 2024 05:32
@martinpitt martinpitt changed the title tasks: Drop cockpit-tasks as main entry point, move tasks secrets from /secrets to /run/secrets/tasks, add local mock PR tasks: Drop cockpit-tasks as main entry point, move tasks secrets from /secrets to /run/secrets/tasks, add local mock PR test Mar 5, 2024
@@ -22,7 +22,7 @@ tasks-shell:
$(DOCKER) run -ti --rm \
--shm-size=1024m \
--volume=$(CURDIR)/tasks:/usr/local/bin \
--volume=$(TASK_SECRETS):/secrets:ro \
--volume=$(TASK_SECRETS):/run/secrets/tasks/:ro \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...(drive by) this entire chunk seems pretty DRY-unfriendly vs. install-service...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I know :/

@@ -62,7 +62,7 @@ ExecStart=/usr/bin/podman run --name=cockpit-tasks-%i --hostname=${CONTAINER_HOS
--device=/dev/kvm --network=cockpit-tasks-%i \
--memory=24g --pids-limit=16384 --shm-size=1024m ${TMPVOL:-} \
--volume=\${TEST_CACHE}/images:/cache/images:rw \
--volume=\${TEST_SECRETS}/tasks:/secrets:ro \
--volume=\${TEST_SECRETS}/tasks:/run/secrets/tasks:ro \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we could think a step ahead here and:

  • add /run/secrets/github-token as the 'authoritative' location for the token file
  • add /run/secrets/s3-keys as the location for the s3-keys directory

...and adjust the symlinks to point to those things. That's where we'll want to have those two things inside of containers spawned by job-runner, under control of the "secrets" attribute in the job definition.

Of course, we'd also continue to have the volume mount for secrets/webhook and secrets/tasks since we still use those at present. We might want to start thinking about what to do more generally with "external config" such as the npm-registry certificate, though... it's not really a secret.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No objection to the two extra volumes (and doing the same in run-local.sh). Do you want to do that in this PR?

I'm not sure if we can commit the npm registry internal RH mirror cert to a public place -- it's not really secret, but it applies to RH internal infrastructure only. TBH I wouldn't change that right now without at least some more checking.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see the github-token and s3-keys bit in this PR. No time like the present!

The npm registry cert is a bit more interesting and it dovetails into the question about what to do with the npm config bits that current live in setup-tasks (which needs to die, I'm increasingly convinced)...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't yet do this, instead used the env variables. Let's stabilize this version first, then I can take another look.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. With the envvars, I care a lot less.

If stdin is a terminal, i.e. when running this interactively, wait for
the user to press Enter before cleaning up everything. That makes
complex tests easier to debug than having to repeat the steps in
`--interactive` mode.
@martinpitt martinpitt changed the title tasks: Drop cockpit-tasks as main entry point, move tasks secrets from /secrets to /run/secrets/tasks, add local mock PR test tasks: Drop cockpit-tasks as main entry point, move tasks secrets from /secrets to /run/secrets/tasks, drop setup-tasks, add local mock PR test Mar 5, 2024
@martinpitt martinpitt marked this pull request as draft March 5, 2024 10:44
@martinpitt
Copy link
Member Author

@allisonkarlitskaya Reworked, and I also tested the webhook in an interactive deployment with tasks/run-local.sh -i -t ~/.config/cockpit-dev/github-token. Then podman logs cockpituous-webhook shows that it properly scans our projects, and bots/inspect-queue --amqp localhost:5671 in the tasks container shows the scanned results.

At this point I'd like at least a cursory re-review. When you are by and large happy, I can try to deploy this to prod to see what I missed 😉

Copy link
Member

@allisonkarlitskaya allisonkarlitskaya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks amazing. The usual minor gripes, but to be honest, if you want to land it as-is, go right ahead.

This makes them symmetric with /run/secrets/webhook. Also move the symlink
setup from `Containerfile` to `setup-tasks` - that way all the secrets setup is
in one place, and it's also more fungible (e.g. it's simple to bind-mount an
updated `setup-tasks` script without rebuilding the image).
Copy link
Member

@allisonkarlitskaya allisonkarlitskaya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the tweaks.

# This intentionally does not capture subdirs like tasks/s3-keys/. As OpenShift secret volumes don't have subdirectories,
# these need to be created as a separate volume if and when we ever need that.
for f in $(find -maxdepth 1 -type f -o -type l); do
printf ' %s: %s\n' "${f#./}" "$(base64 --wrap=0 $f)"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to suggest "that thing you can do with shell parameter expansion that drops prefixes" as another option, but I was too lazy to look it up ;)

Create a static /etc/npmrc on our tasks bot hosts which is suitable for
machines in the internal Red Hat network. (The role already assumed that
before, so that's not a regression - and easy enough to conditionalize
if we need to in the future).

This gets rid of the `npm config` calls in `setup-tasks`.
We don't need ~/.git-credentials any more. We have moved most bots like
`po-refresh` or `npm-update` to GitHub workflows some time ago, and
`image-refresh` calls `git config credential` on its own.

We also haven't had an OpenShift /dev/kvm task runner, only the degraded
"statistics only" on CentOS CI. We don't need to create an user account
for that. If we ever do, this bit should move to a custom entry point
wrapper in the OpenShift YAML resource.
The bots GitHub API recently added `$COCKPIT_GITHUB_TOKEN_FILE` [1], which
allows us to not create the ~/.config/cockpit-dev/github-token symlink any
more. It already supports `$COCKPIT_S3_KEY_DIR`. So set these variables in
the tasks container.

Adjust the OpenShift secrets build to not consider subdirectories. This
was only relevant for the s3-keys/ subdirectory -- but we don't
currently need them on our CentOS CI deployment. If we ever do, we'll
create a separate volume out of that directory. This gets rid of the
`--` subdirectory encoding.

These were the last things in `setup-tasks`, so remove it.

[1] cockpit-project/bots#6016
Similarly to [1], avoid the complicated setup of credentials in `$HOME`,
and just read the webhook token file from an env var.

The amqp certs are already hardcoded to /run/secrets/webhook, and the
GitHub token was addressed in the previous commit.

[1] cockpit-project/bots#6016
We already overwrite the entry point in the webhook container and on
OpenShift, and we are going to for the impending job-runner/job split.

So drop it from the image, and instead set it explicitly in the systemd
unit.

Rework run-local.sh accordingly: Replace the async "wait for
initialization" loop with a synchronous bots clone, and only run the
`cockpit-tasks` main loop for `test_pr()` (so that we keep covering that
script) -- the other tests don't need it. This paves the way for adding
a test which covers `run-queue` with custom parameters (against a mock
GitHub API), which we will introduce in the next commit.
This is an "almost end to end" test which runs fully locally and requires no
GitHub interaction or token. Add a little mock GitHub server (utilizing bots'
`task.test_mock_server`) to respond to the required GitHub APIs for handling an
"opened PR" webhook event.

This is useful for playing around with our AMQP queue/job execution engine
locally (when using the mock GH server in interactive mode), or
validating changes to run-queue/AMQP structure.
@martinpitt martinpitt marked this pull request as ready for review March 5, 2024 13:01
@martinpitt
Copy link
Member Author

I rolled this out to production, and spotted a few typos in the bits that run-local.sh can't reach (mostly OpenShift). I retried some tests in cockpit-project/cockpit#20117 after the rollout, and they seem happy both on e2e and rhos.

@martinpitt martinpitt merged commit 57514bd into main Mar 5, 2024
3 checks passed
@martinpitt martinpitt deleted the entry-point branch March 5, 2024 13:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants