-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
job-runner: testing runs isolated to their own container #6017
Conversation
job-runner
Dismissed
await run_job(job, ctx) | ||
|
||
if __name__ == '__main__': | ||
asyncio.run(main()) |
Check warning
Code scanning / CodeQL
Use of the return value of a procedure Warning
main
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lolwat
This still needs to happen for production, but allows us to start testing cockpit-project/bots#6017.
dc240da
to
a50e606
Compare
This still needs to happen for production, but allows us to start testing cockpit-project/bots#6017.
Random note: while hacking on this, despite all kinds of exceptions and irregular exits for all kinds of reasons (including many programming errors), I haven't seen a single leaked container in over two weeks (when I got the |
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This still needs to happen for production, but allows us to start testing cockpit-project/bots#6017. We can drop the CONTAINER_HOST environment variable we added in the last commit — we can provide that information via job-runner.toml instead.
This comment was marked as outdated.
This comment was marked as outdated.
This still needs to happen for production, but allows us to start testing cockpit-project/bots#6017.
This still needs to happen for production, but allows us to start testing cockpit-project/bots#6017.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First round of review. We need to talk 😁
test_pr() failed. The inner bots tests succeeded (and the statuses bubble is green here), but the output check failed:
That used to be done by the old s3-streamer:
While we can of course change the string format, the node where |
This still needs to happen for production, but allows us to start testing cockpit-project/bots#6017.
I adjusted the expected test output in cockpit-project/cockpituous#587 and added a FIXUP commit here which adds back 'Running on:". The integration test passes now. |
This still needs to happen for production, but allows us to start testing cockpit-project/bots#6017.
This still needs to happen for production, but allows us to start testing cockpit-project/bots#6017.
This still needs to happen for production, but allows us to start testing cockpit-project/bots#6017.
f649627
to
5d35d6f
Compare
72caf52
to
0e1ed24
Compare
This failure looks very relevant:
Fortunately it already happens in test_mock_pr(), and it effortlessly reproduces in This is because we are using podman-remote in that test (and will in prod), so that temp dir will indeed not exist. |
f2edd19
to
7c23a58
Compare
7c23a58
to
da49455
Compare
This still needs to happen for production, but allows us to start testing cockpit-project/bots#6017.
This still needs to happen for production, but allows us to start testing cockpit-project/bots#6017.
This still needs to happen for production, but allows us to start testing cockpit-project/bots#6017.
This still needs to happen for production, but allows us to start testing cockpit-project/bots#6017.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I ignored the design complexity aspect now, and looked at some implementation questions/bugs.
da49455
to
85f5244
Compare
This will allow us to enable job-runner in cockpit-project/bots#6017. Do this both for our production bots as well as our run-local.sh integration tests.
This will allow us to enable job-runner in cockpit-project/bots#6017. Do this both for our production bots as well as our run-local.sh integration tests.
85f5244
to
543a95e
Compare
This will allow us to enable job-runner in cockpit-project/bots#6017. Do this both for our production bots as well as our run-local.sh integration tests.
366b7bf
to
f58bff5
Compare
This will allow us to enable job-runner in cockpit-project/bots#6017. Do this both for our production bots as well as our run-local.sh integration tests.
Introduce job-runner, a new asyncio python application designed to replace a good chunk of our backend CI infra: - `tests-invoke` - `make-checkout` - `s3-streamer` - (eventually, maybe) `run-queue` and dramatically simplify others: - `issue-scan` - `tests-scan` while making it easier to play with our testing infrastructure locally. The main new feature vs. the existing stuff is that each testing job is run in its own one-time-use container. The image for that container is under control of the revision being tested, allowing for gating of task container upgrades (avoiding the usual wave of broken pixels, linting changes, etc.). Instead of running our jobs from the giant blobs of shell pasted together in `issue-scan` or `tests-scan` we consume them as the new "Job" json format, already introduced in #5932. The new approach also makes a conscious effort to reduce the amount of exposure of secrets to test runs (although we have some more work to do here). This commit introduces job-runner (along with its helper script, `checkout-and-run`) along with tests. Nothing uses this yet.
If we have a 'job' attribute in our payload, ignore the command blob and dispatch the JSON via `job-runner`.
f58bff5
to
043aaf0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's do this! ⭐ 🚀 🏁
Introduce job-runner, a new asyncio python application designed to replace a good chunk of our backend CI infra:
tests-invoke
make-checkout
s3-streamer
run-queue
and dramatically simplify others:
issue-scan
tests-scan
while making it easier to play with our testing infrastructure locally.
The main new feature vs. the existing stuff is that each testing job is run in its own one-time-use container. The image for that container is under control of the revision being tested, allowing for gating of task container upgrades (avoiding the usual wave of broken pixels, linting changes, etc.).
Instead of running our jobs from the giant blobs of shell pasted together in
issue-scan
ortests-scan
we consume them as the new "Job" json format, already introduced in #5932.The new approach also makes a conscious effort to reduce the amount of exposure of secrets to test runs (although we have some more work to do here).
TODO:
cidfile crash after successful runs https://paste.centos.org/view/01fcdfc8 withnot reproducible, ignorepodman-remote
; this breaks attachments, thus a blocker