Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tasks: Bind the host's podman API socket #583

Merged
merged 1 commit into from
Mar 5, 2024
Merged

Conversation

martinpitt
Copy link
Member

@martinpitt martinpitt commented Mar 4, 2024

tasks: Bind the host's podman API socket

This paves the way for spawning per-job tasks containers from the
container (via job-runner).

Getting the permissions right is unfortunately annoyingly complicated, as the
host's socket has 660 permissions, but the tasks container runs as uid 1111.
Ideally we could use something like

-v "${XDG_RUNTIME_DIR:-/run}/podman/podman.sock:/podman.sock:idmap=gids=$(id -g)-1111-1"

but that fails with "invalid mappings", and is generally not well documented.
--mount=type=bind,[...],idmap --uidmap [...] does not work either.

So resort to file permissions for the host's podman.sock:

  • For local developers in run-local.sh, make the socket world
    writable. This does not actually hurt for a human developer: Its
    directory (/run/user/uid) is not accessible for any other user.

  • For production, set the socket group to 1111. That doesn't matter
    much, the secrets are all already chmod'ed to the container user, and
    these machines don't do anything else. setfacl would be more
    targeted, but it isn't installed in Fedora IoT/CoreOS.


I rolled this out to one production machine:

ansible -i inventory -m include_role -a name=tasks-systemd -e instances=1 -l rhos-01-1 openstack_tasks

Et voilà:

❱❱❱ ssh rhos-01-1 sudo podman exec -it cockpit-tasks-1 podman-remote ps
Warning: Permanently added '10.0.203.194' (ED25519) to the list of known hosts.
CONTAINER ID  IMAGE                         COMMAND               CREATED        STATUS        PORTS                 NAMES
51ef06c0f09c  quay.io/minio/minio:latest    server /data          4 days ago     Up 4 days     0.0.0.0:80->9000/tcp  cockpit-s3
ea0bd1fbcc99  quay.io/cockpit/tasks:latest  /usr/local/bin/co...  5 minutes ago  Up 5 minutes                        cockpit-tasks-1

Once this lands, I'll apply it to all our production machines.

@martinpitt

This comment was marked as resolved.

@martinpitt
Copy link
Member Author

martinpitt commented Mar 4, 2024

Ugh, running the test on GitHub is difficult. First of all, the setfacl did succeed, but the container still gets an EPERM. If I hack this with chmod 666, then it fails with

Error: unable to connect to Podman socket: server API version is too old. Client "4.0.0" server "3.4.4"

containers/podman#17362 doesn't make me very hopeful, but there is somebody who backported the Debian bookworm packages to Ubuntu 22.04: https://launchpad.net/~quarckster/+archive/ubuntu/containers . This works very well indeed!

I don't want to waste too much time on the EPERM on github, so I just hacked it.

@martinpitt martinpitt force-pushed the podman-remote branch 6 times, most recently from 36ebb34 to e6091cd Compare March 4, 2024 08:22
@martinpitt martinpitt marked this pull request as ready for review March 4, 2024 08:39
@martinpitt martinpitt self-assigned this Mar 4, 2024
@martinpitt
Copy link
Member Author

Rebased after #585

@@ -61,8 +65,11 @@ ExecStart=/usr/bin/podman run --name=cockpit-tasks-%i --hostname=${CONTAINER_HOS
--volume=\${TEST_SECRETS}/webhook:/run/secrets/webhook:ro \
--volume=${IMAGE_STORES}:/work/.config/cockpit-dev/image-stores:ro \
--volume=/etc/npmrc:/etc/npmrc:ro \
--volume=%t/podman/podman.sock:/podman.sock:rw \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use the "normal" location on the inside, and save ourselves the envvar?

Copy link
Member Author

@martinpitt martinpitt Mar 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, I spent about two hours on that. There's no $XDG_RUNTIME_DIR in the tasks container, so podman defaults to some /tmp/podman-1111/whatever path which doesn't feel very predictable. Plus, if you actually bind-mount it there, it then wants a /work/rundir/podman/... directory. This is by far the easiest and most robust version that I found.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean /run/podman/podman.sock. I guess that's only the default when you're root, though?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we aren't, we are uid 1111 😁

This paves the way for spawning per-job tasks containers from the
container (via `job-runner`).

Getting the permissions right is unfortunately annoyingly complicated, as the
host's socket has 660 permissions, but the tasks container runs as uid 1111.
Ideally we could use something like

    -v "${XDG_RUNTIME_DIR:-/run}/podman/podman.sock:/podman.sock:idmap=gids=$(id -g)-1111-1"

but that fails with "invalid mappings", and is generally not well documented.
`--mount=type=bind,[...],idmap --uidmap [...]` does not work either.

So resort to file permissions for the host's `podman.sock`:
 - For local developers in `run-local.sh`, make the socket world
   writable. This does not actually hurt for a human developer: Its
   directory (/run/user/uid) is not accessible for any other user.

 - For production, set the socket group to `1111`. That doesn't matter
   much, the secrets are all already chmod'ed to the container user, and
   these machines don't do anything else. `setfacl` would be more
   targeted, but it isn't installed in Fedora IoT/CoreOS.
@martinpitt
Copy link
Member Author

@allisonkarlitskaya Simplified as discussed, fixed the rebase error, and I redeployed it on rhos-01-1. I re-validated that I can access podman there.

Copy link
Member

@allisonkarlitskaya allisonkarlitskaya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better, thanks!

@allisonkarlitskaya allisonkarlitskaya merged commit 4e413ff into main Mar 5, 2024
3 checks passed
@allisonkarlitskaya allisonkarlitskaya deleted the podman-remote branch March 5, 2024 14:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants