eaccess
exception when an CLI operation is performed on a node in a booting container
#11954
Replies: 3 comments 4 replies
-
My guess is that the Erlang VM that runs the We might spend time addressing this issue but the quick solution is, of course, to not do this. Here is an example of how I wait for RabbitMQ to start in a docker container: https://github.com/rabbitmq/rabbitmq-dotnet-client/blob/main/.ci/ubuntu/gha-setup.sh#L84-L109 |
Beta Was this translation helpful? Give feedback.
-
@jeremyestein I don't see how the error in the logs has anything to do with Anyhow,
is very specific. The shared secret (Erlang cookie) is usually mounted as a volume and on container boot, that does not happen instantly. There is nothing RabbitMQ-specific about it. I don't have a better solution than "wait before you assume that the container is ready for any operations". |
Beta Was this translation helpful? Give feedback.
-
@jeremyestein please see this script which demonstrates a fix for the issue you report: https://github.com/lukebakken/rabbitmq-server-11954/blob/main/repro.sh#L7 If the I must admit I am a bit baffled by this behavior, because |
Beta Was this translation helpful? Give feedback.
-
Describe the bug
rabbitmq version: 3.13.6 (docker image e42cf20fe44e) (crash also present in 4.0.0-beta.3-management 45deda312190)
Can reproduce on docker on Linux (Docker version 23.0.0, build e92dd87) and macOS (Docker version 25.0.3, build 4debf41)
I discovered this when using
rabbitmq-diagnostics -q check_running
as a container healthcheck, although any invocation of that command will do it.If it's run very quickly after container startup then the server crashes. It's possible to run it too quickly or too slowly to reproduce the crash - I'd say about 2-5 seconds is the sweet spot.
Reproduction steps
docker pull rabbitmq:management
docker run --name jes_test_rabbit rabbitmq:management
docker exec jes_test_rabbit rabbitmq-diagnostics --help
The rabbitmq server crashes basically every time if you get the timing right.
Docker container logs are attached.
rabbitmq_crash_20240808_STDERR.log
rabbitmq_crash_20240808_STDOUT.log
In the stderr output, there is a statement:
Crash dump is being written to: erl_crash.dump
, however I can't find this file anywhere on the stopped container.Expected behavior
The server should continue to boot normally.
In particular,
rabbitmq-diagnostics -q check_running
should return an error status to say the server is not yet running, rather than crashing the server.Additional context
No response
Beta Was this translation helpful? Give feedback.
All reactions