Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] agent crashes on startup #35521

Open
josh-coderpad opened this issue Mar 26, 2025 · 1 comment
Open

[BUG] agent crashes on startup #35521

josh-coderpad opened this issue Mar 26, 2025 · 1 comment

Comments

@josh-coderpad
Copy link

Agent Environment
7.63.3

Describe what happened:
Agent crashes in a loop and stops

2025-03-20 13:10:14 UTC | CORE | INFO | (comp/core/tagger/collectors/workloadmeta_main.go:154 in stream) | workloadmeta tagger collector started
panic: runtime error: index out of range [2] with length 2
goroutine 355 [running]:
github.com/DataDog/datadog-agent/comp/core/workloadmeta/collectors/internal/docker.layersFromDockerHistoryAndInspect({_, _, _}, {{0xc000db0cd0, 0x47}, {0x>
        /omnibus/src/datadog-agent/src/github.com/DataDog/datadog-agent/comp/core/workloadmeta/collectors/internal/docker/docker.go:684 +0x4f5
github.com/DataDog/datadog-agent/comp/core/workloadmeta/collectors/internal/docker.(*collector).getImageMetadata(0xc000a09cc0, {0x5b5f438, 0xc000aa4d20}, >
        /omnibus/src/datadog-agent/src/github.com/DataDog/datadog-agent/comp/core/workloadmeta/collectors/internal/docker/docker.go:649 +0x436
github.com/DataDog/datadog-agent/comp/core/workloadmeta/collectors/internal/docker.(*collector).generateEventsFromImageList(0xc000a09cc0, {0x5b5f438, 0xc0>
        /omnibus/src/datadog-agent/src/github.com/DataDog/datadog-agent/comp/core/workloadmeta/collectors/internal/docker/docker.go:222 +0x128
github.com/DataDog/datadog-agent/comp/core/workloadmeta/collectors/internal/docker.(*collector).Start(0xc000a09cc0, {0x5b5f438, 0xc000aa4d20}, {0x5baff60,>
        /omnibus/src/datadog-agent/src/github.com/DataDog/datadog-agent/comp/core/workloadmeta/collectors/internal/docker/docker.go:115 +0x205
github.com/DataDog/datadog-agent/comp/core/workloadmeta/impl.(*workloadmeta).startCandidates(0xc0012c5ad0, {0x5b5f438, 0xc000aa4d20})
        /omnibus/src/datadog-agent/src/github.com/DataDog/datadog-agent/comp/core/workloadmeta/impl/store.go:565 +0x13b
github.com/DataDog/datadog-agent/comp/core/workloadmeta/impl.(*workloadmeta).startCandidatesWithRetry.func1()
        /omnibus/src/datadog-agent/src/github.com/DataDog/datadog-agent/comp/core/workloadmeta/impl/store.go:552 +0xec
github.com/cenkalti/backoff.RetryNotify(0xc0015d7f58, {0x5b25840, 0xc000a03620}, 0x0)
        /pkg/mod/github.com/cenkalti/backoff@v2.2.1+incompatible/retry.go:37 +0x168
github.com/cenkalti/backoff.Retry(...)
        /pkg/mod/github.com/cenkalti/backoff@v2.2.1+incompatible/retry.go:24
github.com/DataDog/datadog-agent/comp/core/workloadmeta/impl.(*workloadmeta).startCandidatesWithRetry(0xc0012c5ad0, {0x5b5f438, 0xc000aa4d20})
        /omnibus/src/datadog-agent/src/github.com/DataDog/datadog-agent/comp/core/workloadmeta/impl/store.go:545 +0xaa
github.com/DataDog/datadog-agent/comp/core/workloadmeta/impl.(*workloadmeta).start.func3()
        /omnibus/src/datadog-agent/src/github.com/DataDog/datadog-agent/comp/core/workloadmeta/impl/store.go:95 +0x28
created by github.com/DataDog/datadog-agent/comp/core/workloadmeta/impl.(*workloadmeta).start in goroutine 352

Describe what you expected:
Agent starts without crashing

Steps to reproduce the issue:
in /etc/datadog-agent/datadog.yaml:

# Managed by Ansible

site: datadoghq.com

api_key: <a valid key>

ac_exclude:
- image:.*
ac_include:
- image:us-central1-docker.pkg.dev/our-project/images/an-image-prefix
apm_config:
    enabled: false
## adding this resolves the issue
## disabling conf.d/docker.d/* and conf.d/container.d/* integrations had no effect
# autoconfig_exclude_features:
# - docker
logs_enabled: false
process_config:
    container_collection: false
    process_collection: false
    process_discovery:
        enabled: false

Additional environment details (Operating System, Cloud provider, etc):

PRETTY_NAME="Ubuntu 22.04.5 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.5 LTS (Jammy Jellyfish)"

On a GCP/GCE VM

Some things that weren't it:

  • We've been on this version for a few days, with new machines even
  • We haven't updated the images mentioned by ac_include (I thought because of the error maybe we had). Some of them do have 1 layer though
@sblumenthal
Copy link
Member

Thank you for the detailed bug report.

We are trying to reproduce this internally, but to help us with regards to this, please reach out to datadog support (if you have not already) with the output of docker history <image> and docker inspect <image> and we will have someone from our engineering team look into it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants