Snapshotter runs out of memory when resolving a large layer with prefetch enabled #369

rdpsin · 2021-07-13T15:52:24Z

Hello,

I've recently run into an issue where the snapshotter runs out of memory when resolving a large layer. The image in question is the pgbench 14 image. Unfortunately, I'm unable to provide the exact image as it is in a private repository.

/etc/containerd-stargz-grpc/config.toml

noprefetch = false
max_concurrency = 10

Available System Memory: ~16GB

The content of the stargz image is:

{
  "schemaVersion": 2,
  "config": {
    "mediaType": "application/vnd.oci.image.config.v1+json",
    "digest": "sha256:7986847b2cf6aa99385f53ac46e6ce24a7d105b4700ceeaec822d5d1e4a33beb",
    "size": 10967
  },
  "layers": [
    {
      "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
      "digest": "sha256:17e5262e6adea25ecd965bfadab9e4735db1b9161f6f0fdedaba95dba6efdb09",
      "size": 29205376,
      "annotations": {
        "containerd.io/snapshot/stargz/toc.digest": "sha256:61da3fc30f957a30b0924c67519e4dafe43c03271f9af747dd522d09d6178bc4"
      }
    },
    {
      "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
      "digest": "sha256:f8de078082becc7c11422956ab091ce2bebc45e3af8194609f89bcdc56bf49e3",
      "size": 4289511,
      "annotations": {
        "containerd.io/snapshot/stargz/toc.digest": "sha256:2b2ff5fc711663e115defe39b8a63414ea736f8c31f9b7ecb6f3a3597063819a"
      }
    },
    {
      "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
      "digest": "sha256:15269d890b80b9750fc80a11351ee383986e41131c9cb9789f085642474d4e69",
      "size": 4587,
      "annotations": {
        "containerd.io/snapshot/stargz/toc.digest": "sha256:721b0fa2a89238e4c85776cbf8b81335b5cfaa2e16b23645c912a7224e4479c1"
      }
    },
    {
      "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
      "digest": "sha256:ce293a065f68d58a8f89d8290b6353c649ec252b841be6798ede920dda17916b",
      "size": 1359793,
      "annotations": {
        "containerd.io/snapshot/stargz/toc.digest": "sha256:187f8b203f65405d6cbefc957943c4bc3b7711c41ae742d7e805198c09b76a5e"
      }
    },
    {
      "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
      "digest": "sha256:71e9c401f1c33156dd006a8d22ec75151b547af8115c2a185cd9b3bd98685c53",
      "size": 8291211,
      "annotations": {
        "containerd.io/snapshot/stargz/toc.digest": "sha256:b8ef402b13b7f305891f935ead8bbfeca2fedb25f2e3619c593ad6da8bd98e18"
      }
    },
    {
      "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
      "digest": "sha256:bda9a0ad26a39ee7d0e40070c7d323599a6432d9723a4bd5b780c2117943d744",
      "size": 303693,
      "annotations": {
        "containerd.io/snapshot/stargz/toc.digest": "sha256:89997b8d71ddc7d6aca396698d2387b351e1eb632fd111e8c4bf5e35cafeca2b"
      }
    },
    {
      "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
      "digest": "sha256:a529b1a9e4f4f489559154b573032b3664e634ec22b8fb365bfdd0c44ed32611",
      "size": 608,
      "annotations": {
        "containerd.io/snapshot/stargz/toc.digest": "sha256:e6902c5f6a98615674fa19380a3c5f80d6ed152bdd0beb44e71633029ddae6da"
      }
    },
    {
      "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
      "digest": "sha256:1878d783adddfcfee7cb9d3e8a3658223b35f192e8943086301a05f4fc132097",
      "size": 3673,
      "annotations": {
        "containerd.io/snapshot/stargz/toc.digest": "sha256:7a06728d288635b43365489d1dff679105028042d9189eb90c07b8c37aa9ee68"
      }
    },
    {
      "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
      "digest": "sha256:cc14c639217217b4eeb780a814097e351f98bc62ba4502c39916ef7bd21db509",
      "size": 94964217,
      "annotations": {
        "containerd.io/snapshot/stargz/toc.digest": "sha256:95f9e517724962c5dfeb84e429ffc47a1a017f4ab465af590b7c7e8a68b3c6bf"
      }
    },
    {
      "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
      "digest": "sha256:1a67a26b3dd177e51ed96eea8e9f5e49da81ad180a93b7b396930aed05df6271",
      "size": 17968,
      "annotations": {
        "containerd.io/snapshot/stargz/toc.digest": "sha256:414580f03d043ef486de8dab5a336016b31c7cef9e8d650413eed0a01bff216a"
      }
    },
    {
      "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
      "digest": "sha256:99ca52bd3e74d60aa212a33474f9b2e83de408542e635362e4f83c1d4749759a",
      "size": 699,
      "annotations": {
        "containerd.io/snapshot/stargz/toc.digest": "sha256:166e6697dcc9b97be9cb9cf04003fa3f980e53690b90906de70adb1c7597f6b9"
      }
    },
    {
      "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
      "digest": "sha256:75f05b17f7429b7c6f7668faeb8b8e1393e0e5fb110cee54e6a31f0868451554",
      "size": 920,
      "annotations": {
        "containerd.io/snapshot/stargz/toc.digest": "sha256:9c48dc0c6e4b523f82eb463b2e90821056d3e3d3f03bc87ce351543c638f7dc7"
      }
    },
    {
      "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
      "digest": "sha256:2b60598ac3476ce9240a7acc753a12efa2d053e80f03bdaea9a88ee8b785480e",
      "size": 4437,
      "annotations": {
        "containerd.io/snapshot/stargz/toc.digest": "sha256:73ca3c282d4811f60504b834398e3288260ca33b3b5c44a1ac12e3633cb27a1f"
      }
    },
    {
      "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
      "digest": "sha256:f403d2e1f998ff05e6ec992f1025c7adff229240201099cef49fd7014ae72a2f",
      "size": 633,
      "annotations": {
        "containerd.io/snapshot/stargz/toc.digest": "sha256:ab181951ded947e6d5f5a41d362676b2f17b6ecbb8e677413b500eb9cf4541ce"
      }
    },
    {
      "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
      "digest": "sha256:09c35e8efcc4ec49fad18bf1cdf237fad3a683de32e612094b4d896ea3bf98b1",
      "size": 873803975,
      "annotations": {
        "containerd.io/snapshot/stargz/toc.digest": "sha256:d77826b8a4faa831ee407ad228aee000913d30c7aacb89e6dbcb8bd8087cdbc8"
      }
    },
    {
      "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
      "digest": "sha256:f7b84947ee25f10c9cc5329f5dfe379956eb18fb77e0ffeefe14cc0c78ae26b2",
      "size": 908,
      "annotations": {
        "containerd.io/snapshot/stargz/toc.digest": "sha256:e0e6a43dfde918f1791ad896f8eafa868ea0a3d902fcc0a994c81271cabe93eb"
      }
    },
    {
      "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
      "digest": "sha256:33a7916abd4f28d17331c8101b37865c8ba1f2aa68475081f428a8d66a7fbb92",
      "size": 923,
      "annotations": {
        "containerd.io/snapshot/stargz/toc.digest": "sha256:b199705c89dff961241797310c65cae490512c810a0ee6f5e58f7eeab5a1377e"
      }
    }
  ]
}

The issue only arises when prefetch is enabled (noprefetch = false).
The culprit seems to be the 833M layer (digest: sha256:09c35e8efcc4ec49fad18bf1cdf237fad3a683de32e612094b4d896ea3bf98b1).

The working theory is that this is caused by buffers backing the memory cached chunks being put inside a buffer pool after the cache is written to the filesystem

Ref:

stargz-snapshotter/fs/layer/layer.go

Line 191 in 481bc84

dCache.OnEvicted = func(key string, value interface{}) {

) .

We found that a large number of 4M chunks were being created (~2900) that would be placed inside the buffer pool after being written out to the filesystem.

Our theory seems to be validated when we pass in cache.Direct() to the Cache function in:

stargz-snapshotter/fs/layer/layer.go

Line 437 in a6e2491

if err := lr.Cache(reader.WithFilter(func(e *estargz.TOCEntry) bool {

Using cache.Direct(), memory usage doesn't spike (remains consistent) and cumulative memory usage is reduced by half.

Graph showing memory usage with and without memcaching (note: w/ memcaching is only shown until the snapshotter crashes):

I would like to ask the containerd/snapshotter community, what the ideal way to go around fixing this would be? Thanks.

The text was updated successfully, but these errors were encountered:

ktock · 2021-07-14T01:27:17Z

Thank you for reporting this. This seems a bug. We might need to limit the size of lrucache and buf pool.

rdpsin · 2021-07-14T16:38:46Z

How would one go about doing that?

ktock · 2021-07-15T15:45:47Z

As a quick workaround, let's add a configuration (of config.toml) to forcefully enable Direct mode for the cache. Then I'll work on improving the memory consumption of the cache (maybe we need something like size-aware sync.Pool implementation).

ktock · 2021-07-16T09:46:29Z

@rdpsin Please try v0.7.0 https://github.com/containerd/stargz-snapshotter/releases/tag/v0.7.0
You can enable direct mode of the cache to avoid this issue.

[directory_cache]
direct = true

We'll work on the improvement of the memory consumption of the cache.

ktock · 2021-11-05T09:42:51Z

FYI: Specifying metadata_store = "db" in config.toml enables storing filesystem metadata on disk. (since v0.10.0)
This can reduce the memory consumption of containerd-stargz-grpc (#415 (comment)).

rdpsin closed this as completed Jul 13, 2021

rdpsin reopened this Jul 13, 2021

ktock added the bug Something isn't working label Jul 14, 2021

ktock mentioned this issue Jul 16, 2021

Add direct field for cache configuration in config.toml #372

Merged

ktock self-assigned this Jul 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Snapshotter runs out of memory when resolving a large layer with prefetch enabled #369

Snapshotter runs out of memory when resolving a large layer with prefetch enabled #369

rdpsin commented Jul 13, 2021

ktock commented Jul 14, 2021

rdpsin commented Jul 14, 2021

ktock commented Jul 15, 2021 •

edited

Loading

ktock commented Jul 16, 2021

ktock commented Nov 5, 2021

Snapshotter runs out of memory when resolving a large layer with prefetch enabled #369

Snapshotter runs out of memory when resolving a large layer with prefetch enabled #369

Comments

rdpsin commented Jul 13, 2021

ktock commented Jul 14, 2021

rdpsin commented Jul 14, 2021

ktock commented Jul 15, 2021 • edited Loading

ktock commented Jul 16, 2021

ktock commented Nov 5, 2021

ktock commented Jul 15, 2021 •

edited

Loading