Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for GPU Generation Hitting Rid Allocation Limit #735

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

mschwarz1
Copy link

I was running into an issue where after a couple of minutes of flying around with gpu generation on where I would start hitting the rid allocation limit. I took a look at the code and noticed the generator_uniform_set variable was being created but never cleared as far as I could tell. So I switched it to a member variable like _generator_pipeline_rid and freed it along with that and it seems to fix the problem.

I didn't see any resource allocation issues after ~15 minutes of flying around with this change where as before I was hitting the issue after ~3 minutes.

@Zylann
Copy link
Owner

Zylann commented Mar 7, 2025

Thanks for spotting it!

Back when I first tested GPU stuff with Godot when they just came out, I started the dev project for making shaders in the module's repo, and freeing uniform sets was commented out. I don't remember why, but I notice that I consistently never freed uniform sets. Forgot the reason as well.
But according to docs it should be freed, so there are more places to fix.

@@ -151,7 +151,7 @@ void GenerateBlockGPUTask::prepare(GPUTaskContext &ctx) {
// Note, this internally locks RenderingDeviceVulkan's class mutex. Which means it could perhaps be used outside
// of the compute list (which already locks the class mutex until it ends). Thankfully, it uses a recursive
// Mutex (instead of BinaryMutex)
const RID generator_uniform_set =
_generator_uniform_set =
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in a for loop. You will still leak RIDs in different setups.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh yeah tunnel vision. I'll have to fix that. I'm surprised the change is making a much of a difference at all given that added context.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multi-box situations occur in particular cases (notably edited terrain) that you might not have encountered.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. The other places using uniform_set_create are in the render_detail_texture_gpu_task which I also wouldn't have seen. So I'll try turning that on and doing some edits in the next pass.

@mschwarz1
Copy link
Author

Thanks for spotting it!

Back when I first tested GPU stuff with Godot when they just came out, I started the dev project for making shaders in the module's repo, and freeing uniform sets was commented out. I don't remember why, but I notice that I consistently never freed uniform sets. Forgot the reason as well. But according to docs it should be freed, so there are more places to fix.

It's funny you mention that because I have another script where I was also trying to free the uniform set because I was following the documentation but it resulted in saying I was freeing something invalid which I wasn't expecting. In this case it's not complaining about anything so I assume it's a valid thing to clear . Point being I'm not really sure what the deal is with uniform sets either.

@Zylann
Copy link
Owner

Zylann commented Mar 7, 2025

I asked devs, and apparently uniform sets are supposed to be "freed by their dependencies" automatically, which hopefully means if one of their contents is freed, then the uniform set is freed. Hence the absence of freeing in the code. This is still unclear to me because none of this is documented...

Considering uniform sets are then supposed to be automatically freed... I'm also told the issue could be something else.
Are you able to reproduce this consistently in a minimal test project? Does it occur in both Godot 4.4 and 4.3?

Note: the PR still lacks a lot of areas to free uniform sets. I wrote the full fix locally (and pushed a branch to Git because I will soon pack my desktop computer and want stuff saved) but not going to apply it just yet, as apparently it needs more investigation about the root cause

Zylann added a commit that referenced this pull request Mar 7, 2025
@mschwarz1
Copy link
Author

mschwarz1 commented Mar 7, 2025

minimaltest.zip

Attached is a minimal reproduction project. All you need to do is play the one scene and wait 2 minutes or so. Things to note. I'm running a pretty much 4.4 double precision build of godot. (I really hope this isn't another random double build specific behavior being weird.) I don't notice anything because of the double precision but if you start seeing the mesh looking strange it might just be because the camera is getting far out. You should see the issues around 21km though so I don't think you'd run into anything super noticeable until you hit the allocation issue and then you'll really notice some weirdness along with the printed errors.

Edit: Better minimal test version here. It looks a bit worse but it triggers it a bit faster.
minimaltestedited.zip
The original one I added wasn't very good because I forgot that I increased the max limit on the LOD distance so when run on my version of godot the LOD distance is 256 and triggers faster. That said I wasn't able to reproduce this in your 4.3 single precision release build or at least it's significantly faster on my 4.4 build with the edited project. I'll try on the 4.4.

Second Edit:
I was able to reproduce the issue on your 4.4 release after a minute of running with the edited minimal test version.

@mschwarz1
Copy link
Author

mschwarz1 commented Mar 7, 2025

An aside, I tried out having detail normal maps enabled with the gpu generation on my other project where I ran into this originally after making the for loop update and I didn't seem to hit any issues. Not sure why. Maybe I didn't let it go long enough or those uniform_sets are being cleaned up automatically? Did you try out your branch with detail normal maps and get any double free errors/warnings?

@Zylann
Copy link
Owner

Zylann commented Mar 7, 2025

So far my tests just show that without my branch, moving forwards periodically reaches this line: https://github.com/godotengine/godot/blob/7459a0361d7cea65e190232ae60c6fcc275c73f5/core/templates/rid_owner.h#L127
And every time, it is to allocate a RID for one of those uniform sets. chunk_count is bigger each time, so if I keep going it would eventually reach the limit.
But with my branch, that never happens (because I free the uniform sets manually).

Something new though:
One of the uniforms in the set is changing each time. It is a storage buffer. However, turns out I never free this storage buffer either: I pool them. Which means that technically, none of the RIDs referenced by the uniform buffer get freed. And that's expected (some because they are pooled, others because they indeed dont need to change across tasks). Which means Godot's auto-freeing behavior will not trigger. And that we really need to free those uniform buffers manually then.

@mschwarz1
Copy link
Author

The pooling would explain why I didn't see anything with the normal maps assuming those uniform sets aren't part of a pooled resource.

See my edits to the comment with my minimal reproduction project if you didn't already. I don't think you'll need the new version if you're just tracking the chunk_count directly though.

@Zylann
Copy link
Owner

Zylann commented Mar 8, 2025

The pooling would explain why I didn't see anything with the normal maps assuming those uniform sets aren't part of a pooled resource.

Not exactly. Uniform sets are never pooled. It is the pooling of storage buffers that contributes to make the GPU chunk generation not auto-free uniform sets. Because then Godot sees that all resources referenced by the uniform remain alive so it assumes the uniform set is still in use (but it isnt).
Normal maps might actually have the same issue, I think they use pooled buffers too. However, I think textures are not pooled though, which might be the reason they dont leak here.
But you have to also enable GPU usage for them in terrain properties (neither of your test projects have this).

@mschwarz1
Copy link
Author

Right. We're on the same page. I was saying if whatever resource the uniform set belongs to isn't pooled (the textures in this case based on what you're saying) it would explain why I didn't see any problems. And no I didn't include it in the minimal examples mainly because I didn't want to recreate the shader since I didn't see anything in the other project but I probably should've and turned it on just to cover all the bases. Sorry about that if that wasn't clear.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants