Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VM] Dynamic RAM allocation going haywire. #3958

Open
haldi4803 opened this issue Mar 19, 2025 · 6 comments
Open

[VM] Dynamic RAM allocation going haywire. #3958

haldi4803 opened this issue Mar 19, 2025 · 6 comments
Labels
board/ova Open Virtual Appliance (Virtual Machine) bug hypervisor/kvm KVM related issues

Comments

@haldi4803
Copy link

haldi4803 commented Mar 19, 2025

Describe the issue you are experiencing

I'm running TrueNAS Scale VM with HAOS... i've changed from 6GB RAM to 12GB RAM in settings when the HAOS machine was down.
Now i'm stuck with 95% RAM usage and 90% SWAP usage and have absolutely NO idea why.
Glances doesn't show anything abnormal, htop doesn't show memory usage.
Obviously i tried Rebooting the VM, but didn't change anything.

  ~ free -h
               total        used        free      shared  buff/cache   available
Mem:            11Gi        11Gi       109Mi       4.0Mi       540Mi       526Mi
Swap:          3.9Gi       3.5Gi       363Mi

Image

Image

Image

Using top -b -o +%MEM | head -n 22 in the terminal on the Host does not show anything using too much RAM.
Using free -h says everything is full.

Image

What other way do i have to debug/log RAM usage?

What operating system image do you use?

generic-x86-64 (Generic UEFI capable x86-64 systems)

What version of Home Assistant Operating System is installed?

15.0

Did the problem occur after upgrading the Operating System?

No

Hardware details

TrueNAS Scale: ElectricEel-24.10.2
64GB RAM, 12GB given to VM.
5 Cores 10 Threas given to VM.

Steps to reproduce the issue

No Clue... seriously.

Anything in the Supervisor logs that might be useful for us?

nope, nothing RAM related.
OR should there be?

Anything in the Host logs that might be useful for us?

nope, nothing RAM related.
OR should there be?

System information

System Information

version core-2025.3.3
installation_type Home Assistant OS
dev false
hassio true
docker true
user root
virtualenv false
python_version 3.13.2
os_name Linux
os_version 6.12.18-haos
arch x86_64
timezone Europe/Zurich
config_dir /config
Home Assistant Community Store
GitHub API ok
GitHub Content ok
GitHub Web ok
HACS Data ok
GitHub API Calls Remaining 5000
Installed Version 2.0.5
Stage running
Available Repositories 1604
Downloaded Repositories 14
Home Assistant Cloud
logged_in false
can_reach_cert_server ok
can_reach_cloud_auth ok
can_reach_cloud ok
Home Assistant Supervisor
host_os Home Assistant OS 15.0
update_channel stable
supervisor_version supervisor-2025.03.3
agent_version 1.7.2
docker_version 28.0.1
disk_total 33.7 GB
disk_used 24.4 GB
healthy true
supported true
host_connectivity true
supervisor_connectivity true
ntp_synchronized true
virtualization kvm
board ova
supervisor_api ok
version_api ok
installed_addons Whisper (2.4.0), Piper (1.5.2), ESPHome Device Builder (2025.2.2), openWakeWord (1.10.0), File editor (5.8.0), Advanced SSH & Web Terminal (20.0.2), Studio Code Server (5.18.3), Glances (0.21.1), ESPHome (dev) (dev), InfluxDB (5.0.2)
Dashboards
dashboards 2
resources 9
views 6
mode storage
Network Configuration
adapters lo (disabled), enp0s3 (enabled, default, auto), docker0 (disabled), hassio (disabled), vethd7d7598 (disabled), vetha3bda91 (disabled), veth2705565 (disabled), veth891784e (disabled), veth541da13 (disabled), vethf3731d4 (disabled), veth4623118 (disabled), vethec3950d (disabled), vethce4c7ef (disabled), veth1e53068 (disabled), veth1afdad4 (disabled), veth4ed9ef2 (disabled)
ipv4_addresses lo (127.0.0.1/8), enp0s3 (192.168.1.10/24), docker0 (172.30.232.1/23), hassio (172.30.32.1/23), vethd7d7598 (), vetha3bda91 (), veth2705565 (), veth891784e (), veth541da13 (), vethf3731d4 (), veth4623118 (), vethec3950d (), vethce4c7ef (), veth1e53068 (), veth1afdad4 (), veth4ed9ef2 ()
ipv6_addresses lo (::1/128), enp0s3 (fddf:3940:83b2::dc7/128, 2001:4060:c00b:dd30::dc7/128, 2001:4060:c00b:dd30:5696:ac91:d59:be47/64, fddf:3940:83b2:0:18d5:ab4b:45b9:2e08/64, fe80::aa8a:2217:6ead:feb2/64), docker0 (fe80::d0e9:1fff:fe7a:36e/64), hassio (fe80::875:2bff:feb2:466f/64), vethd7d7598 (fe80::cc6f:caff:fe80:f940/64), vetha3bda91 (fe80::387a:2fff:fe50:e303/64), veth2705565 (fe80::c4c3:92ff:fe32:6c19/64), veth891784e (fe80::c093:78ff:fe54:f2fd/64), veth541da13 (fe80::c70:7aff:fee8:ef76/64), vethf3731d4 (fe80::888e:99ff:fe96:c291/64), veth4623118 (fe80::4c02:4ff:fe40:cdd1/64), vethec3950d (fe80::fce1:2dff:fe5b:8c81/64), vethce4c7ef (fe80::c4fa:27ff:fe64:1317/64), veth1e53068 (fe80::8411:7aff:fe31:34cf/64), veth1afdad4 (fe80::5491:9eff:fe97:b8f9/64), veth4ed9ef2 (fe80::148b:15ff:fe44:9c17/64)
announce_addresses 192.168.1.10, fddf:3940:83b2::dc7, 2001:4060:c00b:dd30::dc7, 2001:4060:c00b:dd30:5696:ac91:d59:be47, fddf:3940:83b2:0:18d5:ab4b:45b9:2e08, fe80::aa8a:2217:6ead:feb2
Recorder
oldest_recorder_run 8 March 2025 at 08:37
current_recorder_run 19 March 2025 at 19:45
estimated_db_size 251.07 MiB
database_engine sqlite
database_version 3.48.0

Additional information

Switching Back to 6GB RAM does solve the issue...

Image

But WTF happend here?

Increasing the RAM again to 20GB again shows 90% Full.

Image

@haldi4803 haldi4803 added the bug label Mar 19, 2025
@Impact123
Copy link

It might be related to ballooning. total only showing 11G also hints towards that. Can you share free -h from your TrueNAS host while this is happening as well?
I'm curious why you allocate so much to HAOS.

@WladyTee
Copy link

My personal experience on Intel NUC (No WM, only HAOS) with 4GB.
Since OS 15, I'm experiencing multiple self rebooting, database corruption and slow operations with several integration timeouts.

It seems there are some concurrent tasks that make the system crash.
I'm investigating now, in the beginning I believed into a hardware issue but I have to check and restore the previous backup.

I hope it helps.

@sairon
Copy link
Member

sairon commented Mar 20, 2025

It might be related to ballooning.

That might be a good lead. I saw something similar in Proxmox when "Minimum memory" was different than the "Memory" configured, when the VM started to eat more memory, it went haywire when the allocation changed. With fixed allocation, the problem went away.

@sairon sairon added board/ova Open Virtual Appliance (Virtual Machine) hypervisor/kvm KVM related issues labels Mar 20, 2025
@haldi4803
Copy link
Author

I'm curious why you allocate so much to HAOS.

i was running Whisper with the tiny-int8 Model and it couldnt grasp a single sentence correct. thats why i thought use a bigger model. but then RAM was not enough.

Image
Image
Image
Image

Ballooning might be a good argument!
I set a minimum of 4gb and increased only the maximum.

@haldi4803
Copy link
Author

Yeah.
Would you look at that?
I've set RAM fixed to 12 GiB and it works!

Image
Image

takes off pirate hat let's change the title accordingly...

@haldi4803 haldi4803 changed the title WHY IS THE RAM GONE??? [VM] Dynamic RAM allocation going haywire. Mar 20, 2025
@chheiss

This comment has been minimized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
board/ova Open Virtual Appliance (Virtual Machine) bug hypervisor/kvm KVM related issues
Projects
None yet
Development

No branches or pull requests

5 participants