-
Does anyone know why the size of our C++ DLLs (CUDA) has suddenly increased to nearly 700 MB, compared to around 60 MB previously? I've also noticed some new files, including ggml-cuda.dll, which is over 600 MB. Could this be due to incorrect dynamic linking, instead of statically linking only the necessary code? Thank you in advance for your help! |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 6 replies
-
When was the last time you built llama.cpp? The cuda build has been large for some time now. It would probably be smaller if you build it only for your device arch, and not generic. You may be able to make it even smaller if you build it with the compression option |
Beta Was this translation helpful? Give feedback.
-
I believe I've identified the issue. I now have a smaller ggml-cuda.dll (51MB for the Windows Release with one architecture and 157MB with two architectures). The issue seems to stem from the -arch=native option. NVCC doesn't support this option, but it appears the code requires it for some reason. I had previously removed it, but I’ve now added it back. Additionally, the architecture(s) need to be defined manually (e.g., CMAKE_CUDA_ARCHITECTURES="61;89") while ensuring that -arch=native remains present in the CMake script. |
Beta Was this translation helpful? Give feedback.
I believe I've identified the issue. I now have a smaller ggml-cuda.dll (51MB for the Windows Release with one architecture and 157MB with two architectures). The issue seems to stem from the -arch=native option. NVCC doesn't support this option, but it appears the code requires it for some reason. I had previously removed it, but I’ve now added it back. Additionally, the architecture(s) need to be defined manually (e.g., CMAKE_CUDA_ARCHITECTURES="61;89") while ensuring that -arch=native remains present in the CMake script.