Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C++ performance and optimization #1767

Open
wustus opened this issue Jan 14, 2025 · 0 comments
Open

C++ performance and optimization #1767

wustus opened this issue Jan 14, 2025 · 0 comments

Comments

@wustus
Copy link

wustus commented Jan 14, 2025

I've initially implemented a neural network in C++, following Nielsens Book 'Neural Networks and Deep Learning' and using Eigen as a BLAS library.
After reimplementing the same network for different datasets and reaching one where the input was larger than before (3.072x1), one epoch trained for ~30 seconds, I wanted to look into GPU training. After some research I found MLX and switched out the Eigen Matrices for MLX Vectors.
Doing this for a smaller project (kmnist) with 28x28 grayscale images (same as MNIST), I get extremely slow code compared to Eigen.

My current code can be found here: https://github.com/wustus/kuzushiji

I'm trying to chain my operations and evaluate every few batches.
When trying to refactor for example the sigmoid_prime method and compile it for a speedup I got a Segmentation fault.
I know I'm probably better off using the MLX methods for SGD and I guess Module to create the actual network but I'd like to use it as a BLAS library to learn as much as I can about the underlying techniques.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant