A JavaScript implementation of Llama 3 using node-mlx, code modified from mlx-examples.
Quantized models can only run on Macs with Apple Silicon.
Download weights (more can be found at mlx-community):
npm install -g @frost-beta/huggingface
huggingface download --to weights mlx-community/Meta-Llama-3-8B-Instruct-8bit
Start chating:
npm install -g llama3
llama3-chat ./weights
Or do text generation:
llama3-generate ./weights 'Write a short story'
This project serves as a demo of node-mlx, and code is intended to keep as simple as possible.
For general purpose LLM module, please visit llm.js.