Releases
v0.22.0
Highlights
Export and import MLX functions to a file (example , bigger example )
Functions can be exported from Python and run in C++ and vice versa
Core
Add slice
and slice_update
which take arrays for starting locations
Add an example for using MLX in C++ with CMake
Fused attention for generation now supports boolean masking (benchmark )
Allow array offset for mx.fast.rope
Add mx.finfo
Allow negative strides without resorting to copying for slice
and as_strided
Add Flatten
, Unflatten
and ExpandDims
primitives
Enable the compilation of lambdas in C++
Add a lot more primitives for shapeless compilation (full list )
Fix performance regression in qvm
Introduce separate types for Shape
and Strides
and switch to int64 strides from uint64
Reduced copies for fused-attention kernel
Recompile a function when the stream changes
Several steps to improve the linux / x86_64 experience (#1625 , #1627 , #1635 )
Several steps to improve/enable the windows experience (#1628 , #1660 , #1662 , #1661 , #1672 , #1663 , #1664 , ...)
Update to newer Metal-cpp
Throw when exceeding the maximum number of buffers possible
Add mx.kron
mx.distributed.send
now implements the identity function instead of returning an empty array
Better errors reporting for mx.compile
on CPU and for unrecoverable errors
NN
Add optional bias correction in Adam/AdamW
Enable mixed quantization by nn.quantize
Remove reshapes from nn.QuantizedEmbedding
Bug fixes
Fix qmv/qvm bug for batch size 2-5
Fix some leaks and races (#1629 )
Fix transformer postnorm in mlx.nn
Fix some mx.fast
fallbacks
Fix the hashing for string constants in compile
Fix small sort in Metal
Fix memory leak of non-evaled arrays with siblings
Fix concatenate/slice_update vjp in edge-case where the inputs have different type
You can’t perform that action at this time.