--- license: apache-2.0 language: - en tags: - mamba - mlx - cartesia --- # Model Card for mamba2-130m-8bit-mlx This is an [MLX](https://ml-explore.github.io/mlx)-compatible version of the [mamba2-130m](https://huggingface.co/state-spaces/mamba2-130m) model, quantized to 8 bits. It uses the [EleutherAI/gpt-neox-20b](https://huggingface.co/EleutherAI/gpt-neox-20b) tokenizer. For more details, see our [blog post](https://cartesia.ai/blog/on-device). ## Usage ### Installation This model requires the `cartesia-metal` and `cartesia-mlx` packages. Installation requires Xcode, which can be downloaded from https://developer.apple.com/xcode/. Accept the license agreement with: ```shell sudo xcodebuild -license ``` Install the required dependencies: the exact version of `nanobind`, followed by `cartesia-metal`, and finally `cartesia-mlx`, with the following commands: ```shell pip install nanobind@git+https://github.com/wjakob/nanobind.git@2f04eac452a6d9142dedb957701bdb20125561e4 pip install git+https://github.com/cartesia-ai/edge.git#subdirectory=cartesia-metal pip install cartesia-mlx ``` Note: This package has been tested on macOS Sonoma 14.1 with the M3 chip. ### Generation example ```python import mlx.core as mx import cartesia_mlx as cmx model = cmx.from_pretrained("cartesia-ai/mamba2-130m-8bit-mlx") model.set_dtype(mx.float32) prompt = "Rene Descartes was" print(prompt, end="", flush=True) for text in model.generate( prompt, max_tokens=500, eval_every_n=5, verbose=True, top_p=0.99, temperature=0.85, ): print(text, end="", flush=True) ``` ## About Cartesia At [Cartesia](https://cartesia.ai/), we're building real-time multimodal intelligence for every device.