--- base_model: GreenBitAI/Llama-3.1-Nemotron-70B-Instruct-layer-mix-bpw-4.0-test license: apache-2.0 tags: - mlx --- # GreenBitAI/Llama-3.1-Nemotron-70B-Instruct-layer-mix-bpw-4.0-mlx This quantized low-bit model [GreenBitAI/Llama-3.1-Nemotron-70B-Instruct-layer-mix-bpw-4.0-mlx](https://huggingface.co/GreenBitAI/Llama-3.1-Nemotron-70B-Instruct-layer-mix-bpw-4.0-mlx) was converted to MLX format from [`GreenBitAI/Llama-3.1-Nemotron-70B-Instruct-layer-mix-bpw-4.0-test`](https://huggingface.co/GreenBitAI/Llama-3.1-Nemotron-70B-Instruct-layer-mix-bpw-4.0-test) using gbx-lm version **0.3.4**. Refer to the [original model card](https://huggingface.co/GreenBitAI/Llama-3.1-Nemotron-70B-Instruct-layer-mix-bpw-4.0-test) for more details on the model. ## Use with mlx ```bash pip install gbx-lm ``` ```python from gbx_lm import load, generate model, tokenizer = load("GreenBitAI/Llama-3.1-Nemotron-70B-Instruct-layer-mix-bpw-4.0-mlx") response = generate(model, tokenizer, prompt="hello", verbose=True) ```