mamba2-8b-3t-4k-hf

This repository contains a Hugging Face Transformers-compatible conversion of nvidia/mamba2-8b-3t-4k.

Notes

  • Source checkpoint format: Megatron-LM
  • Target format: Hugging Face Transformers
  • Loaded via Mamba2ForCausalLM
  • Original SentencePiece tokenizer file is preserved in this repo
  • Tokenizer is a practical-compatibility T5Tokenizer wrapper rather than a byte-for-byte Megatron GPTSentencePiece clone

Loading

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

repo_id = "ib-ssm/mamba2-8b-3t-4k-hf"

tokenizer = AutoTokenizer.from_pretrained(repo_id)
model = AutoModelForCausalLM.from_pretrained(
    repo_id,
    torch_dtype="auto",
    device_map="auto",
)
Downloads last month
295
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ib-ssm/mamba2-8b-3t-4k-hf

Finetuned
(1)
this model