--- library_name: transformers license: apache-2.0 --- # BiMamba This repository wraps a bidirectional Mamba module in Hugging Face compatible APIs / classes. To use BiMamba as a drop-in replacement for other Hugging Face models, you can use the following code: ```python """Sample code for initializing BiMamba from the template HF hub model.""" import torch from transformers import AutoConfig, AutoModelForMaskedLM, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased") model_name_or_path = "yairschiff/bimamba-template" config_overrides = { "d_model": 128, # TODO: Change this as desired "n_layer": 2, # TODO: Change this as desired "pad_token_id": tokenizer.pad_token_id, "vocab_size": tokenizer.vocab_size, "pad_vocab_size_multiple": 1, # TODO: See configuration_bimamba for all config options } config = AutoConfig.from_pretrained( model_name_or_path, **config_overrides, trust_remote_code=True ) model = AutoModelForMaskedLM.from_config( config=config, trust_remote_code=True ) # Test the model device = "cuda" if torch.cuda.is_available() else "cpu" inputs = ["A sample sentence for model testing."] tokenized = tokenizer.batch_encode_plus(inputs, return_tensors="pt") model = model.to(device) model_out = model(tokenized["input_ids"].to(device)) ``` ## Model Card Contact Yair Schiff (yzs2@cornell.edu)