mamba-coder / README.md
mrm8488's picture
Update README.md
a9037bb verified
---
license: wtfpl
datasets:
- HuggingFaceH4/CodeAlpaca_20K
pipeline_tag: text-generation
thumbnail: https://huggingface.co/mrm8488/mamba-coder/resolve/main/mamba-coder-no-bg.png
language:
- en
- code
---
# Mamba-Coder
## MAMBA (2.8B) ๐Ÿ fine-tuned on CodeAlpaca_20k for code generation
<div style="text-align:center;width:250px;height:250px;">
<img src="https://huggingface.co/mrm8488/mamba-coder/resolve/main/mamba-coder-no-bg.png" alt="mamba-coder logo"">
</div>
## Base model info
Mamba is a new state space model architecture showing promising performance on information-dense data such as language modeling, where previous subquadratic models fall short of Transformers.
It is based on the line of progress on [structured state space models](https://github.com/state-spaces/s4),
with an efficient hardware-aware design and implementation in the spirit of [FlashAttention](https://github.com/Dao-AILab/flash-attention).
## Dataset info
[CodeAlpaca_20K](https://huggingface.co/datasets/HuggingFaceH4/CodeAlpaca_20K): contains 20K instruction-following data used for fine-tuning the Code Alpaca model.
## Usage
```sh
pip install torch==2.1.0 transformers==4.35.0 causal-conv1d==1.0.0 mamba-ssm==1.0.1
```
```py
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from mamba_ssm.models.mixer_seq_simple import MambaLMHeadModel
CHAT_TEMPLATE_ID = "HuggingFaceH4/zephyr-7b-beta"
device = "cuda:0" if torch.cuda.is_available() else "cpu"
model_name = "mrm8488/mamba-coder"
eos_token = "<|endoftext|>"
tokenizer = AutoTokenizer.from_pretrained(model_name)
tokenizer.eos_token = eos_token
tokenizer.pad_token = tokenizer.eos_token
tokenizer.chat_template = AutoTokenizer.from_pretrained(CHAT_TEMPLATE_ID).chat_template
model = MambaLMHeadModel.from_pretrained(
model_name, device=device, dtype=torch.float16)
messages = []
prompt = "Write a bash script to remove .tmp files"
messages.append(dict(role="user", content=prompt))
input_ids = tokenizer.apply_chat_template(
messages, return_tensors="pt", add_generation_prompt=True
).to(device)
out = model.generate(
input_ids=input_ids,
max_length=2000,
temperature=0.9,
top_p=0.7,
eos_token_id=tokenizer.eos_token_id,
)
decoded = tokenizer.batch_decode(out)
assistant_message = (
decoded[0].split("<|assistant|>\n")[-1].replace(eos_token, "")
)
print(assistant_message)
```
## Gradio Demo
```sh
git clone https://github.com/mrm8488/mamba-chat.git
cd mamba-chat
pip install -r requirements.txt
pip install -q gradio==4.8.0
python app.py \
--model mrm8488/mamba-coder \
--share
```
## Evaluations
Coming soon!
## Citation
```Bibtext
@misc {manuel_romero_2024,
author = { {Manuel Romero} },
title = { mamba-coder (Revision 214a13a) },
year = 2024,
url = { https://huggingface.co/mrm8488/mamba-coder },
doi = { 10.57967/hf/1673 },
publisher = { Hugging Face }
}
```
## Acknowledgments
Thanks to [mamba-chat](https://github.com/havenhq/mamba-chat/tree/main) for heavily inspiring our work