clibrain
/

mamba-2.8b-chat-no_robots

Text Generation

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

mamba-2.8b-chat-no_robots / README.md

mrm8488's picture

Update README.md

606fa26 10 months ago

|

No virus

2.96 kB

	---
	license: wtfpl
	datasets:
	- HuggingFaceH4/no_robots
	pipeline_tag: text-generation
	---

	# MAMBA (2.8B) 🐍 fine-tuned on H4/no_robots dataset for chat / instruction

	Model Card is still WIP!


	## Base model info

	Mamba is a new state space model architecture showing promising performance on information-dense data such as language modeling, where previous subquadratic models fall short of Transformers.
	It is based on the line of progress on [structured state space models](https://github.com/state-spaces/s4),
	with an efficient hardware-aware design and implementation in the spirit of [FlashAttention](https://github.com/Dao-AILab/flash-attention).

	## Dataset info

	_Look Ma, an instruction dataset that wasn't generated by GPTs!_

	### Dataset Description

	- Repository: https://github.com/huggingface/alignment-handbook
	- Paper:
	- Leaderboard: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard
	- Point of Contact: Lewis Tunstall

	#### Dataset Summary

	No Robots is a high-quality dataset of 10,000 instructions and demonstrations created by skilled human annotators. This data can be used for supervised fine-tuning (SFT) to make language models follow instructions better. No Robots was modelled after the instruction dataset described in OpenAI's [InstructGPT paper](https://huggingface.co/papers/2203.02155), and is comprised mostly of single-turn instructions across the following categories:

	\| Category \| Count \|
	\|:-----------\|--------:\|
	\| Generation \| 4560 \|
	\| Open QA \| 1240 \|
	\| Brainstorm \| 1120 \|
	\| Chat \| 850 \|
	\| Rewrite \| 660 \|
	\| Summarize \| 420 \|
	\| Coding \| 350 \|
	\| Classify \| 350 \|
	\| Closed QA \| 260 \|
	\| Extract \| 190 \|


	## Usage

	```py
	import torch
	from transformers import AutoTokenizer, AutoModelForCausalLM
	from mamba_ssm.models.mixer_seq_simple import MambaLMHeadModel

	CHAT_TEMPLATE_ID = "HuggingFaceH4/zephyr-7b-beta"

	device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

	eos_token = "<\|endoftext\|>"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	tokenizer.eos_token = eos_token
	tokenizer.pad_token = tokenizer.eos_token
	tokenizer.chat_template = AutoTokenizer.from_pretrained(CHAT_TEMPLATE_ID).chat_template

	model = MambaLMHeadModel.from_pretrained(
	model_name, device=device, dtype=torch.float16)

	history_dict: list[dict[str, str]] = []
	prompt = "Tell me 5 sites to visit in Spain"
	history_dict.append(dict(role="user", content=prompt))

	input_ids = tokenizer.apply_chat_template(
	history_dict, return_tensors="pt", add_generation_prompt=True
	).to(device)

	out = model.generate(
	input_ids=input_ids,
	max_length=2000,
	temperature=0.9,
	top_p=0.7,
	eos_token_id=tokenizer.eos_token_id,
	)

	decoded = tokenizer.batch_decode(out)
	assistant_message = (
	decoded[0].split("<\|assistant\|>\n")[-1].replace(eos, "")
	)

	print(assistant_message)
	```

	## Evaluations

	Coming soon!