metadata
license: apache-2.0
language:
- en
datasets:
- HuggingFaceFW/fineweb-edu
- allenai/dolma
- Skylion007/openwebtext
- NeuML/wikipedia-20250123
pipeline_tag: text-generation
Qwen2-96M
Qwen2-96M is a small language model based on the Qwen2 architecture, trained from scratch on English datasets with a context length of 8192 tokens. With only 96 million parameters, this model serves as a lightweight base model that can be fine-tuned for specific tasks.
Due to its compact size, the model has significant limitations in reasoning, factual knowledge, and general capabilities compared to larger models. It may produce incorrect, irrelevant, or nonsensical outputs. Additionally, as it was trained on internet text data, it may contain biases and potentially generate inappropriate content.
Usage
pip install transformers==4.49.0 torch==2.6.0
from transformers import pipeline, AutoModelForCausalLM, AutoTokenizer, TextStreamer
import torch
model_path = "Felladrin/Qwen2-96M"
prompt = "I've been thinking about"
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path).to(device)
streamer = TextStreamer(tokenizer)
generate = pipeline(
"text-generation",
model=model,
tokenizer=tokenizer,
device=device,
streamer=streamer,
)
generate(
prompt,
eos_token_id=tokenizer.eos_token_id,
pad_token_id=tokenizer.pad_token_id,
max_length=tokenizer.model_max_length,
truncation=True,
do_sample=True,
repetition_penalty=1.05,
)