Edit model card

Model Card for Soniox-7B-v1.0

Soniox 7B is a powerful large language model. Supports English and code with 8K context. Matches GPT-4 performance on some benchmarks. Built on top of Mistral 7B, enhanced with additional pre-training and fine-tuning for strong problem-solving capabilities. Apache 2.0 License. For more details, please read our blog post.

Usage in Transformers

The model is available in transformers and can be used as follows:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_path = "soniox/Soniox-7B-v1.0"
model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=torch.float16)
tokenizer = AutoTokenizer.from_pretrained(model_path)

device = "cuda"
model.to(device)

messages = [
    {"role": "user", "content": "12 plus 21?"},
    {"role": "assistant", "content": "33."},
    {"role": "user", "content": "Five minus one?"},
]
tok_prompt = tokenizer.apply_chat_template(messages, return_tensors="pt")

model_inputs = tok_prompt.to(device)
generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True)
decoded = tokenizer.batch_decode(generated_ids)
print(decoded[0])

Inference deployment

Refer to our documentation for inference with vLLM and other deployment options.

Downloads last month
952
Safetensors
Model size
7.24B params
Tensor type
BF16
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Space using soniox/Soniox-7B-v1.0 1