legmlai/openhermes-fr
Viewer • Updated • 800k • 17 • 3
How to use Tonic/petite-elle-L-aime-3-sft with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-generation", model="Tonic/petite-elle-L-aime-3-sft")
messages = [
{"role": "user", "content": "Who are you?"},
]
pipe(messages) # Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("Tonic/petite-elle-L-aime-3-sft")
model = AutoModelForCausalLM.from_pretrained("Tonic/petite-elle-L-aime-3-sft")
messages = [
{"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
tokenize=True,
return_dict=True,
return_tensors="pt",
).to(model.device)
outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))How to use Tonic/petite-elle-L-aime-3-sft with vLLM:
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Tonic/petite-elle-L-aime-3-sft"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "Tonic/petite-elle-L-aime-3-sft",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'docker model run hf.co/Tonic/petite-elle-L-aime-3-sft
How to use Tonic/petite-elle-L-aime-3-sft with SGLang:
# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
--model-path "Tonic/petite-elle-L-aime-3-sft" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "Tonic/petite-elle-L-aime-3-sft",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'docker run --gpus all \
--shm-size 32g \
-p 30000:30000 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_TOKEN=<secret>" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server \
--model-path "Tonic/petite-elle-L-aime-3-sft" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "Tonic/petite-elle-L-aime-3-sft",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'How to use Tonic/petite-elle-L-aime-3-sft with Docker Model Runner:
docker model run hf.co/Tonic/petite-elle-L-aime-3-sft
| Model | Average | BBH-fr | GPQA-fr | IFEval-fr | MUSR-fr | MATH Lvl5-fr | MMMLU-fr |
|---|---|---|---|---|---|---|---|
| HuggingFaceTB/SmolLM3-3B | 20.83% | 24.30% | 11.18% | 12.76% | 6.64% | 24.96% | 45.11% |
| Tonic/petite-elle-L-aime-3-sft | 18.99% | 19.06% | 10.29% | 11.73% | 6.74% | 21.06% | 45.05% |
petite-elle-L-aime-3-sft achieves 57.6% accuracy, outperforming SmolLM3 by 3.6 percentage pointspetite-elle-L-aime-3-sft shows superior performance at 53.6% vs 52.8%petite-elle-L-aime-3-sft's strongest math performance at 48.7% accuracy, outperforming SmolLM3 by 0.9 percentage pointspetite-elle-L-aime-3-sft performs slightly better at 34.0% vs 33.5%petite-elle-L-aime-3-sft shows competitive performance at 50.4% vs 49.6%petite-elle-L-aime-3-sft performs better at 18.0% vs 16.0% accuracypetite-elle-L-aime-3-sft may have better spatial reasoning for moderate complexity taskspetite-elle-L-aime-3-sft's Performance Profile
petite-elle-L-aime-3-sft demonstrates superior understanding of sports-related content| Task Category | Specific Task | petite-elle-L-aime-3-sft |
SmolLM3 | Difference | Winner |
|---|---|---|---|---|---|
| MMLU (French) | Overall Accuracy | 50.55% | 50.60% | +0.05% | SmolLM3 |
| BBH (French) | |||||
| Compréhension de la date | 39.2% | 52.4% | +13.2% | SmolLM3 | |
| Compréhension des sports | 57.6% | 54.0% | -3.6% | petite-elle-L-aime-3-sft |
|
| Comptage d'objets | 48.0% | 50.4% | +2.4% | SmolLM3 | |
| Déduction logique (3 objets) | 60.4% | 68.0% | +7.6% | SmolLM3 | |
| Déduction logique (5 objets) | 39.6% | 46.8% | +7.2% | SmolLM3 | |
| Déduction logique (7 objets) | 28.0% | 39.6% | +11.6% | SmolLM3 | |
| Désambiguïsation QA | 34.8% | 56.0% | +21.2% | SmolLM3 | |
| Expressions booléennes | 44.8% | 53.6% | +8.8% | SmolLM3 | |
| Formes géométriques | 35.6% | 34.4% | -1.2% | petite-elle-L-aime-3-sft |
|
| Hyperbate | 53.6% | 52.8% | -0.8% | petite-elle-L-aime-3-sft |
|
| Jugement causal | 57.2% | 56.1% | -1.1% | petite-elle-L-aime-3-sft |
|
| Naviguer | 58.4% | 64.0% | +5.6% | SmolLM3 | |
| Pingouins sur une table | 47.9% | 50.7% | +2.8% | SmolLM3 | |
| Raisonnement sur les objets colorés | 41.6% | 48.8% | +7.2% | SmolLM3 | |
| Recommandation de film | 39.2% | 58.8% | +19.6% | SmolLM3 | |
| Sarcasmes | 59.0% | 62.9% | +3.9% | SmolLM3 | |
| Sophismes formels | 52.8% | 54.0% | +1.2% | SmolLM3 | |
| Suivi objets mélangés (3 objets) | 33.6% | 34.4% | +0.8% | SmolLM3 | |
| Suivi objets mélangés (5 objets) | 18.0% | 16.0% | -2.0% | petite-elle-L-aime-3-sft |
|
| Suivi objets mélangés (7 objets) | 12.8% | 15.6% | +2.8% | SmolLM3 | |
| Séquences temporelles | 40.8% | 44.4% | +3.6% | SmolLM3 | |
| Toile de mensonges | 51.2% | 51.2% | 0.0% | Tie |
|
| GPQA (French) | |||||
| Diamond | 34.0% | 33.5% | -0.5% | petite-elle-L-aime-3-sft |
|
| Extended | 34.1% | 34.3% | +0.2% | SmolLM3 | |
| Main | 30.1% | 32.4% | +2.3% | SmolLM3 | |
| Math (French) | |||||
| Algebra | 35.7% | 44.9% | +9.2% | SmolLM3 | |
| Counting & Probability | 6.1% | 11.7% | +5.6% | SmolLM3 | |
| Geometry | 11.7% | 11.7% | 0.0% | Tie | |
| Number Theory | 17.1% | 23.0% | +5.9% | SmolLM3 | |
| Prealgebra | 48.7% | 47.8% | -0.9% | petite-elle-L-aime-3-sft |
|
| Precalculus | 7.1% | 10.7% | +3.6% | SmolLM3 | |
| IFEval (French) | |||||
| Prompt Level Strict | 2.5% | 2.5% | 0.0% | Tie | |
| Instruction Level Strict | 20.9% | 23.0% | +2.1% | SmolLM3 | |
| Prompt Level Loose | 3.7% | 3.5% | -0.2% | petite-elle-L-aime-3-sft |
|
| Instruction Level Loose | 20.7% | 24.4% | +3.7% | SmolLM3 | |
| MUSR (French) | |||||
| Murder Mysteries | 50.4% | 49.6% | -0.8% | petite-elle-L-aime-3-sft |
|
| Object Placements | 35.5% | 35.9% | +0.4% | SmolLM3 | |
| Team Allocation | 20.8% | 23.6% | +2.8% | SmolLM3 |
This is a fine-tuned version of the SmolLM3-3B model with the following specifications:
The model provides:
pip install torch transformers accelerate
pip install torchao # For quantized models
from transformers import pipeline
question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
generator = pipeline("text-generation", model="None", device="cuda")
output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
print(output["generated_text"])
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load the main model
model = AutoModelForCausalLM.from_pretrained(
"Tonic/petite-elle-L-aime-3-sft",
device_map="auto",
torch_dtype=torch.bfloat16
)
tokenizer = AutoTokenizer.from_pretrained("Tonic/petite-elle-L-aime-3-sft")
# Generate text
input_text = "What are we having for dinner?"
input_ids = tokenizer(input_text, return_tensors="pt").to(model.device.type)
output = model.generate(**input_ids, max_new_tokens=50)
print(tokenizer.decode(output[0], skip_special_tokens=True))
This repository also includes quantized versions of the model for improved efficiency:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load int4 quantized model (CPU optimized)
model = AutoModelForCausalLM.from_pretrained(
"Tonic/petite-elle-L-aime-3-sft/int4",
device_map="cpu",
torch_dtype=torch.bfloat16
)
tokenizer = AutoTokenizer.from_pretrained("Tonic/petite-elle-L-aime-3-sft/int4")
This model was trained with custom monitoring:
exp_20250727_172526Track Tonic SpacesThe model was fine-tuned on:
The model was evaluated using:
Track TonicIf you use this model in your research, please cite:
@misc{Petite_Elle_L_Aime_3_SFT,
title={{Petite Elle L'Aime 3}},
author={{Joseph Pollack}},
year={2025},
url={https://huggingface.co/Tonic/petite-elle-L-aime-3-sft}
}
@misc{vonwerra2022trl,
title = {{TRL: Transformer Reinforcement Learning}},
author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\'e}dec},
year = 2020,
journal = {GitHub repository},
publisher = {GitHub},
howpublished = {\url{https://github.com/huggingface/trl}}
}
This model is licensed under the Apache 2.0 License.
petite-elle-L-aime-3-sft/
├── .gitattributes # 1.96 kB, Git configuration for file handling
├── README.md # 7.38 kB, Project overview and instructions (updated 7 minutes ago)
├── chat_template.jinja # 5.6 kB, Chat template
├── config.json # 1.92 kB, Model configuration
├── generation_config.json # 177 Bytes, Generation settings
├── model-00001-of-00002.safetensors # 4.97 GB, Model weights (part 1)
├── model-00002-of-00002.safetensors # 1.18 GB, Model weights (part 2)
├── model.safetensors.index.json # 26.9 kB, Model index
├── pytorch_model.bin # Main model weights
├── special_tokens_map.json # 289 Bytes, Special tokens
├── tokenizer.json # 17.2 MB, Tokenizer data
├── tokenizer_config.json # 50.4 kB, Tokenizer configuration
├── train_results.json # 182 Bytes, Training results
├── training_args.bin # 6.16 kB, Training arguments
├── training_results/ # Training results directory
├── checkpoint-4000/ # Training checkpoint at step 4000
├── checkpoint-5000/ # Training checkpoint at step 5000
├── checkpoint-6000/ # Training checkpoint at step 6000
├── checkpoint-7000/ # Training checkpoint at step 7000
├── checkpoint-8000/ # Training checkpoint at step 8000
│ ├── chat_template.jinja # 5.6 kB, Chat template
│ ├── config.json # 1.92 kB, Checkpoint configuration
│ ├── generation_config.json # 177 Bytes, Generation settings
│ ├── model-00001-of-00002.safetensors # 4.97 GB, Model weights (part 1)
│ ├── model-00002-of-00002.safetensors # 1.18 GB, Model weights (part 2)
│ ├── model.safetensors.index.json # 26.9 kB, Model index
│ ├── optimizer.pt # 12.3 GB, Optimizer state
│ ├── rng_state.pth # 14.6 kB, Random number generator state
│ ├── scheduler.pt # 1.47 kB, Scheduler state
│ ├── special_tokens_map.json # 289 Bytes, Special tokens
│ ├── tokenizer.json # 17.2 MB, Tokenizer data
│ ├── tokenizer_config.json # 50.4 kB, Tokenizer configuration
│ ├── trainer_state.json # 84.5 kB, Trainer state (uploaded ~4 hours ago)
│ └── training_args.bin # Training arguments
└── int4/ # Quantized model for CPU
├── README.md # 1.7 kB, Quantized model documentation (updated ~1 hour ago)
├── config.json # 2.67 vikB, Quantized model configuration
├── generation_config.json # 177 Bytes, Generation settings
├── pytorch_model.bin # 2.63 GB, Quantized model weights
├── special_tokens_map.json # 289 Bytes, Special tokens
├── tokenizer.json # 17.2 MB, Tokenizer data
└── tokenizer_config.json # Tokenizer configuration