Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
reaperdoesntknow
/
DeepReasoning_1R
like
0
Text Generation
Transformers
TensorBoard
Safetensors
HumanLLMs/Human-Like-DPO-Dataset
qwen2
conversational
text-generation-inference
Model card
Files
Files and versions
Metrics
Training metrics
Community
Train
Deploy
Use this model
README.md exists but content is empty.
Downloads last month
6
Safetensors
Model size
494M params
Tensor type
FP16
·
Inference Providers
NEW
Text Generation
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for
reaperdoesntknow/DeepReasoning_1R
Base model
Qwen/Qwen2.5-0.5B
Finetuned
Qwen/Qwen2.5-0.5B-Instruct
Finetuned
(
300
)
this model
Dataset used to train
reaperdoesntknow/DeepReasoning_1R
HumanLLMs/Human-Like-DPO-Dataset
Viewer
•
Updated
Jan 12
•
10.9k
•
6.15k
•
214