metadata
base_model: Solshine/reflection-llama-3.1-8B-Solshine-trainround1-16bit
language:
- en
license: llama3.1
tags:
- text-generation-inference
- transformers
- unsloth
- llama
- trl
- sft
- reflection
datasets:
- Harshkmr/orca-math-word-reflection
Uploaded model
- Developed by: Solshine
- License: LLama 3.1 License
- Finetuned from model : Solshine/reflection-llama-3.1-8B-Solshine-trainround1-16bit
Inspired by and featuring the Reflection Tuning technique pioneered by Matt Shumer (possibly earlier innovated by the team at Anthropic.)
This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.