Llama-3.2-3B-countdown-RFT / README.md

Upload README.md with huggingface_hub

abf6979 verified 6 months ago

425 Bytes

metadata

library_name: transformers
pipeline_tag: text-generation
base_model:
  - meta-llama/Llama-3.2-3B

UFT

## References

* [UFT: Unifying Supervised and Reinforcement Fine-Tuning](https://arxiv.org/abs/2505.16984)