Update README.md

Cleanup model card, tags and license

Files changed (1) hide show

README.md CHANGED Viewed

	@@ -1 +1,14 @@
1	- This is a model released from the preprint: [SimPO: Simple Preference Optimization with a Reference-Free Reward](https://arxiv.org/abs/2405.14734) Please refer to our [repository](https://github.com/princeton-nlp/SimPO) for more details.

+---
+license: llama3
+library_name: transformers
+pipeline_tag: text-generation
+tags:
+- SimPo
+language:
+- en
+base_model:
+- meta-llama/Meta-Llama-3-8B-Instruct
+---
+This is a model released from the preprint: *[SimPO: Simple Preference Optimization with a Reference-Free Reward](https://arxiv.org/abs/2405.14734)*, which is an offline preference optimization algorithm designed to enhance the training of large language models (LLMs) with preference optimization datasets.
+SimPO aligns the reward function with the generation likelihood, eliminating the need for a reference model and incorporating a target reward margin to boost performance.
+Please refer to our [github repo](https://github.com/princeton-nlp/SimPO) for more details.