lqtrung1998 commited on
Commit
414d3a8
1 Parent(s): 7ee426b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -10,10 +10,10 @@ Repo: https://github.com/lqtrung1998/mwp_ReFT (under [Apache2.0 License](https:/
10
  We introduce REinforced Fine-tuning (ReFT), a method that enhances the generalizability of learning LLMs for reasoning.
11
 
12
  This repository contains:
13
- - A Supervised Fine-tuned model on GSM8k benchmark: [lqtrung1998/galactica-6.7b-hf-SFT-GSM8k](https://huggingface.co/lqtrung1998/galactica-6.7b-hf-SFT-GSM8k)
14
- - A Warmup Supervised Fine-tuned model on GSM8k benchmark: [lqtrung1998/galactica-6.7b-hf-SFT-warmup-GSM8k](https://huggingface.co/lqtrung1998/galactica-6.7b-hf-SFT-warmup-GSM8k)
15
- - A REinforced Fine-tuned model on GSM8k benchmark: [lqtrung1998/galactica-6.7b-hf-ReFT-GSM8k](https://huggingface.co/lqtrung1998/galactica-6.7b-hf-ReFT-GSM8k)
16
- - A Rerank model that can score the fine-tuned model output: [lqtrung1998/galactica-6.7b-hf-ReFT-Rerank-GSM8k](https://huggingface.co/lqtrung1998/galactica-6.7b-hf-ReFT-Rerank-GSM8k)
17
 
18
  Note: Our models are tuned based on Galactica, thus, licenses applicable to Galactica, such as non-commercial CC BY-NC 4.0 license also hold on these models.
19
 
 
10
  We introduce REinforced Fine-tuning (ReFT), a method that enhances the generalizability of learning LLMs for reasoning.
11
 
12
  This repository contains:
13
+ - A Supervised Fine-tuned model on GSM8k benchmark: [lqtrung1998/galactica-6.7b-SFT-GSM8k](https://huggingface.co/lqtrung1998/galactica-6.7b-SFT-GSM8k)
14
+ - A Warmup Supervised Fine-tuned model on GSM8k benchmark: [lqtrung1998/galactica-6.7b-SFT-warmup-GSM8k](https://huggingface.co/lqtrung1998/galactica-6.7b-SFT-warmup-GSM8k)
15
+ - A REinforced Fine-tuned model on GSM8k benchmark: [lqtrung1998/galactica-6.7b-ReFT-GSM8k](https://huggingface.co/lqtrung1998/galactica-6.7b-ReFT-GSM8k)
16
+ - A Rerank model that can score the fine-tuned model output: [lqtrung1998/galactica-6.7b-ReFT-Rerank-GSM8k](https://huggingface.co/lqtrung1998/galactica-6.7b-ReFT-Rerank-GSM8k)
17
 
18
  Note: Our models are tuned based on Galactica, thus, licenses applicable to Galactica, such as non-commercial CC BY-NC 4.0 license also hold on these models.
19