Update README.md
Browse files
README.md
CHANGED
@@ -4,10 +4,10 @@ license: apache-2.0
|
|
4 |
|
5 |
|
6 |
|
7 |
-
# Rationalyst
|
8 |
|
9 |
This model is a fine-tuned version of the [LLaMa-3-Instruct-8B](https://huggingface.co/bert-base-uncased). It was
|
10 |
-
introduced in RATIONALYST:
|
11 |
inference can be found [here](https://github.com/JHU-CLSP/reasoning_world_model).
|
12 |
|
13 |
## Model description
|
@@ -20,8 +20,7 @@ To use it, simply input question and partial reasoning trajectory, and the model
|
|
20 |
|
21 |
## Training data
|
22 |
|
23 |
-
This Rationalyst is trained using
|
24 |
-
[here](https://huggingface.co/datasets/Dongwei/reasoning_world_model)
|
25 |
|
26 |
|
27 |
## Evaluation results
|
@@ -30,4 +29,4 @@ When used to evaluate on downstream tasks, this model achieves the following res
|
|
30 |
|
31 |
| Task | GSM8K | MATH | ECQA | HellaSwag | ProofWriter | ARC | MMLU-Pro |
|
32 |
|:----:|:----:|:----:|:----:|:-----:|:----:|:-----:|:----:|
|
33 |
-
| |
|
|
|
4 |
|
5 |
|
6 |
|
7 |
+
# Rationalyst
|
8 |
|
9 |
This model is a fine-tuned version of the [LLaMa-3-Instruct-8B](https://huggingface.co/bert-base-uncased). It was
|
10 |
+
introduced in [RATIONALYST: Pre-training Process-Supervision for Improving Reasoning](https://arxiv.org/pdf/2410.01044). The code for the rationale extraction, model training, and
|
11 |
inference can be found [here](https://github.com/JHU-CLSP/reasoning_world_model).
|
12 |
|
13 |
## Model description
|
|
|
20 |
|
21 |
## Training data
|
22 |
|
23 |
+
This Rationalyst is trained using 65k implicit rationales from The Pile and 14k implicit rationales from GSM8K and ECQA. The data used can be found [here](https://huggingface.co/datasets/Dongwei/reasoning_world_model)
|
|
|
24 |
|
25 |
|
26 |
## Evaluation results
|
|
|
29 |
|
30 |
| Task | GSM8K | MATH | ECQA | HellaSwag | ProofWriter | ARC | MMLU-Pro |
|
31 |
|:----:|:----:|:----:|:----:|:-----:|:----:|:-----:|:----:|
|
32 |
+
| | 81.6 | 32.5 | 75.2 | 60.3 | 90.7 | 80.7 | 45.3 |
|