Edit model card

Rationalyst (with rationales extracted from reasoning datasets)

This model is a fine-tuned version of the LLaMa-3-Instruct-8B. It was introduced in RATIONALYST: Supervising Reasoning via Self-Supervised Rationale Extraction. The code for the rationale extraction, model training and inference can be found here.

Model description

Implicit rationales are often embedded in the unlabelled text, reflecting the natural thought processes behind speech and writing. RATIONALYST is a self-supervised approach to extract and filter these implicit rationales from unlabelled text and apply them to supervise reasoning.

How to use

To use it, simply input question and partial reasoning trajectory, and the model will output the rationale to supervise the next reasoning step.

Training data

This Rationalyst is trained using 17566 rationales from GSM8K and 19669 rationales from ECQA. The data used can be found here

Evaluation results

When used to evaluate on downstream tasks, this model achieves the following results:

Task GSM8K MATH ECQA HellaSwag ProofWriter ARC MMLU-Pro
76.2 32.5 76.2 59.4 90.1 79.3 32.1
Downloads last month
2
Safetensors
Model size
7.5B params
Tensor type
F32
·
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.