base_model: tiiuae/Falcon3-10B-Instruct
tags:
- text-generation-inference
- transformers
- unsloth
- llama
- trl
license: apache-2.0
language:
- en
pipeline_tag: text-generation
library_name: transformers
model-index:
- name: RA_Reasoner
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: IFEval (0-Shot)
type: HuggingFaceH4/ifeval
args:
num_few_shot: 0
metrics:
- type: inst_level_strict_acc and prompt_level_strict_acc
value: 55.92
name: strict accuracy
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Daemontatox/RA_Reasoner
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: BBH (3-Shot)
type: BBH
args:
num_few_shot: 3
metrics:
- type: acc_norm
value: 43.07
name: normalized accuracy
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Daemontatox/RA_Reasoner
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MATH Lvl 5 (4-Shot)
type: hendrycks/competition_math
args:
num_few_shot: 4
metrics:
- type: exact_match
value: 20.09
name: exact match
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Daemontatox/RA_Reasoner
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: GPQA (0-shot)
type: Idavidrein/gpqa
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 10.85
name: acc_norm
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Daemontatox/RA_Reasoner
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MuSR (0-shot)
type: TAUR-Lab/MuSR
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 7.51
name: acc_norm
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Daemontatox/RA_Reasoner
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU-PRO (5-shot)
type: TIGER-Lab/MMLU-Pro
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 36.67
name: accuracy
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Daemontatox/RA_Reasoner
name: Open LLM Leaderboard
Uploaded Model
Developed by: Daemontatox
License: Apache 2.0
Finetuned from model: tiiuae/Falcon3-10B-Instruct
This model was fine-tuned from the Falcon-10B-Instruct model. It was trained 2x faster with Unsloth and Hugging Face's TRL library.
This model is intended for text generation tasks, with a focus on reasoning capabilities and instruction following, similar to capabilities demonstrated by the ChatGPT-O1-Mini model.
Training Details
This model was fine-tuned with Unsloth and TRL, resulting in significant speed improvements during the training process. Details on specific fine-tuning data, parameters and methods will be added soon. The fine-tuning process has prioritized improving the model's reasoning abilities on various benchmarks.
Intended Use
This model is intended for research and development purposes related to text generation, instruction following, and complex reasoning tasks. It is suitable for applications that require a model capable of handling multi-step logical problems and understanding nuanced instructions.
Focus on Reasoning: The fine-tuning has been geared towards enhancing the model's ability to tackle reasoning challenges and logic-based tasks.
Open LLM Leaderboard Evaluation Results
Detailed results can be found here! Summarized results can be found here!
Metric | % Value |
---|---|
Avg. | 29.02 |
IFEval (0-Shot) | 55.92 |
BBH (3-Shot) | 43.07 |
MATH Lvl 5 (4-Shot) | 20.09 |
GPQA (0-shot) | 10.85 |
MuSR (0-shot) | 7.51 |
MMLU-PRO (5-shot) | 36.67 |