---
language:
- en
- fr
- es
- pt
tags:
- falcon3
license: other
license_name: falcon-llm-license
license_link: https://falconllm.tii.ae/falcon-terms-and-conditions.html
library_name: transformers
---
# Falcon3-7B-Base
**Falcon3** family of Open Foundation Models is a set of pretrained and instruct LLMs ranging from 1B to 10B.
This repository contains the **Falcon3-7B-Base**. It achieves state of art results (at the time of release) on reasoning, language understanding, instruction following, code and mathematics tasks.
Falcon3-7B-Base supports 4 languages (english, french, spanish, portuguese) and a context length up to 32K.
⚠️ **This is a raw, pretrained model, which should be further finetuned for most usecases.**
## Model Details
- Architecture
- transformer based causal decoder only architecture
- 28 decoder blocks
- grouped query attention (GQA) for faster inference: 12 query heads and 4 KV heads
- wider head dimension: 256
- high RoPE value to support long context understanding: 1000042
- 32k context length
- 131k vocab size
- Pretrained on 14 Teratokens of datasets comprising of web, code, STEM, high quality and mutlilingual data using 1024 H100 GPU chips
- Supports EN, FR, ES, PT
- Developed by [Technology Innovation Institute](https://www.tii.ae)
- License: TII Falcon-LLM License 2.0
- Model Release Date: December 2024
## Getting started
Click to expand
```python
import torch
from transformers import pipeline
pipe = pipeline(
"text-generation",
model="tiiuae/Falcon3-7B-Base",
torch_dtype=torch.bfloat16,
device_map="auto"
)
response = pipe("Question: How many hours in one day? Answer: ")
print(response[0]['generated_text'])
```
# Benchmarks
We report in the following table our internal pipeline benchmarks.
- We use [lm-evaluation harness](https://github.com/EleutherAI/lm-evaluation-harness).
- We report **raw scores**.
- We use same batch-size across all models.
Category |
Benchmark |
Llama3.1-8B |
Qwen2-7B |
Qwen2.5-7B |
gemma-2-9b |
Falcon3-7B-Base |
General |
MMLU (5-shot) |
65.2 |
70.4 |
74.2 |
- |
67.5 |
MMLU-PRO (5-shot) |
32.7 |
42.1 |
43.5 |
- |
39.2 |
IFEval |
12.0 |
30.6 |
33.9 |
- |
34.3 |
Math |
GSM8K (5-shot) |
49.4 |
77.9 |
82.9 |
- |
76.2 |
MATH(4-shot) |
4.1 |
17.5 |
15.5 |
- |
18.0 |
Reasoning |
Arc Challenge (25-shot) |
53.4 |
57.4 |
59.0 |
- |
59.6 |
GPQA (0-shot) |
31.0 |
31.9 |
33.0 |
- |
35.5 |
MUSR (0-shot) |
38.0 |
44.1 |
44.2 |
- |
47.3 |
BBH (3-shot) |
46.5 |
53.3 |
54.0 |
- |
51.0 |
CommonSense Understanding |
PIQA (0-shot) |
80.3 |
79.8 |
78.7 |
- |
77.7 |
SciQ (0-shot) |
96.3 |
95.9 |
96.6 |
- |
95.3 |
Winogrande (0-shot) |
74.0 |
72.1 |
72.9 |
- |
71.0 |
OpenbookQA (0-shot) |
33.4 |
35.2 |
33.6 |
- |
31.4 |
# Citation
If Falcon3 family were helpful to your work, feel free to give us a cite.
```
@misc{Falcon3,
title = {Falcon 3 family of Open Foundation Models},
author = {TII Team},
month = {December},
year = {2024}
}
```