🐐 FinGEITje 7B

---
license: cc-by-nc-4.0
library_name: peft
tags:
- alignment-handbook
- generated_from_trainer
- trl
- sft
- geitje
- fingeitje
- dutch
- nl
- finance
base_model: BramVanroy/GEITje-7B-ultra
datasets:
- snoels/FinGEITje-sft
model-index:
- name: snoels/FinGEITje-7B-sft
  results: []
language:
- nl
pipeline_tag: text-generation
inference: false
---

<p align="center" style="margin:0;padding:0">
<img src="https://huggingface.co/snoels/FinGEITje-7B-sft/resolve/main/fingeitje-banner.png" alt="FinGEITje Banner" width="1000"/>
</p>

<div style="margin:auto; text-align:center">
  <h1 style="margin-bottom: 0; font-size: 2em;">🐐 FinGEITje 7B</h1>
  <em style="font-size: 1em;">A large open Dutch Financial language model.</em>
</div>

This model is a fine-tuned version of [BramVanroy/GEITje-7B-ultra](https://huggingface.co/BramVanroy/GEITje-7B-ultra) on the [snoels/FinGEITje-sft](https://huggingface.co/datasets/snoels/FinGEITje-sft) dataset.

## 📖 Model Description

FinGEITje 7B is a large open Dutch financial language model with 7 billion parameters, based on Mistral 7B. It has been further trained on Dutch financial texts, enhancing its proficiency in the Dutch language and its knowledge of financial topics. As a result, FinGEITje provides more accurate and relevant responses in the domain of finance.

## 📊 Training and Evaluation Data

### Training Data

FinGEITje 7B was fine-tuned on the [snoels/FinGEITje-sft](https://huggingface.co/datasets/snoels/FinGEITje-sft) dataset, which consists of translated and processed Dutch financial texts. This dataset includes a wide range of financial topics and instruction tuning data.

#### Data Processing Steps

1. **Translation**: Original instruction tuning datasets were translated into Dutch using a specialized translation service to maintain the integrity of financial terminology.
2. **Post-processing**: The translated data underwent post-processing to correct any translation inconsistencies and to format it according to the original dataset structure.
3. **Formatting**: The data was formatted to match the style and requirements of instruction tuning datasets, ensuring compatibility with the fine-tuning process.
4. **Filtering**: A Dutch language check and predefined validation checks were applied to filter out any low-quality or irrelevant data.

### Evaluation Data

The model was evaluated using:

- **[snoels/FinDutchBench](https://huggingface.co/datasets/snoels/FinDutchBench)**: A Dutch financial benchmark dataset designed to assess the model's performance on various financial tasks. 

## ⚙️ Training Procedure

FinGEITje was trained following the methodology described in the [Alignment Handbook](https://github.com/huggingface/alignment-handbook).

### Training Configuration

- The training configuration is based on the recipe outlined in the alignment handbook and can be found in the [config_qlora.yaml](https://github.com/snoels/fingeit/blob/master/src/training/sft/config_qlora.yaml) file.
- The model was further trained using **QLoRA** (Quantized LoRA) for efficient fine-tuning with reduced computational resources.

### Training Hyperparameters

The following hyperparameters were used during training:

- **Learning Rate**: 0.0002
- **Train Batch Size**: 4
- **Evaluation Batch Size**: 8
- **Seed**: 42
- **Distributed Type**: Multi-GPU
- **Gradient Accumulation Steps**: 2
- **Total Train Batch Size**: 8
- **Optimizer**: Adam with betas=(0.9, 0.999) and epsilon=1e-08
- **LR Scheduler Type**: Cosine
- **Warmup Ratio**: 0.1
- **Number of Epochs**: 1

### Training Results

| Training Loss | Epoch | Step | Validation Loss |
|---------------|-------|------|-----------------|
|     0.406     |  1.0  | 3922 |      0.3928     |

### Evaluation Package

The evaluation package includes a set of metrics defined per task, grouped per dataset to evaluate the model's performance across different financial domains. The evaluation notebooks are available:

- **[Evaluation in Dutch](https://github.com/snoels/fingeit/blob/master/notebooks/evaluation_nl.ipynb)**: Assesses the model's performance on the Dutch financial benchmark dataset.
- **[Evaluation in English](https://github.com/snoels/fingeit/blob/master/notebooks/evaluation_en.ipynb)**: Evaluates the model's performance on English financial benchmarks for comparison purposes.

### Framework Versions

- **PEFT**: 0.7.1
- **Transformers**: 4.39.0.dev0
- **PyTorch**: 2.1.2
- **Datasets**: 2.14.6
- **Tokenizers**: 0.15.2

## 🛠️ How to Use

FinGEITje 7B can be utilized using the Hugging Face Transformers library along with PEFT to load the LoRA adapters efficiently.

### Installation

Ensure you have the necessary libraries installed:

```bash
pip install torch transformers peft accelerate
```

### Loading the Model

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained("BramVanroy/GEITje-7B-ultra", use_fast=False)

# Load the base model
base_model = AutoModelForCausalLM.from_pretrained("BramVanroy/GEITje-7B-ultra", device_map='auto')

# Load the FinGEITje model with PEFT adapters
model = PeftModel.from_pretrained(base_model, "snoels/FinGEITje-7B-sft", device_map='auto')
```

### Generating Text

```python
# Prepare the input
input_text = "Wat zijn de laatste trends in de Nederlandse banksector?"
input_ids = tokenizer.encode(input_text, return_tensors='pt').to(model.device)

# Generate a response
outputs = model.generate(input_ids, max_length=200, num_return_sequences=1)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)

print(response)
```

## 🚧 Limitations and Future Work

While FinGEITje 7B demonstrates significant improvements in understanding and generating Dutch financial content, certain limitations exist:

- **Data Cutoff**: The model's knowledge is limited to the data it was trained on and may not include the most recent developments in the financial sector.
- **Accuracy Concerns**: The model may generate incorrect or outdated information. Users should verify critical information with reliable sources.
- **Biases**: Potential biases in the training data may affect the neutrality and fairness of the model's responses.
- **Language Scope**: Primarily designed for Dutch; performance in other languages is not optimized.
- **Ethical Use**: Users should ensure that the model's outputs comply with ethical standards and do not promote misinformation or harmful content.

### Future Work

- **Data Updates**: Incorporate more recent and diverse financial datasets to keep the model up-to-date.
- **Bias Mitigation**: Implement techniques to identify and reduce biases in the model's outputs.
- **Performance Enhancement**: Fine-tune on more specialized financial topics and complex financial tasks.
- **Multilingual Expansion**: Extend support to other languages relevant to the financial sector in the Netherlands and Europe.

## 🙏 Acknowledgements

We would like to thank:

- **Rijgersberg** ([GitHub](https://github.com/Rijgersberg)) for creating [GEITje](https://github.com/Rijgersberg/GEITje), one of the first Dutch foundation models, and for contributing significantly to the development of Dutch language models.
- **Bram Vanroy** ([GitHub](https://github.com/BramVanroy)) for creating [GEITje-7B-ultra](https://huggingface.co/BramVanroy/GEITje-7B-ultra), an open-source Dutch chat model, and for sharing training, translation, and evaluation resources.
- **Contributors of the [Alignment Handbook](https://github.com/huggingface/alignment-handbook)** for providing valuable resources that guided the development and training process of FinGEITje.
- **Silverfin** for their collaboration in this research. Silverfin, a Belgian scale-up focused on building an accountancy cloud service, provided valuable insights and resources that were instrumental in the development of FinGEITje. More about their work can be found at [Silverfin](https://silverfin.com/).
  
## 📝 Citation
[Link to the paper](https://arxiv.org/abs/2410.12835) 

If you use FinGEITje in your work, please cite:

```bibtex
@article{FinGEITje2024,
  title={A Dutch Financial Large Language Model},
  author={Noels, Sander and De Blaere, Jorne and De Bie, Tijl},
  journal={arXiv preprint arXiv:2410.12835},
  year={2024},
  url={https://arxiv.org/abs/2410.12835}
}
```

## 📜 License

This model is licensed under the [Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)](https://creativecommons.org/licenses/by-nc/4.0/) license.

## 📧 Contact

For any inquiries or questions, please contact [Sander Noels](mailto:sander.noels@ugent.be).