--- license: mit language: de --- # german-financial-statements-bert This model is a fine-tuned version of [bert-base-german-cased](https://huggingface.co/bert-base-german-cased) using German financial statements. It achieves the following results on the evaluation set: - Loss: 1.2025 - Accuracy: 0.7376 - Perplexity: 3.3285 ## Model description Annual financial statements in Germany are published in the Federal Gazette and are freely accessible. The documents describe the entrepreneurial and in particular the financial situation of a company with reference to a reporting period. The german-financial-statements-bert model aims to provide a BERT model specifically for this domain. ## Training and evaluation data The training was performed with 100,000 natural language sentences from annual financial statements. 50,000 of these sentences were taken unfiltered and randomly from 5,500 different financial statement documents, and another 50,000 were also taken randomly from 5,500 different financial statement documents, but this half was filtered so that only sentences referring to a financial entity were selected. Specifically, this means that the second half of the sentences contains an indicator for a reference to a financial entity (EUR, Euro, TEUR, €, T€). The evaluation was carried out with 20,000 sentences of the same origin and distribution. ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 5e-05 - train_batch_size: 8 - eval_batch_size: 8 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 3.0 ### Framework versions - Transformers 4.17.0 - Pytorch 1.10.0+cu111 - Datasets 1.18.3 - Tokenizers 0.11.6