--- language: de license: cc-by-4.0 tags: - named-entity-recognition, legal, ner datasets: - german-ler metrics: - precision - recall - f1 model-index: - name: elenanereiss/bert-german-ler results: - task: name: Token Classification type: token-classification dataset: name: german-ler type: german-ler args: german-ler metrics: - name: F1 type: f1 value: 0.9546215361725869 - name: Precision type: precision value: 0.9449558173784978 - name: Recall type: recall value: 0.9644870349492672 pipeline_tag: token-classification widget: - text: "Herr W. verstieß gegen § 36 Abs. 7 IfSG." --- # bert-german-ler ## Model description This model is a fine-tuned version of [bert-base-german-cased](https://huggingface.co/bert-base-german-cased) on the [German LER Dataset](https://huggingface.co/datasets/elenanereiss/german-ler). Model fine-tuning is done via [T-NER](https://github.com/asahi417/tner)'s hyper-parameter search (see the repository for more detail). It achieves the following results on the test set: ## Intended uses & limitations to do ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 1e-05 - train_batch_size: 12 - eval_batch_size: 16 - max_seq_length: 512 - num_epochs: 3 ## Results ``` eval_loss = 0.020239440724253654 eval_accuracy_score = 0.9953227664227791 eval_precision = 0.9212203128016991 eval_recall = 0.9458762886597938 eval_f1 = 0.9333855032769246 eval_runtime = 111.4147 eval_samples_per_second = 59.875 eval_steps_per_second = 3.743 epoch = 3.0 ``` ``` test_loss = 0.011871221475303173 test_accuracy_score = 0.9969460436964865 test_precision = 0.9449558173784978 test_recall = 0.9644870349492672 test_f1 = 0.9546215361725869 test_runtime = 111.5143 test_samples_per_second = 59.849 test_steps_per_second = 3.748 ``` ### Usage to do ### Reference ``` @misc{https://doi.org/10.48550/arxiv.2003.13016, doi = {10.48550/ARXIV.2003.13016}, url = {https://arxiv.org/abs/2003.13016}, author = {Leitner, Elena and Rehm, Georg and Moreno-Schneider, Julián}, keywords = {Computation and Language (cs.CL), Information Retrieval (cs.IR), FOS: Computer and information sciences, FOS: Computer and information sciences}, title = {A Dataset of German Legal Documents for Named Entity Recognition}, publisher = {arXiv}, year = {2020}, copyright = {arXiv.org perpetual, non-exclusive license} } ```