niclasgriesshaber
commited on
Updated README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,40 @@
|
|
1 |
-
---
|
2 |
-
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
base_model: unsloth/gemma-2-2b-it-bnb-4bit
|
3 |
+
language:
|
4 |
+
- en
|
5 |
+
- de
|
6 |
+
license: apache-2.0
|
7 |
+
tags:
|
8 |
+
- text-generation-inference
|
9 |
+
- transformers
|
10 |
+
- unsloth
|
11 |
+
- llama
|
12 |
+
- trl
|
13 |
+
- machine-translation
|
14 |
+
- historical-language
|
15 |
+
- early-modern-german
|
16 |
+
- legal-texts
|
17 |
+
- economic-history
|
18 |
+
- open-source
|
19 |
+
---
|
20 |
+
|
21 |
+
# English to Early Modern Bohemian German Translation Model
|
22 |
+
|
23 |
+
## Overview
|
24 |
+
|
25 |
+
This model translates from English to Early Modern Bohemian German (EMBG). It was fine-tuned using LoRA on a unique historical dataset of 3,873 paragraph-level translation pairs sourced from legal court records. The dataset was meticulously transcribed and translated by the Chichele Professor of Economic History, **Sheilagh Ogilvie**, from All Souls College, University of Oxford.
|
26 |
+
|
27 |
+
### Key Features
|
28 |
+
|
29 |
+
- **Base Model**: `unsloth/gemma-2-2b-it-bnb-4bit`
|
30 |
+
- **Fine-Tuning**: Performed using [LoRA](https://arxiv.org/abs/2106.09685) and [Unsloth](https://github.com/unslothai/unsloth), leveraging Hugging Face's [Transformers](https://github.com/huggingface/transformers) and [TRL](https://github.com/huggingface/trl) libraries.
|
31 |
+
- **Languages Supported**:
|
32 |
+
- Source: English
|
33 |
+
- Target: Early Modern Bohemian German (EMBG)
|
34 |
+
- **Dataset**: Legal court records, manually transcribed and translated over five years. The dataset will be published in an upcoming [ACL](https://acl2024.org) paper.
|
35 |
+
|
36 |
+
### Use Cases
|
37 |
+
|
38 |
+
- Research in economic history and legal studies.
|
39 |
+
- Exploration of historical dialects and their nuances.
|
40 |
+
- Applications in language revitalisation and historical text analysis.
|