--- language: - de license: mit library_name: transformers pipeline_tag: text2text-generation tags: - llama - translation - german - dialect - swabian - qlora - dpo datasets: - custom model-index: - name: swabian-german-translator results: - task: type: translation name: German-Swabian Translation metrics: - type: accuracy value: 0.8 name: Training Loss - type: bleu value: N/A name: BLEU Score metadata: author: [Your Name] framework: pytorch fine_tuning_type: - dpo - qlora base_model: llama-3.1-8b training_data: Custom dataset based on Schwäbisch-Schwätza wordbook training_processes: - sft - dpo --- # Swabian-German Translation Model (DPO-Enhanced) This model fine-tunes LLAMA 3.1 8B for bidirectional translation between Standard German and Swabian dialect, enhanced through Direct Preference Optimization (DPO). ## Model Details - Base Model: LLAMA 3.1 8B - Training Method: Two-stage fine-tuning (SFT + DPO) - Training Data: 12,000+ word-pair translations with contextual sentences - Hardware Requirements: Compatible with single-GPU setups (thanks to QLoRA) ## Intended Use - Translating between Standard German and Swabian dialect - Understanding and preserving regional linguistic variations - Educational purposes for language learners ## Usage ### Basic Translation ```python from transformers import AutoModelForCausalLM, AutoTokenizer import torch # Load model and tokenizer model_name = "your-username/swabian-translator-dpo" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name) # Example translation from Swabian to Standard German def translate(text, direction="to_german"): if direction == "to_german": prompt = f"Übersetze ins Hochdeutsche: {text}" else: prompt = f"Übersetze ins Schwäbische: {text}" inputs = tokenizer(prompt, return_tensors="pt") outputs = model.generate(**inputs, max_length=100) return tokenizer.decode(outputs[0], skip_special_tokens=True) # Example usage swabian_text = "Du hosch ja a blaus Mol am Arm!" german_translation = translate(swabian_text, "to_german") print(german_translation) # Expected: "Du hast ja einen Bluterguss am Arm!" ``` ### Translation Examples Swabian to German: ``` Input: "I han koi Zeit" Output: "Ich habe keine Zeit" Input: "Des goht et" Output: "Das geht nicht" Input: "Wo bisch du her komma?" Output: "Woher kommst du?" ``` German to Swabian: ``` Input: "Ich verstehe das nicht" Output: "I versteh des et" Input: "Das schmeckt sehr gut" Output: "Des schmeckt arg guat" ``` ## Model Architecture & Training ### Training Process 1. **Initial Dataset Preparation** - Base dataset: 12,000+ word pairs from Schwäbisch-Schwätza wordbook - Context enhancement using LLM-generated sentences - Manual verification and cleanup 2. **SFT (Supervised Fine-Tuning)** - QLoRA implementation for efficient training - 2 epochs on the complete dataset - Loss convergence at ~0.8 3. **DPO (Direct Preference Optimization)** - 300 carefully curated preference pairs - 3 epochs of preference learning - Focus on natural and accurate translations ### Technical Implementation - Quantized training using QLoRA - 4-bit precision for efficient resource usage - Training framework: UnslothAI - Single GPU training (~16GB VRAM required) ## Limitations and Considerations 1. **Dialect Variations** - Swabian varies significantly by region - Model focuses on common/standard Swabian expressions - May not capture all local variations 2. **Translation Quality** - Best performance on common phrases and expressions - May struggle with very colloquial or context-dependent translations - Not recommended for official or legal translations 3. **Technical Limitations** - Input length limited to 512 tokens - Generation speed affected by quantization - Memory requirements: ~8GB RAM minimum ## Community and Contributions We welcome community contributions to improve the model: - Additional training data - Regional variant documentation - Bug reports and fixes - Performance improvements Please submit issues or pull requests through the Hugging Face repository. ## Citation and Attribution ```bibtex @misc{swabian-german-translator, author = {[Your Name]}, title = {Swabian-German Translation Model}, year = {2024}, publisher = {Hugging Face}, journal = {Hugging Face Model Hub} } ``` ## License This project is licensed under the MIT License - see the LICENSE file for details. ## Acknowledgments - Original dictionary data: [schwäbisch-schwätza.de](http://xn--schwbisch-schwtza-tqbk.de/) - UnslothAI for the training framework - LLAMA 3.1 8B base model