--- license: apache-2.0 language: - en pipeline_tag: text-generation tags: - chat base_model: Qwen/Qwen2-7B --- # Model Summary Qwen2-7B-Instruct-Better-Translation is a fine-tuned language model based on Qwen2-7B-Instruct, specifically optimized for improving English-to-Chinese translation. The model was fine-tuned using Direct Preference Optimization (DPO) with a custom dataset that prioritizes fluent, idiomatic translations (chosen) over literal translations (rejected). Developers: sevenone - License: Qwen2 License - Base Model: Qwen2-7B-Instruct - Model Size: 7B - Context Length: 131,072 tokens (inherits from Qwen2-7B-Instruct) For more details, please refer to our [GitHub](https://github.com/sevenyearsonelife/Better_translation). # 1. Introduction Qwen2-7B-Instruct-Better-Translation is designed to provide high-quality English-to-Chinese translations, particularly focusing on producing natural, idiomatic translations instead of literal, word-for-word translations. The fine-tuning process involved using a preference dataset where the chosen translations were idiomatic and the rejected translations were more literal. This model is ideal for users who need accurate and fluent translations for complex or nuanced English text. # 2. Training Details The model was fine-tuned using Direct Preference Optimization (DPO), a method that optimizes the model to prefer certain outputs over others based on user-provided preferences. The training dataset consisted of English source sentences, with corresponding translations labeled as either "chosen" (idiomatic) or "rejected" (literal). - Training Framework: Hugging Face Transformers - Optimizer: AdamW - Training Method: Lora with direct preference optimization - Training Data: Custom preference dataset for English-to-Chinese translation - Preference Type: Favoring idiomatic translations (chosen) over literal translations (rejected) # 3. Requirements To use this model, please ensure you have installed `transformers>=4.37.0` to avoid any compatibility issues. # 4. Usage You can load and use the model to translate English to Chinese as shown in the following code snippet: ```python from transformers import AutoModelForCausalLM, AutoTokenizer import torch model_id = "sevenone/Qwen2-7B-Instruct-Better-Translation" device = "cuda" # load onto GPU if available tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained( model_id, torch_dtype="auto", device_map="auto" ) prompt = "Translate the following sentence to Chinese: 'Artificial intelligence is transforming industries worldwide.'" messages = [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": prompt} ] # Apply the chat template for better generation text = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True ) model_inputs = tokenizer([text], return_tensors="pt").to(device) # Generate translation generated_ids = model.generate( model_inputs.input_ids, max_new_tokens=512 ) generated_ids = [ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids) ] response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0] print(response) ``` # 5. Citation If sevenone/qwen2-7b-instruct-better-translation is helpful in your work, please kindly cite as: ``` @misc{sevenone_2024, author = {sevenone}, title = {Qwen2-7B-Instruct-Better-Translation}, year = 2024, url = {https://huggingface.co/sevenone/Qwen2-7B-Instruct-Better-Translation}, publisher = {Hugging Face} } ```