File size: 2,916 Bytes
4ae8dbf
 
 
 
 
 
 
 
 
 
 
 
a284f8c
4ae8dbf
d0af668
 
4ae8dbf
 
a284f8c
4ae8dbf
a284f8c
4ae8dbf
 
 
 
 
 
 
 
 
 
a284f8c
4ae8dbf
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
---
license: gpl-3.0
datasets:
- wmt/wmt19
language:
- en
- zh
base_model: Mxode/NanoLM-365M-Base
pipeline_tag: translation
tags:
- text-generation-inference
---
# NanoTranslator-immersive_translate-365M

English | [简体中文](README_zh-CN.md)

## Introduction

NanoTranslator-immersive_translate-365M is a model specifically designed for **Chinese-English bilingual** translation, trained with 6M data from the [wmt-19](https://huggingface.co/datasets/wmt/wmt19) dataset, based on [NanoLM-365M-Base](https://huggingface.co/Mxode/NanoLM-365M-Base).

This model is trained following the [Immersive Translate](https://immersivetranslate.com/) prompt format and can be deployed as an OpenAI format interface using tools like vllm and lmdeploy for utilization.

## How to use

Below is a method to call the model using transformers. The prompt follows the immersive translation format to ensure optimal results.

```python
import torch
from typing import Literal
from transformers import AutoModelForCausalLM, AutoTokenizer

model_path = 'Mxode/NanoTranslator-immersive_translate-365M'

model = AutoModelForCausalLM.from_pretrained(model_path).to('cuda:0', torch.bfloat16)
tokenizer = AutoTokenizer.from_pretrained(model_path)

def translate(
    text: str,
    to: Literal["chinese", "english"] = "chinese",
    **kwargs
):
    generation_args = dict(
        max_new_tokens = kwargs.pop("max_new_tokens", 512),
        do_sample = kwargs.pop("do_sample", True),
        temperature = kwargs.pop("temperature", 0.35),
        top_p = kwargs.pop("top_p", 0.8),
        top_k = kwargs.pop("top_k", 40),
        **kwargs
    )

    prompt = """Translate the following source text to {to}. Output translation directly without any additional text.
    Source Text: {text}

    Translated Text:"""

    messages = [
        {"role": "system", "content": "You are a professional, authentic machine translation engine."},
        {"role": "user", "content": prompt.format(to=to, text=text)}
    ]
    inputs = tokenizer.apply_chat_template(
        messages,
        tokenize=False,
        add_generation_prompt=True
    )
    model_inputs = tokenizer([inputs], return_tensors="pt").to(model.device)

    generated_ids = model.generate(model_inputs.input_ids, **generation_args)
    generated_ids = [
        output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
    ]

    response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
    return response

text = "After a long day at work, I love to unwind by cooking a nice dinner and watching my favorite TV series. It really helps me relax and recharge for the next day."
response = translate(text=text, to='chinese')
print(f'Translation: {response}')

"""
Translation: 工作了一天,我喜欢吃一顿美味的晚餐,看我最喜欢的电视剧,这样做有助于我放松,补充能量。
"""
```