Edit model card
☯️

Dao-9B

Intro

Dao-9B is the smaller, open-source version of the translation model powering the Omni Translator, a state-of-the-art literary translation tool. While less capable than the full model, as it's on a less recent methodology and data, Dao-9B is still a powerful translation model that can be run locally and performs especially well on Chinese webnovels.

While no comparison is available for this model, you can find the comparison page (against several other translation tools) for the full model here.

Quick Start

To get started quickly, you can explore the starter Colab notebook here.

We do not provide inference service for this model, you can instead try out the more powerful full model on the Omni Translator.

Usage

To fully utilize the capability of this model in translation, it is recommended to follow the steps below:

  1. Preprocessing: Prepare the text to be translated. Make sure that \n\n is used to separate paragraphs. Normalize the text via Unicode normalization (NFKC) and remove any extra spaces.
  2. Chunking: Break the text into chunks of approximately 350 characters. This is to ensure that the model can handle the text efficiently.
  3. Term Extraction: Extract terms from the text to be translated. This is especially useful for translating novels, where the same terms are used repeatedly across the chapters.
  4. Translation: Translate the text using the model. Provide the terms extracted in step 3, and the previous chunk of text to the model to improve the translation quality.

We provide a starting code that demonstrates most of the above steps in the starter Colab notebook.

Extracting Terms

Use the following prompt template to extract terms from the text:

<context>
{context}
</context>

<passage>
{input}
</passage>

Given the above passage, please list out the terminologies and namings present that may be reused in the translations of future passages. Output the terminologies in the format of Raw, English. Use the CSV format.

Performing Translation

Use the following prompt template to perform translation:

<context>
{context}
</context>

<passage>
{input}
</passage>

<terms>
{terms}
</terms>

You are a professional translator. Given the above passage, please translate the passage to English.

Limitations

  • This model is uncensored and may generate content that is inappropriate for some audiences. Please use with caution.
  • This model is trained with mainly Chinese -> English data, and may not perform well on other language pairs.
Downloads last month
48
Safetensors
Model size
8.83B params
Tensor type
BF16
·

Finetuned from