colab notebook, no environment needed

简单的gpu colab笔记本，可以测试，不需要线下的环境配置和安装
总之换成gpu模式可以用啦，翻译一本小说什么的可读性不高（尴尬）
https://colab.research.google.com/drive/19rQG4ryrue-0g8KH4ATT0_o2-8tHLcIT?usp=sharing

Release Notes

this model is finetuned from mt5-base, training methods and datasets refers to larryvrh/mt5-translation-ja_zh
used a trimmed and fused dataset CCMatrix-v1-Ja_Zh 1e-4 for 1 epoch no weight decay，arraived at about 1.5 val loss, pretty decent for this behemoth tokenizer
spent about 26h on a modified 2080ti 22g graphic card, but size-wise this is safe to train on much smaller cards
reason for making this model
There are some issues in the original model by larryvrh, which includes:
- long sentence repetition, doesn't recongize breaks
- dirty mix of numbers and periods
- translates to or from english "sometimes"
- a bit too big on smaller cards
  They are generally all-parameter problems that i can only partially change with all-parameter finetune But I generally perfer to make a base model that doesn't have these issues to begin with. so here...

模型公开声明

这个模型由 mt5-translation-ja_zh 启发（其实就是在它上面改的），使用mt5-base，比原模型要小一些
使用了CCMatrix-v1-Ja_Zh， 1e-4学习率， 1 个epoch
大概在自己的2080ti 22g卡上跑了26小时，用高级的小卡会更快
制造这个模型的原因 larryvrh的原模型很不错了，但是有一些小问题
- 长句子会卷起来重复，而且不认识换行符
- 数字和标点会乱写
- 有时候会翻译或翻成英文，有时候会不翻
- 对于小的机器来说有点大了
  当然还有别的问题，但是以上这些问题涉及到所有的param形状，我加lora上去它还是歪的，并不解决问题，像是之前那样整个模型finetune太精细不好把握所以还是重新炼个丹把上面的都解决掉

简单的后端应用

还没稳定调试，慎用

https://github.com/IryNeko/RabbitCafe

A more precise example using it

使用指南

from transformers import pipeline
model_name="iryneko571/mt5-base-translation-ja_zh"
#pipe = pipeline("translation",model=model_name,tokenizer=model_name,repetition_penalty=1.4,batch_size=1,max_length=256)
pipe = pipeline("translation",
  model=model_name,
  repetition_penalty=1.4,
  batch_size=1,
  max_length=256
  )

def translate_batch(batch, language='<-ja2zh->'): # batch is an array of string
    i=0 # quickly format the list
    while i<len(batch):
        batch[i]=f'{language} {batch[i]}'
        i+=1
    translated=pipe(batch)
    result=[]
    i=0
    while i<len(translated):
        result.append(translated[i]['translation_text'])
        i+=1
    return result

inputs=[]

print(translate_batch(inputs))

Roadmap

want some loras?
build the platform better

how to find me

找到作者

Discord Server:
https://discord.gg/JmjPmJjA
If you need any help, a test server or just want to chat
如果需要帮助，需要试试最新的版本，或者只是为了看下我是啥，可以进channel看看（这边允许发布这个吗？）

iryneko571
/

mt5-base-translation-ja_zh