t5-base-tr / README.md
bonur's picture
Update README.md
d083b5f
metadata
language:
  - tr
  - en
tags:
  - mt5
  - t5
  - text-generation-inference
  - turkish
widget:
  - text: >-
      Bu hafta hasta olduğum için <extra_id_0> gittim. Midem ağrıyordu ondan
      dolayı şu an <extra_id_1>.
  - example_title: Turkish Example 1
  - text: Bu gece kar yağacakmış. Yarın yollarda <extra_id_0> olabilir.
  - example_title: Turkish Example 2
  - text: I bought two tickets for NBA match. Do you like <extra_id_0> ?
  - example_title: English Example 2

Model Card

Please check google/mt5-base model. This model is pruned version of mt5-base model to only work in Turkish and English. Also for methodology, you can check Russian version of mT5-base cointegrated/rut5-base.

Usage

You should import required libraries by:

from transformers import T5ForConditionalGeneration, T5Tokenizer
import torch

To load model:

model = T5ForConditionalGeneration.from_pretrained('bonur/t5-base-tr')
tokenizer = T5Tokenizer.from_pretrained('bonur/t5-base-tr')

To make inference with given text, you can use the following code:

inputs = tokenizer("Bu hafta hasta olduğum için <extra_id_0> gittim.", return_tensors='pt')
with torch.no_grad():
    hypotheses = model.generate(
        **inputs,
        do_sample=True, top_p=0.95,
        num_return_sequences=2,
        repetition_penalty=2.75,
        max_length=32,
    )
for h in hypotheses:
    print(tokenizer1.decode(h))

You can tune parameters for better result, and this model is ready to fine-tune in bilingual downstream tasks with English and Turkish.