Czech Poetry TinyLLama
TinyLLama finetuned on Czech poetry from github project by
Institute of Czech Literature, Czech Academy of Sciences.
https://github.com/versotym/corpusCzechVerse
Usage
Use as any other LM style model
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
tokenizer = AutoTokenizer.from_pretrained("jinymusim/TinyLlama-Czech-Poet")
model = AutoModelForCausalLM.from_pretrained("jinymusim/TinyLlama-Czech-Poet")
# Input Poet Start
poet_start = '<|AUTHOR|> Adámek, Bohumil'
poet_start = poet_start.strip()
tokenized_poet_start = tokenizer.encode(poet_start, return_tensors='pt')
# generated a continuation to it
out = model.generate(tokenized_poet_start,
max_length=256,
do_sample=True,
top_k=50
early_stopping=True,
pad_token_id= tokenizer.pad_token_id,
eos_token_id = tokenizer.eos_token_id)
# Decode Poet
decoded_cont = tokenizer.decode(out[0], skip_special_tokens=True)
print(decoded_cont)
Structure of outputs
Outputs are structured in following way:
<|AUTHOR|> AUTHOR
<|TITLE|> TITLE
<|YEAR|> YEAR
<|STROPHE_START|>
<|METER|> METER
<|RHYME|> RHYME SCHEMA
STROPHE
<|STROPHE_END|>
<|STROPHE_START|>
<|METER|> METER
<|RHYME|> RHYME SCHEMA
STROPHE
<|STROPHE_START|>
- Downloads last month
- 18
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for jinymusim/TinyLlama-Czech-Poet
Base model
BUT-FIT/CSTinyLlama-1.2B