Edit model card

PULI GPTrio (7.67B billion parameter)

For further details read our paper or testing our instruct model, see our demo site.

  • Hungarian-English-Chinese trilingual GPT-NeoX model (7.67B billion parameter)
  • Trained with EleutherAI's GPT-NeoX github
  • Checkpoint: 410 000 steps

Dataset

  • Hungarian: 41.5 billion words (314 GB)
  • English: 61.9 billion words (391 GB)
  • Github: 6 million documents (33 GB)
  • Chinese: 98.7 billion Chinese character (340 GB)
    • (12 billion non Chinese token)

Limitations

  • max_seq_length = 2048
  • float16
  • vocab size: 150 016

Citation

If you use this model, please cite the following paper:

@inproceedings {yang-puli-gptrio,
    title = {Mono- and multilingual GPT-3 models for Hungarian},
    booktitle = {Text, Speech, and Dialogue},
    year = {2023},
    publisher = {Springer Nature Switzerland},
    series = {Lecture Notes in Computer Science},
    address = {Plzeň, Czech Republic},
    author = {Yang, Zijian Győző and Laki, László János and Váradi, Tamás and Prószéky, Gábor},
    pages = {94--104},
    isbn = {978-3-031-40498-6}
}

Usage

from transformers import GPTNeoXForCausalLM, AutoTokenizer

model = GPTNeoXForCausalLM.from_pretrained("NYTK/PULI-GPTrio")
tokenizer = AutoTokenizer.from_pretrained("NYTK/PULI-GPTrio")
prompt = "Elmesélek egy történetet a nyelvtechnológiáról."
input_ids = tokenizer(prompt, return_tensors="pt").input_ids

gen_tokens = model.generate(
    input_ids,
    do_sample=True,
    temperature=0.9,
    max_length=100,
)

gen_text = tokenizer.batch_decode(gen_tokens)[0]
print(gen_text)

Usage with pipeline

from transformers import pipeline, GPTNeoXForCausalLM, AutoTokenizer

model = GPTNeoXForCausalLM.from_pretrained("NYTK/PULI-GPTrio")
tokenizer = AutoTokenizer.from_pretrained("NYTK/PULI-GPTrio")
prompt = "Elmesélek egy történetet a nyelvtechnológiáról."
generator = pipeline(task="text-generation", model=model, tokenizer=tokenizer)

print(generator(prompt)[0]["generated_text"])
Downloads last month
875
Safetensors
Model size
7.67B params
Tensor type
FP16
·
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for NYTK/PULI-GPTrio

Finetunes
1 model

Spaces using NYTK/PULI-GPTrio 26