|
--- |
|
language: |
|
- en |
|
- ru |
|
license: apache-2.0 |
|
|
|
|
|
tags: |
|
- gpt |
|
- NLG |
|
|
|
--- |
|
|
|
# YaLM 100B |
|
|
|
https://github.com/yandex/YaLM-100B |
|
|
|
**YaLM 100B** is a GPT-like neural network for generating and processing text. It can be used freely by developers and researchers from all over the world. |
|
|
|
The model leverages 100 billion parameters. It took 65 days to train the model on a cluster of 800 A100 graphics cards and 1.7 TB of online texts, books, and countless other sources in both English and Russian. |
|
|
|
Training details and best practices on acceleration and stabilizations can be found on **[Medium](https://medium.com/p/d1df53d0e9a6)** (English) and **[Habr](https://habr.com/ru/company/yandex/blog/672396/)** (Russian) articles. |
|
|