yandex
/

yalm-100b

Model card Files Files and versions Metrics Training metrics Community

yalm-100b / README.md

artnitolog's picture

Update README.md

a36c524 almost 3 years ago

|

722 Bytes

	---
	language:
	- en
	- ru
	license: apache-2.0


	tags:
	- gpt
	- NLG

	---

	# YaLM 100B

	https://github.com/yandex/YaLM-100B

	YaLM 100B is a GPT-like neural network for generating and processing text. It can be used freely by developers and researchers from all over the world.

	The model leverages 100 billion parameters. It took 65 days to train the model on a cluster of 800 A100 graphics cards and 1.7 TB of online texts, books, and countless other sources in both English and Russian.

	Training details and best practices on acceleration and stabilizations can be found on [Medium](https://medium.com/p/d1df53d0e9a6) (English) and [Habr](https://habr.com/ru/company/yandex/blog/672396/) (Russian) articles.