nicholasKluge
/

TeenyTinyLlama-460m

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

nicholasKluge commited on Jan 17

Commit

1cba1a4

•

1 Parent(s): 7130df2

Update README.md

Files changed (1) hide show

README.md +3 -2

README.md CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 license: apache-2.0
 datasets:
-- nicholasKluge/portuguese-corpus-v3
 language:
 - pt
 metrics:
@@ -40,12 +40,13 @@ co2_eq_emissions:
 Given the lack of available monolingual foundational models in non-English languages and the fact that some of the most used and downloaded models by the community are those small enough to allow individual researchers and hobbyists to use them in low-resource environments, we developed the TeenyTinyLlama: _a pair of small foundational models trained in Brazilian Portuguese._
 TeenyTinyLlama is a compact language model based on the Llama 2 architecture ([TinyLlama implementation](https://huggingface.co/TinyLlama)). This model is designed to deliver efficient natural language processing capabilities while being resource-conscious These models were trained by leveraging [scaling laws](https://arxiv.org/abs/2203.15556) to determine the optimal number of tokens per parameter while incorporating [preference pre-training](https://arxiv.org/abs/2112.00861).
 ## Details
 - **Architecture:** a Transformer-based model pre-trained via causal language modeling
 - **Size:** 468,239,360 parameters
 - **Context length:** 2048 tokens
-- **Dataset:** [Portuguese-Corpus-v3](https://huggingface.co/datasets/nicholasKluge/portuguese-corpus-v3) (6.2B tokens)
 - **Language:** Portuguese
 - **Number of steps:** 1,200,000
 - **GPU:** 1 NVIDIA A100-SXM4-40GB

 ---
 license: apache-2.0
 datasets:
+- nicholasKluge/Pt-Corpus-Instruct
 language:
 - pt
 metrics:
 Given the lack of available monolingual foundational models in non-English languages and the fact that some of the most used and downloaded models by the community are those small enough to allow individual researchers and hobbyists to use them in low-resource environments, we developed the TeenyTinyLlama: _a pair of small foundational models trained in Brazilian Portuguese._
 TeenyTinyLlama is a compact language model based on the Llama 2 architecture ([TinyLlama implementation](https://huggingface.co/TinyLlama)). This model is designed to deliver efficient natural language processing capabilities while being resource-conscious These models were trained by leveraging [scaling laws](https://arxiv.org/abs/2203.15556) to determine the optimal number of tokens per parameter while incorporating [preference pre-training](https://arxiv.org/abs/2112.00861).
 ## Details
 - **Architecture:** a Transformer-based model pre-trained via causal language modeling
 - **Size:** 468,239,360 parameters
 - **Context length:** 2048 tokens
+- **Dataset:** [Pt-Corpus Instruct](https://huggingface.co/datasets/nicholasKluge/Pt-Corpus-Instruct) (6.2B tokens)
 - **Language:** Portuguese
 - **Number of steps:** 1,200,000
 - **GPU:** 1 NVIDIA A100-SXM4-40GB