Tucano is a series of decoder-transformers based on the Llama 2 architecture, natively pre-trained in Portuguese.
Tucano
university
AI & ML interests
Advancing Neural Text Generation for Portuguese
Recent Activity
View all activity
Organization Card
Tucano: Advancing Neural Text Generation for Portuguese
To stimulate the future of open development of neural text generation in Portuguese, we present both GigaVerbo, a concatenation of deduplicated Portuguese text corpora amounting to 200 billion tokens, and Tucano, a series of decoder-transformers natively pre-trained in Portuguese. All byproducts of our study, including the source code used for training and evaluation, are openly released on GitHub and Hugging Face.
Read our preprint in arXiv.
News
- [29/11/2024] Tucano is mentioned on Deutsche Welle: "Cientistas criam maior banco de dados em portuguĂŞs para IA".
- [27/11/2024] Tucano video presentation at the C4AI (USP) [available on YouTube].
- [12/11/2024] "Tucano: Advancing Neural Text Generation for Portuguese" is published as a preprint on ArXiv, with all models and datasets released on Hugging Face.
Community Contributions 🤝
- Demo on how to run inference on Tucano.
- Demo on how to create a simple Chat UI for Tucano using Gradio.
- Tucano OpenVINO is a ported version of Tucano-2b4-Instruct optimized for Intel openVINO inference technology.
Collections
1
models
10
TucanoBR/Tucano-160m
Text Generation
•
Updated
•
272
•
1
TucanoBR/Tucano-630m
Text Generation
•
Updated
•
120
•
1
TucanoBR/Tucano-1b1
Text Generation
•
Updated
•
493
TucanoBR/Tucano-2b4
Text Generation
•
Updated
•
284
•
3
TucanoBR/Tucano-1b1-Instruct
Text Generation
•
Updated
•
387
•
1
TucanoBR/Tucano-2b4-Instruct
Text Generation
•
Updated
•
510
•
2
TucanoBR/XGBRegressor-text-filter
Updated
TucanoBR/BERTimbau-large-text-filter
Text Classification
•
Updated
•
15
TucanoBR/XGBClassifier-text-filter
Updated
TucanoBR/BERTimbau-base-text-filter
Text Classification
•
Updated
•
30
datasets
6
TucanoBR/GigaVerbo
Viewer
•
Updated
•
145M
•
1.95k
•
11
TucanoBR/GigaVerbo-Text-Filter
Viewer
•
Updated
•
110k
•
87
TucanoBR/Tucano-SFT
Viewer
•
Updated
•
680k
•
98
TucanoBR/alpaca-eval-pt
Viewer
•
Updated
•
805
•
50
TucanoBR/lambada-pt
Viewer
•
Updated
•
5.15k
•
49
•
2
TucanoBR/wikipedia-PT
Viewer
•
Updated
•
1.1M
•
74