Instructions to use Fernandosr85/afrobr-langbench-adapter with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use Fernandosr85/afrobr-langbench-adapter with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
AfroBR-LangBench — Afro-Brazilian Portuguese Sociolinguistics Adapter
LoRA adapter fine-tuned on Llama-4-Scout-17B-16E-Instruct (109B) for respectful sociolinguistic reasoning about Afro-Brazilian Portuguese varieties, via Adaption's AutoScientist platform.
The problem this adapter addresses
Language models trained on standard text systematically treat Afro-Brazilian Portuguese features as errors rather than documented linguistic phenomena. When given "eles foi lá", a base model typically responds:
"Correction: the correct form is 'eles foram lá'"
The sociolinguistically adequate response is:
"This exemplifies Concordância Verbal Reduzida (CVR), documented in quilombola communities and studied by Lucchesi et al. (2009). Its origin lies in contact between colonial Portuguese and Bantu languages..."
This adapter teaches the model to explain, normalize respectfully, identify, and cite academic sources for 10 documented phenomena.
Adaptive Data results
| Metric | Before | After |
|---|---|---|
| Quality score | 6.0 | 9.1 |
| Quality grade | C | A |
| Relative improvement | — | +51.7% |
| Percentile (Language domain) | 8.2 | 33.0 |
Training metrics
| Metric | Value |
|---|---|
| Base model | meta-llama/Llama-4-Scout-17B-16E-Instruct (109B) |
| Trained model name | adaption_pt_afro_brasileiro_qa |
| Training method | SFT + LoRA |
| LoRA rank (r) | 16 |
| LoRA alpha | 32 |
| LoRA dropout | 0.1 |
| Trainable modules | all-linear |
| Epochs | 4 |
| Training steps | 88 |
| Learning rate | 7e-5 (cosine scheduler) |
| Warmup ratio | 0.05 |
| Weight decay | 0.03 |
| Dataset size | 400 examples (Grade A) |
Dataset
| Platform | Link |
|---|---|
| HuggingFace Dataset | Fernandosr85/adaption-pt-afro-brasileiro-qa |
| Kaggle Dataset | afrobr-langbench-sociolinguistics-dataset |
| Kaggle Notebook | AfroBR-LangBench |
400 instruction-tuning examples across 4 task categories:
| Category | Task | Examples |
|---|---|---|
| A | Sociolinguistic explanation without prejudice | 150 |
| B | Respectful normalization to standard register | 100 |
| C | Identification of linguistic phenomena | 100 |
| D | RAG-style questions with academic citations | 50 |
10 documented phenomena
| Code | Phenomenon |
|---|---|
| CVR | Concordância Verbal Reduzida |
| CNR | Concordância Nominal Reduzida |
| APR | Apagamento do /r/ em coda silábica |
| TOP | Topicalização com Deslocamento à Esquerda |
| AGT | Uso de 'a gente' como pronome de 1ª pessoa do plural |
| NPV | Negação Pós-verbal |
| MON | Monotongação de Ditongos |
| PREP | Variação no Uso de Preposições |
| CLC | Ausência de Clítico Acusativo de 3ª Pessoa |
| MAA | Marcadores Aspectuais de Origem Africana |
Academic sources
- Lucchesi, D., Baxter, A., Ribeiro, I. (2009). O Português Afro-Brasileiro. EDUFBA.
- Projeto Vertentes (UFBA, 2001–) — speech corpus from quilombola communities in Bahia
- Cyrino, S. (1997). O objeto nulo no Português do Brasil. UNICAMP.
- Galves, C. (2001). Ensaios sobre as gramáticas do português. UNICAMP.
- Schwenter, S. A. (2005). The pragmatics of negation in Brazilian Portuguese. Lingua.
- Holm, J. (2004). Languages in Contact: The Partial Restructuring of Vernaculars. Cambridge University Press.
- Castro, Y. P. (2001). Línguas africanas no Brasil. CEAO/UFBA.
Credits
- Fine-tuning platform: Adaption — AutoScientist & Adaptive Data
- Challenge: AutoScientist Challenge 2026
- Training infrastructure: Adaption compute credits
- Dataset remastering: Adaption Adaptive Data pipeline (Grade A, +51.7% quality improvement)
- Author: Fernando Rodrigues · Kaggle: fernandosr85 · HuggingFace: Fernandosr85
Disclaimer
Experimental research artifact submitted to AutoScientist Challenge 2026 (Language category). This adapter is intended for linguistic research and education. It does not represent or speak for Afro-Brazilian communities.
- Downloads last month
- -
Model tree for Fernandosr85/afrobr-langbench-adapter
Base model
meta-llama/Llama-4-Scout-17B-16E