Instructions to use guaran-ia/coreguapa-quality-lm with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use guaran-ia/coreguapa-quality-lm with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="guaran-ia/coreguapa-quality-lm")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForMultimodalLM

tokenizer = AutoTokenizer.from_pretrained("guaran-ia/coreguapa-quality-lm")
model = AutoModelForMultimodalLM.from_pretrained("guaran-ia/coreguapa-quality-lm")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use guaran-ia/coreguapa-quality-lm with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "guaran-ia/coreguapa-quality-lm"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "guaran-ia/coreguapa-quality-lm",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/guaran-ia/coreguapa-quality-lm

SGLang

How to use guaran-ia/coreguapa-quality-lm with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "guaran-ia/coreguapa-quality-lm" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "guaran-ia/coreguapa-quality-lm",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "guaran-ia/coreguapa-quality-lm" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "guaran-ia/coreguapa-quality-lm",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use guaran-ia/coreguapa-quality-lm with Docker Model Runner:
```
docker model run hf.co/guaran-ia/coreguapa-quality-lm
```

Model Card: Coreguapa Quality LM

This model is intended to be used to validate the quality of Guaraní text. It was trained on the Coreguapa corpus, a restricted dataset manually compiled and curated by the Paraguayan Secretary of Linguistic Policies. The corpus contains high-quality documents, including primarily copyrighted materials.

Although the model is based on a transformer-based architecture (Gemma 2), it was not developed as a generative tool but its primary use is to compute the perplexity score of Guaraní documents, where lower perplexity might suggest text that is more predictable by the model and more similar to the reference high-quality corpus.

Summary

Model type: Gemma2 For Causal LM
Base model: princeton-nlp/gemma-2-9b-it-SimPO
Fine-tuning method: Full fine-tuning (all model weights updated)
Dataset: Guaraní corpus derived from data/coreguapa_identified_all.jsonl
Primary task: Causal language modeling / text generation
Training target: output/full_cpt_202605261711

Model Details

Architecture: Gemma2ForCausalLM
Number of layers: 42
Hidden size: 3584
Attention heads: 16
Feedforward intermediate size: 14336
Vocabulary size: 256000
Maximum context length: 8192 tokens
Precision: float16
Tokenizer: saved in this folder via tokenizer.json and tokenizer_config.json
Generation config: saved in generation_config.json
Prompt template: chat_template.jinja

Training Details

Training script: src/train.py
Training configuration:
- config/common_config.yaml
- config/full_config.yaml
Batch size: 1
Gradient accumulation: 1
Learning rate: 2e-5
Weight decay: 0.01
Warmup steps: 100
Optimizer: paged_adamw_8bit
Scheduler: linear
Epochs: 2
Precision mode: bf16 where available
Gradient checkpointing: enabled
Dataset preprocessing: src/preprocess_data.py

Dataset and Preprocessing

Raw source file: private
Processed dataset directory: private
Split strategy: train / validation / test via src/preprocess_data.py
Sequence length used for tokenization: 2048
Tokenizer source: princeton-nlp/gemma-2-9b-it-SimPO

Usage

Use the model with Hugging Face Transformers as follows:

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained('output/full_cpt_202605261711')
tokenizer = AutoTokenizer.from_pretrained('output/full_cpt_202605261711')

prompt = 'Your input text here.'
inputs = tokenizer(prompt, return_tensors='pt')
outputs = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

For inference logic used in this project, see src/inference.py.

Evaluation

Evaluation scripts in the repository include:

src/evaluate_base_vs_cpt.py — compare base, LoRA, and full fine-tuned models
src/inference.py — generate predictions from saved checkpoints

Limitations and Notes

The training data is drawn from a private Guaraní corpus.
The model may reflect biases present in the source corpus.
License metadata is provided in this folder.

License

This model checkpoint and accompanying files are released under the GNU General Public License v3 (GPLv3). See the LICENSE file in this directory for the full license text.

Caveats

This file is generated from the available project configuration and model metadata.
If you need exact license or authorship details, consult the repository maintainers or project documentation.

Downloads last month: 14

Safetensors

Model size

9B params

Tensor type

F16