Edit model card

Llama-3.1-8B-Instruct-Galician

This model is a continued pretraining version of meta-llama/Llama-3.1-8B-Instruct on the CorpusNós dataset.

Model Description

How to Get Started with the Model

import transformers
import torch

model_id = "irlab-udc/Llama-3.1-8B-Instruct-Galician"

pipeline = transformers.pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={"torch_dtype": torch.bfloat16},
    device_map="auto",
)

messages = [
    {"role": "system", "content": "You are a conversational AI that always responds in Galician."},
    {"role": "user", "content": "Cal é a principal vantaxe de usar Scrum?"},
]

outputs = pipeline(messages, max_new_tokens=512)

print(outputs[0]["generated_text"][-1]["content"])

[More Information Needed]

Training Details

[More Information Needed]

Training Data

[More Information Needed]

Training Hyperparameters

Parameter Value
learning_rate 0.0001
train_batch_size 32
eval_batch_size 1
seed 42
distributed_type multi-GPU
num_devices 4
gradient_accumulation_steps 2
total_train_batch_size 256
total_eval_batch_size 4
optimizer Adam with betas=(0.9, 0.999), epsilon=1e-08
lr_scheduler_type cosine
lr_scheduler_warmup_ratio 0.1
num_epochs 1.0

Training results

Training Loss Epoch Step Validation Loss
2.0606 0.1682 900 2.0613
1.9898 0.3363 1800 1.9929
1.9847 0.5045 2700 1.9613
1.9577 0.6726 3600 1.9445
1.9287 0.8408 4500 1.9368

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

  • Hardware Type: 4x NVIDIA A100 SXM4 80 GB (TDP of 400W)
  • Hours used: 60
  • Cloud Provider: Private infrastructure
  • Carbon Emitted: 10.37 Kg. CO₂ eq.

Citation

Coming soon

Downloads last month
34
Safetensors
Model size
8.03B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for irlab-udc/Llama-3.1-8B-Instruct-Galician

Finetuned
(270)
this model
Quantizations
2 models

Collection including irlab-udc/Llama-3.1-8B-Instruct-Galician

Evaluation results