Instructions to use HiTZ/Latxa-Qwen3.5-2B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use HiTZ/Latxa-Qwen3.5-2B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="HiTZ/Latxa-Qwen3.5-2B") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForMultimodalLM processor = AutoProcessor.from_pretrained("HiTZ/Latxa-Qwen3.5-2B") model = AutoModelForMultimodalLM.from_pretrained("HiTZ/Latxa-Qwen3.5-2B") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use HiTZ/Latxa-Qwen3.5-2B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "HiTZ/Latxa-Qwen3.5-2B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "HiTZ/Latxa-Qwen3.5-2B", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/HiTZ/Latxa-Qwen3.5-2B
- SGLang
How to use HiTZ/Latxa-Qwen3.5-2B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "HiTZ/Latxa-Qwen3.5-2B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "HiTZ/Latxa-Qwen3.5-2B", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "HiTZ/Latxa-Qwen3.5-2B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "HiTZ/Latxa-Qwen3.5-2B", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use HiTZ/Latxa-Qwen3.5-2B with Docker Model Runner:
docker model run hf.co/HiTZ/Latxa-Qwen3.5-2B
Model Card for HiTZ/Latxa-Qwen3.5-2B
Latxa-Qwen3.5-2B is a Basque-adapted multimodal and multilingual instruct model built on top of Qwen3.5-2B, a powerful vision-language LLM capable of understanding and generating text and processing images. This model has been adapted by the HiTZ Research Center for improved performance on Basque, Galician and Catalan languages and interactive instruction following.
In addition to Basque, the model has been trained on Catalan and Galician data.
Model Details
Model Description
Latxa Vision models are a family of Vision-Language Models based on Qwen3.5. The models were adapted to different languages following Sainz et al. (2025) adaptation method.
- Developed by: HiTZ Research Center & IXA Research group (University of the Basque Country UPV/EHU)
- Funded by: Ikergaitu and ALIA projects (Basque and Spanish Government)
- Model type: Vision-Language Instruct Model
- Language(s) (NLP): Basque, Galician, Catalan, Spanish, English and more.
- License: Apache 2.0
- Finetuned from model: Qwen3.5-2B
Getting Started
Use the code below to get started with the model.
from transformers import pipeline
# Load the text and image to text pipeline
pipe = pipeline("image-text-to-text", model="HiTZ/Latxa-Qwen3.5-2B")
# Messages can be of many types
messages = [
{
"role": "user",
"content": [
{"type": "image", "image": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/cats.png"},
{"type": "text", "text": "What do we see in this image?"},
]
}
]
output = pipe(messages)
print(output)
We recommend using the following set of sampling parameters for generation
- Thinking mode for text tasks:
temperature=1.0, top_p=0.95, top_k=20, min_p=0.0, presence_penalty=1.5, repetition_penalty=1.0- Thinking mode for VL or precise coding (e.g. WebDev) tasks :
temperature=0.6, top_p=0.95, top_k=20, min_p=0.0, presence_penalty=0.0, repetition_penalty=1.0- Non-thinking mode for text tasks:
temperature=1.0, top_p=1.00, top_k=20, min_p=0.0, presence_penalty=2.0, repetition_penalty=1.0- Non-thinking mode for VL tasks:
temperature=0.7, top_p=0.80, top_k=20, min_p=0.0, presence_penalty=1.5, repetition_penalty=1.0Please note that the support for sampling parameters varies according to inference frameworks.
Uses
Latxa models are intended to be used with Basque data; for any other language the performance is not guaranteed.
Regarding the multi variant, it was additionally adapted for Galician and Catalan.
Direct Use
Latxa Instruct models are trained to follow instructions or to work as chat assistants.
Out-of-Scope Use
The model is not intended for malicious activities, such as harming others or violating human rights. Any downstream application must comply with current laws and regulations. Irresponsible usage in production environments without proper risk assessment and mitigation is also discouraged.
Bias, Risks, and Limitations
In an effort to alleviate the potentially disturbing or harmful content, Latxa has been trained on carefully selected and processed data which comes mainly from local media, national/regional newspapers, encyclopedias and blogs (see Latxa Corpus v2). Still, the model is based on Qwen3-VL models and can potentially carry the same bias, risk and limitations.
Recommendations
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
Training Details
Training Data
For training details, please, refer to our paper: Instructing Large Language Models for Low-Resource Languages: A Systematic Study for Basque
Evaluation
We evaluated the models using 5-shot settings on multiple-choice and generative tasks.
| Task | Q3-VL-2B | Q3-VL-2B eu |
Q3-VL-2B multi |
Q3.5-2B | Q3.5-2B multi |
|---|---|---|---|---|---|
| Arc Challenge | 36.95 | 51.28 (+14.33) | 55.20 (+18.25) | 45.05 | 59.72 (+14.67) |
| Arc Easy | 43.27 | 65.99 (+22.72) | 69.95 (+26.68) | 54.76 | 73.61 (+18.85) |
| BeleBele | 46.00 | 65.44 (+19.44) | 60.67 (+14.67) | 56.33 | 66.89 (+10.56) |
| BertaQA global | 46.03 | 53.43 (+7.40) | 56.81 (+10.78) | 49.50 | 60.36 (+10.86) |
| BertaQA local | 37.27 | 42.51 (+5.24) | 44.46 (+7.19) | 37.01 | 52.45 (+15.44) |
| BL2MP | 49.11 | 87.94 (+38.83) | 89.22 (+40.11) | 60.89 | 90.06 (+29.17) |
| Eus Exams | 33.81 | 42.44 (+8.63) | 42.81 (+9.00) | 36.86 | 43.30 (+6.44) |
| Eus Proficiency | 25.69 | 36.45 (+10.76) | 36.58 (+10.89) | 27.80 | 43.39 (+15.59) |
| Eus Reading | 25.85 | 47.73 (+21.45) | 41.76 (+15.91) | 40.91 | 45.17 (+4.26) |
| Eus Trivia | 35.04 | 40.41 (+5.37) | 42.04 (+7.00) | 38.13 | 52.59 (+14.46) |
| MGSM CoT | 13.10 | 33.20 (+20.10) | 34.00 (+20.90) | 21.60 | 38.80 (+17.20) |
| MMLU | 34.07 | 43.33 (+9.26) | 45.93 (+11.86) | 42.22 | 47.40 (+5.18) |
| OpenBook QA | 30.20 | 50.40 (+20.20) | 54.60 (+24.40) | 45.40 | 57.00 (+11.60) |
| PIQA | 53.70 | 55.17 (+1.47) | 54.08 (+0.38) | 52.94 | 59.53 (+6.59) |
| SIQA | 38.18 | 48.26 (+10.08) | 50.31 (+12.13) | 42.94 | 50.87 (+7.93) |
| X-StoryCloze | 50.50 | 56.98 (+6.48) | 57.05 (+6.55) | 52.08 | 58.30 (+6.22) |
| AVG EU | 38.77 | 51.31 (+12.54) | 52.22 (+13.45) | 44.03 | 56.22 (+12.19) |
DISCLAIMER
These model are still under development. The results are only reported for Basque tasks, the results in the rest of the languages will be released in the near future.
Citation
@inproceedings{sainz-etal-2025-instructing,
title = "Instructing Large Language Models for Low-Resource Languages: A Systematic Study for {B}asque",
author = "Sainz, Oscar and
Perez, Naiara and
Etxaniz, Julen and
Fernandez de Landa, Joseba and
Aldabe, Itziar and
Garc{\'i}a-Ferrero, Iker and
Zabala, Aimar and
Azurmendi, Ekhi and
Rigau, German and
Agirre, Eneko and
Artetxe, Mikel and
Soroa, Aitor",
editor = "Christodoulopoulos, Christos and
Chakraborty, Tanmoy and
Rose, Carolyn and
Peng, Violet",
booktitle = "Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing",
month = nov,
year = "2025",
address = "Suzhou, China",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2025.emnlp-main.1484/",
doi = "10.18653/v1/2025.emnlp-main.1484",
pages = "29124--29148",
ISBN = "979-8-89176-332-6",
abstract = "Instructing language models with user intent requires large instruction datasets, which are only available for a limited set of languages. In this paper, we explore alternatives to conventional instruction adaptation pipelines in low-resource scenarios. We assume a realistic scenario for low-resource languages, where only the following are available: corpora in the target language, existing open-weight multilingual base and instructed backbone LLMs, and synthetically generated instructions sampled from the instructed backbone. We present a comprehensive set of experiments for Basque that systematically study different combinations of these components evaluated on benchmarks and human preferences from 1,680 participants. Our conclusions show that target language corpora are essential, with synthetic instructions yielding robust models, and, most importantly, that using as backbone an instruction-tuned model outperforms using a base non-instructed model. Scaling up to Llama 3.1 Instruct 70B as backbone, our model comes near frontier models of much larger sizes for Basque, without using any Basque instructions. We release code, models, instruction datasets, and human preferences to support full reproducibility in future research on low-resource language adaptation."
}
Acknowledgements
This work has been partially supported by the Basque Government (Research group funding IT1570-22 and IKER-GAITU project), the Spanish Ministry for Digital Transformation and of Civil Service, and the EU-funded NextGenerationEU Recovery, Transformation and Resilience Plan (ALIA project). The models were trained on the Leonardo supercomputer at CINECA under the EuroHPC Joint Undertaking, project EHPC-EXT-2024E01-042.
- Downloads last month
- 49