Qwen2.5-7B-EU-Funding-Expert

A fine-tuned version of Qwen/Qwen2.5-7B-Instruct specialized in European Union research funding programmes. This model has been trained on 435K+ question-answer pairs derived from the official CORDIS (Community Research and Development Information Service) database.

What it knows

The model has expert knowledge across all 6 EU framework programmes:

Programme	Period	Projects
FP4	1994–1998	~13K
FP5	1998–2002	~17K
FP6	2002–2006	~10K
FP7	2007–2013	~26K
Horizon 2020	2014–2020	~36K
Horizon Europe	2021–2027	~24K

It can answer questions about:

🔬 Project details: objectives, methodology, expected impacts
💰 Funding information: total costs, EC contributions, funding schemes
📅 Timelines: start/end dates, project duration
🏢 Organizations: coordinators, participants, country information
🧬 Scientific domains: EuroSciVoc classifications, research topics
📊 Programme-level statistics: funding distribution, topic clusters
🔄 Cross-programme comparisons: funding trends across FP4→Horizon Europe

Usage

With PEFT (recommended)

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base_model = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen2.5-7B-Instruct",
    torch_dtype="auto",
    device_map="auto",
)
model = PeftModel.from_pretrained(base_model, "RCaz/Qwen2.5-7B-EU-Funding-Expert")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-7B-Instruct")

messages = [
    {"role": "system", "content": "You are an expert assistant specializing in European Union research funding programmes."},
    {"role": "user", "content": "What were the main funding priorities under Horizon 2020?"}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.7)
print(tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))

Example prompts

"Tell me about the ITER project and its EU funding."
"How much did the EU invest in quantum computing research under Horizon 2020?"
"Compare the total budgets of FP7 and Horizon 2020."
"Which organizations coordinated the most EU-funded AI projects?"
"What scientific domains received the highest funding in Horizon Europe?"

Training details

Parameter	Value
Base model	Qwen/Qwen2.5-7B-Instruct (7.62B params)
Method	LoRA SFT (Supervised Fine-Tuning)
LoRA rank	64
LoRA alpha	16
LoRA target	all-linear layers
LoRA dropout	0.05
Trainable params	~410M (5.4% of total)
Dataset	RCaz/eu-funding-cordis-qa
Training samples	413,499
Validation samples	21,764
Max sequence length	2048
Packing	BFD (Best-Fit Decreasing)
Avg tokens/sample	~285
Precision	bf16
Optimizer	AdamW (fused)
Learning rate	2e-4 (cosine schedule)
Batch size	1 × 8 grad accum = 8 effective
Hardware	NVIDIA L4 (24GB)
Flash Attention	2.0
Gradient checkpointing	✓

Dataset

The training data was generated from official CORDIS CSV exports covering 126K+ EU-funded research projects. Eight types of Q&A conversations were created:

Project overview — What is the project about?
Funding details — How much funding did it receive?
Timeline — When did it start/end?
Organizations — Who coordinates/participates?
Scientific domains — What fields does it cover?
Topics — What EU topics/calls is it associated with?
Programme-level — Statistics and trends within a programme
Cross-programme — Comparisons across framework programmes

All conversations follow the ChatML format with a system prompt establishing EU funding expertise.

Limitations

Knowledge is based on CORDIS data snapshots and may not reflect the very latest project updates
The model is specialized for EU funding — it may be less capable on general knowledge tasks compared to the base model
Financial figures and project details are as accurate as the source CORDIS data
The model may occasionally hallucinate details for very specific project queries

Citation

If you use this model, please cite:

@misc{rcaz2026eufunding,
  title={Qwen2.5-7B-EU-Funding-Expert: A Fine-tuned LLM for European Research Funding},
  author={RCaz},
  year={2026},
  url={https://huggingface.co/RCaz/Qwen2.5-7B-EU-Funding-Expert}
}

Acknowledgments

Base model by Qwen Team
Fine-tuned using TRL and PEFT

Downloads last month: -

Model tree for RCaz/Qwen2.5-7B-EU-Funding-Expert

Base model

Qwen/Qwen2.5-7B

Finetuned

Qwen/Qwen2.5-7B-Instruct

Adapter

(2137)

this model

RCaz
/

Qwen2.5-7B-EU-Funding-Expert