top-papers/Qwen3-VL-8B-Instruct-scireason

This repository contains the fine-tuned SciReason VLM artifacts produced by the DataSphere SFT + GRPO pipeline.

Root files: final GRPO adapter and processor files copied from outputs/hf_top_papers_qwen3vl_8b_grpo_lora for convenient loading.
artifacts/sft_lora/: SFT LoRA adapter directory copied from outputs/hf_top_papers_qwen3vl_8b_sft_lora.
artifacts/grpo_lora/: complete final GRPO output directory copied from outputs/hf_top_papers_qwen3vl_8b_grpo_lora.
artifacts/archives/: compressed .tar.gz archives produced by the job.
artifacts/data/: generated train/eval JSONL files and dataset summary.
artifacts/reports/: budget, final summary, upload manifest and runtime reports.

Training metadata

Base model: Qwen/Qwen3-VL-8B-Instruct
Dataset: top-papers/top-papers-graph-experts-data
Output prefix: hf_top_papers_qwen3vl_8b
Uploaded at UTC: 2026-06-22T07:38:23Z

Loading note

The root of this repository is prepared as the final GRPO adapter directory. For LoRA/PEFT loading, use the same base model listed above and load this repository as the adapter. The complete SFT and GRPO directories are also preserved under artifacts/ for auditability and reproducibility.

Downloads last month: 42

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for top-papers/Qwen3-VL-8B-Instruct-scireason

Base model

Qwen/Qwen3-VL-8B-Instruct

Adapter

(125)

this model

top-papers
/

Qwen3-VL-8B-Instruct-scireason

top-papers/Qwen3-VL-8B-Instruct-scireason

Contents

Training metadata

Loading note

Model tree for top-papers/Qwen3-VL-8B-Instruct-scireason

Dataset used to train top-papers/Qwen3-VL-8B-Instruct-scireason