top-papers/Qwen3-VL-8B-Instruct-scireason

This repository contains the fine-tuned SciReason VLM artifacts produced by the DataSphere SFT + GRPO pipeline.

Contents

  • Root files: final GRPO adapter and processor files copied from outputs/hf_top_papers_qwen3vl_8b_grpo_lora for convenient loading.
  • artifacts/sft_lora/: SFT LoRA adapter directory copied from outputs/hf_top_papers_qwen3vl_8b_sft_lora.
  • artifacts/grpo_lora/: complete final GRPO output directory copied from outputs/hf_top_papers_qwen3vl_8b_grpo_lora.
  • artifacts/archives/: compressed .tar.gz archives produced by the job.
  • artifacts/data/: generated train/eval JSONL files and dataset summary.
  • artifacts/reports/: budget, final summary, upload manifest and runtime reports.

Training metadata

  • Base model: Qwen/Qwen3-VL-8B-Instruct
  • Dataset: top-papers/top-papers-graph-experts-data
  • Output prefix: hf_top_papers_qwen3vl_8b
  • Uploaded at UTC: 2026-06-22T07:38:23Z

Loading note

The root of this repository is prepared as the final GRPO adapter directory. For LoRA/PEFT loading, use the same base model listed above and load this repository as the adapter. The complete SFT and GRPO directories are also preserved under artifacts/ for auditability and reproducibility.

Downloads last month
42
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for top-papers/Qwen3-VL-8B-Instruct-scireason

Adapter
(125)
this model

Dataset used to train top-papers/Qwen3-VL-8B-Instruct-scireason