Expert QA Pipeline (EMNLP 2026 Industry)
Collection
SFT checkpoints from a 2x2 factorial ablation of a Korean speech-to-SFT pipeline (medical + finance). 9 LLMs, 2.4B-70B. EMNLP 2026 Industry. • 182 items • Updated
SFT checkpoint from the EMNLP 2026 Industry Track submission A Factorial Ablation of a Speech-to-SFT Pipeline: Differential Effects on Data Quality and Downstream Transfer.
| Field | Value |
|---|---|
| Pipeline condition | Exp 0 (baseline (no refinement)) |
| Domain | finance |
| Seed | n/a (single seed 42) |
| Base model | LGAI-EXAONE/EXAONE-3.5-2.4B-Instruct |
| Training | Full FT (ZeRO-3) |
| Upstream STT | In-house STT (paper main pipeline) |
| License | CC BY-NC 4.0 (research and non-commercial use only) |
Intended use: research and non-commercial use only, matching the consent scope of the source audio.
Companion repository (code, configs, prompts, sample QA): https://github.com/flitto/speech-to-sft-ablation-paper
Base model
LGAI-EXAONE/EXAONE-3.5-2.4B-Instruct