Configuration Parsing Warning:Config file config.json cannot be fetched (too big)

Configuration Parsing Warning:Config file tokenizer_config.json cannot be fetched (too big)

qwen3-8b-tablellama

Replication of TableLlama, trained from Qwen3-8B on the corresponding instruction-tuning corpus.

Released alongside the EACL 2026 Findings paper "What Really Matters for Table LLMs? A Meta-Evaluation of Model and Data Effects" (Deng et al., 2026) as an additional artefact extending the paper's experiments — the main 3 base × 4 training-data grid in the paper covers Mistral-v0.3, OLMo, and Phi-3-small at the 7B scale; this model adds another base-model variant trained on the same corpus.

Training

Base model Qwen/Qwen3-8B
Training corpus tablellama_train.json from dnaihao/Table-Instructs
Method Full SFT via LLaMA-Factory
Learning rate 5e-7

Full hyperparameter sweep, ablations, and per-benchmark numbers are reported in the paper.

Evaluation

This model was not part of the per-benchmark evaluation reported in the paper; it is released as an additional artefact for the community. See github.com/dnaihao/table-sft-eacl-2026 for the eval setup we used on the paper's main models — the same scripts can be adapted for this checkpoint.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("dnaihao/qwen3-8b-tablellama")
model = AutoModelForCausalLM.from_pretrained(
    "dnaihao/qwen3-8b-tablellama",
    torch_dtype="auto",
    device_map="auto",
)

License

This model inherits the license of its base model (Qwen/Qwen3-8B: apache-2.0).

Citation

@inproceedings{deng-etal-2026-really,
    title = "What Really Matters for Table {LLM}s? A Meta-Evaluation of Model and Data Effects",
    author = "Deng, Naihao  and Zhang, Sheng  and Zhu, Henghui  and Chang, Shuaichen  and Zhang, Jiani  and Li, Alexander Hanbo  and Hang, Chung-Wei  and Kobayashi, Hideo  and Hu, Yiqun  and Ng, Patrick",
    booktitle = "Findings of the Association for Computational Linguistics: EACL 2026",
    year = "2026",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2026.findings-eacl.195/",
    doi = "10.18653/v1/2026.findings-eacl.195"
}
Downloads last month
9
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for dnaihao/qwen3-8b-tablellama

Finetuned
Qwen/Qwen3-8B
Finetuned
(1642)
this model

Dataset used to train dnaihao/qwen3-8b-tablellama

Collection including dnaihao/qwen3-8b-tablellama

Paper for dnaihao/qwen3-8b-tablellama