Instructions to use AaronHuang160/daxplain-8b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use AaronHuang160/daxplain-8b with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.1-8B-Instruct") model = PeftModel.from_pretrained(base_model, "AaronHuang160/daxplain-8b") - Notebooks
- Google Colab
- Kaggle
DAXplain-8B
A QLoRA adapter on Llama 3.1 8B Instruct that explains Power BI / Tabular DAX measures in plain English, in the voice of a senior BI consultant: what the measure computes, how it computes it, and one practical watch-out.
- Adapter (this repo): rank-16 LoRA, ~168 MB. Load on top of the base model.
- Dataset:
AaronHuang160/dax-explanations - Code / reproducibility: github.com/AaronHuang-git/LoRA-Workflow-Finetuning-A-B-Test (
modal run train.py).
What it does
Given a DAX measure such as:
Sales YTD = TOTALYTD ( [Total Sales], 'Date'[Date] )
it returns a concise, structured explanation ending in a real-world watch-out (e.g. "the year boundary defaults to December 31 — fiscal years need the optional year-end argument").
Intended use
- A practitioner aid for documenting and understanding DAX in a semantic model.
- Few-shot / batch explanation generation for BI documentation.
Not intended as a substitute for professional validation of DAX logic. Output is model-generated and can be wrong — especially on very long measures or recently released DAX functions. Always verify against the actual model behaviour.
Training data
Finetuned on 611 training rows from
AaronHuang160/dax-explanations:
~711 synthetic DAX measures across six categories (simple aggregation, time
intelligence, ratios, iterators, ranking, filter-context), each paired with a
Claude-Sonnet-4.6-distilled explanation. 178 rows (25%) were human-reviewed and 30
were hand-edited by a BI practitioner — including factual corrections (e.g. RANKX's
default SKIP ranking). 100 rows were held out for evaluation.
How it was trained
QLoRA via trl SFTTrainer on a single Modal A10G GPU (~11 min):
| Setting | Value |
|---|---|
| Base | meta-llama/Llama-3.1-8B-Instruct, 4-bit (nf4) |
| LoRA rank / alpha / dropout | 16 / 32 / 0.05 |
| Target modules | all attention + MLP projections |
| LR / schedule | 2e-4, cosine, 3% warmup |
| Epochs / effective batch | 2 / 16 |
| Final train loss | 1.09 (from 2.14) |
Evaluation
Base Llama 3.1 8B Instruct vs DAXplain-8B on 100 held-out measures, scored by Claude Sonnet 4.6 (blind to source) on a 1–5 rubric.
| Metric | Base Llama 3.1 8B | DAXplain-8B | Δ |
|---|---|---|---|
| Accuracy | 3.11 | 2.91 | −0.20 |
| Conciseness | 2.51 | 2.17 | −0.34 |
| BI-fluency | 2.29 | 3.21 | +0.92 |
| Overall | 2.64 | 2.76 | +0.13 |
The headline effect is a large gain in BI-fluency / readability (+0.92) and a net-positive overall score: the base model tends to produce verbose, markdown-bulleted tutorials that truncate mid-list, while DAXplain-8B produces tight prose ending in a single actionable watch-out. This comes with a small, honestly-reported tradeoff in raw accuracy (−0.20) and conciseness (−0.34) — the expected profile of a style finetune, which changes how the model explains, not what it knows. Scores are deliberately strict (the judge rarely awards 5s), so read them relatively, not absolutely.
Worked examples
Min Order Value = MINX ( VALUES ( Sales[OrderNumber] ), CALCULATE ( SUMX ( Sales, Sales[Quantity] * Sales[UnitPrice] ) ) )
Base 3/2/2 → Finetuned 4/3/4
- Base: "Let's break down the given DAX measure step by step. Measure Name: … Purpose: … 1. MINX: … 2. VALUES: …" (long numbered walkthrough, truncates mid-list)
- Finetuned: "This measure finds the single lowest total order value across all orders visible in the current filter context… It works by using VALUES to get each distinct OrderNumber, then iterating with MINX; inside each iteration CALCULATE shifts the row context into a filter context so SUMX can sum Quantity × UnitPrice for that one order. The key watch-out is that if multiple rows share the same OrderNumber they're summed together — so you're looking at aggregated order values, not line items…"
Top 5 Sales Reps Profit = CALCULATE ( [Total Profit], TOPN ( 5, ALL ( Sales[SalesRepName] ), [Total Profit], DESC ) )
Base 3/2/2 → Finetuned 4/3/4
- Base: "Let's break down the given DAX measure step by step. What does it do? … 1.
CALCULATE([Total Profit]): … 2.TOPN(5,...): …" (numbered, truncates) - Finetuned: "This measure returns the total profit contributed by only the top five sales reps… TOPN scans all reps after stripping the SalesRepName filter via ALL, ranks by Total Profit, and selects five; CALCULATE applies that set back. The key watch-out is that ties at the fifth position make TOPN return more than five rows, inflating the total — add a tie-break if exactly five matters."
See eval/results.md in the GitHub repo for the full table and more examples.
Limitations
- Trained on a generic star-schema vocabulary; table/column names are illustrative.
- Synthetic measures skew toward common patterns; rare or very long measures are weaker.
- A style finetune — it does not add DAX knowledge the base model lacks, and can inherit or amplify base-model factual errors.
- English only.
License
Adapter released under the Llama 3.1 Community License (inherited from the base model). Training data is CC-BY-4.0.
Reproducibility
modal run train.py # trains and pushes this adapter
modal run eval.py --action all --n 100 # reproduces the eval table above
- Downloads last month
- 10
Model tree for AaronHuang160/daxplain-8b
Base model
meta-llama/Llama-3.1-8B