massaindustries commited on
Commit
7cf07c0
·
verified ·
1 Parent(s): d02f685

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +78 -3
README.md CHANGED
@@ -1,3 +1,78 @@
1
- ---
2
- license: cc-by-nc-4.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: Qwen/Qwen3.5-0.8B
4
+ tags:
5
+ - peft
6
+ - lora
7
+ - complexity-classification
8
+ - llm-routing
9
+ - query-difficulty
10
+ - brick
11
+ datasets:
12
+ - regolo/brick-complexity-extractor
13
+ library_name: peft
14
+ pipeline_tag: text-classification
15
+ language:
16
+ - en
17
+ ---
18
+
19
+ # Brick Complexity Extractor
20
+
21
+ LoRA fine-tune of [Qwen/Qwen3.5-0.8B](https://huggingface.co/Qwen/Qwen3.5-0.8B) for query complexity classification (easy / medium / hard).
22
+
23
+ Used in the **Brick** LLM routing system to decide which model tier should handle a query.
24
+
25
+ ## Training
26
+
27
+ - **Base model**: Qwen3.5-0.8B
28
+ - **Method**: LoRA (r=16, alpha=32, dropout=0.05)
29
+ - **Dataset**: [regolo/brick-complexity-extractor](https://huggingface.co/datasets/regolo/brick-complexity-extractor) — 65K samples labeled by Qwen3.5-122B as LLM judge
30
+ - **Epochs**: 3, **LR**: 2e-4 (cosine), **Batch**: 32
31
+ - **Hardware**: NVIDIA H200 141GB, bf16
32
+
33
+ ## Evaluation (test set, 3841 samples)
34
+
35
+ | Class | Precision | Recall | F1 |
36
+ |-------|-----------|--------|----|
37
+ | easy | 81.3% | 80.4% | 80.8% |
38
+ | medium | 77.6% | 80.8% | 79.2% |
39
+ | hard | 72.7% | 65.1% | 68.7% |
40
+ | **accuracy** | | | **78.1%** |
41
+ | **macro avg** | 77.2% | 75.4% | 76.2% |
42
+
43
+ Average confidence: 91.7%
44
+
45
+ ## Usage
46
+
47
+ ```python
48
+ from peft import PeftModel
49
+ from transformers import AutoModelForCausalLM, AutoTokenizer
50
+ import torch, torch.nn.functional as F
51
+
52
+ base = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3.5-0.8B", torch_dtype=torch.bfloat16, trust_remote_code=True)
53
+ model = PeftModel.from_pretrained(base, "regolo/brick-complexity-extractor").eval().cuda()
54
+ tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3.5-0.8B", trust_remote_code=True)
55
+
56
+ # Classification via logit extraction
57
+ LABELS = ["easy", "medium", "hard"]
58
+ label_ids = {l: tokenizer.encode(l, add_special_tokens=False)[0] for l in LABELS}
59
+
60
+ messages = [
61
+ {"role": "system", "content": "<system prompt from training_metadata.json>"},
62
+ {"role": "user", "content": "Classify: Design a lock-free concurrent skip-list with MVCC"},
63
+ ]
64
+ prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
65
+ inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False).to("cuda")
66
+
67
+ with torch.no_grad():
68
+ logits = model(**inputs).logits[0, -1, :]
69
+
70
+ probs = F.softmax(torch.tensor([logits[label_ids[l]] for l in LABELS], dtype=torch.float32), dim=0)
71
+ label = LABELS[probs.argmax()]
72
+ confidence = probs.max().item()
73
+ print(f"{label} ({confidence:.2%})") # hard (94.12%)
74
+ ```
75
+
76
+ ## License
77
+
78
+ Apache 2.0