crypto-qwen25-coder32-r5-v9

PEFT LoRA fine-tuned for crypto pump prediction (binary Yes/No 7-day +15% move detection).

V9 results β€” PASSED gate

Metric v9 v8
Raw MCC +0.2162 (CI [+0.1880, +0.2424]) +0.0000
Threshold-tuned MCC +0.1937 β€”
Platt-calibrated MCC +0.1342 β€”
AUC-ROC 0.693 (saturation collapsed v8)
AUC-PR 0.277 β€”
Accuracy 77.6% β€”
F1 0.339 β€”
Saturation@95 0.00% ~100% in v8
ECE 0.233 (raw) / 0.021 (Platt) β€”
Brier 0.173 β€”
n_test 8000 (5898 coin-holdout unseen) β€”

Improvement: new in v9 (+0.2162).

Why v9 beats v8

  1. Calibrated CE loss (label_smoothing=0.05, pos_weight=6, conf_penalty=0.01) β€” no probability saturation
  2. PEFT merge_and_unload() before eval β€” fixes F38 multi-GPU eval bug
  3. dataset_v9_v2 with coin-holdout (15% of coins never in train)
  4. Post-hoc Platt + threshold tuning recovers signal
  5. Natural 14.3% Yes balance (+15% threshold) vs v8 oversampled 35%

Usage

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
base = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-Coder-32B-Instruct", torch_dtype=torch.bfloat16, device_map="auto")
model = PeftModel.from_pretrained(base, "majid2230/crypto-qwen25-coder32-r5-v9")
model = model.merge_and_unload()
tok = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-Coder-32B-Instruct")

Apply Platt scaling (a=0.895354882854696, b=-1.3339554070728084) + threshold tune for best results.

Recipe (locked v9)

epochs=3 lora_r=64 LR=1.5e-5 warmup=0.05 max_length=768
label_smoothing=0.05 pos_weight=6.0 conf_penalty=0.01 patience=2

Part of R5 v9 cohort β€” https://huggingface.co/majid2230

Downloads last month
15
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for majid2230/crypto-qwen25-coder32-r5-v9

Base model

Qwen/Qwen2.5-32B
Adapter
(56)
this model