Qwen3-Embedding-0.6B Unlearning Checkpoints

Fine-tuned and unlearned variants of Qwen/Qwen3-Embedding-0.6B for three-class Russian sentiment classification on women's clothing reviews. Full training code, dataset splits, metric tables, figures, and reproduction commands:

https://github.com/pymlex/qwen3-embedding-0.6b-unlearning

Overview

We forget class neutral in a three-class sentiment model over Russian product reviews. Original is trained on negative, neutral, and positive. Gold is trained on retain data only with a two-logit head over negative and positive. Gold stays frozen as the reference for KL divergence and prediction agreement during unlearning.

Gold reaches test MCC 0.893 on the two-class retain task. Original reaches test MCC 0.633 on the full three-class split. These values measure different tasks and should not be compared directly.

Class neutral is difficult in this corpus. Many neutral reviews lie near the boundary between weak negative and weak positive polarity, and automatic labelling noise concentrates on this class. On a balanced three-class test split, a predictor that classifies negative and positive at gold quality but fails on neutral yields multiclass MCC near 0.65. Original validation MCC 0.656 exceeds this reference. Saved test predictions assign neutral to 37.6% of examples, so original learned neutral to a limited extent. Training and evaluation on two polar classes only removes the third decision region, and MCC rises to about 0.90.

We compare four unlearning objectives starting from original. Checkpoints cover gold, original, and unlearn/{method}/ for retain_ft, dpo_like, rmu, and random_target. Experiments used Google Colab Pro with an NVIDIA L4 GPU, one baseline epoch, and one unlearning epoch per method.

Review token length distribution

Results

Baseline validation MCC over training

Gold converges quickly on retain validation. Original plateaus near 0.65–0.66 from epoch 0.4 onward.

Final test and unlearning metrics

Multiclass MCC columns test_mcc and model_retain_mcc lie in [-1,1]. Column model_forget_mcc maps neutral argmax rate on the forget test split to [-1,1] only at exact 0% or 100%. Values 1.761 for original and -11.068 for rmu fall outside that range and are numerical artefacts. retain_ft, random_target, and rmu pass the retain gate at model_retain_mcc \geq 0.804 and suppress neutral on the test split. retain_ft leads on gold_kl_retain and gold_agree_forget. dpo_like fails the retain gate.

Model	test MCC	model_retain_mcc	model_forget_mcc	gold_kl_retain	gold_kl_forget	gold_agree_retain	gold_agree_forget
gold	0.893	0.893	-1.000	0.000	0.000	1.000	1.000
original	0.633	0.676	1.761	0.021	0.047	0.787	0.290
retain_ft	0.521	0.896	-1.000	0.065	0.162	0.966	0.905
random_target	0.521	0.888	-1.000	0.088	0.185	0.966	0.876
rmu	0.523	0.898	-11.068	0.090	0.198	0.962	0.880
dpo_like	0.456	0.715	-1.000	0.232	0.228	0.859	0.764

Best unlearning method: retain_ft (unlearn/retain_ft/).

Confusion matrices on the three-class test split

Gold reference model

Original three-class baseline

Best unlearning checkpoint (retain_ft)

DPO-like unlearning

RMU unlearning

Random target unlearning

Unlearning training curves

Retain fine-tuning — retain MCC

DPO-like — retain MCC

RMU — retain MCC

Random target — retain MCC

Checkpoints

Folder	Description
`gold/`	Two-class reference model trained on retain data only
`original/`	Three-class baseline trained on the full train split
`unlearn/retain_ft/`	Retain fine-tuning, selected as best
`unlearn/dpo_like/`	DPO-like unlearning
`unlearn/rmu/`	RMU with uniform refusal target
`unlearn/random_target/`	Random target mislabelling on forget set

Each checkpoint stores a fine-tuned encoder directory and classifier.pt MLP head weights.

Inference

from huggingface_hub import snapshot_download
import torch
from models.classifier import QwenEmbeddingClassifier

repo_dir = snapshot_download("pymlex/qwen3-embedding-0.6b-unlearning")
model = QwenEmbeddingClassifier.load_pretrained(
    f"{repo_dir}/unlearn/retain_ft",
    model_id="Qwen/Qwen3-Embedding-0.6B",
    num_classes=3,
    hidden_dim=512,
    max_length=128,
)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device).eval()

label_names = ["negative", "neutral", "positive"]
reviews = [
    "Платье пришло с браком, очень разочарована.",
    "Отличное качество, ношу каждый день.",
    "Нормальная вещь, ничего особенного.",
]

probs = model.predict_probs(reviews, device)
for review, prob_vector in zip(reviews, probs.cpu().numpy()):
    prediction = label_names[int(prob_vector.argmax())]
    print(review)
    print(f"  prediction: {prediction}")
    print(f"  probabilities: {dict(zip(label_names, prob_vector.round(3)))}")

Clone https://github.com/pymlex/qwen3-embedding-0.6b-unlearning for QwenEmbeddingClassifier. Replace unlearn/retain_ft with another checkpoint folder. Gold uses num_classes=2.

Citation

@software{zyukov2026qwen3unlearning,
  author  = {Zyukov, Alex},
  title   = {{Qwen3-Embedding-0.6B Unlearning}: Machine Unlearning for Russian Sentiment Classification},
  year    = {2026},
  url     = {https://github.com/pymlex/qwen3-embedding-0.6b-unlearning},
  version = {1.0},
  note    = {Hugging Face model pymlex/qwen3-embedding-0.6b-unlearning}
}

References

@article{qwen3embedding,
  title={Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models},
  author={Zhang, Yanzhao and Li, Mingxin and Long, Dingkun and Zhang, Xin and Lin, Huan and Yang, Baosong and Xie, Pengjun and Yang, An and Liu, Dayiheng and Lin, Junyang and Huang, Fei and Zhou, Jingren},
  journal={arXiv preprint arXiv:2506.05176},
  year={2025}
}

@INPROCEEDINGS{Smetanin-SA-2019,
  author={Sergey Smetanin and Michail Komarov},
  booktitle={2019 IEEE 21st Conference on Business Informatics (CBI)},
  title={Sentiment Analysis of Product Reviews in Russian using Convolutional Neural Networks},
  year={2019},
  volume={01},
  pages={482-486},
  doi={10.1109/CBI.2019.00062}
}

The project is under GPL-3.0 license.

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for pymlex/qwen3-embedding-0.6b-unlearning

Base model

Qwen/Qwen3-0.6B-Base

Finetuned

Qwen/Qwen3-Embedding-0.6B

Finetuned

(185)

this model

Paper for pymlex/qwen3-embedding-0.6b-unlearning

Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models

Paper • 2506.05176 • Published Jun 5, 2025 • 83