NiklasKoch commited on
Commit
167c794
·
verified ·
1 Parent(s): a72eee9

Upload folder using huggingface_hub

Browse files
Files changed (3) hide show
  1. README.md +159 -0
  2. adapter_config.json +51 -0
  3. adapter_model.safetensors +3 -0
README.md ADDED
@@ -0,0 +1,159 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: Qwen/Qwen3-Embedding-0.6B
3
+ library_name: peft
4
+ tags:
5
+ - text-classification
6
+ - reddit
7
+ - conversation-analysis
8
+ - constructive-dialogue
9
+ - qwen
10
+ - lora
11
+ - transformers
12
+ language:
13
+ - en
14
+ datasets:
15
+ - reddit
16
+ pipeline_tag: text-classification
17
+ ---
18
+
19
+ # Qwen Reddit Constructive Conversation Classifier
20
+
21
+ A fine-tuned Qwen 3 Embedding model for classifying constructive vs non-constructive conversations in Reddit discussions.
22
+
23
+ ## Model Description
24
+
25
+ This model is a QLoRA (Quantized LoRA) fine-tuned version of `Qwen/Qwen3-Embedding-0.6B` specifically trained to identify constructive conversations in Reddit threads. The model was trained using self-training techniques on Reddit discussion data.
26
+
27
+ - **Model Type**: Text Classification (Binary)
28
+ - **Base Model**: Qwen/Qwen3-Embedding-0.6B
29
+ - **Training Method**: QLoRA with self-training
30
+ - **Task**: Binary classification of conversation constructiveness
31
+ - **Language**: English
32
+
33
+ ## Intended Uses
34
+
35
+ ### Primary Use Case
36
+ - Classifying Reddit discussions as constructive or non-constructive
37
+ - Content moderation assistance
38
+ - Conversation quality analysis
39
+ - Social media research
40
+
41
+ ### Direct Use
42
+ ```python
43
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
44
+ from peft import PeftModel
45
+ import torch
46
+
47
+ # Load base model and tokenizer
48
+ base_model_name = "Qwen/Qwen3-Embedding-0.6B"
49
+ tokenizer = AutoTokenizer.from_pretrained(base_model_name)
50
+ model = AutoModelForSequenceClassification.from_pretrained(
51
+ base_model_name,
52
+ num_labels=2
53
+ )
54
+
55
+ # Load the fine-tuned adapters
56
+ model = PeftModel.from_pretrained(model, "NiklasKoch/qwen-discussion-classifier")
57
+ model.eval()
58
+
59
+ # Classify text
60
+ def classify_text(text):
61
+ inputs = tokenizer(
62
+ text,
63
+ return_tensors="pt",
64
+ truncation=True,
65
+ padding=True,
66
+ max_length=4096
67
+ )
68
+
69
+ # Move inputs to same device as model (important for GPU usage)
70
+ inputs = {k: v.to(next(model.parameters()).device) for k, v in inputs.items()}
71
+
72
+ with torch.no_grad():
73
+ outputs = model(**inputs)
74
+ predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
75
+
76
+ # 0 = non-constructive, 1 = constructive
77
+ predicted_class = torch.argmax(predictions, dim=-1).item()
78
+ confidence = predictions[0][predicted_class].item()
79
+
80
+ return {
81
+ 'class': 'constructive' if predicted_class == 1 else 'non-constructive',
82
+ 'confidence': confidence,
83
+ 'scores': {
84
+ 'non-constructive': predictions[0][0].item(),
85
+ 'constructive': predictions[0][1].item()
86
+ }
87
+ }
88
+
89
+ # Example usage
90
+ text = "[author0] LEGO: What do you think you're doing?!? [author1] I don't get it did he reveal bionicle reboot or smthn? [author2] Not really, he did announce something but was super vague, seems like a sort of passion project we wants to do with the community, he even said it might not even be bionicle. [author1] So is that image fan made or is it one of his passion projects [author2] Those pictures are real and on his insta, he did a stream talking about it I\u2019m sure you can find somewhere, search up Fabre bionicle stream 2020 or something. [author1] OK thanks"
91
+ result = classify_text(text)
92
+ print(result)
93
+ ```
94
+
95
+ ## Training Details
96
+
97
+ ### Training Data
98
+ - **Source**: https://archive.org/download/pushshift_reddit_200506_to_202212/
99
+ - **Size**: The dataset I used contained a total of ~1.4 million Reddit threads filtered for English language and a minimum of 2 authors per thread.
100
+ - **Labels**: Binary (constructive/non-constructive conversations)
101
+ - **Additional Data**: YNACC and IAC datasets for initial supervised training
102
+
103
+ ### Training Procedure
104
+ - **Training Method**: Self-Training
105
+ - **Quantization**: 4-bit QLoRA
106
+ - **LoRA Config**:
107
+ - `r`: 16
108
+ - `lora_alpha`: 32
109
+ - `lora_dropout`: 0.1
110
+ - Target modules: `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`
111
+ - **Loss Function**: Focal Loss with class weighting
112
+ - **Max Sequence Length**: 4096 tokens
113
+ - **Batch Size**: 64
114
+ - **Learning Rate**: 2e-6
115
+
116
+ ### Training Hardware
117
+ - 48 hours on 4x NVIDIA A100 40GB GPUs
118
+
119
+ ## Performance
120
+
121
+ ### Evaluation Results
122
+
123
+ ```
124
+ YNACC:
125
+ Accuracy: 0.70
126
+ F1-Score: 0.69
127
+
128
+ IAC:
129
+ Accuracy: 0.78
130
+ F1-Score: 0.86
131
+
132
+ Reddit:
133
+ Accuracy: 0.64
134
+ F1-Score: 0.74
135
+ ```
136
+
137
+ ## Limitations and Bias
138
+
139
+ - **Language**: English only
140
+ - **Bias**: May reflect biases present in Reddit discussions and training data
141
+
142
+ ## Ethical Considerations
143
+
144
+ - Human oversight is recommended for important moderation decisions
145
+
146
+ ## Technical Specifications
147
+
148
+ - **Model Architecture**: Qwen 3 Embedding + Classification Head
149
+ - **Parameters**: ~600M base + LoRA adapters + classification head
150
+ - **Precision**: 4-bit quantized base model with full-precision adapters
151
+ - **Framework**: PyTorch, Transformers, PEFT (any recent version - you may see harmless warnings about configuration parameters)
152
+
153
+ ## Model Card Authors
154
+
155
+ Niklas Koch, Georg August University of Göttingen
156
+
157
+ ## Model Card Contact
158
+
159
+ niklas.koch01@stud.uni-goettingen.de
adapter_config.json ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "alpha_pattern": {},
3
+ "auto_mapping": null,
4
+ "base_model_name_or_path": "Qwen/Qwen3-Embedding-0.6B",
5
+ "bias": "none",
6
+ "corda_config": null,
7
+ "eva_config": null,
8
+ "exclude_modules": null,
9
+ "fan_in_fan_out": false,
10
+ "inference_mode": true,
11
+ "init_lora_weights": true,
12
+ "layer_replication": null,
13
+ "layers_pattern": null,
14
+ "layers_to_transform": null,
15
+ "loftq_config": {},
16
+ "lora_alpha": 32,
17
+ "lora_bias": false,
18
+ "lora_dropout": 0.1,
19
+ "megatron_config": null,
20
+ "megatron_core": "megatron.core",
21
+ "modules_to_save": [
22
+ "score",
23
+ "classifier",
24
+ "score",
25
+ "classifier",
26
+ "score",
27
+ "classifier",
28
+ "score",
29
+ "classifier",
30
+ "score"
31
+ ],
32
+ "peft_type": "LORA",
33
+ "qalora_group_size": 16,
34
+ "r": 16,
35
+ "rank_pattern": {},
36
+ "revision": null,
37
+ "target_modules": [
38
+ "v_proj",
39
+ "down_proj",
40
+ "q_proj",
41
+ "o_proj",
42
+ "up_proj",
43
+ "gate_proj",
44
+ "k_proj"
45
+ ],
46
+ "task_type": "SEQ_CLS",
47
+ "trainable_token_indices": null,
48
+ "use_dora": false,
49
+ "use_qalora": false,
50
+ "use_rslora": false
51
+ }
adapter_model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f279c68eeb3827c617cfd4d4c8b104a1612c68421a2b69fff34c95026ff7ec8a
3
+ size 40430464