Upload folder using huggingface_hub

Browse files

Files changed (3) hide show

README.md +159 -0
adapter_config.json +51 -0
adapter_model.safetensors +3 -0

README.md ADDED Viewed

	@@ -0,0 +1,159 @@

+---
+base_model: Qwen/Qwen3-Embedding-0.6B
+library_name: peft
+tags:
+- text-classification
+- reddit
+- conversation-analysis
+- constructive-dialogue
+- qwen
+- lora
+- transformers
+language:
+- en
+datasets:
+- reddit
+pipeline_tag: text-classification
+---
+# Qwen Reddit Constructive Conversation Classifier
+A fine-tuned Qwen 3 Embedding model for classifying constructive vs non-constructive conversations in Reddit discussions.
+## Model Description
+This model is a QLoRA (Quantized LoRA) fine-tuned version of `Qwen/Qwen3-Embedding-0.6B` specifically trained to identify constructive conversations in Reddit threads. The model was trained using self-training techniques on Reddit discussion data.
+- **Model Type**: Text Classification (Binary)
+- **Base Model**: Qwen/Qwen3-Embedding-0.6B
+- **Training Method**: QLoRA with self-training
+- **Task**: Binary classification of conversation constructiveness
+- **Language**: English
+## Intended Uses
+### Primary Use Case
+- Classifying Reddit discussions as constructive or non-constructive
+- Content moderation assistance
+- Conversation quality analysis
+- Social media research
+### Direct Use
+```python
+from transformers import AutoTokenizer, AutoModelForSequenceClassification
+from peft import PeftModel
+import torch
+# Load base model and tokenizer
+base_model_name = "Qwen/Qwen3-Embedding-0.6B"
+tokenizer = AutoTokenizer.from_pretrained(base_model_name)
+model = AutoModelForSequenceClassification.from_pretrained(
+    base_model_name,
+    num_labels=2
+)
+# Load the fine-tuned adapters
+model = PeftModel.from_pretrained(model, "NiklasKoch/qwen-discussion-classifier")
+model.eval()
+# Classify text
+def classify_text(text):
+    inputs = tokenizer(
+        text,
+        return_tensors="pt",
+        truncation=True,
+        padding=True,
+        max_length=4096
+    )
+    # Move inputs to same device as model (important for GPU usage)
+    inputs = {k: v.to(next(model.parameters()).device) for k, v in inputs.items()}
+    with torch.no_grad():
+        outputs = model(**inputs)
+        predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
+    # 0 = non-constructive, 1 = constructive
+    predicted_class = torch.argmax(predictions, dim=-1).item()
+    confidence = predictions[0][predicted_class].item()
+    return {
+        'class': 'constructive' if predicted_class == 1 else 'non-constructive',
+        'confidence': confidence,
+        'scores': {
+            'non-constructive': predictions[0][0].item(),
+            'constructive': predictions[0][1].item()
+        }
+    }
+# Example usage
+text = "[author0] LEGO: What do you think you're doing?!? [author1] I don't get it did he reveal bionicle reboot or smthn? [author2] Not really, he did announce something but was super vague, seems like a sort of passion project we wants to do with the community, he even said it might not even be bionicle. [author1] So is that image fan made or is it one of his passion projects [author2] Those pictures are real and on his insta, he did a stream talking about it I\u2019m sure you can find somewhere, search up Fabre bionicle stream 2020 or something. [author1] OK thanks"
+result = classify_text(text)
+print(result)
+```
+## Training Details
+### Training Data
+- **Source**: https://archive.org/download/pushshift_reddit_200506_to_202212/
+- **Size**: The dataset I used contained a total of ~1.4 million Reddit threads filtered for English language and a minimum of 2 authors per thread.
+- **Labels**: Binary (constructive/non-constructive conversations)
+- **Additional Data**: YNACC and IAC datasets for initial supervised training
+### Training Procedure
+- **Training Method**: Self-Training
+- **Quantization**: 4-bit QLoRA
+- **LoRA Config**:
+  - `r`: 16
+  - `lora_alpha`: 32
+  - `lora_dropout`: 0.1
+  - Target modules: `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`
+- **Loss Function**: Focal Loss with class weighting
+- **Max Sequence Length**: 4096 tokens
+- **Batch Size**: 64
+- **Learning Rate**: 2e-6
+### Training Hardware
+- 48 hours on 4x NVIDIA A100 40GB GPUs
+## Performance
+### Evaluation Results
+```
+YNACC:
+Accuracy: 0.70
+F1-Score: 0.69
+IAC:
+Accuracy: 0.78
+F1-Score: 0.86
+Reddit:
+Accuracy: 0.64
+F1-Score: 0.74
+```
+## Limitations and Bias
+- **Language**: English only
+- **Bias**: May reflect biases present in Reddit discussions and training data
+## Ethical Considerations
+- Human oversight is recommended for important moderation decisions
+## Technical Specifications
+- **Model Architecture**: Qwen 3 Embedding + Classification Head
+- **Parameters**: ~600M base + LoRA adapters + classification head
+- **Precision**: 4-bit quantized base model with full-precision adapters
+- **Framework**: PyTorch, Transformers, PEFT (any recent version - you may see harmless warnings about configuration parameters)
+## Model Card Authors
+Niklas Koch, Georg August University of Göttingen
+## Model Card Contact
+niklas.koch01@stud.uni-goettingen.de

adapter_config.json ADDED Viewed

	@@ -0,0 +1,51 @@

+{
+  "alpha_pattern": {},
+  "auto_mapping": null,
+  "base_model_name_or_path": "Qwen/Qwen3-Embedding-0.6B",
+  "bias": "none",
+  "corda_config": null,
+  "eva_config": null,
+  "exclude_modules": null,
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layer_replication": null,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "loftq_config": {},
+  "lora_alpha": 32,
+  "lora_bias": false,
+  "lora_dropout": 0.1,
+  "megatron_config": null,
+  "megatron_core": "megatron.core",
+  "modules_to_save": [
+    "score",
+    "classifier",
+    "score",
+    "classifier",
+    "score",
+    "classifier",
+    "score",
+    "classifier",
+    "score"
+  ],
+  "peft_type": "LORA",
+  "qalora_group_size": 16,
+  "r": 16,
+  "rank_pattern": {},
+  "revision": null,
+  "target_modules": [
+    "v_proj",
+    "down_proj",
+    "q_proj",
+    "o_proj",
+    "up_proj",
+    "gate_proj",
+    "k_proj"
+  ],
+  "task_type": "SEQ_CLS",
+  "trainable_token_indices": null,
+  "use_dora": false,
+  "use_qalora": false,
+  "use_rslora": false
+}

adapter_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f279c68eeb3827c617cfd4d4c8b104a1612c68421a2b69fff34c95026ff7ec8a
+size 40430464