File size: 2,495 Bytes
95b999b 2cf56f9 862c90d 95b999b 2cf56f9 95b999b ba8b1a9 83b32dc ba8b1a9 95b999b 2cf56f9 95b999b 2cf56f9 95b999b 2cf56f9 95b999b 2cf56f9 95b999b a6fddad |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 |
---
license: mit
datasets:
- argilla/distilabel-intel-orca-dpo-pairs
- jondurbin/truthy-dpo-v0.1
- argilla/distilabel-math-preference-dpo
- argilla/distilabel-capybara-dpo-7k-binarized
language:
- en
library_name: adapter-transformers
base_model: Technoculture/MT7Bi-sft
---
# Technoculture/MedMerge-6-7b-alpha-dpo
# Open LLM Leaderboard
![image/png](https://cdn-uploads.huggingface.co/production/uploads/63486df1f8f01fcc4b23e97d/ZhdVcETriQf5WFiDhXb5q.png)
| Model Name | ARC | HellaSwag | MMLU | TruthfulQA | Winogrande | GSM8K |
| ----------------------- | -------- | --------- | ------ | ---------- | ---------- | -------- |
| Orca-2-7b | **78.4** | 76.1 | 53.7 | **52.4** | **74.2** | **47.2** |
| LLAMA-2-7b | 43.2 | **77.1** | 44.4 | 38.7 | 69.5 | 16 |
| MT7Bi-sft | 54.1 | 75.11 | - | 43.08 | 72.14 | 15.54 |
| MedMerge-6-7b | 29.52 | 41.04 | - | 37.53 | 59.35 | 0.91 |
| MedMerge-6-7b-alpha-dpo | 54.27 | 75.6 | 52.65 | 43.94 | 71.03 | 26.16 |
## Training Details
- **GPU:** Nvidia A100 Tensor Core GPU
- **Total Batches:** 4266
- **Epochs:** 3
- **Duration:** 3 hours, 57 minutes, and 00 seconds
## DPO Training Dataset Mixture
| Dataset Name | Original Size(Rows) | Ratio | Size After Ratio(Rows) |
|----------------------------------------------------|---------------|-------|------------------|
| argilla/distilabel-math-preference-dpo | 2.4k | 1.0 | 2.4k |
| argilla/distilabel-intel-orca-dpo-pairs | 12.9k | 0.5 | 6.45k |
| jondurbin/truthy-dpo-v0.1 | 1.04k | 1.0 | 1.04k |
| argilla/distilabel-capybara-dpo-7k-binarized | 7.5k | 0.2 | 1.5k |
Total Size: 11.38k
## Training Loss Plot
![image/png](https://cdn-uploads.huggingface.co/production/uploads/658bed1c8ff537204fbd92a3/wEkGQGRVK000d0q6FkXE9.png)
## Training Loss Smoothed Plot
![image/png](https://cdn-uploads.huggingface.co/production/uploads/658bed1c8ff537204fbd92a3/CDk_JCsteIwGAG_DyHRDE.png)
### For full details of this dpo-training please read our notebook.
<a target="_blank" href="https://colab.research.google.com/github/dkshjn/Technoculture/blob/main/MedMerge_6_7b_alpha_dpo.ipynb">
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a> |