charlesdedampierre commited on
Commit
3202359
1 Parent(s): 6eaddc4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -3
README.md CHANGED
@@ -4,7 +4,10 @@ license: apache-2.0
4
 
5
  ## Model description
6
 
7
- TopicNeuralHermes 2.5 Mistral 7B is a Mistral-based fine-tuned model, continuing from OpenHermes 2.5.
 
 
 
8
 
9
  The model was trained on a refined DPO dataset. The objective was to train the model on a small portion of the DPO data. To achieve this, we compared two datasets used to train the reward model: the rejected Llama answers and the accepted ChatGPT answers from the [DPO dataset](mlabonne/chatml_dpo_pairs).
10
  We then conducted topic modeling on both datasets, keeping only the topics that existed in the accepted dataset but not in the rejected one.
@@ -14,8 +17,6 @@ This method allows for quicker convergence with significantly less data (around
14
 
15
  Special thanks to [mlabonne](https://huggingface.co/mlabonne) for creating the [colab notebook](https://colab.research.google.com/drive/15iFBr1xWgztXvhrj5I9fBv20c7CFOPBE?usp=sharing#scrollTo=YpdkZsMNylvp) that facilitated the DPO Strategy.
16
 
17
- We used [Bunkatopics](https://github.com/charlesdedampierre/BunkaTopics) to implement the topic modeling methods.
18
-
19
 
20
  ## Topic Analysis
21
 
 
4
 
5
  ## Model description
6
 
7
+
8
+ TopicNeuralHermes 2.5 Mistral 7B is a refined model developed through fine-tuning with a specific subset of data, selected via Topic Modeling Techniques using [Bunkatopics](https://github.com/charlesdedampierre/BunkaTopics).
9
+
10
+ continuing from OpenHermes 2.5.
11
 
12
  The model was trained on a refined DPO dataset. The objective was to train the model on a small portion of the DPO data. To achieve this, we compared two datasets used to train the reward model: the rejected Llama answers and the accepted ChatGPT answers from the [DPO dataset](mlabonne/chatml_dpo_pairs).
13
  We then conducted topic modeling on both datasets, keeping only the topics that existed in the accepted dataset but not in the rejected one.
 
17
 
18
  Special thanks to [mlabonne](https://huggingface.co/mlabonne) for creating the [colab notebook](https://colab.research.google.com/drive/15iFBr1xWgztXvhrj5I9fBv20c7CFOPBE?usp=sharing#scrollTo=YpdkZsMNylvp) that facilitated the DPO Strategy.
19
 
 
 
20
 
21
  ## Topic Analysis
22