charlesdedampierre
/

TopicNeuralHermes-2.5-Mistral-7B

Text Generation

Safetensors

Bunkatopics

mistral

conversational

Model card Files Files and versions Community

charlesdedampierre commited on Jan 13

Commit

b05e402

•

1 Parent(s): 79cb7a6

Update README.md

Browse files

Files changed (1) hide show

README.md +107 -6

README.md CHANGED Viewed

@@ -4,12 +4,113 @@ license: apache-2.0
 ## Model description
-TopicNeuralHermes 2.5 Mistral 7B is a Mistral-based fine-tuned model, as a continuuaion of OpenHermes 2.5.
-The model was trained on a refined DPO dataset. We compared the rejected and accepted in hte DPO datastes adn tried to find the reasons behind acceptance or rejection.
-We used Topic Modeling methods (hence TopicNeuralHermes) on both datasets and only kept the topics that existed in the ChatGPT responses and not in the LLama repsonses. Our hypothesis
-is that those topics encapsulate the main differences between the two ways of answering. This method can help converge quicker and with way less data (around 1/6 of the initial dataset)
-Bug thanks to https://huggingface.co/mlabonne for the notebbok he created that helped carry out the DPO Strategy.
-We use [Bunkatopics](https://github.com/charlesdedampierre/BunkaTopics) to carry out the Topic Modeling methods.

 ## Model description
+TopicNeuralHermes 2.5 Mistral 7B is a Mistral-based fine-tuned model, continuing from OpenHermes 2.5.
+The model was trained on a refined DPO dataset. The objective was to train the model on a small portion of the DPO data. To achieve this, we compared two datasets used to train the reward model: the rejected Llama answers and the accepted ChatGPT answers from the [DPO dataset](mlabonne/chatml_dpo_pairs).
+We then conducted topic modeling on both datasets, keeping only the topics that existed in the accepted dataset but not in the rejected one.
+Our hypothesis is that these topics encapsulate the main differences between the two answering styles.
+This method allows for quicker convergence with significantly less data (around 1/6 of the initial dataset).
+Special thanks to [mlabonne](https://huggingface.co/mlabonne) for creating the [colab notebook](https://colab.research.google.com/drive/15iFBr1xWgztXvhrj5I9fBv20c7CFOPBE?usp=sharing#scrollTo=YpdkZsMNylvp) that facilitated the DPO Strategy.
+We used [Bunkatopics](https://github.com/charlesdedampierre/BunkaTopics) to implement the topic modeling methods.
+## Topic Analysis
+We applied the topic modeling method to both datasets, extracting 30 topics from each.
+These topics were characterized using the 10 most specific unigrams or bigrams.
+We then compared the two sets of topics (30 from each dataset) and retained those in the accepted dataset that shared fewer than 2 terms with any topic in the rejected dataset
+We found the 13 distincitve following topics described by 10 terms each:
+**Emotional Dynamics**: feelings, Quinn, Austin, minority women, teaching, schools, individual, personality, backgrounds, triggers.
+**Global Knowledge Queries**: question, information, geography, news articles, Step, answer, capital city, pipeline system, country, analogy.
+**Digital Interactions and Queries**: questions, question, PersonX, modem, answers, effect relationship, Quora, browser, answer, e-commerce.
+**Business and Cybersecurity**: email, businesses, initiatives, innovation, advertising papers, spam, breaches, antivirus, payments, prospects.
+**Lifestyle and Wellness**: sleep, exercise, gifts, shopping, Casey, stores, stress, headaches, options, mood.
+**Wildlife Ecology**: birds, prey, animals, species, infection, nest, eggs, bacteria, insects, kitty condo.
+**Environmental Science and Climate**: temperature, gases, greenhouse, emissions, perturbation, sulfur, dioxide, climate change, water, heat.
+**Maritime and Mechanical Engineering**: ship, bowling, propulsion, beam width, Filing cabinet, LED, lane, containment area, lawnmower, rotors.
+**Cultural and Social Dynamics**: Lindsey, museum, Kate, Rachel, Jason, Alex, Erin, conversation, Laura, exhibits.
+**Political Media Analysis**: media platforms, election, politics, teenagers, elections, White House, Barack Obama, nation, Confederate, depression.
+**International Relations and Policy**: cooperation, EU, nations, alliance, NATO, European Union, member states, policy, monarch, Brexit.
+**Astrophysics and Physical Sciences**: electrons, km, Moon, acceleration, orbit, friction, current, asteroid, electron, collector emitter.
+**Film Critique and Analysis**: movie review, film, reviewer, sentiment, critic, flaws, DVD, plot, opinion, originality.
+While those topics are not domain-specific, they did not appear right away in the rejected dataset. Further research need to undersand the reason behind the prominence of
+those topics in the accepted dataset.
+## Usage
+You can run this model using LM Studio or any other frontend.
+You can also run this model using the following code:
+import transformers
+from transformers import AutoTokenizer
+# Format prompt
+message = [
+    {"role": "system", "content": "You are a helpful assistant chatbot."},
+    {"role": "user", "content": "What is a Large Language Model?"}
+]
+tokenizer = AutoTokenizer.from_pretrained(new_model)
+prompt = tokenizer.apply_chat_template(message, add_generation_prompt=True, tokenize=False)
+# Create pipeline
+pipeline = transformers.pipeline(
+    "text-generation",
+    model=new_model,
+    tokenizer=tokenizer
+)
+# Generate text
+sequences = pipeline(
+    prompt,
+    do_sample=True,
+    temperature=0.7,
+    top_p=0.9,
+    num_return_sequences=1,
+    max_length=200,
+)
+print(sequences[0]['generated_text'])
+Training hyperparameters
+LoRA:
+r=16
+lora_alpha=16
+lora_dropout=0.05
+bias="none"
+task_type="CAUSAL_LM"
+target_modules=['k_proj', 'gate_proj', 'v_proj', 'up_proj', 'q_proj', 'o_proj', 'down_proj']
+Training arguments:
+per_device_train_batch_size=4
+gradient_accumulation_steps=4
+gradient_checkpointing=True
+learning_rate=5e-5
+lr_scheduler_type="cosine"
+max_steps=200
+optim="paged_adamw_32bit"
+warmup_steps=100
+DPOTrainer:
+beta=0.1
+max_prompt_length=1024
+max_length=1536