Crystalcareai commited on
Commit
0ec1735
·
verified ·
1 Parent(s): 706595d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -5
README.md CHANGED
@@ -12,10 +12,7 @@ license_link: https://ai.google.dev/gemma/terms
12
 
13
 
14
 
15
- Note: If you wish to use GemMoE while it is in beta, you must install transformers from my branch using:
16
-
17
- ```bash
18
- pip install git+https://github.com/Crystalcareai/transformers.git@GemMoE#egg=transformers
19
  ```
20
 
21
 
@@ -40,7 +37,7 @@ Adapting Mergekit to support Gemma was no small feat, and I had to make signific
40
 
41
  I want to extend my gratitude to Maxime Labonne for his incredibly useful LLM course and colab notebooks, which helped me level up my fine-tuning skills. Jon Durbin's bagel GitHub repository was a crash course in what makes good data, and it played a crucial role in informing my data selection process. The transparency and example set by Teknium inspired me to turn my AI side hustle into a full-time gig, and for that, I am deeply grateful.
42
 
43
- Locutusque's datasets served as a prime example of how to aggregate data like a pro. Justin Lin from Alibaba research supported my previous project, Qwen1.5 - 8x7b, which laid the foundation for GemMoE. I also want to thank Deepmind for releasing Gemma and acknowledge the hard work and dedication of everyone who contributed to its development.
44
 
45
  The Deepseek team's DeepseekMoE paper was a game-changer for me, providing critical insights into what makes an MoE as good as possible. I am also incredibly grateful to the entire Perplexity team, whose answer engine accelerated my education and understanding of AI by a factor of five (source: vibes).
46
 
 
12
 
13
 
14
 
15
+ Note: If you wish to use GemMoE while it is in beta, you must flag trust_remote_code=True in your training config.
 
 
 
16
  ```
17
 
18
 
 
37
 
38
  I want to extend my gratitude to Maxime Labonne for his incredibly useful LLM course and colab notebooks, which helped me level up my fine-tuning skills. Jon Durbin's bagel GitHub repository was a crash course in what makes good data, and it played a crucial role in informing my data selection process. The transparency and example set by Teknium inspired me to turn my AI side hustle into a full-time gig, and for that, I am deeply grateful.
39
 
40
+ Locutusque's datasets were a prime example of aggregating data like a pro. Justin Lin from Alibaba research supported my previous project, Qwen1.5 - 8x7b, which laid the foundation for GemMoE. I also want to thank Deepmind for releasing Gemma and acknowledge the hard work and dedication of everyone who contributed to its development.
41
 
42
  The Deepseek team's DeepseekMoE paper was a game-changer for me, providing critical insights into what makes an MoE as good as possible. I am also incredibly grateful to the entire Perplexity team, whose answer engine accelerated my education and understanding of AI by a factor of five (source: vibes).
43