LeroyDyer commited on
Commit
db63cb9
1 Parent(s): 63c7051

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +120 -0
README.md CHANGED
@@ -9,7 +9,127 @@ tags:
9
  - mistral
10
  - trl
11
  base_model: LeroyDyer/Mixtral_AI_CyberTron_DeepMind_III
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
 
14
  # Uploaded model
15
 
 
9
  - mistral
10
  - trl
11
  base_model: LeroyDyer/Mixtral_AI_CyberTron_DeepMind_III
12
+ datasets:
13
+ - gretelai/synthetic_text_to_sql
14
+ - HuggingFaceTB/cosmopedia
15
+ - teknium/OpenHermes-2.5
16
+ - Open-Orca/SlimOrca
17
+ - Open-Orca/OpenOrca
18
+ - cognitivecomputations/dolphin-coder
19
+ - databricks/databricks-dolly-15k
20
+ - yahma/alpaca-cleaned
21
+ - uonlp/CulturaX
22
+ - mwitiderrick/SwahiliPlatypus
23
+ - swahili
24
+ - Rogendo/English-Swahili-Sentence-Pairs
25
+ - ise-uiuc/Magicoder-Evol-Instruct-110K
26
+ - meta-math/MetaMathQA
27
+ - abacusai/ARC_DPO_FewShot
28
+ - abacusai/MetaMath_DPO_FewShot
29
+ - abacusai/HellaSwag_DPO_FewShot
30
+ - HaltiaAI/Her-The-Movie-Samantha-and-Theodore-Dataset
31
+ - gretelai/synthetic_text_to_sql
32
+ - HuggingFaceTB/cosmopedia
33
+ - teknium/OpenHermes-2.5
34
+ - cognitivecomputations/dolphin-coder
35
+ - databricks/databricks-dolly-15k
36
+ - yahma/alpaca-cleaned
37
+ - uonlp/CulturaX
38
+ - mwitiderrick/SwahiliPlatypus
39
+ - swahili
40
+ - Rogendo/English-Swahili-Sentence-Pairs
41
+ - ise-uiuc/Magicoder-Evol-Instruct-110K
42
+ - meta-math/MetaMathQA
43
+ metrics:
44
+ - accuracy
45
+ - bertscore
46
+ - bleu
47
+ - brier_score
48
+ - cer
49
+ - character
50
+ - charcut_mt
51
+ - chrf
52
+ - code_eval
53
+ y-Gene:
54
+ - LeroyDyer/Mixtral_AI_DeepMind
55
+ - LeroyDyer/Mixtral_AI_CyberUltron_DPO
56
+ - LeroyDyer/Mixtral_AI_Chat_2.0
57
+ - LeroyDyer/Mixtral_AI_DeepMedicalMind
58
+ - LeroyDyer/Mixtral_AI_Samantha
59
+ x-Gene:
60
+ - LeroyDyer/Mixtral_AI_Chat_2.0
61
+ - LeroyDyer/Mixtral_BioMedical
62
+ - LeroyDyer/Mixtral_AI_Medic
63
+ - LeroyDyer/Mixtral_Cyber_BioMedic
64
+ - LeroyDyer/Mixtral_AI_DeepMedicalMind
65
+ Variant:
66
+ - LeroyDyer/MetaMath_LLM
67
+ - LeroyDyer/TruthfulQA_LLM
68
+ - LeroyDyer/HellaSwag_LLM
69
+ - LeroyDyer/Mixtral_AI_DeepMedicalMind
70
  ---
71
+
72
+
73
+ # ::: DEEP MIND PROJECT :::
74
+
75
+ here we begin the models for Deep mind :
76
+
77
+ this model created from the first trained models : deepmind!
78
+ these models contain:
79
+
80
+ ## thoughts and processes :
81
+
82
+ ## SelfRAG:
83
+
84
+ ## Agent Generation:
85
+
86
+ ## Chain of thoughts :
87
+
88
+ ## Deep thinking and memory recall:
89
+
90
+
91
+
92
+
93
+ Training Prompt version - Working GREAT! -
94
+
95
+
96
+ checks itsef discussing complex questions (question it does not know the answer to ... it trys to discuss with itself to find a result(sometimes unsucessfully))
97
+
98
+ It generates Mini agents to perform small tasks such as entity recognition; step by step definitions, write psuedo codebases , generare uscases... perform calculations, analize content
99
+
100
+ It thinks.... sometimes sarcasim , sometimes reflection... sometimes random thoughts ...
101
+
102
+ it has personalitys : by installing various long discussions with chat gpt in persona it weas able to generate role coversation data, which was added to its conversation chat Q/A; as well as a datset from the samantha tv show ... and HER!.... so it is a personal assistant and very friendly;
103
+
104
+ It has been really training mainly on coding datasets and medical information : from experiments to research to patient/doctor .. to diagnosis ... to problem solving :
105
+
106
+ it has been trained to be a counseller and assist with psycological problems :: empathtetic discussion :
107
+
108
+ this one has its own thoughts despite the prompt given : (if you allow the thought prompt it will display the thoughts)
109
+
110
+ this is a highly focused model :
111
+
112
+
113
+ ### Methodology:
114
+ many functions such as defining words andnlp task we also added via datsets and very complexed datstructures and prompts :
115
+ These prompts are removed after training and standard alpaca training given on top:(this enables for the previous highly over fit task to become embedded underneath the previous layer):
116
+ its important to Change Lora configuration for Embedding layers within the model as well as fine tuning above previous training:
117
+ Usually i deploy a factor of 8 calcuculation for my loras by this one i chose factor of 9 (9-18/18/36) .... which actually trained so smoothly that i was able to train many different datsets in a signle sitting ; to below 0.9 all varioations of the alpaca prompt !
118
+ after testing the was absolutly 0 loss from previous knowledge as well as enhancing some responses and providing comparitive responses for others;
119
+ I personally use a topK of 1000....
120
+ this allows the model to have many choices (this is the context window of results),
121
+ i put my topP to 0.68(68%)....
122
+ hence it will select from that percentage of probabiltys...
123
+ enabling for my temp to be 1 ..
124
+ therfore it will normalize the selected quartile of next probablity selection enabling for the lower probabiltys to have a scaled chace in being selected :
125
+ It is important to have a degree of randomness in the respopnse or you will ask the same question and get the same answer ! .... we need varied answer to ome querys and focues for other ? how do we do this ?..... Duplicates!!!!! raising the probability of some information by repetition : as this is how the human learns truth ! truth is that which has been repeated so many times it cannot be disputed!
126
+ hence some information being absolute and others being transient and constantly updateing:
127
+ As a predictve model it needs to be ables to have the ability to calculate and predicte and cclassify as wel as recall exact information :
128
+ hence when utilizing a rag : the conversation history is the dats to be fine tuned into the model as frequent data!
129
+ as well as producing multiple simular querys to query the rag system for Q/A pairs : also to be updted onto the model :
130
+ as we are in this development period we are focused on BRAIN cureently .......
131
+
132
+
133
 
134
  # Uploaded model
135