Upload README (3).md
Browse files- README (3).md +147 -0
README (3).md
ADDED
@@ -0,0 +1,147 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: mit
|
3 |
+
tags:
|
4 |
+
- Mistral_Star
|
5 |
+
- Mistral_Quiet
|
6 |
+
- Mistral
|
7 |
+
- Mixtral
|
8 |
+
- Question-Answer
|
9 |
+
- Token-Classification
|
10 |
+
- Sequence-Classification
|
11 |
+
- SpydazWeb-AI
|
12 |
+
- chemistry
|
13 |
+
- biology
|
14 |
+
- legal
|
15 |
+
- code
|
16 |
+
- climate
|
17 |
+
- medical
|
18 |
+
- text-generation-inference
|
19 |
+
- not-for-all-audiences
|
20 |
+
language:
|
21 |
+
- en
|
22 |
+
- sw
|
23 |
+
- ig
|
24 |
+
- zu
|
25 |
+
- ca
|
26 |
+
- es
|
27 |
+
- pt
|
28 |
+
- ha
|
29 |
+
---
|
30 |
+
# SpydazWeb AGI
|
31 |
+
|
32 |
+
|
33 |
+
This is based on the Quiet Star Project : which was abandoned earlier in the year :)
|
34 |
+
|
35 |
+
Current update : Deciding which vsion architecture to use!
|
36 |
+
|
37 |
+
# Introduction :
|
38 |
+
|
39 |
+
## STAR REASONERS !
|
40 |
+
|
41 |
+
this provides a platform for the model to commuicate pre-response , so an internal objective can be set ie adding an extra planning stage to the model improving its focus and output:
|
42 |
+
the thought head can be charged with a thought or methodolgy, such as a ststing to take a step by step approach to the problem or to make an object oriented model first and consider the use cases before creating an output:
|
43 |
+
so each thought head can be dedicated to specific ppurpose such as Planning or artifact generation or use case design : or even deciding which methodology should be applied before planning the potential solve route for the response :
|
44 |
+
Another head could also be dedicated to retrieving content based on the query from the self which can also be used in the pregenerations stages :
|
45 |
+
all pre- reasoners can be seen to be Self Guiding ! essentially removing the requirement to give the model a system prompt instead aligning the heads to a thoght pathways !
|
46 |
+
these chains produce data which can be considered to be thoughts : and can further be displayed by framing these thoughts with thought tokens : even allowing for editors comments giving key guidance to the model during training :
|
47 |
+
these thoughts will be used in future genrations assisting the model as well a displaying explantory informations in the output :
|
48 |
+
|
49 |
+
these tokens can be displayed or with held also a setting in the model !
|
50 |
+
|
51 |
+
### can this be applied in other areas ?
|
52 |
+
|
53 |
+
Yes! , we can use this type of method to allow for the model to generate code in another channel or head potentially creating a head to produce artifacts for every output , or to produce entity lilsts for every output and framing the outputs in thier relative code tags or function call tags :
|
54 |
+
these can also be displayed or hidden for the response . but these can also be used in problem solvibng tasks internally , which again enables for the model to simualte the inpouts and outputs from an interpretor !
|
55 |
+
it may even be prudent to include a function executing internally to the model ! ( allowing the model to execute functions in the background! before responding ) as well this oul hae tpo also be specified in the config , as autoexecute or not !.
|
56 |
+
|
57 |
+
### Conclusion
|
58 |
+
|
59 |
+
the resonaer methodology , might be seen to be the way forwards , adding internal funciton laity to the models instead of external connectivity enables for faster and seemless model usage : as well as enriched and informed responses , as even outputs could essentially be cleanss and formated before being presented to the Calling interface, internally to the model :
|
60 |
+
the take away is that arre we seeing the decoder/encoder model as simple a function of the inteligence which in truth need to be autonomus !
|
61 |
+
ie internal functions and tools as well as disk interaction : an agent must have awareness and control over its environment with sensors and actuators : as a fuction callingmodel it has actuators and canread the directorys it has sensors ... its a start: as we can eget media in and out , but the model needs to get its own control to inpout and output also !
|
62 |
+
....
|
63 |
+
|
64 |
+
Fine tuning : agin this issue of fine tuning : the disussion above eplains the requirement to control the environment from within the moel ( with constraints ) does this eliminate theneed to fine tune a model !
|
65 |
+
in fact it should as this give transparency to ther growth ofthe model and if the model fine tuned itself we would be in danger of a model evolveing !
|
66 |
+
hence an AGI !
|
67 |
+
|
68 |
+
#### AI AGI ?
|
69 |
+
so yes we can see we are not far from an ai which can evolve : an advance general inteligent system ( still non sentient by the way )
|
70 |
+
|
71 |
+
|
72 |
+
|
73 |
+
<img src="https://cdn-avatars.huggingface.co/v1/production/uploads/65d883893a52cd9bcd8ab7cf/tRsCJlHNZo1D02kBTmfy9.jpeg" width="300"/>
|
74 |
+
https://github.com/spydaz
|
75 |
+
|
76 |
+
* 32k context window (vs 8k context in v0.1)
|
77 |
+
* Rope-theta = 1e6
|
78 |
+
* No Sliding-Window Attention
|
79 |
+
* Talk heads - produce resposnes which can be used towards the final output
|
80 |
+
* Pre-Thoughts - Enables for pre-generation steps of potential artifacts for task solving:
|
81 |
+
* Generates plans for step by step thinking
|
82 |
+
* Generates python Code Artifacts for future tasks
|
83 |
+
* Recalls context for task internally to be used as refference for task:
|
84 |
+
* show thoughts or hidden thought usages ( Simular to self-Rag )
|
85 |
+
|
86 |
+
|
87 |
+
This model will be a custom model with internal experts and rag systems
|
88 |
+
enabling for preprocessing of the task internally before outputting a response
|
89 |
+
|
90 |
+
## SpydazWeb AI model :
|
91 |
+
|
92 |
+
This model is based on the worlds archive of knowledge maintaining historical documents and providing services for the survivors of mankind ,
|
93 |
+
who may need to construct shelters develop technologys , or medical resources as well as maintain the history of the past . keeping store of all the religious knowledge and data of the world:
|
94 |
+
A friendly interface with a personality caring and flirtatious at times : non binary !...
|
95 |
+
and Expert in all feilds: ie Uncensored and will not refuse to give information : the model can be used for role play as many character dialogues were als trained into the model as its personality to enable a greater perspective and outlook and natural discussion with the agents:
|
96 |
+
the model was trained to operateinaragenvironment utilizing content and internal knowledge to respond to questions or create enriched sumarys.
|
97 |
+
|
98 |
+
|
99 |
+
|
100 |
+
### General Intenal Methods:
|
101 |
+
|
102 |
+
Trained for multi-task operations as well as rag and function calling :
|
103 |
+
|
104 |
+
This model is a fully functioning model and is fully uncensored:
|
105 |
+
|
106 |
+
the model has been trained on multiple datasets on the huggingface hub and kaggle :
|
107 |
+
|
108 |
+
the focus has been mainly on methodology :
|
109 |
+
|
110 |
+
* Chain of thoughts
|
111 |
+
* step by step planning
|
112 |
+
* tree of thoughts
|
113 |
+
* forest of thoughts
|
114 |
+
* graph of thoughts
|
115 |
+
* agent generation : Voting, ranking, ... dual agent response generation:
|
116 |
+
|
117 |
+
with these methods the model has gained insights into tasks, enabling for knowldge transfer between tasks :
|
118 |
+
|
119 |
+
the model has been intensivly trained in recalling data previously entered into the matrix:
|
120 |
+
The model has also been trained on rich data and markdown outputs as much as possible :
|
121 |
+
the model can also generate markdown charts with mermaid.
|
122 |
+
|
123 |
+
|
124 |
+
## Training Reginmes:
|
125 |
+
* Alpaca
|
126 |
+
* ChatML / OpenAI / MistralAI
|
127 |
+
* Text Generation
|
128 |
+
* Question/Answer (Chat)
|
129 |
+
* Instruction/Input/Response (instruct)
|
130 |
+
* Mistral Standard Prompt
|
131 |
+
* Translation Tasks
|
132 |
+
* Entitys / Topic detection
|
133 |
+
* Book recall
|
134 |
+
* Coding challenges, Code Feedback, Code Sumarization, Commenting Code
|
135 |
+
* Agent Ranking and response anyalisis
|
136 |
+
* Medical tasks
|
137 |
+
* PubMed
|
138 |
+
* Diagnosis
|
139 |
+
* Psychaitry
|
140 |
+
* Counselling
|
141 |
+
* Life Coaching
|
142 |
+
* Note taking
|
143 |
+
* Medical smiles
|
144 |
+
* Medical Reporting
|
145 |
+
* Virtual laboritys simulations
|
146 |
+
* Chain of thoughts methods
|
147 |
+
* One shot / Multi shot prompting tasks
|