File size: 7,910 Bytes
1a03412 6f2461c 1a03412 4fb6805 1a03412 e9735e5 8b458a7 13fb130 1a03412 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 |
---
license: mit
tags:
- Mistral_Star
- Mistral_Quiet
- Mistral
- Mixtral
- Question-Answer
- Token-Classification
- Sequence-Classification
- SpydazWeb-AI
- chemistry
- biology
- legal
- code
- climate
- medical
- text-generation-inference
language:
- en
- sw
- ig
- zu
- ca
- es
- pt
- ha
pipeline_tag: text-generation
---
# SpydazWeb AGI
This is based on the Quiet Star Reasoning Project : which was abandoned earlier in the year :)
Current update : UNDER TEST ! / Self Extend still to be applied : this is just an Early Release of the StaR model ! cleaned up tensors but !
problem with cos cache ?? --- Will fix it tomorow or this week here : Im workingonit!
Also cannot get to do the trainingbecause it need large memory and the A100 colab will not seem to work ! ( accelerate issue )
(I cannot make it a gguf ?HOW!) - Unless maybe i hack the transformers library maybe ?(reframe the model as themistral and replace the existing file ( thats how they had doen it in the past , perhaps i will have to stay as a remte code model ?))
# Introduction :
## STAR REASONERS !
this provides a platform for the model to commuicate pre-response , so an internal objective can be set ie adding an extra planning stage to the model improving its focus and output:
the thought head can be charged with a thought or methodolgy, such as a ststing to take a step by step approach to the problem or to make an object oriented model first and consider the use cases before creating an output:
so each thought head can be dedicated to specific ppurpose such as Planning or artifact generation or use case design : or even deciding which methodology should be applied before planning the potential solve route for the response :
Another head could also be dedicated to retrieving content based on the query from the self which can also be used in the pregenerations stages :
all pre- reasoners can be seen to be Self Guiding ! essentially removing the requirement to give the model a system prompt instead aligning the heads to a thoght pathways !
these chains produce data which can be considered to be thoughts : and can further be displayed by framing these thoughts with thought tokens : even allowing for editors comments giving key guidance to the model during training :
these thoughts will be used in future genrations assisting the model as well a displaying explantory informations in the output :
these tokens can be displayed or with held also a setting in the model !
### can this be applied in other areas ?
Yes! , we can use this type of method to allow for the model to generate code in another channel or head potentially creating a head to produce artifacts for every output , or to produce entity lilsts for every output and framing the outputs in thier relative code tags or function call tags :
these can also be displayed or hidden for the response . but these can also be used in problem solvibng tasks internally , which again enables for the model to simualte the inpouts and outputs from an interpretor !
it may even be prudent to include a function executing internally to the model ! ( allowing the model to execute functions in the background! before responding ) as well this oul hae tpo also be specified in the config , as autoexecute or not !.
### Conclusion
the resonaer methodology , might be seen to be the way forwards , adding internal funciton laity to the models instead of external connectivity enables for faster and seemless model usage : as well as enriched and informed responses , as even outputs could essentially be cleanss and formated before being presented to the Calling interface, internally to the model :
the take away is that arre we seeing the decoder/encoder model as simple a function of the inteligence which in truth need to be autonomus !
ie internal functions and tools as well as disk interaction : an agent must have awareness and control over its environment with sensors and actuators : as a fuction callingmodel it has actuators and canread the directorys it has sensors ... its a start: as we can eget media in and out , but the model needs to get its own control to inpout and output also !
....
Fine tuning : agin this issue of fine tuning : the disussion above eplains the requirement to control the environment from within the moel ( with constraints ) does this eliminate theneed to fine tune a model !
in fact it should as this give transparency to ther growth ofthe model and if the model fine tuned itself we would be in danger of a model evolveing !
hence an AGI !
#### AI AGI ?
so yes we can see we are not far from an ai which can evolve : an advance general inteligent system ( still non sentient by the way )
<img src="https://cdn-avatars.huggingface.co/v1/production/uploads/65d883893a52cd9bcd8ab7cf/tRsCJlHNZo1D02kBTmfy9.jpeg" width="300"/>
https://github.com/spydaz
* 32k context window (vs 8k context in v0.1)
* Rope-theta = 1e6
* No Sliding-Window Attention
* Talk heads - produce resposnes which can be used towards the final output
* Pre-Thoughts - Enables for pre-generation steps of potential artifacts for task solving:
* Generates plans for step by step thinking
* Generates python Code Artifacts for future tasks
* Recalls context for task internally to be used as refference for task:
* show thoughts or hidden thought usages ( Simular to self-Rag )
This model will be a custom model with internal experts and rag systems
enabling for preprocessing of the task internally before outputting a response
## SpydazWeb AI model :
This model is based on the worlds archive of knowledge maintaining historical documents and providing services for the survivors of mankind ,
who may need to construct shelters develop technologys , or medical resources as well as maintain the history of the past . keeping store of all the religious knowledge and data of the world:
A friendly interface with a personality caring and flirtatious at times : non binary !...
and Expert in all feilds: ie Uncensored and will not refuse to give information : the model can be used for role play as many character dialogues were als trained into the model as its personality to enable a greater perspective and outlook and natural discussion with the agents:
the model was trained to operateinaragenvironment utilizing content and internal knowledge to respond to questions or create enriched sumarys.
### General Intenal Methods:
Trained for multi-task operations as well as rag and function calling :
This model is a fully functioning model and is fully uncensored:
the model has been trained on multiple datasets on the huggingface hub and kaggle :
the focus has been mainly on methodology :
* Chain of thoughts
* step by step planning
* tree of thoughts
* forest of thoughts
* graph of thoughts
* agent generation : Voting, ranking, ... dual agent response generation:
with these methods the model has gained insights into tasks, enabling for knowldge transfer between tasks :
the model has been intensivly trained in recalling data previously entered into the matrix:
The model has also been trained on rich data and markdown outputs as much as possible :
the model can also generate markdown charts with mermaid.
## Training Reginmes:
* Alpaca
* ChatML / OpenAI / MistralAI
* Text Generation
* Question/Answer (Chat)
* Instruction/Input/Response (instruct)
* Mistral Standard Prompt
* Translation Tasks
* Entitys / Topic detection
* Book recall
* Coding challenges, Code Feedback, Code Sumarization, Commenting Code
* Agent Ranking and response anyalisis
* Medical tasks
* PubMed
* Diagnosis
* Psychaitry
* Counselling
* Life Coaching
* Note taking
* Medical smiles
* Medical Reporting
* Virtual laboritys simulations
* Chain of thoughts methods
* One shot / Multi shot prompting tasks |