|
--- |
|
license: mit |
|
tags: |
|
- Mistral_Star |
|
- Mistral_Quiet |
|
- Mistral |
|
- Mixtral |
|
- Question-Answer |
|
- Token-Classification |
|
- Sequence-Classification |
|
- SpydazWeb-AI |
|
- chemistry |
|
- biology |
|
- legal |
|
- code |
|
- climate |
|
- medical |
|
- text-generation-inference |
|
language: |
|
- en |
|
- sw |
|
- ig |
|
- zu |
|
- ca |
|
- es |
|
- pt |
|
- ha |
|
pipeline_tag: text-generation |
|
--- |
|
# SpydazWeb AGI |
|
|
|
|
|
This is based on the Quiet Star Reasoning Project : which was abandoned earlier in the year :) |
|
|
|
Current update : UNDER TEST ! / Self Extend still to be applied : this is just an Early Release of the StaR model ! cleaned up tensors but ! |
|
problem with cos cache ?? --- Will fix it tomorow or this week here : Im workingonit! |
|
Also cannot get to do the trainingbecause it need large memory and the A100 colab will not seem to work ! ( accelerate issue ) |
|
(I cannot make it a gguf ?HOW!) - Unless maybe i hack the transformers library maybe ?(reframe the model as themistral and replace the existing file ( thats how they had doen it in the past , perhaps i will have to stay as a remte code model ?)) |
|
|
|
# Introduction : |
|
|
|
## STAR REASONERS ! |
|
|
|
this provides a platform for the model to commuicate pre-response , so an internal objective can be set ie adding an extra planning stage to the model improving its focus and output: |
|
the thought head can be charged with a thought or methodolgy, such as a ststing to take a step by step approach to the problem or to make an object oriented model first and consider the use cases before creating an output: |
|
so each thought head can be dedicated to specific ppurpose such as Planning or artifact generation or use case design : or even deciding which methodology should be applied before planning the potential solve route for the response : |
|
Another head could also be dedicated to retrieving content based on the query from the self which can also be used in the pregenerations stages : |
|
all pre- reasoners can be seen to be Self Guiding ! essentially removing the requirement to give the model a system prompt instead aligning the heads to a thoght pathways ! |
|
these chains produce data which can be considered to be thoughts : and can further be displayed by framing these thoughts with thought tokens : even allowing for editors comments giving key guidance to the model during training : |
|
these thoughts will be used in future genrations assisting the model as well a displaying explantory informations in the output : |
|
|
|
these tokens can be displayed or with held also a setting in the model ! |
|
|
|
### can this be applied in other areas ? |
|
|
|
Yes! , we can use this type of method to allow for the model to generate code in another channel or head potentially creating a head to produce artifacts for every output , or to produce entity lilsts for every output and framing the outputs in thier relative code tags or function call tags : |
|
these can also be displayed or hidden for the response . but these can also be used in problem solvibng tasks internally , which again enables for the model to simualte the inpouts and outputs from an interpretor ! |
|
it may even be prudent to include a function executing internally to the model ! ( allowing the model to execute functions in the background! before responding ) as well this oul hae tpo also be specified in the config , as autoexecute or not !. |
|
|
|
### Conclusion |
|
|
|
the resonaer methodology , might be seen to be the way forwards , adding internal funciton laity to the models instead of external connectivity enables for faster and seemless model usage : as well as enriched and informed responses , as even outputs could essentially be cleanss and formated before being presented to the Calling interface, internally to the model : |
|
the take away is that arre we seeing the decoder/encoder model as simple a function of the inteligence which in truth need to be autonomus ! |
|
ie internal functions and tools as well as disk interaction : an agent must have awareness and control over its environment with sensors and actuators : as a fuction callingmodel it has actuators and canread the directorys it has sensors ... its a start: as we can eget media in and out , but the model needs to get its own control to inpout and output also ! |
|
.... |
|
|
|
Fine tuning : agin this issue of fine tuning : the disussion above eplains the requirement to control the environment from within the moel ( with constraints ) does this eliminate theneed to fine tune a model ! |
|
in fact it should as this give transparency to ther growth ofthe model and if the model fine tuned itself we would be in danger of a model evolveing ! |
|
hence an AGI ! |
|
|
|
#### AI AGI ? |
|
so yes we can see we are not far from an ai which can evolve : an advance general inteligent system ( still non sentient by the way ) |
|
|
|
|
|
|
|
<img src="https://cdn-avatars.huggingface.co/v1/production/uploads/65d883893a52cd9bcd8ab7cf/tRsCJlHNZo1D02kBTmfy9.jpeg" width="300"/> |
|
https://github.com/spydaz |
|
|
|
* 32k context window (vs 8k context in v0.1) |
|
* Rope-theta = 1e6 |
|
* No Sliding-Window Attention |
|
* Talk heads - produce resposnes which can be used towards the final output |
|
* Pre-Thoughts - Enables for pre-generation steps of potential artifacts for task solving: |
|
* Generates plans for step by step thinking |
|
* Generates python Code Artifacts for future tasks |
|
* Recalls context for task internally to be used as refference for task: |
|
* show thoughts or hidden thought usages ( Simular to self-Rag ) |
|
|
|
|
|
This model will be a custom model with internal experts and rag systems |
|
enabling for preprocessing of the task internally before outputting a response |
|
|
|
## SpydazWeb AI model : |
|
|
|
This model is based on the worlds archive of knowledge maintaining historical documents and providing services for the survivors of mankind , |
|
who may need to construct shelters develop technologys , or medical resources as well as maintain the history of the past . keeping store of all the religious knowledge and data of the world: |
|
A friendly interface with a personality caring and flirtatious at times : non binary !... |
|
and Expert in all feilds: ie Uncensored and will not refuse to give information : the model can be used for role play as many character dialogues were als trained into the model as its personality to enable a greater perspective and outlook and natural discussion with the agents: |
|
the model was trained to operateinaragenvironment utilizing content and internal knowledge to respond to questions or create enriched sumarys. |
|
|
|
|
|
|
|
### General Intenal Methods: |
|
|
|
Trained for multi-task operations as well as rag and function calling : |
|
|
|
This model is a fully functioning model and is fully uncensored: |
|
|
|
the model has been trained on multiple datasets on the huggingface hub and kaggle : |
|
|
|
the focus has been mainly on methodology : |
|
|
|
* Chain of thoughts |
|
* step by step planning |
|
* tree of thoughts |
|
* forest of thoughts |
|
* graph of thoughts |
|
* agent generation : Voting, ranking, ... dual agent response generation: |
|
|
|
with these methods the model has gained insights into tasks, enabling for knowldge transfer between tasks : |
|
|
|
the model has been intensivly trained in recalling data previously entered into the matrix: |
|
The model has also been trained on rich data and markdown outputs as much as possible : |
|
the model can also generate markdown charts with mermaid. |
|
|
|
|
|
## Training Reginmes: |
|
* Alpaca |
|
* ChatML / OpenAI / MistralAI |
|
* Text Generation |
|
* Question/Answer (Chat) |
|
* Instruction/Input/Response (instruct) |
|
* Mistral Standard Prompt |
|
* Translation Tasks |
|
* Entitys / Topic detection |
|
* Book recall |
|
* Coding challenges, Code Feedback, Code Sumarization, Commenting Code |
|
* Agent Ranking and response anyalisis |
|
* Medical tasks |
|
* PubMed |
|
* Diagnosis |
|
* Psychaitry |
|
* Counselling |
|
* Life Coaching |
|
* Note taking |
|
* Medical smiles |
|
* Medical Reporting |
|
* Virtual laboritys simulations |
|
* Chain of thoughts methods |
|
* One shot / Multi shot prompting tasks |