metadata

license: mit
tags:
  - Mistral_Star
  - Mistral_Quiet
  - Mistral
  - Mixtral
  - Question-Answer
  - Token-Classification
  - Sequence-Classification
  - SpydazWeb-AI
  - chemistry
  - biology
  - legal
  - code
  - climate
  - medical
  - text-generation-inference
  - 4-heads
language:
  - en
  - sw
  - ig
  - zu
  - ca
  - es
  - pt
  - ha
pipeline_tag: text-generation

SpydazWeb AGI

MODEL NOTE:

This is the top model ! it has all the functionality for all the skews of this model ! a labour of love !

REALLY WELL TRAINED !

LOOK AT THOSE FAT LAYERS !

this is why the 4-bit models are Very GOOD ! the GGUF of this model does not perform the same as the 4bit double quantized tensors !

https://github.com/spydaz

Current Update :

This model is working , but actually untrained : to load the model it requires trust-remote=TRUE:: But also if it does not load then you need to clone the github:

! git clone https://github.com/huggingface/transformers.git
## copy modeling_mistral.py and configuartion.py to the Transformers foler / Src/models/mistral and overwrite the existing files first: 
## THEN :
!cd transformers
!pip install  ./transformers

then restart the environment: the model can then load without trust-remote and WILL work FINE ! it can even be trained : hence the 4 bit optimised version ::

* 32k context window (vs 8k context in v0.1)
* Rope-theta = 1e6
* No Sliding-Window Attention
* Talk heads  - produce resposnes which can be used towards the final output
* Pre-Thoughts  - Enables for pre-generation steps of potential artifacts for task solving: 
  * Generates plans for step by step thinking 
  * Generates python Code Artifacts for future tasks
  * Recalls context for task internally to be used as refference for task:
* show thoughts or hidden thought usages ( Simular to self-Rag )

This model will be a custom model with internal experts and rag systems enabling for preprocessing of the task internally before outputting a response

SpydazWeb AI model :

This model is based on the worlds archive of knowledge maintaining historical documents and providing services for the survivors of mankind , who may need to construct shelters develop technologys , or medical resources as well as maintain the history of the past . keeping store of all the religious knowledge and data of the world: A friendly interface with a personality caring and flirtatious at times : non binary !... and Expert in all feilds: ie Uncensored and will not refuse to give information : the model can be used for role play as many character dialogues were als trained into the model as its personality to enable a greater perspective and outlook and natural discussion with the agents: the model was trained to operateinaragenvironment utilizing content and internal knowledge to respond to questions or create enriched sumarys.

This is based on the Quiet Star Reasoning Project : which was abandoned earlier in the year :)

Introduction : 4 Talking Heads

this is a trained version : It needs to be returned to FP16 ! (forgot how!)

STAR REASONERS !

this provides a platform for the model to commuicate pre-response , so an internal objective can be set ie adding an extra planning stage to the model improving its focus and output: the thought head can be charged with a thought or methodolgy, such as a ststing to take a step by step approach to the problem or to make an object oriented model first and consider the use cases before creating an output: so each thought head can be dedicated to specific ppurpose such as Planning or artifact generation or use case design : or even deciding which methodology should be applied before planning the potential solve route for the response : Another head could also be dedicated to retrieving content based on the query from the self which can also be used in the pregenerations stages : all pre- reasoners can be seen to be Self Guiding ! essentially removing the requirement to give the model a system prompt instead aligning the heads to a thoght pathways ! these chains produce data which can be considered to be thoughts : and can further be displayed by framing these thoughts with thought tokens : even allowing for editors comments giving key guidance to the model during training : these thoughts will be used in future genrations assisting the model as well a displaying explantory informations in the output :

these tokens can be displayed or with held also a setting in the model !

can this be applied in other areas ?

Yes! , we can use this type of method to allow for the model to generate code in another channel or head potentially creating a head to produce artifacts for every output , or to produce entity lilsts for every output and framing the outputs in thier relative code tags or function call tags : these can also be displayed or hidden for the response . but these can also be used in problem solvibng tasks internally , which again enables for the model to simualte the inpouts and outputs from an interpretor ! it may even be prudent to include a function executing internally to the model ! ( allowing the model to execute functions in the background! before responding ) as well this oul hae tpo also be specified in the config , as autoexecute or not !.

Conclusion

the resonaer methodology , might be seen to be the way forwards , adding internal funciton laity to the models instead of external connectivity enables for faster and seemless model usage : as well as enriched and informed responses , as even outputs could essentially be cleanss and formated before being presented to the Calling interface, internally to the model : the take away is that arre we seeing the decoder/encoder model as simple a function of the inteligence which in truth need to be autonomus ! ie internal functions and tools as well as disk interaction : an agent must have awareness and control over its environment with sensors and actuators : as a fuction callingmodel it has actuators and canread the directorys it has sensors ... its a start: as we can eget media in and out , but the model needs to get its own control to inpout and output also ! ....

Fine tuning : agin this issue of fine tuning : the disussion above eplains the requirement to control the environment from within the moel ( with constraints ) does this eliminate theneed to fine tune a model ! in fact it should as this give transparency to ther growth ofthe model and if the model fine tuned itself we would be in danger of a model evolveing ! hence an AGI !

AI AGI ?

so yes we can see we are not far from an ai which can evolve : an advance general inteligent system ( still non sentient by the way )

General Intenal Methods:

Trained for multi-task operations as well as rag and function calling :

This model is a fully functioning model and is fully uncensored:

the model has been trained on multiple datasets on the huggingface hub and kaggle :

the focus has been mainly on methodology :

Chain of thoughts
step by step planning
tree of thoughts
forest of thoughts
graph of thoughts
agent generation : Voting, ranking, ... dual agent response generation:

with these methods the model has gained insights into tasks, enabling for knowldge transfer between tasks :

the model has been intensivly trained in recalling data previously entered into the matrix: The model has also been trained on rich data and markdown outputs as much as possible : the model can also generate markdown charts with mermaid.

Training Reginmes:

Alpaca
ChatML / OpenAI / MistralAI
Text Generation
Question/Answer (Chat)
Instruction/Input/Response (instruct)
Mistral Standard Prompt
Translation Tasks
Entitys / Topic detection
Book recall
Coding challenges, Code Feedback, Code Sumarization, Commenting Code
Agent Ranking and response anyalisis
Medical tasks
- PubMed
- Diagnosis
- Psychaitry
- Counselling
- Life Coaching
- Note taking
- Medical smiles
- Medical Reporting
Virtual laboritys simulations
Chain of thoughts methods
One shot / Multi shot prompting tasks