Text Generation
Transformers
Safetensors
GGUF
5 languages
mistral
text-generation-inference
unsloth
trl
chemistry
biology
legal
art
music
finance
code
medical
Not-For-All-Audiences
Merge
climate
chain-of-thought
tree-of-knowledge
forest-of-thoughts
visual-spacial-sketchpad
alpha-mind
knowledge-graph
entity-detection
encyclopedia
wikipedia
stack-exchange
Reddit
Cyber-series
MegaMind
Cybertron
SpydazWeb
Spydaz
LCARS
star-trek
mega-transformers
Mulit-Mega-Merge
Multi-Lingual
Afro-Centric
African-Model
Ancient-One
Inference Endpoints
Edit model card

MODEL :

SpydazWebAI_AI_BRAIN - The ancient one emerges

This model is the pinacle of training: mistralai/Mistral-7B-v0.1

  • The Mistral-7B-v0.1 Large Language Model (LLM) is a pretrained generative text model with 7 billion parameters.

  • Mistral-7B-v0.1 is a transformer model, with the following architecture choices:

    ** Grouped-Query Attention ** Sliding-Window Attention ** Byte-fallback BPE tokenizer

It was also merged from : MutliModels in its life time :

these models were its humble beginings : some basic trainingn pathways! later the model was trained on corpuses and many books and manuals : this was to eneble the model to truly gauge language especially human english languge: the model was also trained to translate various target languages : using the bible as the dataset for translation: enabling for it to learn the moral and historical deisred timelines for this specific model: the model was also highly trained on medical data as well as medical diagnosis and treatment to eneble for it to be a valued tool : it was also given the dialogs from multiple scenarios for treatment and diagnisis and triage situations as well as medical research and medical note taking and reporting: an entity detection taasks were also implemented for this learning pathway : phsyciatric counciliing was also included in this stack: the model was given various personalitys : by merging various role play models the model had some under layering understanding of being used as a role playing model , it was then given multiple dialogs from role play datasets as well as movie characters and iterviews with various historical characters discussing thier exploits: this was obtained in chats generated by cloud models : role play skillsets and profiles were also addd as well as erotic ofreplay and dialog. hence the samantha component was megred and then the datasets trainedito the model which actually provided some refusal and denial as well as some charm to the model : this was later lost in other tasks , but remains as underlayring personality:... some vulgar text messaging dialogs were also added giving even more flavour to obscenetis and attacks enabling for banter with those whom chose that pathway: the model was also trained with various code , python - .net to the obscure ardrino etc. it was also trained to solve by egenerating sub agents to perform tasks internaly to solve a large project and well trained on this ... the model was trained on the stem sciencces , maths, chem, biolo, .... basic assistance as well as many papers were installed into the model which cover such broad topics oits context and previous traiing has allowed for it to train easily on these topics: the model was trained to be a dictionary , an library, a researcher, funciton caller ... the list goes on .... the model was trained with all methodologys with many examples, such as step by step , spacial awareness, user case, object oriented , chain rules , logic based problem solving ... these step require intense multi shot tasks to be trained .. (all are sub tasks with customized prompts which often do not fit into standard prompt templates: hence being a custom methodology)

DEEP MIND SERIES :

the dep mind series was higfhly focuessed on solving university style questions with higher requirements and metholodolgys: this seris was highly focused with oneshot and multushot cometition data , such as the leader board datasets and concepts form other leader boards being used ... hence in fact how can it fail such tests ? as it has beeing highly traied on such task and has many examples ... some how the scores seemed to go down.. and even the socres stopped comming after submitting the latest great models after deep mind! my highest moodel was 79 % average :: but somehow no other modle i created was able to supass this model ??? very suspect ! i also played around with the top models merging and saw no progress and realised the leaderboard has crashed!

Cyber Series :

highly focused on creating experts for specific roles : breaking specific roles and data into s=models and training them as agents to be merged into my mixture of experts model which also was agrea model but nivida intensive, and was it actually better than my 7b ! i played around with 3b models and 1b models for trainign abilitys on my home pc but in truth they were very hard to catch up to my model of which i test before i use chat gpt for answers... finding myself returning to my own model ... going to chat gpt to generate data conversations and this onesided opionion, so that it can be pushed in to the parwent model !!

Librarian

the model has been trained to recall complete books or documents previously uploaded to the library , hence a series of training of unstructured book corpuses as well as structured books as well as recalling previously entered books. somebooks were not added but simply recalled until there was a match of loss 0.5 or below ... this technique has been used to add documentarys sacred books, romantic novels, sci fi books and other fictional books as well as cultural storys i various langugages . for the recall task it was performing well but we are still unsure of it to be able to generalise this task .. but maybe with some grokkng it will generalise..

Training Paradigms:: ( 1 year of training 365 merges later (all with the preious parents ))

GENETIC MERGES TO PRESERVE THE PAST TASKS AND SUPER FINE TUNING

  • A series of merges (Genetic algorithm Merge) - multiple merge targets chose to create the x+y+z models : utilizing the current series and previous series of models :
  • hence the LCARS/ARCHIVE models have been remerged into the world archive models : hence the prompt used was to enable the model to learn books and recall books and pasages: as well as recount the historys of mankind:

Higher Settigs in generations to allow for sparse and indepth training of information: METHODS OF TRAINING

  • highly trained on multiple facets and content ! - i use 1000topK, 0.42topK, 0.2Temp :hence drawing on a large results pool and selecting from a higher likelyhood of truth ..with a low temp to allow for accuracy of results .. to allow for more expression just up temp tiny... as i train my role play in deep to be a part of the models core !
  • varing training generation setting config file allow for the traier to utilise these setting during training - here we can know that with role training to train sparsely to not effect the final language or model inputs; raising the temp and topK sampling.
  • during training very indepth methodolgys and chains are used with single and multi shot prompting to train the model to use chains , of thoughts , creating multiple reponses and selecting the highest , forest of thoughts , tree of knowledge:
  • during training data was reshaped and context was replicated or sumarized into thioughts , step by step instructions were generated as thoughts , these thoughts were used as a visual spacial scratch pad to solve tasks and follow steps to produce the final response:
  • a virtual agent system was designed for programming tasks and softeware geeration: so the model will generate coders and documenters and systems designers and dat formatters to design the components iternally and colate the outputs to produce the final response: aand output: this is performed internally in the thought workspace:
  • the model was also trained to become the expert for any feild required for the task ; such as geerating a virtual lab with tools to perform virtual chemesrty and biology experiments ... replicating wexpected results , which is great for exploring how things work together and the requirements to perform a task i which you do ot have the materials or may even be dangerous:
  • the model was given aspects of a character to emulate and questioned in character regarding the life and times, achivenments and discoverys of the character and the dialogues genenrated were installed into the model: this really give the model the ability to mimic characters and even give you perspectives of people of antiquity .. from th eperspective of litriture known of movie dialogues :, so many subtitled move dialogues were extracted and just the single chaacter files were intalled in ot the model sfor each chaeracter ... all were given to give the moel character and histrical perspective:
  • projects were generated with the model and discussoions and tasks for building timelines were acomplished together with books and sacred texts and bibles and histrical books as well as obscure storys to generate to=ime lies for fictitious and mythological and real history civilisations: great tool for undertanding history and mythological texts . these texts were highly trained into the model ... as well as all ancient texts and docuemtarys were added on these subject such as acient alines and flat earth and black history and ....all the classical greek and roman texts etc were added , chat gpt does not have this information as it is secular to pop knowledge:

COMMON CRAWL :

The model has been trained on aspect of the common crawl dataset , such as medical etc, space , health ad various other extrated subject matter , due to the random nature of the data it was actually trained very sparsely (5 million params only) but still it was trained under 0.9 it give the model upto date data . the wesites were not added but only the actual data as we will not be using it to recall specific websites despite considering using the model as a virtual webserver , requesting wesites stroed and returning the markdown pages : but in truth it will be done at a later stage, first is to download as much of the data in an iunstructured way firat to enable for the context layer to be available to later tasks for recal set at a deeper layer such as 15 million parameters , pulling from the smaller previous context layer, and laying foundation for recalling the pages previulsy enteredinto the model: so previous tasks whcih have any form of relation to this data in which it was hallucenating will be fulfilled with factual data now ... as hallucanations are predictions ... but it cannot predict if it does not have enough information ... so its prediction is wrong or false , but givenn more context or more determinarive methodologys its predictive capabilitys willl become actuall factual prpedictions: it is important to use the large data st of unknowns as it has wide ranging information : in previouis tasks we have been very specific in our domain data and this gives a normalization to the model : its like adding another pre-training to the model and even updating the earlier pretraining: we dont know really what the mistral was trained on as well as being told it was multi lingual so.. give it the general knowledge : FINE WEB ! there are vearious versions which have been segmented as the main model is very hard to get working but once it is streaming correctly and i can make subsets i will take from the common crawl the subjects desired for the models training... as it is a clean unstrauctured data respource:

FUNCTIONS:

Writing functions has been second nature to the model : but we want directed actions : so we begin by training the model for giving use functions based on requested tasks : we continue by requiring the model to perform tasks we=hich may include actuall calculations or use of tokenization or entity detection : so we create examplles of these functions and implment them with examples and prompt tune these into the model ! we create variius coding chains : ie production development chains and teach the model to generate agents to perform tasks and calcualtions ! think step by step , have visaul spacial sketch pad, coding areas, ranking , judge and jury ... all these crazy thought patterns and agent control methodologys by training the experiences in to the model : as well as framing prompts and examples to fut these formats as completed chains of thought ! hece methodology training has increased the models ablity to perform tasks internally ! as well as display results and internal thought processes... als ths model was trained for a period of merges and fine tunings to produce just json outputs ... so we can now export funcitons? in truth this aspect of funciton calling is partly a myth to me : as the model is using external api ! but this will be all the funciton calling we will require the model to perform : so an api dataset was installed ... i think gorilla! and open functions and code interpreter was merged into the model and overriden with more function calling from glave ! but again this changed and we understand now that the way to be going forwards with this .. is Iron Python !<<< leth the model make calls to jupiter kernal ! so with this in mind using open interpreter allows for the model to learn to use an operating system via iron python .... so this way is actually the right way... so again jupiter notebooks and markdown documents become more important to have uploaded as full documents recallable : now the model can perform functions on the operating system ... and if it needs to contact some external api it can make up a funciton call for it :: ie generate its own functions and api calls ...as i do ot use api but if it requires it will have to inform me to sign up ! So seperation : as we only need to send two funcitons to the model ! ie the api call and the ipkernal call ! (reucing mad prompt sizes)

OPENAI+

so now we get to understand openAI !! we train our model to be hosted on lmstudio/gpt4all/olama.... so now we need to train it to be chat gpt ! so extensive training with the messeging language formatting ... allowing for the model to be used to these types of prompts : extensive funciton calling and chat , especially role play , some maths and other scientific data ... so the model is trained with these vairous vicouna templates as well as mistral and open ai and share gpt templates ! i have found the model responds differently to different interfaces or modes of use ! .hence being trained in multimodal fashion : the ml languge format requires the model to be trained inside this ml format , and this is not my personal usage as i do use it mostly with code and gradio ... but after many considerations traing the model to be used as an api as well as to produce structured outputs such as json or even the messaging format to enable the colllection of the outputted functions or code fragments or even markdown ... so yes the odel has been trained to handle such inputs and outputs still using the internal prompt by mistral

UNCENSORED !

Mistral models do not need abliteration to be uncensored .. they are uncensored ? infact how could they censor your model if you begun with a random genration ? so this model has never been censored . hence when experienceing refusals just repeat the question and the answer will be forth commig as the dataset contained the reufusal : so... i also used some of the toxic datasets ad udesirable datsets.. as well as dpo trainingn in a=which i used the rejected in the thoughts and the chosewn as the output for these adjusted question ... but still some of these were reverse intuative as they ahad also added some restriction .. wow a battle against basic people !!! so .. most of the comercial datasets that were used in early trained such as orca / dolphin / hermes etc despite adding thier own base of expected knowledge and capabilitys were actualy not great for bringing the model inteligence up. now the model is infact its own character and knowlede base ... Sorry - Uncomparareable ! as especially for data which is unknown to me or is my personal interset these model do not have these answers : i dont do role lay but if i desire to down load a charactwer to use on tidy or kobollo then i want it to work and be sexy or unyeilding !! SO i uncensored the model i also paid attention to adding scripts and characters from my favorites as well as romance and erotic data .. and erotic and nasty chat ! i also installed the smanatha character after removing her from the equatsion ! i used the movie scripts !! so it is actually very frinedly ! i also added many counilling sessions and real word discussions and phone calls : as wellas lewd chat data ... and silly chit chat ... so hopefully no blank responses ? always something ! the was also trained to ask you questions as the user instead of being th AI! so whe you are actually chatting the model will ask your opinion on topic oit already knows ! as well as even ask you to perform a calc on the system with the given code ! to return the answer so they can check your doing it right ! lol ! its important to have the study partner s! as once you install a microphone the whole dynal=mic is changed !

Downloads last month
90
Safetensors
Model size
7.24B params
Tensor type
FP16
·
Invalid base_model specified in model card metadata. Needs to be a model id from hf.co/models.

Datasets used to train LeroyDyer/LCARS_AI_001