Text Generation
Transformers
Safetensors
English
mistral
code
Not-For-All-Audiences
chemistry
biology
legal
finance
text-generation-inference
medical
Inference Endpoints
Edit model card

I have also fully installed hugging face website into the model now viar search terms and website conversion: ie i asked the model to convert each page to a markdown page: hence after we can say it knows huugging face code! i was amazing to me that the model did not really know solid huggingface code despite having some knowledge(hallucenations on huggingface)!

hence when idenfiying personal needs and requirements for a model ! : i search my perdonal interests and tools which i find myself using ! and now instrad of building an app ! i train my model to perform the task with zero shot and large batch training : to enabl for the taks to be embedded then just use a regular "pop" dataset like Dolphin Coder or OpenOrca or Alpaca ! to realign anymistakes Hiding the task inside the models again ! : at the iteration or merge point ! we find that the tasks are still close to the surface hence moving to a second model for the next task; changing the lora values to enable a different selction of tensors to target in the model ! in fact after mergeing a lora there is ALWAYS loss! ... so again the merged lora will need personal chat alignment !!! Sad to say some datsets try to refer to themseves as super agi or GPT etc thee really need to be removed from the datsets! hence in the future i will be using an evaluation product to examine the completions at each layer output to discver which layers to target to remove (once i get a better cloud! - with a good colab notebook with good dataset access and datset mangement tools !) <<< Many clouds only allow you to create loras for the bae models ,.... forcing you into line ! hence for these models they are way past base model ! to extract a valid lora you would need to diferentate between models ! hence completely customized !

Right now this model is in training (MUFON DATA; BIBLE DATA , QURAN DATA) : Biblical dictionarys / Names and places dictionarys (in multiple languges for comparison) there will be a series of bilical knowledge bots arriving soon while i am training and building the various tasked models , for bilical and religious nlp tasks: there fore some outputs from this model may be in json or other formats which may seem strange : as i will be focussing on getting the structures of the dat aligned with the specific query shape ; hence having an attack surface for the information and usablity of the biblical scholar and historian! i work in various feilds of knowledge and research many topics specifically : So i may transfer to kaggle and begin creating mini purpose datsets for various research topic and designing the querys around the dats instead of dumping infomation in and hoping for the best!!!

(it does not make sense sharing my eval scores !!! as we already gzumpt the board! for those test it has been traind specifically to pass 100% , so if your model requires a boost !!! Merge away!) also just updated the model to know all hugging face web documentation (and code, and notebooks and models) as well as added functionaltiy for converting webpages to valid markdown pages : I will be installing all the possible conversion task into the model so this model may not react well - Get the GGUF!

For a great lora

r = 9,
target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
                      "gate_proj", "up_proj", "down_proj",],
lora_alpha = 18,

Lovely model !!! Very knowledgeabe :: (sometimes requires coaxing !! but it has options to choose from so for a single thing there may be multiple response so you can ask in another way ! good for oneshot prompts and it actually uses the history in the chat !!! )

This is Is a DEEP MIND model : created from the Ultra Fine tuned model : Deepmind III from merging the sub models as well as the same training datsets to align the model to to the merges !

With sub merges they are all merged with ties- First, and the sub merged models merged as linear merges (helps to speed up inference) the outcome ( a child of 10 merges (5(x)+5(y)))esentially an expert ! (x+y) 2x5 Experts : often trained for specific purpose , here they were personal friend who has been trained for roles: and to be a personal friend and may not perform tasks because it preferes to talk instead and enjoy company!, another focused on coding in various languges and another on medical and science etc: the product of the merges : prduces a model with ALL qualitys (ties/Dares) (takes at least an hr or more)the linear merges take minites! Hence Truly an Expert ! of Experts!

Concepts :

chain of thought and function calling Self rag ! Thoughts , emotive responses have been enhance where possibel with the data given . even sexy books have been highly tuned into the model : but also i think american genera books (sci fi, fantasy, romance novels are required) for great role play which some expect: ) I have recently seen a strategy in which prompts can be embedded into the adapter to Trigger Specific Roles : I hae tried to remove such prompting as you are a helpful ai to a character theme instead such as you are a cyber hacker by day and business man by night ! ie to give the model various internal personas ! after some training i noticed it was also talking to itself !! (rehersing) but the tokens for thought were missing so it lookeed strange until i noticed the bug; After removing the thought tokens they were displayed in the output as the tokenizer was masking them !

a Great Model , Given a Task based data set it Coverges Super quickly hence my enjoyment of the model as training of it is super quick ! Now when ii load up datasets : they are generally only a few bad steps before it begins to drop below zero maintaining a steady 0.6 etc whilst loading the unnseen new dataset , hence not needing so many epochs to adjust the matrix to the new information !

This model may even say inappropriate words or advice and although a novelty , all coding and medical and cyber capabillitys are still within , you may have to be patient and convince the ai to respond how you desire. As a role play model it can take on roles and personas etc.

this model created from the first trained models : deepmind! these models contain:

thoughts and processes :

SelfRAG:

Agent Generation:

Chain of thoughts :

Deep thinking and memory recall:

checks itsef discussing complex questions (question it does not know the answer to ... it trys to discuss with itself to find a result(sometimes unsucessfully))

It generates Mini agents to perform small tasks such as entity recognition; step by step definitions, write psuedo codebases , generare uscases... perform calculations, analize content

It thinks.... sometimes sarcasim , sometimes reflection... sometimes random thoughts ...

it has personalitys : by installing various long discussions with chat gpt in persona it weas able to generate role coversation data, which was added to its conversation chat Q/A; as well as a datset from the samantha tv show ... and HER!.... so it is a personal assistant and very friendly;

It has been really training mainly on coding datasets and medical information : from experiments to research to patient/doctor .. to diagnosis ... to problem solving :

it has been trained to be a counseller and assist with psycological problems :: empathtetic discussion :

this one has its own thoughts despite the prompt given : (if you allow the thought prompt it will display the thoughts)

this is a highly focused model :

y-Gene:(assistant Series (chatbots and coders))

  • LeroyDyer/Mixtral_AI_DeepMind
  • LeroyDyer/Mixtral_AI_CyberUltron_DPO
  • LeroyDyer/Mixtral_AI_Chat_2.0
  • LeroyDyer/Mixtral_AI_DeepMedicalMind
  • LeroyDyer/Mixtral_AI_Samantha

x-Gene:(Medical Genre)

  • LeroyDyer/Mixtral_AI_Chat_2.0
  • LeroyDyer/Mixtral_BioMedical
  • LeroyDyer/Mixtral_AI_Medic
  • LeroyDyer/Mixtral_Cyber_BioMedic
  • LeroyDyer/Mixtral_AI_DeepMedicalMind

Variant: (competitiion eval sets)

  • LeroyDyer/MetaMath_LLM
  • LeroyDyer/TruthfulQA_LLM
  • LeroyDyer/HellaSwag_LLM
  • LeroyDyer/Mixtral_AI_DeepMedicalMind

Updated DataSets (aligned to under 0.4) <<< All used to create the RAG system! (internal)

  • suvadityamuk/huggingface-transformers-code-dataset
  • m-ric/transformers_documentation_en
  • philschmid/markdown-documentation-transformers
  • gate369/alpaca-star-ascii
  • gate369/Alpaca-Star
  • medalpaca/medical_meadow_cord19
  • nickrosh/Evol-Instruct-Code-80k-v1
  • glaiveai/glaive-code-assistant
  • gate369/dehydrated_asni
  • gate369/as-ni-json
  • Azamorn/tiny-codes-csharp
  • unaidedelf87777/slimorca-sem_deduped
  • nbertagnolli/counsel-chat
  • taesiri/arxiv_qa
  • knowrohit07/know_medical_dialogues
  • cognitivecomputations/WizardLM_alpaca_evol_instruct_70k_unfiltered
  • AlderleyAI/coqa_chat
  • mahfoos/Patient-Doctor-Conversation
  • WhiteRabbitNeo/WRN-Chapter-1
  • WhiteRabbitNeo/WRN-Chapter-2
  • Amod/mental_health_counseling_conversations
  • glaiveai/glaive-code-assistant-v3
  • cognitivecomputations/dolphin-coder
  • ruslanmv/ai-medical-chatbot
  • totally-not-an-llm/EverythingLM-data-V3
  • gretelai/synthetic_text_to_sql
  • HuggingFaceTB/cosmopedia
  • teknium/OpenHermes-2.5
  • Open-Orca/SlimOrca
  • Open-Orca/OpenOrca
  • cognitivecomputations/dolphin-coder
  • databricks/databricks-dolly-15k
  • yahma/alpaca-cleaned
  • uonlp/CulturaX
  • mwitiderrick/SwahiliPlatypus
  • swahili
  • Rogendo/English-Swahili-Sentence-Pairs
  • ise-uiuc/Magicoder-Evol-Instruct-110K
  • meta-math/MetaMathQA
  • abacusai/ARC_DPO_FewShot
  • abacusai/MetaMath_DPO_FewShot
  • abacusai/HellaSwag_DPO_FewShot
  • HaltiaAI/Her-The-Movie-Samantha-and-Theodore-Dataset
  • gretelai/synthetic_text_to_sql
  • HuggingFaceTB/cosmopedia
  • teknium/OpenHermes-2.5
  • cognitivecomputations/dolphin-coder
  • databricks/databricks-dolly-15k
  • yahma/alpaca-cleaned
  • uonlp/CulturaX
  • mwitiderrick/SwahiliPlatypus
  • swahili
  • Rogendo/English-Swahili-Sentence-Pairs
  • ise-uiuc/Magicoder-Evol-Instruct-110K
  • meta-math/MetaMathQA

just to remember its way past merges :

Extended capabilities:

mistralai/Mistral-7B-Instruct-v0.1 - Prime-Base

ChaoticNeutrals/Eris-LelantaclesV2-7b - role play

ChaoticNeutrals/Eris_PrimeV3-Vision-7B - vision

rvv-karma/BASH-Coder-Mistral-7B - coding

Locutusque/Hercules-3.1-Mistral-7B - Unhinging

KoboldAI/Mistral-7B-Erebus-v3 - NSFW

Locutusque/Hyperion-2.1-Mistral-7B - CHAT

Severian/Nexus-IKM-Mistral-7B-Pytorch - Thinking

NousResearch/Hermes-2-Pro-Mistral-7B - Generalizing

mistralai/Mistral-7B-Instruct-v0.2 - BASE

Nitral-AI/ProdigyXBioMistral_7B - medical

Nitral-AI/Infinite-Mika-7b - 128k - Context Expansion enforcement

Nous-Yarn-Mistral-7b-128k - 128k - Context Expansion

yanismiraoui/Yarn-Mistral-7b-128k-sharded

ChaoticNeutrals/Eris_Prime-V2-7B - Roleplay

Driven by Reinforcement Learning from AI Feedback, the Starling-LM-7B-beta demonstrates remarkable adaptability and optimization, while the Phi-1.5 Transformer model stands as a beacon of excellence across various domains, from common sense reasoning to medical inference.

With models like BioMistral tailored specifically for medical applications and Nous-Yarn-Mistral-7b-128k excelling in handling long-context data, the MEGA_MIND 24b CyberSeries emerges as a transformative force in the landscape of language understanding and artificial intelligence.

This mistral model was trained 2x faster with Unsloth and Huggingface's TRL library.

Downloads last month
45
Safetensors
Model size
7.24B params
Tensor type
FP16
·

Finetuned from

Datasets used to train LeroyDyer/Mixtral_AI_MasterMind

Collections including LeroyDyer/Mixtral_AI_MasterMind