Model card of JOSIExHercules-3.1-Mistral-7B_only_spetial_tokens

This is my Token customized Locutusque/Hercules-3.1-Mistral-7B model

This is based on Locutusque/Hercules-3.1-Mistral-7B model with added custom special Tokens. This wil most likely be my next Model, trained on my own Dataset.

<|im_start|>system\n{message}<|im_end|>\n<|im_start|>user\n{user message}<|im_end|>\n<|im_start|>call\n{function call message}<|im_end|>\n<|im_start|>function\n{function response message}<|im_end|>\n<|im_start|>assistant\n{assistant message}</s>

Training Data

Hercules-3.1-Mistral-7B is fine-tuned from the following sources:

cognitivecomputations/dolphin
Evol Instruct 70K & 140K
teknium/GPT4-LLM-Cleaned
jondurbin/airoboros-3.2
AlekseyKorshuk/camel-chatml
CollectiveCognition/chats-data-2023-09-22
Nebulous/lmsys-chat-1m-smortmodelsonly
glaiveai/glaive-code-assistant-v2
glaiveai/glaive-code-assistant
glaiveai/glaive-function-calling-v2
garage-bAInd/Open-Platypus
meta-math/MetaMathQA
teknium/GPTeacher-General-Instruct
GPTeacher roleplay datasets
BI55/MedText
pubmed_qa labeled subset
Unnatural Instructions
M4-ai/LDJnr_combined_inout_format
CollectiveCognition/chats-data-2023-09-27
CollectiveCognition/chats-data-2023-10-16
NobodyExistsOnTheInternet/sharegptPIPPA
yuekai/openchat_sharegpt_v3_vicuna_format
ise-uiuc/Magicoder-Evol-Instruct-110K
sablo/oasst2_curated

<|im_start|>system
{message}<|im_end|>
<|im_start|>user
{user message}<|im_end|>
<|im_start|>call
{function call message}<|im_end|>
<|im_start|>function
{function response message}<|im_end|>
<|im_start|>assistant
{assistant message}</s>

New added Special Tokens

'<|functions|>',
'<|system|>',
'<|gökdeniz|>',
'<|user|>',
'<|josie|>',
'<|assistant|>',
'<|function_call|>',
'<|function_response|>',
'<|image|>',
'<|long_term_memory|>',
'<|short_term_memory|>',
'<|home_state|>',
'<|current_states|>',
'<|context|>'

New BOS and EOS Tokens

BOS = '<|startoftext|>'
EOS = '<|endoftext|>'

Origional Instruction Prompt Format:

<s>[INST] What is your favourite condiment? [/INST] Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!</s>...

Model Architecture:

MistralForCausalLM(
  (model): MistralModel(
    (embed_tokens): Embedding(32016, 4096)
    (layers): ModuleList(
      (0-31): 32 x MistralDecoderLayer(
        (self_attn): MistralSdpaAttention(
          (q_proj): Linear(in_features=4096, out_features=4096, bias=False)
          (k_proj): Linear(in_features=4096, out_features=1024, bias=False)
          (v_proj): Linear(in_features=4096, out_features=1024, bias=False)
          (o_proj): Linear(in_features=4096, out_features=4096, bias=False)
          (rotary_emb): MistralRotaryEmbedding()
        )
        (mlp): MistralMLP(
          (gate_proj): Linear(in_features=4096, out_features=14336, bias=False)
          (up_proj): Linear(in_features=4096, out_features=14336, bias=False)
          (down_proj): Linear(in_features=14336, out_features=4096, bias=False)
          (act_fn): SiLU()
        )
        (input_layernorm): MistralRMSNorm()
        (post_attention_layernorm): MistralRMSNorm()
      )
    )
    (norm): MistralRMSNorm()
  )
  (lm_head): Linear(in_features=4096, out_features=32016, bias=False)
)

Quants

ExLlamaV2 by bartowski https://huggingface.co/bartowski/Hercules-3.1-Mistral-7B-exl2

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	62.09
AI2 Reasoning Challenge (25-Shot)	61.18
HellaSwag (10-Shot)	83.55
MMLU (5-Shot)	63.65
TruthfulQA (0-shot)	42.83
Winogrande (5-shot)	79.01
GSM8k (5-shot)	42.30

Isaak-Carter
/

J.O.S.I.E.-x-Hercules-3.1-Mistral-7B-only-spetial-tokens

Model card of JOSIExHercules-3.1-Mistral-7B_only_spetial_tokens

This is my Token customized Locutusque/Hercules-3.1-Mistral-7B model

Training Data

New added Special Tokens

New BOS and EOS Tokens

Origional Instruction Prompt Format:

Model Architecture:

Quants

Open LLM Leaderboard Evaluation Results

Finetuned from

Dataset used to train Isaak-Carter/J.O.S.I.E.-x-Hercules-3.1-Mistral-7B-only-spetial-tokens

Evaluation results

Model card of JOSIExHercules-3.1-Mistral-7B_only_spetial_tokens

This is my Token customized Locutusque/Hercules-3.1-Mistral-7B model

Training Data

New added Special Tokens

New BOS and EOS Tokens

Origional Instruction Prompt Format:

Model Architecture:

Quants

Open LLM Leaderboard Evaluation Results

Finetuned from Locutusque/Hercules-3.1-Mistral-7B

Dataset used to train Isaak-Carter/J.O.S.I.E.-x-Hercules-3.1-Mistral-7B-only-spetial-tokens

Evaluation results

Finetuned from