Model card of JOSIExHercules-3.1-Mistral-7B_only_spetial_tokens
This is my Token customized Locutusque/Hercules-3.1-Mistral-7B model
This is based on Locutusque/Hercules-3.1-Mistral-7B model with added custom special Tokens. This wil most likely be my next Model, trained on my own Dataset.
<|im_start|>system\n{message}<|im_end|>\n<|im_start|>user\n{user message}<|im_end|>\n<|im_start|>call\n{function call message}<|im_end|>\n<|im_start|>function\n{function response message}<|im_end|>\n<|im_start|>assistant\n{assistant message}</s>
Training Data
Hercules-3.1-Mistral-7B is fine-tuned from the following sources:
cognitivecomputations/dolphin
Evol Instruct 70K & 140K
teknium/GPT4-LLM-Cleaned
jondurbin/airoboros-3.2
AlekseyKorshuk/camel-chatml
CollectiveCognition/chats-data-2023-09-22
Nebulous/lmsys-chat-1m-smortmodelsonly
glaiveai/glaive-code-assistant-v2
glaiveai/glaive-code-assistant
glaiveai/glaive-function-calling-v2
garage-bAInd/Open-Platypus
meta-math/MetaMathQA
teknium/GPTeacher-General-Instruct
GPTeacher roleplay datasets
BI55/MedText
pubmed_qa labeled subset
Unnatural Instructions
M4-ai/LDJnr_combined_inout_format
CollectiveCognition/chats-data-2023-09-27
CollectiveCognition/chats-data-2023-10-16
NobodyExistsOnTheInternet/sharegptPIPPA
yuekai/openchat_sharegpt_v3_vicuna_format
ise-uiuc/Magicoder-Evol-Instruct-110K
sablo/oasst2_curated
<|im_start|>system
{message}<|im_end|>
<|im_start|>user
{user message}<|im_end|>
<|im_start|>call
{function call message}<|im_end|>
<|im_start|>function
{function response message}<|im_end|>
<|im_start|>assistant
{assistant message}</s>
New added Special Tokens
'<|functions|>',
'<|system|>',
'<|gökdeniz|>',
'<|user|>',
'<|josie|>',
'<|assistant|>',
'<|function_call|>',
'<|function_response|>',
'<|image|>',
'<|long_term_memory|>',
'<|short_term_memory|>',
'<|home_state|>',
'<|current_states|>',
'<|context|>'
New BOS and EOS Tokens
BOS = '<|startoftext|>'
EOS = '<|endoftext|>'
Origional Instruction Prompt Format:
<s>[INST] What is your favourite condiment? [/INST] Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!</s>...
Model Architecture:
MistralForCausalLM(
(model): MistralModel(
(embed_tokens): Embedding(32016, 4096)
(layers): ModuleList(
(0-31): 32 x MistralDecoderLayer(
(self_attn): MistralSdpaAttention(
(q_proj): Linear(in_features=4096, out_features=4096, bias=False)
(k_proj): Linear(in_features=4096, out_features=1024, bias=False)
(v_proj): Linear(in_features=4096, out_features=1024, bias=False)
(o_proj): Linear(in_features=4096, out_features=4096, bias=False)
(rotary_emb): MistralRotaryEmbedding()
)
(mlp): MistralMLP(
(gate_proj): Linear(in_features=4096, out_features=14336, bias=False)
(up_proj): Linear(in_features=4096, out_features=14336, bias=False)
(down_proj): Linear(in_features=14336, out_features=4096, bias=False)
(act_fn): SiLU()
)
(input_layernorm): MistralRMSNorm()
(post_attention_layernorm): MistralRMSNorm()
)
)
(norm): MistralRMSNorm()
)
(lm_head): Linear(in_features=4096, out_features=32016, bias=False)
)
Quants
ExLlamaV2 by bartowski https://huggingface.co/bartowski/Hercules-3.1-Mistral-7B-exl2
Open LLM Leaderboard Evaluation Results
Detailed results can be found here
Metric | Value |
---|---|
Avg. | 62.09 |
AI2 Reasoning Challenge (25-Shot) | 61.18 |
HellaSwag (10-Shot) | 83.55 |
MMLU (5-Shot) | 63.65 |
TruthfulQA (0-shot) | 42.83 |
Winogrande (5-shot) | 79.01 |
GSM8k (5-shot) | 42.30 |
- Downloads last month
- 9
Finetuned from
Dataset used to train Isaak-Carter/J.O.S.I.E.-x-Hercules-3.1-Mistral-7B-only-spetial-tokens
Evaluation results
- normalized accuracy on AI2 Reasoning Challenge (25-Shot)test set Open LLM Leaderboard61.180
- normalized accuracy on HellaSwag (10-Shot)validation set Open LLM Leaderboard83.550
- accuracy on MMLU (5-Shot)test set Open LLM Leaderboard63.650
- mc2 on TruthfulQA (0-shot)validation set Open LLM Leaderboard42.830
- accuracy on Winogrande (5-shot)validation set Open LLM Leaderboard79.010
- accuracy on GSM8k (5-shot)test set Open LLM Leaderboard42.300