TeeZee
/

NEBULA-23.8B-v1.0

Text Generation

Not-For-All-Audiences

Inference Endpoints

text-generation-inference

Model card Files Files and versions Community

Edit model card

NEBULA-23.8B-v1.0

Technical notes

108 layers,DUS procedure, mistral(32)->SOLAR(48)->GALAXY(72)->NEBULA(108)
23.8B parameters
model created as an extension of depth upscaling procedure used for SOLAR by upstage

Results

model can and will produce NSFW content
GSM8k evaluation seems to be often broken, HellaSwag, Winograde and TQA show that its a smart model
RP and ERP work surprisingly good and I didn't encounter any GPTisms yet
comparable memory footprint to 20B and 23B models based on llama
follows character card very well
NSFW output feels fresh comparing to existing models

Finetuning for RP

SFT using MinervaAI/Aesir-Preview dataset, 10 epochs
DPO using athirdpath/DPO_Pairs-Roleplay-Alpaca-NSFW dataset, 1 epoch
SFT using 1xAda6000, 10h
DPO using 1x3090, 30h
jupyter notebooks or mergekit configs for anyone wanting to reproduce/reuse scripts - just drop me a message

Prompt template

Alpaca
chat template is embedded in tokenizer config, should load automatically

Context size

4096

All comments are greatly appreciated, download, test and if you appreciate my work, consider buying me my fuel:

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	59.94
AI2 Reasoning Challenge (25-Shot)	66.72
HellaSwag (10-Shot)	86.98
MMLU (5-Shot)	65.40
TruthfulQA (0-shot)	57.60
Winogrande (5-shot)	82.95
GSM8k (5-shot)	0.00

Downloads last month: 1,993

Safetensors

Model size

23.8B params

Tensor type

BF16

·

Datasets used to train TeeZee/NEBULA-23.8B-v1.0

Evaluation results

normalized accuracy on AI2 Reasoning Challenge (25-Shot)
test set Open LLM Leaderboard

66.720
normalized accuracy on HellaSwag (10-Shot)
validation set Open LLM Leaderboard

86.980
accuracy on MMLU (5-Shot)
test set Open LLM Leaderboard

65.400
mc2 on TruthfulQA (0-shot)
validation set Open LLM Leaderboard

57.600
accuracy on Winogrande (5-shot)
validation set Open LLM Leaderboard

82.950
accuracy on GSM8k (5-shot)
test set Open LLM Leaderboard

0.000

View on Papers With Code