Text Generation
Transformers
Safetensors
English
llama
Not-For-All-Audiences
conversational
Inference Endpoints
text-generation-inference
Edit model card

GALAXY-16B-v1.0

image/png

Technical notes

  • 72 layers,DUS procedure, mistral(32)->SOLAR(48)->GALAXY(72)
  • 16B parameters
  • model created as a extension of depth upscaling procedure used for SOLAR by upstage

Results

  • model can and will produce NSFW content
  • waiting for eval results
Downloads last month
0

Datasets used to train TeeZee/GALAXY-16B-v1.0-bpw8.0-h8-exl2

Collection including TeeZee/GALAXY-16B-v1.0-bpw8.0-h8-exl2