Hello, is this 1.5B model trained from scratch, or is it distilled like LLaMA 3.2?
#7 opened about 2 months ago
by
adol01
recommended context length for SFT?
#6 opened 4 months ago
by
brando
Why is there no model.safetensors.index.json file?
1
#5 opened 4 months ago
by
Infernaught
[AUTOMATED] Model Memory Requirements
#3 opened 5 months ago
by
model-sizer-bot
lm_eval results is weird
5
#2 opened 6 months ago
by
xianf
Upload ONNX weights
#1 opened 6 months ago
by
Xenova