What does 120B really mean?
#1
by
BigDeeper
- opened
Are there more layers, more weights per layer?
120 billion parameters, original model from here. This is a self merge of llama-3 70Bs https://huggingface.co/mlabonne/Meta-Llama-3-120B-Instruct
120 billion parameters, original model from here. This is a self merge of llama-3 70Bs https://huggingface.co/mlabonne/Meta-Llama-3-120B-Instruct
That's a non-answer.
Are there more layers, more weights per layer?
More layers