Edit model card

Nemotron-4-340B-Base-hf

Converted checkpoint of nvidia/Nemotron-4-340B-Base. Specifically it was produced from the v1.2 .nemo checkpoint on NGC.

This runs in vLLM with this PR: https://github.com/vllm-project/vllm/pull/6611. Support in transformers is still pending.

Evaluations

Please see the FP8 checkpoint for evaluations since I only have done single-node inference.

Downloads last month
8
Safetensors
Model size
341B params
Tensor type
BF16
·
Inference API (serverless) has been turned off for this model.

Finetuned from

Collection including mgoin/Nemotron-4-340B-Base-hf