nvidia
/

Llama3-ChatQA-2-70B

Text Generation

Model card Files Files and versions Community

root commited on Sep 6

Commit

43e4235

•

1 Parent(s): 2cb277a

update README.md

Files changed (1) hide show

README.md +4 -4

README.md CHANGED Viewed

@@ -23,11 +23,11 @@ Results in [ChatRAG Bench](https://huggingface.co/datasets/nvidia/ChatRAG-Bench)
 ![Example Image](overview.png)
-| | ChatQA-2-70B | GPT-4-Turbo-2024-04-09 | Qwen2-72B-Instruct | Llama3.1-70B-Instruct |
 | -- |:--:|:--:|:--:|:--:|
 | Ultra-long (4k) | 41.04 | 33.16 | 39.77 | 39.81 |
 | Long (32k) | 48.15 | 51.93 | 49.94 | 49.92 |
-| Short (4k) | 56.30 | 54.72 | 54.06 | 52.12 |
 Note that ChatQA-2 is built based on Llama-3 base model.
@@ -65,7 +65,7 @@ Assistant:
 <pre>
 This is a chat between a user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions based on the context. The assistant should also indicate when the answer cannot be found in the context.
 </pre>
-**Note that our ChatQA-1.5 models are optimized for the capability with context, e.g., over documents or retrieved context.**
 ## How to use
@@ -75,7 +75,7 @@ This can be applied to the scenario where the whole document can be fitted into
 from transformers import AutoTokenizer, AutoModelForCausalLM
 import torch
-model_id = "nvidia/Llama3-ChatQA-1.5-8B"
 tokenizer = AutoTokenizer.from_pretrained(model_id)
 model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16, device_map="auto")

 ![Example Image](overview.png)
+<!-- | | ChatQA-2-70B | GPT-4-Turbo-2024-04-09 | Qwen2-72B-Instruct | Llama3.1-70B-Instruct |
 | -- |:--:|:--:|:--:|:--:|
 | Ultra-long (4k) | 41.04 | 33.16 | 39.77 | 39.81 |
 | Long (32k) | 48.15 | 51.93 | 49.94 | 49.92 |
+| Short (4k) | 56.30 | 54.72 | 54.06 | 52.12 | -->
 Note that ChatQA-2 is built based on Llama-3 base model.
 <pre>
 This is a chat between a user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions based on the context. The assistant should also indicate when the answer cannot be found in the context.
 </pre>
+**Note that our ChatQA-2 models are optimized for the capability with context, e.g., over documents or retrieved context.**
 ## How to use
 from transformers import AutoTokenizer, AutoModelForCausalLM
 import torch
+model_id = "nvidia/Llama3-ChatQA-2-8B"
 tokenizer = AutoTokenizer.from_pretrained(model_id)
 model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16, device_map="auto")