|
--- |
|
base_model: meta-llama/Meta-Llama-3.1-70B-Instruct |
|
language: |
|
- en |
|
- de |
|
- fr |
|
- it |
|
- pt |
|
- hi |
|
- es |
|
- th |
|
library_name: transformers |
|
license: llama3.1 |
|
tags: |
|
- facebook |
|
- meta |
|
- pytorch |
|
- llama |
|
- llama-3 |
|
model-index: |
|
- name: Meta-Llama-3.1-70B-Instruct-NF4 |
|
results: [] |
|
--- |
|
|
|
# Model Card for Model ID |
|
|
|
<!-- Provide a quick summary of what the model is/does. --> |
|
|
|
This is a quantized version of `Llama 3.1 70B Instruct`. Quantized to **4-bit** using `bistandbytes` and `accelerate`. |
|
|
|
- **Developed by:** Farid Saud @ DSRS |
|
- **License:** llama3.1 |
|
- **Base Model:** meta-llama/Meta-Llama-3.1-70B-Instruct |
|
|
|
## Use this model |
|
|
|
|
|
Use a pipeline as a high-level helper: |
|
```python |
|
# Use a pipeline as a high-level helper |
|
from transformers import pipeline |
|
|
|
messages = [ |
|
{"role": "user", "content": "Who are you?"}, |
|
] |
|
pipe = pipeline("text-generation", model="fsaudm/Meta-Llama-3.1-70B-Instruct-NF4") |
|
pipe(messages) |
|
``` |
|
|
|
|
|
|
|
Load model directly |
|
```python |
|
# Load model directly |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("fsaudm/Meta-Llama-3.1-70B-Instruct-NF4") |
|
model = AutoModelForCausalLM.from_pretrained("fsaudm/Meta-Llama-3.1-70B-Instruct-NF4") |
|
``` |
|
|
|
The base model information can be found in the original [meta-llama/Meta-Llama-3.1-70B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-70B-Instruct) |
|
|
|
|
|
|
|
|