File size: 3,285 Bytes

67301b9
0cb4ec3
 
 
 
 
 
 
 
618dc2d
67301b9
 
0cb4ec3
67301b9
0cb4ec3
618dc2d
3c081a4
618dc2d
3c081a4
618dc2d
0cb4ec3
618dc2d
0cb4ec3
618dc2d
0cb4ec3
618dc2d
0cb4ec3
618dc2d
 
 
0cb4ec3
618dc2d
0cb4ec3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
618dc2d
0cb4ec3

---
language:
- en
pipeline_tag: text-generation
tags:
- facebook
- meta
- llama
- llama-3
- llava
license: other
license_name: llama3
license_link: LICENSE
---

## Disclaimer

These models are research experiments and may generate incorrect or harmful content. Outputs from these models should not be taken as factual or representative of the views of myself or the model's creator or any other individual.

The creator(s) of these models and I are not responsible for any harm or damage caused by the models outputs.

I did not train these models or have any say in their creation, I merely converted these models from the sources available below. To report issues or concerns, please contact the model maker via the links provided in this README.

## Conversions

I have used llama.cpp to convert and quantize each of the models available in this repository. Currently, I have quantized:

- `meta` Llama 3 [8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B). Q4_K_M and Q5_K_M.
- `meta` Llama 3 [8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct). Q4_K_M and Q5_K_M.
- `xtuner` Llava Llama 3 [Llava-Llama-3-8B-v1_1](https://huggingface.co/xtuner/llava-llama-3-8b-v1_1). Q4_K_M and Q5_K_M.

**Important information related to each model can be found in the links above**

**Model Architecture** Llama 3 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.

<table>
  <tr>
   <td>
   </td>
   <td><strong>Training Data</strong>
   </td>
   <td><strong>Params</strong>
   </td>
   <td><strong>Context length</strong>
   </td>
   <td><strong>GQA</strong>
   </td>
   <td><strong>Token count</strong>
   </td>
   <td><strong>Knowledge cutoff</strong>
   </td>
  </tr>
  <tr>
   <td rowspan="2" >Llama 3
   </td>
   <td rowspan="2" >A new mix of publicly available online data.
   </td>
   <td>8B
   </td>
   <td>8k
   </td>
   <td>Yes
   </td>
   <td rowspan="2" >15T+
   </td>
   <td>March, 2023
   </td>
  </tr>
  <tr>
   <td>70B
   </td>
   <td>8k
   </td>
   <td>Yes
   </td>
   <td>December, 2023
   </td>
  </tr>
</table>

**Llama 3 family of models**. Token counts refer to pretraining data only. Both the 8 and 70B versions use Grouped-Query Attention (GQA) for improved inference scalability.

**Model Release Date** April 18, 2024.

**License** A custom commercial license is available at: [https://llama.meta.com/llama3/license](https://llama.meta.com/llama3/license).


## Intended Use

**Intended Use Cases** Llama 3 is intended for commercial and research use in English. Instruction tuned models are intended for assistant-like chat, whereas pretrained models can be adapted for a variety of natural language generation tasks.

**Out-of-scope** Use in any manner that violates applicable laws or regulations (including trade compliance laws). Use in any other way that is prohibited by the Acceptable Use Policy and Llama 3 Community License. Use in languages other than English**.

**Note: Developers may fine-tune Llama 3 models for languages beyond English provided they comply with the Llama 3 Community License and the Acceptable Use Policy.