realshyfox
commited on
Commit
•
f41a811
1
Parent(s):
5315762
Update README.md
Browse files
README.md
CHANGED
@@ -5,21 +5,21 @@ tags: []
|
|
5 |
|
6 |
# Model Card for Meta-Llama-3-8B
|
7 |
|
8 |
-
Meta-Llama-3-8B is an advanced language model developed by Meta, part of the Llama 3 family, optimized for text generation and natural language understanding tasks.
|
|
|
9 |
|
10 |
## Model Details
|
11 |
|
12 |
### Model Description
|
13 |
|
14 |
-
This is the model card for the Meta-Llama-3-8B, a part of the Llama 3 model family which includes models with 8 billion and 70 billion parameters.
|
|
|
15 |
|
16 |
- **Developed by:** Meta
|
17 |
-
- **Funded by [optional]:** Not specified
|
18 |
-
- **Shared by [optional]:** Not specified
|
19 |
- **Model type:** Auto-regressive language model
|
20 |
- **Language(s) (NLP):** English
|
21 |
- **License:** Meta's custom commercial license
|
22 |
-
- **Finetuned from model
|
23 |
|
24 |
### Model Sources [optional]
|
25 |
|
@@ -96,11 +96,13 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
|
|
96 |
|
97 |
- **Hardware Type:** NVIDIA A100 GPUs
|
98 |
- **Hours used:** Not specified
|
99 |
-
- **Cloud Provider:** Not
|
100 |
- **Compute Region:** Not specified
|
101 |
- **Carbon Emitted:** Not specified
|
102 |
|
103 |
-
## Technical Specifications
|
|
|
|
|
104 |
|
105 |
### Model Architecture and Objective
|
106 |
|
@@ -116,7 +118,7 @@ Training utilized a cluster of NVIDIA A100 GPUs.
|
|
116 |
|
117 |
The model is compatible with PyTorch and Hugging Face's transformers library.
|
118 |
|
119 |
-
## Citation
|
120 |
|
121 |
**BibTeX:**
|
122 |
|
@@ -134,18 +136,18 @@ The model is compatible with PyTorch and Hugging Face's transformers library.
|
|
134 |
|
135 |
Meta AI. (2024). Meta Llama 3: An Open-Source Large Language Model. Meta AI Blog. Retrieved from https://ai.meta.com/blog/meta-llama-3/
|
136 |
|
137 |
-
## Glossary
|
138 |
|
139 |
- **Auto-regressive model:** A type of model that generates sequences by predicting the next element based on previous elements.
|
140 |
- **Transformer architecture:** A neural network architecture designed for handling sequential data, particularly for tasks in NLP.
|
141 |
|
142 |
-
## More Information
|
143 |
|
144 |
-
For more details, visit the [Meta Llama website](https://llama.meta.com).
|
145 |
|
146 |
-
## Model Card Authors
|
147 |
|
148 |
-
|
149 |
|
150 |
## Model Card Contact
|
151 |
|
|
|
5 |
|
6 |
# Model Card for Meta-Llama-3-8B
|
7 |
|
8 |
+
Meta-Llama-3-8B is an advanced language model developed by Meta, part of the Llama 3 family, optimized for text generation and natural language understanding tasks.
|
9 |
+
This model leverages transformer architecture and is available in pre-trained and instruction-tuned variants.
|
10 |
|
11 |
## Model Details
|
12 |
|
13 |
### Model Description
|
14 |
|
15 |
+
This is the model card for the Meta-Llama-3-8B, a part of the Llama 3 model family which includes models with 8 billion and 70 billion parameters.
|
16 |
+
The model is pre-trained on a diverse dataset of publicly available text and is designed for both research and commercial use, particularly for applications requiring natural language understanding and generation.
|
17 |
|
18 |
- **Developed by:** Meta
|
|
|
|
|
19 |
- **Model type:** Auto-regressive language model
|
20 |
- **Language(s) (NLP):** English
|
21 |
- **License:** Meta's custom commercial license
|
22 |
+
- **Finetuned from model:** Not applicable
|
23 |
|
24 |
### Model Sources [optional]
|
25 |
|
|
|
96 |
|
97 |
- **Hardware Type:** NVIDIA A100 GPUs
|
98 |
- **Hours used:** Not specified
|
99 |
+
- **Cloud Provider:** Not used
|
100 |
- **Compute Region:** Not specified
|
101 |
- **Carbon Emitted:** Not specified
|
102 |
|
103 |
+
## Technical Specifications
|
104 |
+
|
105 |
+
Llama 3 sharded model for an easier inference and fine tuning process on lower to mid end processing systems.
|
106 |
|
107 |
### Model Architecture and Objective
|
108 |
|
|
|
118 |
|
119 |
The model is compatible with PyTorch and Hugging Face's transformers library.
|
120 |
|
121 |
+
## Citation
|
122 |
|
123 |
**BibTeX:**
|
124 |
|
|
|
136 |
|
137 |
Meta AI. (2024). Meta Llama 3: An Open-Source Large Language Model. Meta AI Blog. Retrieved from https://ai.meta.com/blog/meta-llama-3/
|
138 |
|
139 |
+
## Glossary
|
140 |
|
141 |
- **Auto-regressive model:** A type of model that generates sequences by predicting the next element based on previous elements.
|
142 |
- **Transformer architecture:** A neural network architecture designed for handling sequential data, particularly for tasks in NLP.
|
143 |
|
144 |
+
## More Information
|
145 |
|
146 |
+
For more details about Meta, visit the [Meta Llama website](https://llama.meta.com).
|
147 |
|
148 |
+
## Model Card Authors
|
149 |
|
150 |
+
realshyfox
|
151 |
|
152 |
## Model Card Contact
|
153 |
|