realshyfox
/

sharded-Llama-3-8B

@@ -5,21 +5,21 @@ tags: []
 # Model Card for Meta-Llama-3-8B
-Meta-Llama-3-8B is an advanced language model developed by Meta, part of the Llama 3 family, optimized for text generation and natural language understanding tasks. This model leverages transformer architecture and is available in pre-trained and instruction-tuned variants.
 ## Model Details
 ### Model Description
-This is the model card for the Meta-Llama-3-8B, a part of the Llama 3 model family which includes models with 8 billion and 70 billion parameters. The model is pre-trained on a diverse dataset of publicly available text and is designed for both research and commercial use, particularly for applications requiring natural language understanding and generation.
 - **Developed by:** Meta
-- **Funded by [optional]:** Not specified
-- **Shared by [optional]:** Not specified
 - **Model type:** Auto-regressive language model
 - **Language(s) (NLP):** English
 - **License:** Meta's custom commercial license
-- **Finetuned from model [optional]:** Not applicable
 ### Model Sources [optional]
@@ -96,11 +96,13 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
 - **Hardware Type:** NVIDIA A100 GPUs
 - **Hours used:** Not specified
-- **Cloud Provider:** Not specified
 - **Compute Region:** Not specified
 - **Carbon Emitted:** Not specified
-## Technical Specifications [optional]
 ### Model Architecture and Objective
@@ -116,7 +118,7 @@ Training utilized a cluster of NVIDIA A100 GPUs.
 The model is compatible with PyTorch and Hugging Face's transformers library.
-## Citation [optional]
 **BibTeX:**
@@ -134,18 +136,18 @@ The model is compatible with PyTorch and Hugging Face's transformers library.
 Meta AI. (2024). Meta Llama 3: An Open-Source Large Language Model. Meta AI Blog. Retrieved from https://ai.meta.com/blog/meta-llama-3/
-## Glossary [optional]
 - **Auto-regressive model:** A type of model that generates sequences by predicting the next element based on previous elements.
 - **Transformer architecture:** A neural network architecture designed for handling sequential data, particularly for tasks in NLP.
-## More Information [optional]
-For more details, visit the [Meta Llama website](https://llama.meta.com).
-## Model Card Authors [optional]
-Meta AI Team
 ## Model Card Contact

 # Model Card for Meta-Llama-3-8B
+Meta-Llama-3-8B is an advanced language model developed by Meta, part of the Llama 3 family, optimized for text generation and natural language understanding tasks.
+This model leverages transformer architecture and is available in pre-trained and instruction-tuned variants.
 ## Model Details
 ### Model Description
+This is the model card for the Meta-Llama-3-8B, a part of the Llama 3 model family which includes models with 8 billion and 70 billion parameters.
+The model is pre-trained on a diverse dataset of publicly available text and is designed for both research and commercial use, particularly for applications requiring natural language understanding and generation.
 - **Developed by:** Meta
 - **Model type:** Auto-regressive language model
 - **Language(s) (NLP):** English
 - **License:** Meta's custom commercial license
+- **Finetuned from model:** Not applicable
 ### Model Sources [optional]
 - **Hardware Type:** NVIDIA A100 GPUs
 - **Hours used:** Not specified
+- **Cloud Provider:** Not used
 - **Compute Region:** Not specified
 - **Carbon Emitted:** Not specified
+## Technical Specifications
+Llama 3 sharded model for an easier inference and fine tuning process on lower to mid end processing systems.
 ### Model Architecture and Objective
 The model is compatible with PyTorch and Hugging Face's transformers library.
+## Citation
 **BibTeX:**
 Meta AI. (2024). Meta Llama 3: An Open-Source Large Language Model. Meta AI Blog. Retrieved from https://ai.meta.com/blog/meta-llama-3/
+## Glossary
 - **Auto-regressive model:** A type of model that generates sequences by predicting the next element based on previous elements.
 - **Transformer architecture:** A neural network architecture designed for handling sequential data, particularly for tasks in NLP.
+## More Information
+For more details about Meta, visit the [Meta Llama website](https://llama.meta.com).
+## Model Card Authors
+realshyfox
 ## Model Card Contact