aberrio commited on
Commit
11efa75
1 Parent(s): d2a18ef

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -22
README.md CHANGED
@@ -21,39 +21,46 @@ tags:
21
  - gguf
22
  ---
23
 
24
- # LLaMA 3.2 1B Instruct
25
 
26
- ## 1. **Model Title**
 
 
27
  - **Name**: LLaMA 3.2 1B Instruct
28
  - **Parameter Size**: 1B (1.23B)
29
-
30
- ## 2. **Quantization Information**
31
- - **Available Formats**:
32
- - **ggml-model-q8_0.gguf**: 8-bit quantization for resource efficiency and good performance.
33
- - **ggml-model-f16.gguf**: Half-precision (16-bit) floating-point format for enhanced precision.
34
- - **Quantization Library**: llama.cpp
35
- - **Use Cases**: Recommended for tasks such as multilingual dialogue, text generation, and summarization.
36
-
37
- ## 3. **Model Brief**
38
- LLaMA 3.2 1B Instruct is a multilingual instruction-tuned language model, optimized for various dialogue tasks. It has been trained on a diverse set of publicly available data and performs well on common NLP benchmarks. The model architecture leverages improved transformer optimizations, making it effective for both text-only and code tasks.
39
-
40
- - **Purpose**: Multilingual dialogue generation and summarization.
41
  - **Model Family**: LLaMA 3.2
42
  - **Architecture**: Auto-regressive Transformer with Grouped-Query Attention (GQA)
 
43
  - **Training Data**: A mix of publicly available multilingual data, covering up to 9T tokens.
44
  - **Supported Languages**: English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.
45
  - **Release Date**: September 25, 2024
46
  - **Context Length**: 128k tokens
47
  - **Knowledge Cutoff**: December 2023
48
 
49
- ## 4. **Core Library Information**
50
- - **Library**: llama.cpp
51
- - *[Repository Link](https://github.com/ggerganov/llama.cpp)*
 
 
 
 
 
 
 
 
 
 
 
 
 
52
  - **Model Base**: [meta-llama/Llama-3.2-1B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct)
 
 
 
53
 
54
- ## 5. **Safety and Responsible Use**
55
- LLaMA 3.2 1B is designed with safety in mind but still carries inherent risks due to its generative nature. It may produce biased, harmful, or unpredictable responses, especially for less-tested languages or sensitive prompts.
56
 
57
- - **Testing and Risk Assessment**: Initial testing has focused on English outputs, and coverage for other languages is ongoing.
58
- - **Limitations**: As with most LLMs, LLaMA 3.2 may not fully adhere to user instructions or safety guidelines and might exhibit unexpected behavior.
59
- - **Responsible Use Guidelines**: For deployment, thorough testing is advised to align outputs with application-specific safety requirements. Refer to the [Responsible Use Guide](https://ai.meta.com/llama/responsible-use-guide/) for more details.
 
21
  - gguf
22
  ---
23
 
24
+ ## LLaMA 3.2 1B Instruct
25
 
26
+ LLaMA 3.2 1B Instruct is a multilingual instruction-tuned language model with 1.23 billion parameters. Designed for diverse multilingual dialogue and summarization tasks, it offers effective performance on a range of NLP benchmarks.
27
+
28
+ ### Model Information
29
  - **Name**: LLaMA 3.2 1B Instruct
30
  - **Parameter Size**: 1B (1.23B)
 
 
 
 
 
 
 
 
 
 
 
 
31
  - **Model Family**: LLaMA 3.2
32
  - **Architecture**: Auto-regressive Transformer with Grouped-Query Attention (GQA)
33
+ - **Purpose**: Multilingual dialogue generation, text generation, and summarization.
34
  - **Training Data**: A mix of publicly available multilingual data, covering up to 9T tokens.
35
  - **Supported Languages**: English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.
36
  - **Release Date**: September 25, 2024
37
  - **Context Length**: 128k tokens
38
  - **Knowledge Cutoff**: December 2023
39
 
40
+ ### Quantized Model Files
41
+ - **Available Formats**:
42
+ - **ggml-model-q8_0.gguf**: 8-bit quantization for resource efficiency and good performance.
43
+ - **ggml-model-f16.gguf**: Half-precision (16-bit) floating-point format for enhanced precision.
44
+ - **Quantization Library**: llama.cpp
45
+ - **Use Cases**: Multilingual dialogue, summarization, and text generation.
46
+
47
+ ### Core Library
48
+ LLaMA 3.2 1B Instruct can be deployed using `llama.cpp` or `transformers`, with a focus on streamlined integration into the Hugging Face ecosystem.
49
+
50
+ - **Primary Framework**: `llama.cpp`
51
+ - **Alternate Frameworks**:
52
+ - `transformers` for Hugging Face model support.
53
+ - `vLLM` for optimized inference and low-latency deployments.
54
+
55
+ **Library and Model Links**:
56
  - **Model Base**: [meta-llama/Llama-3.2-1B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct)
57
+ - **Models**: [meta-llama/llama-stack](https://github.com/meta-llama/llama-stack)
58
+ - **Inference Support**: [meta-llama/llama](https://github.com/meta-llama/llama)
59
+ - **Quantization**: [ggerganov/llama.cpp](https://github.com/ggerganov/llama.cpp)
60
 
61
+ ### Safety and Responsible Use
62
+ LLaMA 3.2 1B has been designed with safety in mind but may produce biased, harmful, or unpredictable outputs, especially for less-covered languages or specific prompts.
63
 
64
+ - **Testing and Risk Assessment**: Initial testing has primarily focused on English; coverage for other languages is ongoing.
65
+ - **Limitations**: LLaMA 3.2 may not fully adhere to user instructions or safety guidelines, and may exhibit unexpected behaviors.
66
+ - **Responsible Use Guidelines**: Refer to the [Responsible Use Guide](https://ai.meta.com/llama/responsible-use-guide/) for more details.