duyntnet commited on
Commit
b03f63a
1 Parent(s): 2919236

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +75 -0
README.md ADDED
@@ -0,0 +1,75 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ language:
4
+ - en
5
+ pipeline_tag: text-generation
6
+ inference: false
7
+ tags:
8
+ - transformers
9
+ - gguf
10
+ - imatrix
11
+ - Llama-3.2-1B-Instruct
12
+ ---
13
+ Quantizations of https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct
14
+
15
+
16
+ ### Inference Clients/UIs
17
+ * [llama.cpp](https://github.com/ggerganov/llama.cpp)
18
+ * [KoboldCPP](https://github.com/LostRuins/koboldcpp)
19
+ * [text-generation-webui](https://github.com/oobabooga/text-generation-webui)
20
+ * [ollama](https://github.com/ollama/ollama)
21
+
22
+
23
+ ---
24
+
25
+ # From original readme
26
+
27
+ The Meta Llama 3.2 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction-tuned generative models in 1B and 3B sizes (text in/text out). The Llama 3.2 instruction-tuned text only models are optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks. They outperform many of the available open source and closed chat models on common industry benchmarks.
28
+
29
+ **Model Developer:** Meta
30
+
31
+ **Model Architecture:** Llama 3.2 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.
32
+
33
+ ## How to use
34
+
35
+ This repository contains two versions of Llama-3.2-1B-Instruct, for use with `transformers` and with the original `llama` codebase.
36
+
37
+ ### Use with transformers
38
+
39
+ Starting with `transformers >= 4.43.0` onward, you can run conversational inference using the Transformers `pipeline` abstraction or by leveraging the Auto classes with the `generate()` function.
40
+
41
+ Make sure to update your transformers installation via `pip install --upgrade transformers`.
42
+
43
+ ```python
44
+ import torch
45
+ from transformers import pipeline
46
+
47
+ model_id = "meta-llama/Llama-3.2-1B-Instruct"
48
+ pipe = pipeline(
49
+ "text-generation",
50
+ model=model_id,
51
+ torch_dtype=torch.bfloat16,
52
+ device_map="auto",
53
+ )
54
+ messages = [
55
+ {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
56
+ {"role": "user", "content": "Who are you?"},
57
+ ]
58
+ outputs = pipe(
59
+ messages,
60
+ max_new_tokens=256,
61
+ )
62
+ print(outputs[0]["generated_text"][-1])
63
+ ```
64
+
65
+ Note: You can also find detailed recipes on how to use the model locally, with `torch.compile()`, assisted generations, quantised and more at [`huggingface-llama-recipes`](https://github.com/huggingface/huggingface-llama-recipes)
66
+
67
+ ### Use with `llama`
68
+
69
+ Please, follow the instructions in the [repository](https://github.com/meta-llama/llama)
70
+
71
+ To download Original checkpoints, see the example command below leveraging `huggingface-cli`:
72
+
73
+ ```
74
+ huggingface-cli download meta-llama/Llama-3.2-1B-Instruct --include "original/*" --local-dir Llama-3.2-1B-Instruct
75
+ ```