mlabonne commited on
Commit
321e67e
β€’
1 Parent(s): c48c94e

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +51 -0
README.md ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - timdettmers/openassistant-guanaco
5
+ pipeline_tag: text-generation
6
+ ---
7
+ # Llama-2-13b-guanaco
8
+
9
+ πŸ“ [Article](https://towardsdatascience.com/fine-tune-your-own-llama-2-model-in-a-colab-notebook-df9823a04a32) |
10
+ πŸ’» [Colab](https://colab.research.google.com/drive/1PEQyJO1-f6j0S_XJ8DV50NkpzasXkrzd?usp=sharing) |
11
+ πŸ“„ [Script](https://gist.github.com/mlabonne/b5718e1b229ce6553564e3f56df72c5c)
12
+
13
+ <center><img src="https://i.imgur.com/C2x7n2a.png" width="300"></center>
14
+
15
+ This is a `llama-2-13b-chat-hf` model fine-tuned using QLoRA (4-bit precision) on the [`mlabonne/guanaco-llama2`](https://huggingface.co/datasets/mlabonne/guanaco-llama2) dataset.
16
+
17
+ ## πŸ”§ Training
18
+
19
+ It was trained on a Google Colab notebook with a T4 GPU and high RAM.
20
+
21
+ ## πŸ’» Usage
22
+
23
+ ``` python
24
+ # pip install transformers accelerate
25
+
26
+ from transformers import AutoTokenizer
27
+ import transformers
28
+ import torch
29
+
30
+ model = "mlabonne/llama-2-13b-miniguanaco"
31
+ prompt = "What is a large language model?"
32
+
33
+ tokenizer = AutoTokenizer.from_pretrained(model)
34
+ pipeline = transformers.pipeline(
35
+ "text-generation",
36
+ model=model,
37
+ torch_dtype=torch.float16,
38
+ device_map="auto",
39
+ )
40
+
41
+ sequences = pipeline(
42
+ f'<s>[INST] {prompt} [/INST]',
43
+ do_sample=True,
44
+ top_k=10,
45
+ num_return_sequences=1,
46
+ eos_token_id=tokenizer.eos_token_id,
47
+ max_length=200,
48
+ )
49
+ for seq in sequences:
50
+ print(f"Result: {seq['generated_text']}")
51
+ ```