bosbos commited on
Commit
7c1791c
1 Parent(s): 815669a

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +130 -0
README.md ADDED
@@ -0,0 +1,130 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ pipeline_tag: text-generation
4
+ datasets:
5
+ - mlabonne/guanaco-llama2-1k
6
+ ---
7
+
8
+ pipeline_tag: text-generation
9
+ ---
10
+ # |bosbos-2-7b|
11
+
12
+ <center><img src="https://www.geeky-gadgets.com/wp-content/uploads/2023/08/Llama-2-unrestricted-local-install.webp" width="300"></center>
13
+
14
+ This is a `llama-2-7b-chat-hf` model fine-tuned using QLoRA (4-bit precision) on the [`mlabonne/guanaco-llama2`](https://huggingface.co/datasets/mlabonne/guanaco-llama2) dataset.
15
+
16
+ ## 🔧 Training
17
+
18
+ It was trained on a Google Colab notebook with a T4 GPU and high RAM.
19
+
20
+ ## 💻 Usage
21
+
22
+ ``` python
23
+ # pip install transformers accelerate
24
+
25
+ from transformers import AutoTokenizer
26
+ import transformers
27
+ import torch
28
+
29
+ model = "bosbos/bosbos_chat"
30
+ prompt = "what is prediction in frensh ?"
31
+
32
+ tokenizer = AutoTokenizer.from_pretrained(model)
33
+ pipeline = transformers.pipeline(
34
+ "text-generation",
35
+ model=model,
36
+ torch_dtype=torch.float16,
37
+ device_map="auto",
38
+ )
39
+
40
+ sequences = pipeline(
41
+ f'<s>[INST] {prompt} [/INST]',
42
+ do_sample=True,
43
+ top_k=10,
44
+ num_return_sequences=1,
45
+ eos_token_id=tokenizer.eos_token_id,
46
+ max_length=200,
47
+ )
48
+ for seq in sequences:
49
+ print(f"Result: {seq['generated_text']}")
50
+ ```
51
+ Or use this :
52
+ ``` python
53
+ # !pip install -q accelerate==0.21.0 peft==0.4.0 bitsandbytes==0.40.2 transformers==4.31.0 trl==0.4.7
54
+
55
+ import torch
56
+ from transformers import (
57
+ AutoModelForCausalLM,
58
+ AutoTokenizer,
59
+ BitsAndBytesConfig,
60
+ pipeline,
61
+
62
+ )
63
+
64
+ ###############################################################################
65
+ # bitsandbytes parameters
66
+ ################################################################################
67
+
68
+ # Activate 4-bit precision base model loading
69
+ use_4bit = True
70
+
71
+ # Compute dtype for 4-bit base models
72
+ bnb_4bit_compute_dtype = "float16"
73
+
74
+ # Quantization type (fp4 or nf4)
75
+ bnb_4bit_quant_type = "nf4"
76
+
77
+ # Activate nested quantization for 4-bit base models (double quantization)
78
+ use_nested_quant = False
79
+
80
+ ################################################################################
81
+ # SFT parameters
82
+ ################################################################################
83
+
84
+ # Maximum sequence length to use
85
+ max_seq_length = None
86
+
87
+ # Pack multiple short examples in the same input sequence to increase efficiency
88
+ packing = False
89
+
90
+ # Load the entire model on the GPU 0
91
+ device_map = {"": 0}
92
+
93
+ model_name="bosbos/bosbos_chat"
94
+ # Load tokenizer and model with QLoRA configuration
95
+ compute_dtype = getattr(torch, bnb_4bit_compute_dtype)
96
+
97
+ bnb_config = BitsAndBytesConfig(
98
+ load_in_4bit=use_4bit,
99
+ bnb_4bit_quant_type=bnb_4bit_quant_type,
100
+ bnb_4bit_compute_dtype=compute_dtype,
101
+ bnb_4bit_use_double_quant=use_nested_quant,
102
+ )
103
+ # Load base model
104
+ model = AutoModelForCausalLM.from_pretrained(
105
+ model_name,
106
+ quantization_config=bnb_config,
107
+ device_map=device_map
108
+ )
109
+ model.config.use_cache = False
110
+ model.config.pretraining_tp = 1
111
+
112
+ # Load LLaMA tokenizer
113
+ tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
114
+ tokenizer.pad_token = tokenizer.eos_token
115
+ tokenizer.padding_side = "right" # Fix weird overflow issue with fp16 training
116
+
117
+ # Run text generation pipeline with our next model
118
+ prompt = "what is prediction in frensh ?"
119
+ pipe = pipeline(task="text-generation", model=model, tokenizer=tokenizer, max_length=200)
120
+ result = pipe(f"<s>[INST] {prompt} [/INST]")
121
+ print(result[0]['generated_text'])
122
+ ```
123
+
124
+ Output:
125
+ >"Prédiction" is a noun that refers to the act of making a forecast or an estimate of something that will happen in the future. It can also refer to the result of such a forecast or estimate.
126
+
127
+ >For example:
128
+ >* "La prédiction de la météo est que il va pleuvoir demain." (The weather forecast is that it will rain tomorrow.)
129
+ >* "La prédiction de la course de chevaux est que le favori va gagner." (The prediction of the horse race is that the favorite will win.)
130
+ >In English, the word "prediction" is often used in a similar way, but it can also refer to a statement or a prophecy about something that has already happened or is happening.