MatNLP
/

Sectrum

PEFT

Portuguese

llama

4-bit precision

Model card Files Files and versions Community

MatNLP commited on May 27

Commit

c614af5

•

1 Parent(s): 4530879

Update README.md

Browse files

Files changed (1) hide show

README.md +51 -2

README.md CHANGED Viewed

@@ -1,5 +1,7 @@
 ---
 library_name: peft
 ---
 ## Training procedure
@@ -14,7 +16,54 @@ The following `bitsandbytes` quantization config was used during training:
 - bnb_4bit_quant_type: nf4
 - bnb_4bit_use_double_quant: False
 - bnb_4bit_compute_dtype: float16
-### Framework versions
-- PEFT 0.4.0

 ---
 library_name: peft
+language:
+- pt
 ---
 ## Training procedure
 - bnb_4bit_quant_type: nf4
 - bnb_4bit_use_double_quant: False
 - bnb_4bit_compute_dtype: float16
+## Algoritmo para utilização do modelo
+import torch
+from peft import PeftModel, PeftConfig
+from transformers import AutoModelForCausalLM,BitsAndBytesConfig,AutoTokenizer
+#Quantização
+use_4bit = True
+bnb_4bit_compute_dtype = "float16"
+bnb_4bit_quant_type = "nf4"
+use_nested_quant = False
+# Carrega o tokenizer e modelo com configuração QLoRA
+compute_dtype = getattr(torch, bnb_4bit_compute_dtype)
+bnb_config = BitsAndBytesConfig(load_in_4bit = use_4bit,
+                                bnb_4bit_quant_type = bnb_4bit_quant_type,
+                                bnb_4bit_compute_dtype = compute_dtype,
+                                bnb_4bit_use_double_quant = use_nested_quant)
+#Import do modelo
+config = PeftConfig.from_pretrained("MatNLP/Sectrum")
+base_model = AutoModelForCausalLM.from_pretrained("NousResearch/Llama-2-7b-chat-hf",quantization_config = bnb_config)
+model = PeftModel.from_pretrained(base_model, "MatNLP/Sectrum")
+# Carrega o tokenizador
+tokenizer = AutoTokenizer.from_pretrained("NousResearch/Llama-2-7b-chat-hf", trust_remote_code = True,skip_special_tokens=True)
+tokenizer.pad_token = tokenizer.eos_token
+tokenizer.padding_side = "right"
+# Prepara o prompt
+prompt = "Como proteger meu e-mail?"
+# Cria o pipeline
+pipe = pipeline(task = "text-generation",
+                model = model,
+                tokenizer = tokenizer,
+                max_length = 200)
+                #streamer=TextStreamer(tokenizer,skip_prompt=True)
+# Executa o pipeline e gera o texto a partir do prompt inicial
+resultado = pipe(f"<s>[INST] {prompt} [/INST]")
+print(resultado[0]['generated_text'].split("[/INST]")[1].split('<\s>')[0])