--- library_name: peft --- ## Model Usage ```python import wandb import os import torch from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig, TrainingArguments from peft import LoraConfig, prepare_model_for_kbit_training, get_peft_model, AutoPeftModelForCausalLM from datasets import load_dataset from random import randrange from trl import SFTTrainer from huggingface_hub import login tokenizer = AutoTokenizer.from_pretrained("Salesforce/codegen25-7b-mono", trust_remote_code=True) tokenizer.pad_token = tokenizer.eos_token device_map = {"":0} model = AutoPeftModelForCausalLM.from_pretrained("0xk1h0/codegen2.5-7b-py150k-r20-QLoRA", device_map=device_map, torch_dtype=torch.bfloat16) text =""" # Generate AES MODE encrypt python function. """ inputs = tokenizer(text, return_tensors="pt").to("cuda") outputs = model.generate( input_ids=inputs["input_ids"].to("cuda"), attention_mask=inputs["attention_mask"], # max_new_tokens=50, max_length=256, do_sample=True, temperature = 0.4, top_p=0.95, pad_token_id=tokenizer.eos_token_id ) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ## Training procedure The following `bitsandbytes` quantization config was used during training: - load_in_8bit: False - load_in_4bit: True - llm_int8_threshold: 6.0 - llm_int8_skip_modules: None - llm_int8_enable_fp32_cpu_offload: False - llm_int8_has_fp16_weight: False - bnb_4bit_quant_type: nf4 - bnb_4bit_use_double_quant: False - bnb_4bit_compute_dtype: float16 The following `bitsandbytes` quantization config was used during training: - load_in_8bit: False - load_in_4bit: True - llm_int8_threshold: 6.0 - llm_int8_skip_modules: None - llm_int8_enable_fp32_cpu_offload: False - llm_int8_has_fp16_weight: False - bnb_4bit_quant_type: nf4 - bnb_4bit_use_double_quant: False - bnb_4bit_compute_dtype: float16 ### Framework versions - PEFT 0.5.0 - PEFT 0.5.0