--- license: apache-2.0 datasets: - databricks/databricks-dolly-15k language: - en --- ## 模型介绍 - 使用模型:LLaMA2-7B - 微调方法:QLoRA - 数据集:databricks/databricks-dolly-15k - 显卡:一张RTX4090 - 目标:对模型进行指令微调 ## 使用方法 - 加载数据 ``` from datasets import load_dataset from random import randrange # 从hub加载数据集并得到一个样本 dataset = load_dataset("databricks/databricks-dolly-15k", split="train") sample = dataset[randrange(len(dataset))] ``` - 模型使用 ``` from transformers import AutoTokenizer, AutoModelForCausalLM import torch model_name_or_path = "snowfly/llama2-7b-QLoRA-dolly" tokenizer = AutoTokenizer.from_pretrained(pretrained_model_name_or_path=model_name_or_path, trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained(pretrained_model_name_or_path=model_name_or_path, trust_remote_code=True, low_cpu_mem_usage=True, torch_dtype=torch.float16, load_in_4bit=True) model = model.eval() prompt = f"""### Instruction: Use the Input below to create an instruction, which could have been used to generate the input using an LLM. ### Input: {sample['response']} ### Response: """ input_ids = tokenizer(prompt, return_tensors="pt", truncation=True).input_ids.cuda() outputs = model.generate(input_ids=input_ids, max_new_tokens=100, do_sample=True, top_p=0.9,temperature=0.9) print(f"Prompt:\n{sample['response']}\n") print(f"Generated instruction:\n{tokenizer.batch_decode(outputs.detach().cpu().numpy(), skip_special_tokens=True)[0][len(prompt):]}") print(f"Ground truth:\n{sample['instruction']}") ```