snowfly
/

llama2-7b-QLoRA-dolly

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

snowfly commited on Jun 14

Commit

1ee85c5

•

1 Parent(s): 1e98455

Create README.md

Files changed (1) hide show

README.md +55 -0

README.md ADDED Viewed

	@@ -0,0 +1,55 @@

+---
+license: apache-2.0
+datasets:
+- databricks/databricks-dolly-15k
+language:
+- en
+---
+## 模型介绍
+- 使用模型：LLaMA2-7B
+- 微调方法：QLoRA
+- 数据集：databricks/databricks-dolly-15k
+- 目标：对模型进行指令微调
+## 使用方法
+- 加载数据
+```
+from datasets import load_dataset
+from random import randrange
+# 从hub加载数据集并得到一个样本
+dataset = load_dataset("databricks/databricks-dolly-15k", split="train")
+sample = dataset[randrange(len(dataset))]
+```
+- 模型使用
+```
+from transformers import AutoTokenizer, AutoModelForCausalLM
+import torch
+model_name_or_path = "snowfly/llama2-7b-QLoRA-dolly"
+tokenizer = AutoTokenizer.from_pretrained(pretrained_model_name_or_path=model_name_or_path, trust_remote_code=True)
+model = AutoModelForCausalLM.from_pretrained(pretrained_model_name_or_path=model_name_or_path,
+                                  trust_remote_code=True,
+                                  low_cpu_mem_usage=True,
+                                  torch_dtype=torch.float16,
+                                  load_in_4bit=True)
+model = model.eval()
+prompt = f"""### Instruction:
+Use the Input below to create an instruction, which could have been used to generate the input using an LLM.
+### Input:
+{sample['response']}
+### Response:
+"""
+input_ids = tokenizer(prompt, return_tensors="pt", truncation=True).input_ids.cuda()
+outputs = model.generate(input_ids=input_ids, max_new_tokens=100, do_sample=True, top_p=0.9,temperature=0.9)
+print(f"Prompt:\n{sample['response']}\n")
+print(f"Generated instruction:\n{tokenizer.batch_decode(outputs.detach().cpu().numpy(), skip_special_tokens=True)[0][len(prompt):]}")
+print(f"Ground truth:\n{sample['instruction']}")
+```