Kohsaku
/

gemma-2-9b-finetune-2

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Kohsaku commited on Dec 9, 2024

Commit

1855d93

·

verified ·

1 Parent(s): 1e7836a

first commit

Files changed (1) hide show

README.md +45 -3

README.md CHANGED Viewed

@@ -6,17 +6,59 @@ tags:
 - unsloth
 - gemma2
 - trl
-license: apache-2.0
 language:
-- en
 ---
 # Uploaded  model
 - **Developed by:** Kohsaku
-- **License:** apache-2.0
 - **Finetuned from model :** google/gemma-2-9b
 This gemma2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
 [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

 - unsloth
 - gemma2
 - trl
+license: gemma
 language:
+- en,
+datasets:
+- llm-jp/magpie-sft-v1.0
 ---
 # Uploaded  model
 - **Developed by:** Kohsaku
+- **License:** Gemma 2 License
 - **Finetuned from model :** google/gemma-2-9b
 This gemma2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
 [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
+# Sample Use
+``` python
+model_name = "Kohsaku/gemma-2-9b-finetune-2"
+#@title README 検証用
+max_seq_length = 1024
+dtype = None
+load_in_4bit = True
+model, tokenizer = FastLanguageModel.from_pretrained(
+    model_name = model_name,
+    max_seq_length = max_seq_length,
+    dtype = dtype,
+    load_in_4bit = load_in_4bit,
+    token = HF_TOKEN,
+)
+FastLanguageModel.for_inference(model)
+text = "自然言語処理とは何か"
+tokenized_input = tokenizer.encode(text, add_special_tokens=True , return_tensors="pt").to(model.device)
+# attention_maskを作成
+# attention_mask = torch.ones(tokenized_input.shape, device=model.device)
+with torch.no_grad():
+    output = model.generate(
+        tokenized_input,
+        max_new_tokens = 1024,
+        use_cache = True,
+        do_sample=False,
+        repetition_penalty=1.2
+    )[0]
+print(tokenizer.decode(output))
+```