Upload folder using huggingface_hub

Browse files

Files changed (5) hide show

.gitattributes +3 -0
README.md +72 -1
ggml-model-Q4_0.gguf +3 -0
ggml-model-Q8_0.gguf +3 -0
ggml-model-f16.gguf +3 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,6 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+ggml-model-Q4_0.gguf filter=lfs diff=lfs merge=lfs -text
+ggml-model-Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
+ggml-model-f16.gguf filter=lfs diff=lfs merge=lfs -text

README.md CHANGED Viewed

@@ -1,3 +1,74 @@
 ---
-license: cc-by-3.0
 ---

 ---
+license: cc-by-sa-3.0
+datasets:
+- VMware/open-instruct-v1-oasst-dolly-hhrlhf
+language:
+- en
+library_name: transformers
+pipeline_tag: text-generation
 ---
+# Open LLama 13B Open Instruct
+- Model creator: [VMware](https://huggingface.co/VMware)
+- Original model: [](https://huggingface.co/VMware/open-llama-13b-open-instruct)
+## Description
+This repo contains the GGUF model files for [Open LLama 13B Open Instruct](https://huggingface.co/VMware/open-llama-13b-open-instruct).
+These files are compatible with [llama.cpp](https://github.com/ggerganov/llama.cpp).
+# VMware/open-llama-13B-open-instruct
+Instruction-tuned version of the fully trained Open LLama 13B model. The model is open for <b>COMMERCIAL USE</b>. <br>
+<b> NOTE </b> : The model was trained using the Alpaca prompt template \
+<b> NOTE </b> : Fast tokenizer results in incorrect encoding, set the ```use_fast = False``` parameter, when instantiating the tokenizer\
+<b> NOTE </b> : The model might struggle with code as the tokenizer merges multiple spaces
+## License
+- <b>Commercially Viable </b>
+- Instruction dataset, [VMware/open-instruct-v1-oasst-dolly-hhrlhf](https://huggingface.co/datasets/VMware/open-instruct-v1-oasst-dolly-hhrlhf) is under cc-by-sa-3.0
+- Language Model, ([openlm-research/open_llama_13b](https://huggingface.co/openlm-research/open_llama_13b)) is under apache-2.0
+## Nomenclature
+- Model : Open-llama
+- Model Size: 13B parameters
+- Dataset: Open-instruct-v1 (oasst,dolly, hhrlhf)
+## Use in Transformers
+```
+import os
+import torch
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model_name = 'VMware/open-llama-13b-open-instruct'
+tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=False)
+model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16, device_map='sequential')
+prompt_template = "Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\n{instruction}\n\n### Response:"
+prompt = 'Explain in simple terms how the attention mechanism of a transformer model works'
+inputt = prompt_template.format(instruction= prompt)
+input_ids = tokenizer(inputt, return_tensors="pt").input_ids.to("cuda")
+output1 = model.generate(input_ids, max_length=512)
+input_length = input_ids.shape[1]
+output1 = output1[:, input_length:]
+output = tokenizer.decode(output1[0])
+print(output)
+```
+## Finetuning details
+The finetuning scripts will be available in our [RAIL Github Repository](https://github.com/vmware-labs/research-and-development-artificial-intelligence-lab/tree/main/instruction-tuning)
+## Evaluation
+<B>TODO</B>

ggml-model-Q4_0.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c505951413c201868d893f1c91f576aa260e2e9baa5f7a497c0aa6688b22c7be
+size 7365869152

ggml-model-Q8_0.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:7635749b68b4d708869ad7603d0eb3415d385a4f17d7fb6c22009c25f9408a3a
+size 13831353952

ggml-model-f16.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a2f684e3832feb7d66e9564c3ffeb6849b0d8c970ef847297733eced1e35cdab
+size 26033337888