VMware/open-llama-13B-open-instruct

Instruction-tuned version of the fully trained Open LLama 13B model. The model is open for COMMERCIAL USE.

NOTE : The model was trained using the Alpaca prompt template
NOTE : Fast tokenizer results in incorrect encoding, set the use_fast = False parameter, when instantiating the tokenizer
NOTE : The model might struggle with code as the tokenizer merges multiple spaces

License

Nomenclature

  • Model : Open-llama
  • Model Size: 13B parameters
  • Dataset: Open-instruct-v1 (oasst,dolly, hhrlhf)

Use in Transformers

import os
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = 'VMware/open-llama-13b-open-instruct'


tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=False)

model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16, device_map='sequential')

prompt_template = "Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\n{instruction}\n\n### Response:"

prompt = 'Explain in simple terms how the attention mechanism of a transformer model works'


inputt = prompt_template.format(instruction= prompt)
input_ids = tokenizer(inputt, return_tensors="pt").input_ids.to("cuda")

output1 = model.generate(input_ids, max_length=512)
input_length = input_ids.shape[1]
output1 = output1[:, input_length:]
output = tokenizer.decode(output1[0])

print(output)

Finetuning details

The finetuning scripts will be available in our RAIL Github Repository

Evaluation

TODO

Downloads last month
95
Safetensors
Model size
13B params
Tensor type
F32
ยท
FP16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for VMware/open-llama-13b-open-instruct

Adapters
1 model
Quantizations
2 models

Dataset used to train VMware/open-llama-13b-open-instruct

Space using VMware/open-llama-13b-open-instruct 1