Edit model card

Hyperparameters

  • 3 epoch
  • 1e-4 -> 1e-5 with cosine lr decay
  • batch size 128
  • max sequence length 2048
  • AdamW(weigth decay=0.01, b1=0.9, b2=0.99, grad_clip=1.0)
  • no warmup
  • BF16
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("heegyu/WizardVicuna-pythia-1.4b-deduped")
model = AutoModelForCausalLM.from_pretrained("heegyu/WizardVicuna-pythia-1.4b-deduped")

inputs = tokenizer(["Human: Hi\n\nAssistant: "], return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=16)
print(tokenizer.batch_decode(outputs, skip_special_tokens=False))

output: ['Human: Hi\n\nAssistant: Hello! How can I assist you today?<|endoftext|>']

Downloads last month
5
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train heegyu/WizardVicuna-pythia-1.4b-deduped