Edit model card

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

DeciDPObyBB - a 7b DeciLM Finetune using DPO

Built by fine-tuning DeciLM-7B-Insruct using Intel Orca DPO Pairs

created by bhaiyabot

built for research and learning purposes!

usage:

message = [
    {"role": "system", "content": "You are a very helpful assistant chatbot that thinks step by step"},
    {"role": "user", "content": input}
]
tokenizer = AutoTokenizer.from_pretrained(new_model)
prompt = tokenizer.apply_chat_template(message, add_generation_prompt=True, tokenize=False)


sequences = pipeline(
    prompt,
    do_sample=True,
    temperature=1,
    num_beams=5,
    max_length=1000,
    pad_token_id=tokenizer.eos_token_id,
)
print(sequences[0]['generated_text'])
@misc{DeciFoundationModels,
title = {DeciLM-7B-instruct},
author = {DeciAI Research Team},
year = {2023}
url={https://huggingface.co/Deci/DeciLM-7B-instruct},
}

@misc{rafailov2023direct,
      title={Direct Preference Optimization: Your Language Model is Secretly a Reward Model}, 
      author={Rafael Rafailov and Archit Sharma and Eric Mitchell and Stefano Ermon and Christopher D. Manning and Chelsea Finn},
      year={2023},
      eprint={2305.18290},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

more details to come soon

Downloads last month
0
Safetensors
Model size
7.04B params
Tensor type
FP16
·
Inference Examples
Inference API (serverless) does not yet support model repos that contain custom code.

Dataset used to train rohansolo/DeciDPObyBB