vistagi
/

Mixtral-8x7b-v0.1-dpo

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Mixtral-8x7b-v0.1-dpo / README.md

eelxpeng's picture

Create README.md

9eb8bc4 verified 8 months ago

|

history blame contribute delete

No virus

410 Bytes

	---
	license: apache-2.0
	datasets:
	- HuggingFaceH4/ultrafeedback_binarized
	language:
	- en
	---

	# Introduction
	This model vistagi/Mixtral-8x7b-v0.1-sft is trained with Ultrachat-200K dataset through supervised finetuning using Mixtral-8x7b-v0.1 as the baseline model. The training is done with bfloat16 precision using LoRA.

	## Details
	Used Librarys
	- torch
	- deepspeed
	- pytorch lightning
	- transformers
	- peft