Edit model card

CausalLM 14B-DPO-alpha - GGUF


This repo contains GGUF format model files for CausalLM's 14B-DPO-alpha.

About GGUF

!! introduction to GUFF is copied from TheBloke's model card !!

GGUF is a new format introduced by the llama.cpp team on August 21st 2023. It is a replacement for GGML, which is no longer supported by llama.cpp.

Here is an incomplate list of clients and libraries that are known to support GGUF:

  • llama.cpp. The source project for GGUF. Offers a CLI and a server option.
  • text-generation-webui, the most widely used web UI, with many features and powerful extensions. Supports GPU acceleration.
  • KoboldCpp, a fully featured web UI, with GPU accel across all platforms and GPU architectures. Especially good for story telling.
  • LM Studio, an easy-to-use and powerful local GUI for Windows and macOS (Silicon), with GPU acceleration.
  • LoLLMS Web UI, a great web UI with many interesting and unique features, including a full model library for easy model selection.
  • Faraday.dev, an attractive and easy to use character-based chat GUI for Windows and macOS (both Silicon and Intel), with GPU acceleration.
  • ctransformers, a Python library with GPU accel, LangChain support, and OpenAI-compatible AI server.
  • llama-cpp-python, a Python library with GPU accel, LangChain support, and OpenAI-compatible API server.
  • candle, a Rust ML framework with a focus on performance, including GPU support, and ease of use.

Prompt template: ChatML



The license for the original model is listed as "wtfpl", but subject to the "Meta Llama 2 License Terms".

Original model card: CausalLM's CausalLM 14B-DPO-alpha

For details, please refer to the version without DPO training: CausalLM/14B.

Model MT-Bench
GPT-4 8.99
GPT-3.5-Turbo 7.94
Zephyr-7b-β (Overfitting) 7.34
Zephyr-7b-α 6.88
CausalLM/14B-DPO-α 7.618868
CausalLM/7B-DPO-α 7.038125

It should be noted that this is not a version that continues training on CausalLM/14B & 7B, but rather an optimized version that has undergone DPO training concurrently on a previous training branch, and some detailed parameters may have changed. You will still need to download the full model.

The beta branch will soon be released, employing some aggressive approaches that might be detrimental in certain tasks, in order to achieve better alignment with human preferences, aiming to meet or exceed the GPT-3.5 benchmarks. Stay tuned.

Disclaimer: Please note that the model was trained on unfiltered internet data. Since we do not have the capacity to vet all of it, there may be a substantial amount of objectionable content, pornography, violence, and offensive language present that we are unable to remove. Therefore, you will still need to complete your own checks on the model's safety and filter keywords in the output. Due to computational resource constraints, we are presently unable to implement RLHF for the model's ethics and safety, nor training on SFT samples that refuse to answer certain questions for restrictive fine-tuning.


需要注意的是,这并不是在 CausalLM/14B & 7B 上继续训练的版本,而是在之前的训练分支上同时进行了 DPO 训练的优化版本,一些细节参数可能发生了变化。 您仍然需要下载完整模型。



Downloads last month
Unable to determine this model's library. Check the docs .

Datasets used to train tastypear/CausalLM-14B-DPO-alpha-GGUF