sarath-shekkizhar commited on
Commit
602f82f
1 Parent(s): ba61d13

adding model card

Browse files
Files changed (1) hide show
  1. README.md +6 -6
README.md CHANGED
@@ -1,9 +1,3 @@
1
- # TenyxChat: Language Model Alignment using Tenyx Fine-tuning
2
-
3
- Introducing TenyxChat, a series of ChatGPT-like models trained to function as useful assistants through preference tuning, using Tenyx's recently released advanced fine-tuning technology ([VentureBeat article](https://venturebeat.com/ai/tenyx-aims-to-fix-llms-catastrophic-forgetting-problem/)). Our first chat model in the series, TenyxChat-7B-v1, is trained using the [Direct Preference Optimization (DPO)](https://arxiv.org/abs/2305.18290) framework on the open-source AI feedback dataset [UltraFeedback](https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized).
4
-
5
- We fine-tune [Openchat-3.5](https://arxiv.org/pdf/2309.11235.pdf) with our proprietary approach ([blog](https://www.tenyx.com/post/forgetting-and-toxicity-in-llms-a-deep-dive-on-fine-tuning-methods), [service](https://www.tenyx.com/fine-tuning)), which shows an increase in [MT-Bench](https://arxiv.org/abs/2306.05685), without a drop in performance of the model on other benchmarks. Our approach aims to mitigate forgetting in LLMs in a computationally efficient manner, thereby enabling continual fine-tuning capabilities without altering the pre-trained output distribution. TenyxChat-7B-v1 was trained using eight A100s (80GB) for two hours, with a training setup obtained from HuggingFaceH4 ([GitHub](https://github.com/huggingface/alignment-handbook)).
6
-
7
  ---
8
  model_type: Fine-tuned 7B model for chat.
9
  license: {apache-2.0}
@@ -11,6 +5,12 @@ base_model: {openchat/openchat_3.5}
11
  demo: [Hugging Face Spaces](https://huggingface.co/spaces/tenyx/TenyxChat-7B-v1)
12
  ---
13
 
 
 
 
 
 
 
14
 
15
  ## Usage
16
 
 
 
 
 
 
 
 
1
  ---
2
  model_type: Fine-tuned 7B model for chat.
3
  license: {apache-2.0}
 
5
  demo: [Hugging Face Spaces](https://huggingface.co/spaces/tenyx/TenyxChat-7B-v1)
6
  ---
7
 
8
+ # TenyxChat: Language Model Alignment using Tenyx Fine-tuning
9
+
10
+ Introducing TenyxChat, a series of ChatGPT-like models trained to function as useful assistants through preference tuning, using Tenyx's recently released advanced fine-tuning technology ([VentureBeat article](https://venturebeat.com/ai/tenyx-aims-to-fix-llms-catastrophic-forgetting-problem/)). Our first chat model in the series, TenyxChat-7B-v1, is trained using the [Direct Preference Optimization (DPO)](https://arxiv.org/abs/2305.18290) framework on the open-source AI feedback dataset [UltraFeedback](https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized).
11
+
12
+ We fine-tune [Openchat-3.5](https://arxiv.org/pdf/2309.11235.pdf) with our proprietary approach ([blog](https://www.tenyx.com/post/forgetting-and-toxicity-in-llms-a-deep-dive-on-fine-tuning-methods), [service](https://www.tenyx.com/fine-tuning)), which shows an increase in [MT-Bench](https://arxiv.org/abs/2306.05685), without a drop in performance of the model on other benchmarks. Our approach aims to mitigate forgetting in LLMs in a computationally efficient manner, thereby enabling continual fine-tuning capabilities without altering the pre-trained output distribution. TenyxChat-7B-v1 was trained using eight A100s (80GB) for two hours, with a training setup obtained from HuggingFaceH4 ([GitHub](https://github.com/huggingface/alignment-handbook)).
13
+
14
 
15
  ## Usage
16