johnhandleyd
/

thesa

Text Generation

Generated from Trainer

Model card Files Files and versions Metrics Training metrics Community

johnhandleyd commited on Feb 27, 2024

Commit

d4727c5

·

verified ·

1 Parent(s): 20d0198

Update README.md

Files changed (1) hide show

README.md +31 -11

README.md CHANGED Viewed

@@ -3,29 +3,37 @@ license: mit
 base_model: TheBloke/zephyr-7B-alpha-GPTQ
 tags:
 - generated_from_trainer
 model-index:
-- name: thesa_v1
   results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-# thesa_v1
-This model is a fine-tuned version of [TheBloke/zephyr-7B-alpha-GPTQ](https://huggingface.co/TheBloke/zephyr-7B-alpha-GPTQ) on an unknown dataset.
 ## Model description
-More information needed
 ## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
 ## Training procedure
@@ -33,14 +41,20 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 0.0002
 - train_batch_size: 8
 - eval_batch_size: 8
-- seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_ratio: 0.1
 - num_epochs: 10
 - mixed_precision_training: Native AMP
 ### Framework versions
@@ -48,3 +62,9 @@ The following hyperparameters were used during training:
 - Pytorch 2.1.0+cu121
 - Datasets 2.16.1
 - Tokenizers 0.15.1

 base_model: TheBloke/zephyr-7B-alpha-GPTQ
 tags:
 - generated_from_trainer
+- gptq
+- peft
 model-index:
+- name: thesa
   results: []
+datasets:
+- loaiabdalslam/counselchat
+language:
+- en
+pipeline_tag: text-generation
 ---
+# Thesa: A Therapy Chatbot 👩🏻‍⚕️
+Thesa is an experimental project of a therapy chatbot trained on mental health data and fine-tuned with the Zephyr GPTQ model that uses quantization to decrease high computatinal and storage costs.
 ## Model description
+- Model type: A fine-tuned version of Zephyr 7B Alpha - GPTQ on various mental health datasets
+- Language(s): English
+- License: MIT
+- Fine-tuned from: [TheBloke/zephyr-7B-alpha-GPTQ](https://huggingface.co/TheBloke/zephyr-7B-alpha-GPTQ)
 ## Intended uses & limitations
+This model is purely experimental and should not be used as substitute for a mental health professional.
+## Training evaluation
+Training loss:
+<img src="imgs/loss_27.2.24.png" alt="loss" width="550"/>
 ## Training procedure
 The following hyperparameters were used during training:
 - learning_rate: 0.0002
+- warmup_ratio: 0.1
 - train_batch_size: 8
 - eval_batch_size: 8
+- gradient_accumulation_steps: 1
+- seed: 35
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_ratio: 0.1
 - num_epochs: 10
 - mixed_precision_training: Native AMP
+- fp16: True
+Learning rate overtime (warm up ratio was used during training):
+<img src="imgs/lr_27.2.24.png" alt="lr" width="550"/>
 ### Framework versions
 - Pytorch 2.1.0+cu121
 - Datasets 2.16.1
 - Tokenizers 0.15.1
+- Accelerate 0.27.2
+- PEFT 0.8.2
+- Auto-GPTQ 0.6.0
+- TRL 0.7.11
+- Optimum 1.17.1
+- Bitsandbytes 0.42.0