fsndzomga commited on
Commit
623c5e8
1 Parent(s): 1b7e3e7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -15
README.md CHANGED
@@ -13,18 +13,13 @@ license: apache-2.0
13
  This model is a fine-tuned version of the `CohereForAI/aya-23-8B` base model. It has been fine-tuned using a private dataset of prompt-response pairs that has been curated over the past two years. The fine-tuning process aimed to improve the model's ability to generate relevant and accurate responses in various conversational contexts.
14
 
15
  - **Developed by:** Franck Stéphane NDZOMGA
16
- - **Funded by [optional]:** [More Information Needed]
17
  - **Shared by [optional]:** Franck Stéphane NDZOMGA
18
  - **Model type:** Causal Language Model with LoRA Adapters
19
- - **Language(s) (NLP):** Primarily English (add other languages if applicable)
20
  - **License:** Apache-2.0
21
  - **Finetuned from model:** CohereForAI/aya-23-8B
22
 
23
- ### Model Sources [optional]
24
-
25
- - **Repository:** [Include the repository link here if publicly available]
26
- - **Paper [optional]:** [More Information Needed]
27
- - **Demo [optional]:** [More Information Needed]
28
 
29
  ## Uses
30
 
@@ -87,7 +82,7 @@ The model was fine-tuned using a private dataset of prompt-response pairs curate
87
  #### Training Hyperparameters
88
 
89
  - **Precision:** Mixed precision (fp16)
90
- - **Number of epochs:** [Specify the number of epochs]
91
  - **Batch size:** 1 (gradient accumulation steps: 16 to handle memory issues)
92
  - **Learning rate:** 5e-5
93
  - **Warmup steps:** 100
@@ -99,13 +94,6 @@ The model was fine-tuned using a private dataset of prompt-response pairs curate
99
  - **Remove unused columns:** False
100
  - **Mixed Precision:** Disabled (fp16=False) to avoid conflicts
101
 
102
- ### Speeds, Sizes, Times [optional]
103
-
104
- - **Training started:** [Date]
105
- - **Training completed:** [Date]
106
- - **Average training speed:** [Specify if available]
107
- - **Model size:** [Specify if available]
108
-
109
  ### Additional Information from Training Code
110
 
111
  - The training utilized the PEFT (Parameter Efficient Fine-Tuning) library, specifically leveraging the LoRA (Low-Rank Adaptation) method to fine-tune the `CohereForAI/aya-23-8B` model.
 
13
  This model is a fine-tuned version of the `CohereForAI/aya-23-8B` base model. It has been fine-tuned using a private dataset of prompt-response pairs that has been curated over the past two years. The fine-tuning process aimed to improve the model's ability to generate relevant and accurate responses in various conversational contexts.
14
 
15
  - **Developed by:** Franck Stéphane NDZOMGA
16
+ - **Funded by [optional]:** FS NDZOMGA
17
  - **Shared by [optional]:** Franck Stéphane NDZOMGA
18
  - **Model type:** Causal Language Model with LoRA Adapters
19
+ - **Language(s) (NLP):** Arabic, Chinese (simplified & traditional), Czech, Dutch, English, French, German, Greek, Hebrew, Hindi, Indonesian, Italian, Japanese, Korean, Persian, Polish, Portuguese, Romanian, Russian, Spanish, Turkish, Ukrainian, and Vietnamese
20
  - **License:** Apache-2.0
21
  - **Finetuned from model:** CohereForAI/aya-23-8B
22
 
 
 
 
 
 
23
 
24
  ## Uses
25
 
 
82
  #### Training Hyperparameters
83
 
84
  - **Precision:** Mixed precision (fp16)
85
+ - **Number of epochs:** 1
86
  - **Batch size:** 1 (gradient accumulation steps: 16 to handle memory issues)
87
  - **Learning rate:** 5e-5
88
  - **Warmup steps:** 100
 
94
  - **Remove unused columns:** False
95
  - **Mixed Precision:** Disabled (fp16=False) to avoid conflicts
96
 
 
 
 
 
 
 
 
97
  ### Additional Information from Training Code
98
 
99
  - The training utilized the PEFT (Parameter Efficient Fine-Tuning) library, specifically leveraging the LoRA (Low-Rank Adaptation) method to fine-tune the `CohereForAI/aya-23-8B` model.