Jungwonchang commited on
Commit
680febb
1 Parent(s): c67c06c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -17
README.md CHANGED
@@ -10,7 +10,7 @@ language:
10
  ---
11
 
12
  # Model Card for Model ID
13
- Korean Chatbot based on Alibaba's QWEN
14
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6232fdee38869c4ca8fd49e2/CBQ0cdD54Sd7-rbNt-Mkb.png)
15
  [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1fmcq1YZaIYg-cuCS4aadomutLmzSyEYI#scrollTo=6c1edcdc-158d-4043-a7c7-1d145ebf2cd1)
16
  (keep in mind that basic colab runtime with T4 GPU will lead to OOM error. Fine-tuned version of Qwen-14b-Chat-Int4 will not have this issue)
@@ -190,21 +190,10 @@ response = qwen_chat_single_turn(model, tokenizer, device, query=query,
190
  ### Training Procedure
191
 
192
  <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
 
 
 
193
 
194
- #### Preprocessing [optional]
195
-
196
- [More Information Needed]
197
-
198
-
199
- #### Training Hyperparameters
200
-
201
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
202
-
203
- #### Speeds, Sizes, Times [optional]
204
-
205
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
206
-
207
- [More Information Needed]
208
 
209
  ## Evaluation
210
 
@@ -296,8 +285,8 @@ Jungwon Chang
296
 
297
  ## Model Card Contact
298
 
299
- [More Information Needed]
300
-
301
 
302
  ## Training procedure
303
 
 
10
  ---
11
 
12
  # Model Card for Model ID
13
+ Korean Chatbot based on Alibaba's [QWEN](https://github.com/QwenLM/Qwen)
14
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6232fdee38869c4ca8fd49e2/CBQ0cdD54Sd7-rbNt-Mkb.png)
15
  [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1fmcq1YZaIYg-cuCS4aadomutLmzSyEYI#scrollTo=6c1edcdc-158d-4043-a7c7-1d145ebf2cd1)
16
  (keep in mind that basic colab runtime with T4 GPU will lead to OOM error. Fine-tuned version of Qwen-14b-Chat-Int4 will not have this issue)
 
190
  ### Training Procedure
191
 
192
  <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
193
+ The model was fine-tuned using LoRA (Low-Rank Adaptation), which allows for efficient training of large language models by updating only a small set of parameters.
194
+ The fine-tuning process was conducted on a single node with 2 GPUs, utilizing distributed training to enhance the training efficiency and speed.
195
+ The lora rank was set to 32, for I only had limited time to access the GPUs.
196
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
197
 
198
  ## Evaluation
199
 
 
285
 
286
  ## Model Card Contact
287
 
288
+ cjw1994cool@gmail.com
289
+ cjw1994cool@korea.ac.kr
290
 
291
  ## Training procedure
292