psetialana commited on
Commit
cd92dd6
1 Parent(s): faecb10

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +40 -1
README.md CHANGED
@@ -12,4 +12,43 @@ language:
12
  - id
13
  datasets:
14
  - psetialana/multi_session_chat-informal_indonesian-transformed
15
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
  - id
13
  datasets:
14
  - psetialana/multi_session_chat-informal_indonesian-transformed
15
+ ---
16
+
17
+ # Personalized Sahabat AI Llama 3.1 8 B
18
+
19
+ - **Developed by:** [Pradana Setialana](https://www.linkedin.com/in/psetialana/)
20
+
21
+ This model is a fine-tuned version of [GoToCompany/llama3-8b-cpt-sahabatai-v1-instruct](https://huggingface.co/GoToCompany/llama3-8b-cpt-sahabatai-v1-instruct) on [psetialana/multi_session_chat-informal_indonesian-transformed](https://huggingface.co/datasets/psetialana/multi_session_chat-informal_indonesian-transformed) dataset.
22
+
23
+ ## Model description
24
+
25
+ This model can be used to personalize conversations and role-play based on the persona given with the prompt
26
+ ```
27
+ Kamu adalah sahabat user. Kamu memiliki karakter PERSONA_ASSISTANT. User memiliki karakter PERSONA_USER. Kamu berperilaku sesuai PERSONA_ASSISTANT dan menyesuaikan responmu sesuai PERSONA_USER.
28
+
29
+ PERSONA_ASSISTANT:
30
+ {assistant_persona}
31
+
32
+ PERSONA_USER:
33
+ {user_persona}
34
+ ```
35
+
36
+ ## Training procedure
37
+
38
+ ### LoRA config
39
+
40
+ The following lora config were used during training:
41
+ - alpha: 16
42
+ - r: 16
43
+ - droput: 0
44
+ - modules: "q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"
45
+
46
+ ### Training hyperparameters
47
+
48
+ The following hyperparameters were used during training:
49
+ - learning_rate: 2e-4
50
+ - optimizer: adamw_8bit
51
+
52
+ ### Training results
53
+
54
+ [TensorBoard](../../tensorboard)