kirankunapuli commited on
Commit
989c563
1 Parent(s): d7957ee

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +72 -4
README.md CHANGED
@@ -1,6 +1,7 @@
1
  ---
2
  language:
3
  - en
 
4
  license: apache-2.0
5
  tags:
6
  - text-generation-inference
@@ -9,14 +10,81 @@ tags:
9
  - llama
10
  - trl
11
  base_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
 
 
 
 
12
  ---
13
 
14
- # Uploaded model
15
 
16
- - **Developed by:** kirankunapuli
17
  - **License:** apache-2.0
18
- - **Finetuned from model :** TinyLlama/TinyLlama-1.1B-Chat-v1.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
 
20
  This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
21
 
22
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
 
 
1
  ---
2
  language:
3
  - en
4
+ - hi
5
  license: apache-2.0
6
  tags:
7
  - text-generation-inference
 
10
  - llama
11
  - trl
12
  base_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
13
+ datasets:
14
+ - yahma/alpaca-cleaned
15
+ - ravithejads/samvaad-hi-filtered
16
+ - HydraIndicLM/hindi_alpaca_dolly_67k
17
  ---
18
 
19
+ # TinyLlama-1.1B-Hinglish-LORA-v1.0 model
20
 
21
+ - **Developed by:** [Kiran Kunapuli](https://www.linkedin.com/in/kirankunapuli/)
22
  - **License:** apache-2.0
23
+ - **Finetuned from model:** TinyLlama/TinyLlama-1.1B-Chat-v1.0
24
+ - **Model config:**
25
+ ```python
26
+ model = FastLanguageModel.get_peft_model(
27
+ model,
28
+ r = 64,
29
+ target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
30
+ "gate_proj", "up_proj", "down_proj",],
31
+ lora_alpha = 128,
32
+ lora_dropout = 0,
33
+ bias = "none",
34
+ use_gradient_checkpointing = True,
35
+ random_state = 42,
36
+ use_rslora = True,
37
+ loftq_config = None,
38
+ )
39
+ ```
40
+ - **Training parameters:**
41
+ ```python
42
+ trainer = SFTTrainer(
43
+ model = model,
44
+ tokenizer = tokenizer,
45
+ train_dataset = dataset,
46
+ dataset_text_field = "text",
47
+ max_seq_length = max_seq_length,
48
+ dataset_num_proc = 2,
49
+ packing = True,
50
+ args = TrainingArguments(
51
+ per_device_train_batch_size = 12,
52
+ gradient_accumulation_steps = 16,
53
+ warmup_ratio = 0.1,
54
+ num_train_epochs = 1,
55
+ learning_rate = 2e-4,
56
+ fp16 = not torch.cuda.is_bf16_supported(),
57
+ bf16 = torch.cuda.is_bf16_supported(),
58
+ logging_steps = 1,
59
+ optim = "paged_adamw_32bit",
60
+ weight_decay = 0.001,
61
+ lr_scheduler_type = "cosine",
62
+ seed = 42,
63
+ output_dir = "outputs",
64
+ report_to = "wandb",
65
+ ),
66
+ )
67
+ ```
68
+ - **Training details:**
69
+ ```
70
+ ==((====))== Unsloth - 2x faster free finetuning | Num GPUs = 1
71
+ \\ /| Num examples = 15,464 | Num Epochs = 1
72
+ O^O/ \_/ \ Batch size per device = 12 | Gradient Accumulation steps = 16
73
+ \ / Total batch size = 192 | Total steps = 80
74
+ "-____-" Number of trainable parameters = 50,462,720
75
+
76
+ GPU = NVIDIA GeForce RTX 3090. Max memory = 24.0 GB.
77
+ Total time taken for 1 epoch - 2h:35m:28s
78
+ 9443.5288 seconds used for training.
79
+ 157.39 minutes used for training.
80
+ Peak reserved memory = 17.641 GB.
81
+ Peak reserved memory for training = 15.344 GB.
82
+ Peak reserved memory % of max memory = 73.504 %.
83
+ Peak reserved memory for training % of max memory = 63.933 %.
84
+ ```
85
 
86
  This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
87
 
88
+ **[NOTE]** TinyLlama's internal maximum sequence length is 2048. We use RoPE Scaling to extend it to 4096 with Unsloth!
89
+
90
+ [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)