File size: 3,873 Bytes
084ea81
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
---
base_model: habanoz/TinyLlama-1.1B-2T-lr-2e-4-3ep-dolly-15k-instruct-v1
datasets:
- databricks/databricks-dolly-15k
inference: false
language:
- en
license: apache-2.0
model_creator: habanoz
model_name: TinyLlama-1.1B-2T-lr-2e-4-3ep-dolly-15k-instruct-v1
pipeline_tag: text-generation
quantized_by: afrideva
tags:
- gguf
- ggml
- quantized
- q2_k
- q3_k_m
- q4_k_m
- q5_k_m
- q6_k
- q8_0
---
# habanoz/TinyLlama-1.1B-2T-lr-2e-4-3ep-dolly-15k-instruct-v1-GGUF

Quantized GGUF model files for [TinyLlama-1.1B-2T-lr-2e-4-3ep-dolly-15k-instruct-v1](https://huggingface.co/habanoz/TinyLlama-1.1B-2T-lr-2e-4-3ep-dolly-15k-instruct-v1) from [habanoz](https://huggingface.co/habanoz)


| Name | Quant method | Size |
| ---- | ---- | ---- |
| [tinyllama-1.1b-2t-lr-2e-4-3ep-dolly-15k-instruct-v1.fp16.gguf](https://huggingface.co/afrideva/TinyLlama-1.1B-2T-lr-2e-4-3ep-dolly-15k-instruct-v1-GGUF/resolve/main/tinyllama-1.1b-2t-lr-2e-4-3ep-dolly-15k-instruct-v1.fp16.gguf) | fp16 | 2.20 GB  |
| [tinyllama-1.1b-2t-lr-2e-4-3ep-dolly-15k-instruct-v1.q2_k.gguf](https://huggingface.co/afrideva/TinyLlama-1.1B-2T-lr-2e-4-3ep-dolly-15k-instruct-v1-GGUF/resolve/main/tinyllama-1.1b-2t-lr-2e-4-3ep-dolly-15k-instruct-v1.q2_k.gguf) | q2_k | 483.12 MB  |
| [tinyllama-1.1b-2t-lr-2e-4-3ep-dolly-15k-instruct-v1.q3_k_m.gguf](https://huggingface.co/afrideva/TinyLlama-1.1B-2T-lr-2e-4-3ep-dolly-15k-instruct-v1-GGUF/resolve/main/tinyllama-1.1b-2t-lr-2e-4-3ep-dolly-15k-instruct-v1.q3_k_m.gguf) | q3_k_m | 550.82 MB  |
| [tinyllama-1.1b-2t-lr-2e-4-3ep-dolly-15k-instruct-v1.q4_k_m.gguf](https://huggingface.co/afrideva/TinyLlama-1.1B-2T-lr-2e-4-3ep-dolly-15k-instruct-v1-GGUF/resolve/main/tinyllama-1.1b-2t-lr-2e-4-3ep-dolly-15k-instruct-v1.q4_k_m.gguf) | q4_k_m | 668.79 MB  |
| [tinyllama-1.1b-2t-lr-2e-4-3ep-dolly-15k-instruct-v1.q5_k_m.gguf](https://huggingface.co/afrideva/TinyLlama-1.1B-2T-lr-2e-4-3ep-dolly-15k-instruct-v1-GGUF/resolve/main/tinyllama-1.1b-2t-lr-2e-4-3ep-dolly-15k-instruct-v1.q5_k_m.gguf) | q5_k_m | 783.02 MB  |
| [tinyllama-1.1b-2t-lr-2e-4-3ep-dolly-15k-instruct-v1.q6_k.gguf](https://huggingface.co/afrideva/TinyLlama-1.1B-2T-lr-2e-4-3ep-dolly-15k-instruct-v1-GGUF/resolve/main/tinyllama-1.1b-2t-lr-2e-4-3ep-dolly-15k-instruct-v1.q6_k.gguf) | q6_k | 904.39 MB  |
| [tinyllama-1.1b-2t-lr-2e-4-3ep-dolly-15k-instruct-v1.q8_0.gguf](https://huggingface.co/afrideva/TinyLlama-1.1B-2T-lr-2e-4-3ep-dolly-15k-instruct-v1-GGUF/resolve/main/tinyllama-1.1b-2t-lr-2e-4-3ep-dolly-15k-instruct-v1.q8_0.gguf) | q8_0 | 1.17 GB  |



## Original Model Card:
TinyLlama/TinyLlama-1.1B-intermediate-step-955k-token-2T finetuned using dolly dataset. 

Training took 1 hour on an 'ml.g5.xlarge' instance.


```python
hyperparameters ={
  'num_train_epochs': 3,                            # number of training epochs
  'per_device_train_batch_size': 6,                 # batch size for training
  'gradient_accumulation_steps': 2,                 # Number of updates steps to accumulate
  'gradient_checkpointing': True,                   # save memory but slower backward pass
  'bf16': True,                                     # use bfloat16 precision
  'tf32': True,                                     # use tf32 precision
  'learning_rate': 2e-4,                            # learning rate
  'max_grad_norm': 0.3,                             # Maximum norm (for gradient clipping)
  'warmup_ratio': 0.03,                             # warmup ratio
  "lr_scheduler_type":"constant",                   # learning rate scheduler
  'save_strategy': "epoch",                         # save strategy for checkpoints
  "logging_steps": 10,                              # log every x steps
  'merge_adapters': True,                           # wether to merge LoRA into the model (needs more memory)
  'use_flash_attn': True,                           # Whether to use Flash Attention
}

```