---
language:
- en
license: apache-2.0
datasets:
- databricks/databricks-dolly-15k
pipeline_tag: text-generation
base_model: TinyLlama/TinyLlama-1.1B-intermediate-step-955k-token-2T
model-index:
- name: TinyLlama-1.1B-2T-lr-2e-4-3ep-dolly-15k-instruct-v1
  results:
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: AI2 Reasoning Challenge (25-Shot)
      type: ai2_arc
      config: ARC-Challenge
      split: test
      args:
        num_few_shot: 25
    metrics:
    - type: acc_norm
      value: 30.55
      name: normalized accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=habanoz/TinyLlama-1.1B-2T-lr-2e-4-3ep-dolly-15k-instruct-v1
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: HellaSwag (10-Shot)
      type: hellaswag
      split: validation
      args:
        num_few_shot: 10
    metrics:
    - type: acc_norm
      value: 53.7
      name: normalized accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=habanoz/TinyLlama-1.1B-2T-lr-2e-4-3ep-dolly-15k-instruct-v1
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: MMLU (5-Shot)
      type: cais/mmlu
      config: all
      split: test
      args:
        num_few_shot: 5
    metrics:
    - type: acc
      value: 26.07
      name: accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=habanoz/TinyLlama-1.1B-2T-lr-2e-4-3ep-dolly-15k-instruct-v1
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: TruthfulQA (0-shot)
      type: truthful_qa
      config: multiple_choice
      split: validation
      args:
        num_few_shot: 0
    metrics:
    - type: mc2
      value: 35.85
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=habanoz/TinyLlama-1.1B-2T-lr-2e-4-3ep-dolly-15k-instruct-v1
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: Winogrande (5-shot)
      type: winogrande
      config: winogrande_xl
      split: validation
      args:
        num_few_shot: 5
    metrics:
    - type: acc
      value: 58.09
      name: accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=habanoz/TinyLlama-1.1B-2T-lr-2e-4-3ep-dolly-15k-instruct-v1
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: GSM8k (5-shot)
      type: gsm8k
      config: main
      split: test
      args:
        num_few_shot: 5
    metrics:
    - type: acc
      value: 0.0
      name: accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=habanoz/TinyLlama-1.1B-2T-lr-2e-4-3ep-dolly-15k-instruct-v1
      name: Open LLM Leaderboard
---

TinyLlama/TinyLlama-1.1B-intermediate-step-955k-token-2T finetuned using dolly dataset. 

Training took 1 hour on an 'ml.g5.xlarge' instance.


```python
hyperparameters ={
  'num_train_epochs': 3,                            # number of training epochs
  'per_device_train_batch_size': 6,                 # batch size for training
  'gradient_accumulation_steps': 2,                 # Number of updates steps to accumulate
  'gradient_checkpointing': True,                   # save memory but slower backward pass
  'bf16': True,                                     # use bfloat16 precision
  'tf32': True,                                     # use tf32 precision
  'learning_rate': 2e-4,                            # learning rate
  'max_grad_norm': 0.3,                             # Maximum norm (for gradient clipping)
  'warmup_ratio': 0.03,                             # warmup ratio
  "lr_scheduler_type":"constant",                   # learning rate scheduler
  'save_strategy': "epoch",                         # save strategy for checkpoints
  "logging_steps": 10,                              # log every x steps
  'merge_adapters': True,                           # wether to merge LoRA into the model (needs more memory)
  'use_flash_attn': True,                           # Whether to use Flash Attention
}

```
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_habanoz__TinyLlama-1.1B-2T-lr-2e-4-3ep-dolly-15k-instruct-v1)

|             Metric              |Value|
|---------------------------------|----:|
|Avg.                             |34.04|
|AI2 Reasoning Challenge (25-Shot)|30.55|
|HellaSwag (10-Shot)              |53.70|
|MMLU (5-Shot)                    |26.07|
|TruthfulQA (0-shot)              |35.85|
|Winogrande (5-shot)              |58.09|
|GSM8k (5-shot)                   | 0.00|