---
license: other
datasets:
- ehartford/dolphin
- shahules786/orca-chat
- togethercomputer/RedPajama-Data-1T
- atom-in-the-universe/fanfics-10k-50k
---

- **At least Huggingface Transformers [4.31.0](https://pypi.org/project/transformers/4.31.0/) is required to load this model!**
- base model: [meta-llama/Llama-2-7b](https://huggingface.co/meta-llama/Llama-2-7b)
- License: [Llama 2 Community License Agreement](https://ai.meta.com/resources/models-and-libraries/llama-downloads/)
- [wand](https://wandb.ai/open-assistant/public-sft/runs/2jfazjt9)
- checkpoint: 3319 steps
- datatpye: fp16


## Long context (RoPE Scaling)

This model was fine-tuned with a context size of 8192 tokens using linear scaling of RoPE embeddings. This feature has been added only
very recently to the Huggingface transformers library. to In order to load this model HF Transformers >=4.31.0 is required 
(`pip install transformers>=4.31.0`).

## Orca-Chat/Dolphin Datasets


## Model Configuration
```
llama2_13b_orca_8k:
  rng_seed: 0xe1291f1a
  use_custom_sampler: true
  sort_by_length: false
  dtype: fp16
  log_dir: "llama2_log_13b_orca_8k"
  learning_rate: 1e-5
  model_name: /mnt/data/llama2/Llama-2-13b-hf/
  output_dir: llama2_13b_orca_8k
  deepspeed_config: configs/zero_config_pretrain.json
  weight_decay: 0.0
  max_length: 8192
  warmup_steps: 100
  use_flash_attention: true
  gradient_checkpointing: true
  gradient_accumulation_steps: 8
  per_device_train_batch_size: 2
  per_device_eval_batch_size: 1
  residual_dropout: 0.0
  eval_steps: 200
  save_steps: 1000  # (total steps: 3319)
  num_train_epochs: 1
  save_total_limit: 4
  superhot: true
  superhot_config:
    type: linear
    scale: 2
  datasets:
    # Dataset Composition:
    # Tain (sampled):
    #   orca-chat: 100.00% (188842)
    #   fanfics: 100.00% (47760)
    #   red_pajama: 25.00% (188262)
    # Valid:
    #   orca-chat: 5000 (71.43%)
    #   fanfics: 1000 (14.29%)
    #   red_pajama: 1000 (14.29%)
    - orca-chat:
        max_val_set: 5000
    - fanfics:
        max_chunk_size: 65535
        max_val_set: 1000
    - red_pajama:
        fraction: 0.25
        max_val_set: 1000
        max_chunk_size: 65535
  peft_model: false
```

# License

- Llama 2 is licensed under the LLAMA 2 Community License, Copyright © Meta Platforms, Inc. All Rights Reserved.
- Your use of the Llama Materials must comply with applicable laws and regulations (including trade compliance laws and regulations) and adhere to the [Acceptable Use Policy](https://ai.meta.com/llama/use-policy) for the Llama Materials.