togetherchat-dev-7b-v2
This model is a fine-tuned version of togethercomputer/LLaMA-2-7B-32K on 25000 entries for 3 epochs.
Model description
Model can be used for text-to-code generation and for further fine-tuning, Colab notebook example (on free T4 GPU) soon!
Datasets used:
- evol-codealpaca-80k - 10000 entries
- codealpaca-20k - 10000 entries
- open-platypus - 5000 entries
Intended uses & limitations
Please remember that model may (and will) produce inaccurate informations, you need to fine-tune it for your specific task.
Training and evaluation data
See 'Metrics'
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 10
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 40
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 3
Training results
Framework versions
- Transformers 4.33.1
- Pytorch 2.0.1+cu118
- Datasets 2.14.5
- Tokenizers 0.13.3
- Downloads last month
- 11
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for flytech/togetherchat-dev-7b-v2
Base model
togethercomputer/LLaMA-2-7B-32K