Edit model card

togetherchat-dev-7b-v2

This model is a fine-tuned version of togethercomputer/LLaMA-2-7B-32K on 25000 entries for 3 epochs.

Model description

Model can be used for text-to-code generation and for further fine-tuning, Colab notebook example (on free T4 GPU) soon!

Datasets used:

  • evol-codealpaca-80k - 10000 entries
  • codealpaca-20k - 10000 entries
  • open-platypus - 5000 entries

Intended uses & limitations

Please remember that model may (and will) produce inaccurate informations, you need to fine-tune it for your specific task.

Training and evaluation data

See 'Metrics'

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 10
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 40
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 3

Training results

Framework versions

  • Transformers 4.33.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.14.5
  • Tokenizers 0.13.3
Downloads last month
3
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Finetuned from