Zephyr-7b-QnA / README.md
Feluda's picture
Updated Model description
0127919 verified
metadata
library_name: transformers
tags:
  - PEFT
  - mistral
  - sft
  - 'TensorBoard '
  - Safetensors
  - ' trl'
  - generated_from_trainer 4-bit
  - ' precision'
license: mit
datasets:
  - yahma/alpaca-cleaned
language:
  - en
pipeline_tag: question-answering

Model Card for Model ID

This Model is Finetuned for Document Question and Answering purpose Trained on the yahma/alpaca-cleaned(https://huggingface.co/TheBloke/zephyr-7B-beta-GPTQ) dataset.

Model Details

Training hyperparameters

The following hyperparameters were used during training:

-gradient_accumulation_steps=1,

-warmup_steps=5,

-max_steps=20,

-learning_rate=2e-4,

-fp16=not torch.cuda.is_bf16_supported(),

-bf16=torch.cuda.is_bf16_supported(),

-logging_steps=1,

-optim="adamw_8bit",

-weight_decay=0.01,

-lr_scheduler_type="linear",

-seed=3407,

  • Framework versions

  • PEFT 0.7.1

  • Transformers 4.36.0

  • Pytorch 2.0.0

  • Datasets 2.16.1

  • Tokenizers 0.15.0