ML1 Previews
This repository contains the previews for the ML1 model - Reddit Post
Watch training live here: https://api.wandb.ai/links/nickmitchko/t5d47kzr
Checkpoints
Model | 1 Epoch Pct | Link |
---|---|---|
ML1-34b | 15% | Directory |
ML1-34b | 50% | ~ |
ML1-34b | 100% | ~ |
ML1-mistral-7b | 50% | ~ |
ML1-mistral-7b | 100% | ~ |
ML1-70b | 15% | ~ |
ML1-70b | 50% | ~ |
ML1-70b | 100% | ~ |
Model Description
The goal is to develop a series of models that can express superior performance given high quality data. To achieve this, I plan to experiment with the lovely dataset produced by /u/docsoc1. Huge shout out to him/her! If you'd like to view that dataset, the link is below.
Dataset: emrgnt-cmplxty/sciphi-textbooks-are-all-you-need
Prompt Format
The model is trained using the alpaca format. Please see here or below for that format:
Below is an instruction that describes a task. Write a response that appropriately completes the request.
### Instruction:
{instruction}
### Response:
Architecture
nmitchko/ML1-34b-previews
is a large language model repository of LoRA checkpoints specifically fine-tuned to add text-book synthesized data in the style of Phi 1/1.5.
It is based on codellama-34b-hf
at 34 billion parameters.
The primary goal of this model is to test various fine tuning methods around high quality data. It was trained using LoRA, specifically QLora Multi GPU, to reduce memory footprint.
See Training Parameters for more info This Lora supports 4-bit and 8-bit modes.
Requirements
bitsandbytes>=0.41.0
peft@main
transformers@main
Steps to load this model:
- Load base model (codellama-34b-hf) using transformers
- Download a checkpoint folder (checkpoint-1)
- Apply LoRA using peft
Training Parameters
The model is currently training on emrgnt-cmplxty/sciphi-textbooks-are-all-you-need
emrgnt-cmplxty/sciphi-textbooks-are-all-you-need
contains textbook synthesized data.
Item | Amount | Units |
---|---|---|
LoRA Rank | 64 | ~ |
LoRA Alpha | 16 | ~ |
Learning Rate | 1e-4 | SI |
Dropout | 5 | % |
Training procedure
The following bitsandbytes
quantization config was used during training:
- quant_method: QuantizationMethod.BITS_AND_BYTES
- load_in_8bit: False
- load_in_4bit: True
- llm_int8_threshold: 6.0
- llm_int8_skip_modules: None
- llm_int8_enable_fp32_cpu_offload: False
- llm_int8_has_fp16_weight: False
- bnb_4bit_quant_type: nf4
- bnb_4bit_use_double_quant: True
- bnb_4bit_compute_dtype: bfloat16
Framework versions
- PEFT 0.6.0.dev0
- Downloads last month
- 0