Vipitis
/

santacoder-finetuned-the-stack-glsl

Text Generation

text-generation-inference

Model card Files Files and versions Community

Santacoder finetuned on The-Stack-dedup (GLSL subset) for 1000 steps with a batch size of 2 and full sequence length of 2048. adapted finetuning script found here

Finetuning parameters

python3 train.py --model_path "bigcode/santacoder" \
--dataset_name "bigcode/the-stack-dedup" \
--subset "data/glsl" \
--data_column "content" \
--split "train" \
--seq_length 2048 \
--max_steps 1000 \
--batch_size 2 \
--gradient_accumulation_steps 4 \
--learning_rate 5e-5 \
--num_warmup_steps 100 \
--eval_freq 100 \
--save_freq 100 \
--log_freq 1 \
--output_dir "checkpoint_dir" \
--no_fp16

Main purpose of this model is to explore if finetuning models improves performance on ShaderEval, which reached 0.380 with 300 samples.

License carried over from model, and the finetuning dataset holds the same license.

Downloads last month: 13

Safetensors

Model size

1.23B params

Tensor type

F32

·

U8

·

Inference Providers NEW

Text Generation

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Vipitis/santacoder-finetuned-the-stack-glsl

Base model

bigcode/santacoder

Finetuned

(15)

this model

Dataset used to train Vipitis/santacoder-finetuned-the-stack-glsl

Space using Vipitis/santacoder-finetuned-the-stack-glsl 1

Collection including Vipitis/santacoder-finetuned-the-stack-glsl

models to evaluate

collecting models I want to evaluate on shadereval-task2: https://github.com/bigcode-project/bigcode-evaluation-harness/pull/173 at fp16!! • 39 items • Updated Nov 17, 2024 • 2

Evaluation results

300 samples, greedy decoding on Shadertoys-fine
self-reported

0.380

View on Papers With Code