pszemraj
/

distilroberta-base-edu-classifier

Text Classification

Generated from Trainer

Model card Files Files and versions Community

distilroberta-base-edu-classifier / README.md

pszemraj's picture

Model save

55aa86c verified about 1 month ago

|

No virus

3.74 kB

	---
	license: apache-2.0
	base_model: distilroberta-base
	tags:
	- generated_from_trainer
	model-index:
	- name: distilroberta-base-fineweb-edu-llama3-annotations-2048-vN
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/pszemraj/eduscore-regression/runs/8e2uvp5t)
	# distilroberta-base-fineweb-edu-llama3-annotations-2048-vN

	This model is a fine-tuned version of [distilroberta-base](https://huggingface.co/distilroberta-base) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.2197
	- Mse: 0.2197

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 1e-05
	- train_batch_size: 16
	- eval_batch_size: 16
	- seed: 90085
	- gradient_accumulation_steps: 8
	- total_train_batch_size: 128
	- optimizer: Adam with betas=(0.9,0.98) and epsilon=1e-09
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_ratio: 0.05
	- num_epochs: 1.0

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Mse \|
	\|:-------------:\|:------:\|:----:\|:---------------:\|:------:\|
	\| 0.5276 \| 0.0288 \| 100 \| 0.5012 \| 0.5012 \|
	\| 0.3307 \| 0.0576 \| 200 \| 0.3467 \| 0.3467 \|
	\| 0.2994 \| 0.0865 \| 300 \| 0.2948 \| 0.2948 \|
	\| 0.2813 \| 0.1153 \| 400 \| 0.2799 \| 0.2799 \|
	\| 0.2707 \| 0.1441 \| 500 \| 0.3017 \| 0.3017 \|
	\| 0.2506 \| 0.1729 \| 600 \| 0.2699 \| 0.2699 \|
	\| 0.2584 \| 0.2018 \| 700 \| 0.2633 \| 0.2633 \|
	\| 0.2603 \| 0.2306 \| 800 \| 0.2434 \| 0.2434 \|
	\| 0.2973 \| 0.2594 \| 900 \| 0.2394 \| 0.2394 \|
	\| 0.2541 \| 0.2882 \| 1000 \| 0.2356 \| 0.2356 \|
	\| 0.2837 \| 0.3171 \| 1100 \| 0.2437 \| 0.2437 \|
	\| 0.242 \| 0.3459 \| 1200 \| 0.2379 \| 0.2379 \|
	\| 0.2379 \| 0.3747 \| 1300 \| 0.2270 \| 0.2270 \|
	\| 0.23 \| 0.4035 \| 1400 \| 0.2357 \| 0.2357 \|
	\| 0.2345 \| 0.4324 \| 1500 \| 0.2417 \| 0.2417 \|
	\| 0.2574 \| 0.4612 \| 1600 \| 0.2556 \| 0.2556 \|
	\| 0.264 \| 0.4900 \| 1700 \| 0.2452 \| 0.2452 \|
	\| 0.2596 \| 0.5188 \| 1800 \| 0.2215 \| 0.2215 \|
	\| 0.244 \| 0.5477 \| 1900 \| 0.2269 \| 0.2269 \|
	\| 0.2225 \| 0.5765 \| 2000 \| 0.2342 \| 0.2342 \|
	\| 0.2475 \| 0.6053 \| 2100 \| 0.2403 \| 0.2403 \|
	\| 0.253 \| 0.6341 \| 2200 \| 0.2326 \| 0.2326 \|
	\| 0.2435 \| 0.6630 \| 2300 \| 0.2161 \| 0.2161 \|
	\| 0.2865 \| 0.6918 \| 2400 \| 0.2265 \| 0.2265 \|
	\| 0.2351 \| 0.7206 \| 2500 \| 0.2343 \| 0.2343 \|
	\| 0.2582 \| 0.7494 \| 2600 \| 0.2342 \| 0.2342 \|
	\| 0.2167 \| 0.7783 \| 2700 \| 0.2337 \| 0.2337 \|
	\| 0.2495 \| 0.8071 \| 2800 \| 0.2273 \| 0.2273 \|
	\| 0.2364 \| 0.8359 \| 2900 \| 0.2298 \| 0.2298 \|
	\| 0.2236 \| 0.8647 \| 3000 \| 0.2170 \| 0.2170 \|
	\| 0.231 \| 0.8936 \| 3100 \| 0.2234 \| 0.2234 \|
	\| 0.2474 \| 0.9224 \| 3200 \| 0.2227 \| 0.2227 \|
	\| 0.2333 \| 0.9512 \| 3300 \| 0.2241 \| 0.2241 \|
	\| 0.2265 \| 0.9800 \| 3400 \| 0.2197 \| 0.2197 \|


	### Framework versions

	- Transformers 4.42.3
	- Pytorch 2.3.1+cu121
	- Datasets 2.20.0
	- Tokenizers 0.19.1