This folder consists of the Model training (Text Generation) which uses Transformer architecture as a backbone. I have Trained a 330 Million Parameter from sctrach from random weight initalization instead of the taking any existing trained weights . for the dataset I have collected from Fineweb , Fineweb-edu and some other high quality dataset (Research paper, wikipedia dataset) , also conduct the evalution for the model on specific steps

below I have shared the evalution detail

Evaluation Results (470K Steps)

Step	Benchmark	Score
470K	HellaSwag	0.4057
470K	PIQA	0.6692
470K	OpenBookQA	0.3300
470K	WinoGrande	0.6717
470K	Social IQa	0.1950
470K	CommonsenseQA	0.2023
470K	ARC Easy	0.5211
470K	ARC Challenge	0.2876

Also , I have shared some text dataset which in generated by the model (inference) in the below screenshot . I know that this model is not up to mark on the range of existing open source model available in the industry . this model is partial pretrained only and if any those who want to retrain / resume the training from them it's completely open. Thank you ;)

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support