This folder consists of the Model training (Text Generation) which uses Transformer architecture as a backbone. I have Trained a 330 Million Parameter from sctrach from random weight initalization instead of the taking any existing trained weights . for the dataset I have collected from Fineweb , Fineweb-edu and some other high quality dataset (Research paper, wikipedia dataset) , also conduct the evalution for the model on specific steps

below I have shared the evalution detail

Evaluation Results (470K Steps)

Step Benchmark Score
470K HellaSwag 0.4057
470K PIQA 0.6692
470K OpenBookQA 0.3300
470K WinoGrande 0.6717
470K Social IQa 0.1950
470K CommonsenseQA 0.2023
470K ARC Easy 0.5211
470K ARC Challenge 0.2876

Also , I have shared some text dataset which in generated by the model (inference) in the below screenshot . I know that this model is not up to mark on the range of existing open source model available in the industry . this model is partial pretrained only and if any those who want to retrain / resume the training from them it's completely open. Thank you ;)

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support