Edit model card

pythia-finewebedu

  • Generate half intelligible English sentences using a small GPT like model.
  • Will output one sentence at a time.

This model is a fine-tuned version of EleutherAI/pythia-160m-deduped on the FineWebSentences dataset. It achieves the following results on the evaluation set:

  • Loss: 4.7702
  • Accuracy: 0.2402

Model description

To generate 10 random sentences starting from an empty string on a CUDA device:

from transformers import pipeline, set_seed

generator = pipeline('text-generation', model='agentlans/pythia-finewebedu', device='cuda')

set_seed(1234)
results = generator("", max_length=100, num_return_sequences=10, do_sample=True)

for x in results:
    print(x['generated_text'])

Output:

They are also, you need to get great results at her school.
According to him the term of the Newer, as an entity of the country.
- To provide less information to help prevent and respond appropriately, it also seems to take action.
He was an important historical project that he was going to have a history, but the fact that he lived in the US and then he can move back to where he left.
By the use of the ESLP and INGELTS OF THE TRAIL ORD and REPORTANCE OR:
However, the system and the Internet have not been built.
To bridge your teeth with your teeth of the plaque build up with the new teeth and tartar attachments to the tissues, as those without an orthoker.
This is more difficult than other to learn the basics of the workbooks, where a few thousand notes the same idea that the author can be seen on the work of the project.)
This study was that by one of the six states, in the middle of a union that he had to marry or union union.
- A-Pangana and Pitta, P.A. L. T.C.

Intended uses & limitations

  • For generating short lines of English text
  • Could be useful for
    • data augmentation
    • creative inspiration
    • entertainment
    • CAPTCHA
  • Can be further finetuned on other data such as:
    • prompts
    • famous quotes
    • news headlines
    • blog post titles

Limitations include:

  • Not guaranteed to make sensible, coherent, or grammatically correct sentences
  • No regard for accuracy or truthfulness whatsoever
    • It's a bunch of words from a probability model, what do you expect?

Training and evaluation data

Sentences from HuggingFaceFW/fineweb-edu

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 3.0

Training results

No overfitting. Lower loss with Pythia-160m than with Pythia-70m, as expected.

Framework versions

  • Transformers 4.39.3
  • Pytorch 2.3.0+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
11
Safetensors
Model size
162M params
Tensor type
F32
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Finetuned from

Evaluation results