arxiv-summarization

This model is a fine-tuned version of google/flan-t5-small on a dataset of armanc/scientific_papers (arxiv). It is optimized for summarizing scientific abstracts.

Model Details

  • Base Model: google/flan-t5-small
  • Training Data: Arxiv Research Papers (articleabstract)
  • Fine-Tuned Task: Text Summarization
  • Use Case: Generate shorter summaries of long research papers
  • License: Apache 2.0

How to Use

from transformers import T5ForConditionalGeneration, T5Tokenizer

model = T5ForConditionalGeneration.from_pretrained("Talina06/arxiv-summarization")
tokenizer = T5Tokenizer.from_pretrained("Talina06/arxiv-summarization")

text = "Summarize: Deep learning is being used to advance medical research, particularly in cancer detection."
inputs = tokenizer(text, return_tensors="pt")
summary_ids = model.generate(**inputs)
summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)

print("Generated Summary:", summary)

Training Details

  • Training Data: 100k+ Arxiv research papers
  • Training Framework: Hugging Face Transformers
  • Hyperparameters:
    • Learning Rate: 5e-5
    • Batch Size: 8
    • Epochs: 10
  • Hardware Used: TPU & GPU

Limitations

  • ❌ May struggle with very technical papers (e.g., complex math formulas).

Example Summaries

Original Abstract Generated Summary
"Deep learning has transformed many fields... We propose a new CNN for cancer detection..." "A CNN model is proposed for cancer detection using deep learning."
"Quantum computing has shown potential for cryptographic applications..." "Quantum computing can be used in cryptography."
Downloads last month
124
Safetensors
Model size
77M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for Talina06/arxiv-summarization

Finetuned
(345)
this model

Dataset used to train Talina06/arxiv-summarization