Edit model card

Model Card for Model ID

Academ is a fine-tuned BART model for summarizing academic lectures.

To find out how the model was fine-tuned, you can check the notebook on Kaggle: https://www.kaggle.com/code/yousefr/college-lectures-summarization-bart-unsupervised/

Model Details

Model Description

This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.

  • Developed by: Yousef Gamaleldin
  • Model type: Summarization
  • Language(s) (NLP): English
  • License: MIT
  • Finetuned from model [optional]: BART Large CNN

How to Get Started with the Model

Use the code below to get started with the model.

from transformers import BartForConditionalGeneration, AutoTokenizer

model = BartForConditionalGeneration.from_pretrained('yousefg/Academ-0.5')
tokenizer = AutoTokenizer.from_pretrained('facebook/bart-large-cnn')

def get_summary(input_ids, attention_mask, context_length):
    
    summaries = []
    for i in range(0, input_ids.shape[1], context_length):
        
        input_slice = input_ids[:, i:i + context_length] if i + context_length <= input_ids.size(1) else input_ids[:, i:]
        attention_mask_slice = attention_mask[:, i:i + context_length] if i + context_length <= attention_mask.size(1) else attention_mask[:, i:]
        
        summary = model.generate(input_slice, attention_mask = attention_mask_slice, max_new_tokens = 1654, min_new_tokens = 250, do_sample = True, renormalize_logits = True)
        summaries.extend(summary[0].tolist())
        
    summaries = tokenizer.decode(summaries, skip_special_tokens = True)
    
    return summaries
    
batch = tokenizer(texts, truncation = False) # make sure to get the transcript from the lecture

input_ids = torch.tensor(batch['input_ids']).unsqueeze(0).to(device)
attention_mask = torch.tensor(batch['attention_mask']).unsqueeze(0).to(device)

summary = get_summary(input_ids, attention_mask, 1654)
print(summary)

Training Details

The model's training used a custom loss function for getting the model into an optimal length (35% chosen as the optimal length).

Training Hyperparameters

  • Training regime: bf16 non-mixed precision
  • Learning Rate: 0.001
  • Weight Decay: 0.01
  • Epochs: 4
  • Batch Size: 16

Evaluation

The evaluation is based on ROUGE 1 with a change of discounting padding tokens.

Testing Data

The model's test dataset had 289 lectures, mainly from MIT OpenCourseWare.

Results

The model achieved 96% accuracy for ROUGUE 1 in the test dataset, and 93% in the evaluation dataset.

Summary

Academ is a summarization model trained on 2307 lectures, mainly from MIT OpenCourseWare. The model has a max sequence length of 1654, increasing 630 tokens from the original model.

Downloads last month
21
Safetensors
Model size
408M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.