Edit model card
  • Model Name: billsum-BART-base-cnn

Description:

This model is based on BART (Bidirectional and Auto-Regressive Transformers), originally introduced in the paper "BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension" by Lewis et al. It has been fine-tuned on the CNN Daily Mail dataset for the task of text summarization, specifically on the BillSum dataset, which consists of summaries of US Congressional and California state bills.

Model Architecture:

BART is a transformer-based encoder-decoder (seq2seq) model with a bidirectional encoder and an autoregressive decoder. It excels in text generation tasks such as summarization and translation and is effective for comprehension tasks like text classification and question answering.

Dataset Used:

The model has been fine-tuned on the BillSum dataset, which includes the following features:

  • text: The bill text.
  • summary: A summary of the bills.
  • title: The title of the bills (available for US bills only).
  • text_len: The number of characters in the text.
  • sum_len: The number of characters in the summary.

The data was collected from various sources, including the United States Government Publishing Office (GPO) and the California legislature's website.

Uses:

  • Text Summarization: This model can be used to generate concise summaries of longer text documents, making it suitable for applications like news article summarization, document summarization, and more.

Limitations:

  • Data Dependency: The model's performance heavily relies on the quality and diversity of the training data. Fine-tuning on specific datasets may lead to biases or limitations inherent to those datasets.
  • Length Constraints: Like many sequence-to-sequence models, BART has length constraints. Longer input texts may result in truncated or incomplete summaries.
  • Domain Specificity: While fine-tuned on bill summaries, the model may not generalize well to other domains without further fine-tuning.

Ethical Considerations:

  • Bias: Models like BART can inherit biases present in their training data. Care should be taken to evaluate and mitigate biases in generated content, especially when dealing with legal or legislative documents.
  • Privacy: When summarizing text, ensure that sensitive or private information is not inadvertently disclosed in the summaries.
  • Accessibility: Consider making model outputs accessible to individuals with disabilities, such as providing summaries in accessible formats.

Usage :

from transformers import pipeline

# Create a text generation pipeline with the specified model
pipe = pipeline("text2text-generation", model="ayoubkirouane/billsum-bart-base")

# Input text for summarization
input_text = """
Shields a business entity from civil liability relating to any injury or death occurring at a facility of that entity in connection with a use of such facility by a nonprofit organization if: (1) the use occurs outside the scope of business of the business entity; (2) such injury or death occurs during a period that such facility is used by such organization; and (3) the business entity authorized the use of such facility by the organization. Makes this Act inapplicable to an injury or death that results from an act or omission of a business entity that constitutes gross negligence or intentional misconduct, including misconduct that: (1) constitutes a hate crime or a crime of violence or act of international terrorism for which the defendant has been convicted in any court; or (2) involves a sexual offense for which the defendant has been convicted in any court or misconduct for which the defendant has been found to have violated a Federal or State civil rights law. Preempts State laws to the extent that such laws are inconsistent with this Act, except State law that provides additional protection from liability. Specifies that this Act shall not be construed to supersede any Federal or State health or safety law. Makes this Act inapplicable to any civil action in a State court against a business entity in which all parties are citizens of the State if such State, citing this Act's authority and containing no other provision, enacts a statute declaring the State's election that this Act shall not apply to such action in the State.
"""

# Generate the summary
summary = pipe(input_text, max_length=1024)

# Print the generated summary
print(summary[0]['summary_text'])
Downloads last month
5
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train ayoubkirouane/billsum-BART-base-cnn

Space using ayoubkirouane/billsum-BART-base-cnn 1