|
--- |
|
language: "en" |
|
datasets: |
|
- spotify-podcast-dataset |
|
tags: |
|
- bert |
|
- classification |
|
- pytorch |
|
pipeline: |
|
- text-classification |
|
widget: |
|
- text: "__START__ [SEP] This is the first podcast on natural language processing applied to spoken language." |
|
- text: "This is the first podcast on natural language processing applied to spoken language. [SEP] You can find us on https://twitter.com/PodcastExampleClassifier." |
|
- text: "You can find us on https://twitter.com/PodcastExampleClassifier. [SEP] You can also subscribe to our newsletter https://newsletter.com/PodcastExampleClassifier." |
|
--- |
|
|
|
**General Information** |
|
|
|
This is a `bert-base-cased`, binary classification model, fine-tuned to classify a given sentence as containing advertising content or not. It leverages previous-sentence context to make more accurate predictions. |
|
The model is used in the paper 'Leveraging multimodal content for podcast summarization' published at ACM SAC 2022. |
|
|
|
**Usage:** |
|
|
|
```python |
|
from transformers import AutoModelForSequenceClassification, AutoTokenizer |
|
model = AutoModelForSequenceClassification.from_pretrained('morenolq/spotify-podcast-advertising-classification') |
|
tokenizer = AutoTokenizer.from_pretrained('morenolq/spotify-podcast-advertising-classification') |
|
|
|
desc_sentences = ["Sentence 1", "Sentence 2", "Sentence 3"] |
|
for i, s in enumerate(desc_sentences): |
|
if i==0: |
|
context = "__START__" |
|
else: |
|
context = desc_sentences[i-1] |
|
out = tokenizer(context, s, padding = "max_length", |
|
max_length = 256, |
|
truncation=True, |
|
return_attention_mask=True, |
|
return_tensors = 'pt') |
|
outputs = model(**out) |
|
print (f"{s},{outputs}") |
|
``` |
|
|
|
The manually annotated data, used for model fine-tuning are available [here](https://github.com/MorenoLaQuatra/MATeR/blob/main/description_sentences_classification.tsv) |
|
|
|
Hereafter is the classification report of the model evaluation on the test split: |
|
|
|
``` |
|
precision recall f1-score support |
|
|
|
0 0.95 0.93 0.94 256 |
|
1 0.88 0.91 0.89 140 |
|
|
|
accuracy 0.92 396 |
|
macro avg 0.91 0.92 0.92 396 |
|
weighted avg 0.92 0.92 0.92 396 |
|
``` |
|
|
|
If you find it useful, please cite the following paper: |
|
|
|
```bibtex |
|
@inproceedings{10.1145/3477314.3507106, |
|
author = {Vaiani, Lorenzo and La Quatra, Moreno and Cagliero, Luca and Garza, Paolo}, |
|
title = {Leveraging Multimodal Content for Podcast Summarization}, |
|
year = {2022}, |
|
isbn = {9781450387132}, |
|
publisher = {Association for Computing Machinery}, |
|
address = {New York, NY, USA}, |
|
url = {https://doi.org/10.1145/3477314.3507106}, |
|
doi = {10.1145/3477314.3507106}, |
|
booktitle = {Proceedings of the 37th ACM/SIGAPP Symposium on Applied Computing}, |
|
pages = {863–870}, |
|
numpages = {8}, |
|
keywords = {multimodal learning, multimodal features fusion, extractive summarization, deep learning, podcast summarization}, |
|
location = {Virtual Event}, |
|
series = {SAC '22} |
|
} |
|
``` |