File size: 3,147 Bytes
05e85ad
4da6182
05e85ad
52938fe
05e85ad
 
 
 
 
 
79a4daa
 
 
 
05e85ad
 
db5798c
 
66be958
db5798c
 
 
 
 
 
 
bb8ab72
db5798c
 
 
 
 
 
 
52938fe
db5798c
 
 
 
 
 
da963e1
 
af162a7
 
 
 
 
 
 
 
 
 
 
 
 
ada8d8a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
af162a7
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
---
language: "en"
datasets:
- spotify-podcast-dataset
tags:
- bert
- classification
- pytorch
pipeline:
- text-classification
widget:
- text: "__START__ [SEP] This is the first podcast on natural language processing applied to spoken language."
- text: "This is the first podcast on natural language processing applied to spoken language. [SEP] You can find us on https://twitter.com/PodcastExampleClassifier."
- text: "You can find us on https://twitter.com/PodcastExampleClassifier. [SEP] You can also subscribe to our newsletter https://newsletter.com/PodcastExampleClassifier."
---

**General Information**

This is a `bert-base-cased`, binary classification model, fine-tuned to classify a given sentence as containing advertising content or not. It leverages previous-sentence context to make more accurate predictions.
The model is used in the paper 'Leveraging multimodal content for podcast summarization' published at ACM SAC 2022.

**Usage:**

```python
from transformers import AutoModelForSequenceClassification, AutoTokenizer
model = AutoModelForSequenceClassification.from_pretrained('morenolq/spotify-podcast-advertising-classification')
tokenizer = AutoTokenizer.from_pretrained('morenolq/spotify-podcast-advertising-classification')

desc_sentences = ["Sentence 1", "Sentence 2", "Sentence 3"]
for i, s in enumerate(desc_sentences): 
    if i==0:
        context = "__START__"
    else:
        context = desc_sentences[i-1] 
    out = tokenizer(context, s, padding = "max_length",
                        max_length = 256,
                        truncation=True,
                        return_attention_mask=True,
                        return_tensors = 'pt')
    outputs = model(**out)
    print (f"{s},{outputs}")
```

The manually annotated data, used for model fine-tuning are available [here](https://github.com/MorenoLaQuatra/MATeR/blob/main/description_sentences_classification.tsv)

Hereafter is the classification report of the model evaluation on the test split:

```
              precision    recall  f1-score   support

           0       0.95      0.93      0.94       256
           1       0.88      0.91      0.89       140

    accuracy                           0.92       396
   macro avg       0.91      0.92      0.92       396
weighted avg       0.92      0.92      0.92       396
```

If you find it useful, please cite the following paper:

```bibtex
@inproceedings{10.1145/3477314.3507106,
    author = {Vaiani, Lorenzo and La Quatra, Moreno and Cagliero, Luca and Garza, Paolo},
    title = {Leveraging Multimodal Content for Podcast Summarization},
    year = {2022},
    isbn = {9781450387132},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/3477314.3507106},
    doi = {10.1145/3477314.3507106},
    booktitle = {Proceedings of the 37th ACM/SIGAPP Symposium on Applied Computing},
    pages = {863–870},
    numpages = {8},
    keywords = {multimodal learning, multimodal features fusion, extractive summarization, deep learning, podcast summarization},
    location = {Virtual Event},
    series = {SAC '22}
}
```