Moreno La Quatra commited on
Commit
db5798c
1 Parent(s): f17c017

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +26 -0
README.md ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ **General Information**
2
+
3
+ This is a BERT-based (base) classification model that is used to classify a given sentence as containing advertising content or not.
4
+ The model is used in the paper 'Leveraging multimodal content for podcast summarization' published at ACM SAC 2022.
5
+
6
+ **Usage:**
7
+
8
+ ```python
9
+ from transformers import AutoModelForSequenceClassification, AutoTokenizer
10
+ model = AutoModelForSequenceClassification.from_pretrained('morenolq/spotify-podcast-advertising-classification')
11
+ tokenizer = AutoTokenizer.from_pretrained('bert-base-cased')
12
+
13
+ desc_sentences = ["Sentence 1", "Sentence 2", "Sentence 3"]
14
+ for i, s in enumerate(desc_sentences):
15
+ if i==0:
16
+ context = "__START__"
17
+ else:
18
+ context = desc_sentences[i-1]
19
+ out = tokenizer(context, text, padding = "max_length",
20
+ max_length = 256,
21
+ truncation=True,
22
+ return_attention_mask=True,
23
+ return_tensors = 'pt')
24
+ outputs = model(**out)
25
+ print (f"{s},{outputs}")
26
+ ```