antypasd commited on
Commit
0a6fc5a
1 Parent(s): 39e277b
Files changed (2) hide show
  1. README.md +46 -60
  2. tf_model.h5 +3 -0
README.md CHANGED
@@ -1,60 +1,46 @@
1
- # tweet-topic-21-single
2
-
3
- This is a roBERTa-base model trained on ~124M tweets from January 2018 to December 2021 (see [here](https://huggingface.co/cardiffnlp/twitter-roberta-base-2021-124m)), and finetuned for single-label topic classification on a corpus of 6,997 tweets.
4
- The original roBERTa-base model can be found [here](https://huggingface.co/cardiffnlp/twitter-roberta-base-2021-124m) and the original reference paper is [TweetEval](https://github.com/cardiffnlp/tweeteval). This model is suitable for English.
5
-
6
- - Reference Paper: [TimeLMs paper](https://arxiv.org/abs/2202.03829).
7
- - Git Repo: [TimeLMs official repository](https://github.com/cardiffnlp/timelms).
8
-
9
- <b>Labels</b>:
10
- - 0 -> arts_&_culture;
11
- - 1 -> business_&_entrepreneurs;
12
- - 2 -> pop_culture;
13
- - 3 -> daily_life;
14
- - 4 -> sports_&_gaming;
15
- - 5 -> science_&_technology
16
-
17
-
18
- ## Full classification example
19
-
20
- ```python
21
- from transformers import AutoModelForSequenceClassification
22
- from transformers import AutoTokenizer
23
- import numpy as np
24
- from scipy.special import softmax
25
-
26
-
27
- MODEL = f"antypasd/tweet-topic-21-single"
28
- tokenizer = AutoTokenizer.from_pretrained(MODEL)
29
-
30
- # PT
31
- model = AutoModelForSequenceClassification.from_pretrained(MODEL)
32
- class_mapping = model.config.id2label
33
-
34
- text = "Tesla stock is on the rise!"
35
- encoded_input = tokenizer(text, return_tensors='pt')
36
- output = model(**encoded_input)
37
-
38
- output = model(**encoded_input)
39
- scores = output[0][0].detach().numpy()
40
- scores = softmax(scores)
41
-
42
- ranking = np.argsort(scores)
43
- ranking = ranking[::-1]
44
- for i in range(scores.shape[0]):
45
- l = class_mapping[ranking[i]]
46
- s = scores[ranking[i]]
47
- print(f"{i+1}) {l} {np.round(float(s), 4)}")
48
-
49
- ```
50
-
51
- Output:
52
-
53
- ```
54
- 1) business_&_entrepreneurs 0.8361
55
- 2) science_&_technology 0.0904
56
- 3) pop_culture 0.0288
57
- 4) daily_life 0.0178
58
- 5) arts_&_culture 0.0137
59
- 6) sports_&_gaming 0.0133
60
- ```
 
1
+ ---
2
+ tags:
3
+ - generated_from_keras_callback
4
+ model-index:
5
+ - name: tf version
6
+ results: []
7
+ ---
8
+
9
+ <!-- This model card has been generated automatically according to the information Keras had access to. You should
10
+ probably proofread and complete it, then remove this comment. -->
11
+
12
+ # tf version
13
+
14
+ This model is a fine-tuned version of [antypasd/tweet-topic-21-single](https://huggingface.co/antypasd/tweet-topic-21-single) on an unknown dataset.
15
+ It achieves the following results on the evaluation set:
16
+
17
+
18
+ ## Model description
19
+
20
+ More information needed
21
+
22
+ ## Intended uses & limitations
23
+
24
+ More information needed
25
+
26
+ ## Training and evaluation data
27
+
28
+ More information needed
29
+
30
+ ## Training procedure
31
+
32
+ ### Training hyperparameters
33
+
34
+ The following hyperparameters were used during training:
35
+ - optimizer: None
36
+ - training_precision: float32
37
+
38
+ ### Training results
39
+
40
+
41
+
42
+ ### Framework versions
43
+
44
+ - Transformers 4.19.2
45
+ - TensorFlow 2.8.2
46
+ - Tokenizers 0.12.1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
tf_model.h5 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d646e555dec74776f36f0727310eea5e2dff51ec0655a6dc5c28474c64f0d960
3
+ size 498890624