aychang commited on
Commit
70316ea
1 Parent(s): e0cbe9e

Push distilbert base cased trec model

Browse files
README.md ADDED
@@ -0,0 +1,102 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ thumbnail:
5
+ tags:
6
+ - text-classification
7
+ license: mit
8
+ datasets:
9
+ - trec
10
+ metrics:
11
+ ---
12
+
13
+ # TREC 6-class Task: distilbert-base-cased
14
+
15
+ ## Model description
16
+
17
+ A simple base distilBERT model trained on the "trec" dataset.
18
+
19
+ ## Intended uses & limitations
20
+
21
+ #### How to use
22
+
23
+ ##### Transformers
24
+
25
+ ```python
26
+ # Load model and tokenizer
27
+ from transformers import AutoModelForSequenceClassification, AutoTokenizer
28
+
29
+ model = AutoModelForQuestionAnswering.from_pretrained(model_name)
30
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
31
+
32
+ # Use pipeline
33
+ from transformers import pipeline
34
+
35
+ model_name = "aychang/distilbert-base-cased-trec-coarse"
36
+
37
+ nlp = pipeline("sentiment-analysis", model=model_name, tokenizer=model_name)
38
+
39
+ results = nlp(["Where did the queen go?", "Why did the Queen hire 1000 ML Engineers?"])
40
+ ```
41
+
42
+ ##### AdaptNLP
43
+
44
+ ```python
45
+ from adaptnlp import EasySequenceClassifier
46
+
47
+ model_name = "aychang/distilbert-base-cased-trec-coarse"
48
+ texts = ["Where did the queen go?", "Why did the Queen hire 1000 ML Engineers?"]
49
+
50
+ classifer = EasySequenceClassifier
51
+ results = classifier.tag_text(text=texts, model_name_or_path=model_name, mini_batch_size=2)
52
+ ```
53
+
54
+ #### Limitations and bias
55
+
56
+ This is minimal language model trained on a benchmark dataset.
57
+
58
+ ## Training data
59
+
60
+ TREC https://huggingface.co/datasets/trec
61
+
62
+ ## Training procedure
63
+
64
+ Preprocessing, hardware used, hyperparameters...
65
+ #### Hardware
66
+ One V100
67
+
68
+ #### Hyperparameters and Training Args
69
+ ```python
70
+ from transformers import TrainingArguments
71
+
72
+ training_args = TrainingArguments(
73
+ output_dir='./models',
74
+ overwrite_output_dir=False,
75
+ num_train_epochs=2,
76
+ per_device_train_batch_size=16,
77
+ per_device_eval_batch_size=16,
78
+ warmup_steps=500,
79
+ weight_decay=0.01,
80
+ evaluation_strategy="steps",
81
+ logging_dir='./logs',
82
+ fp16=False,
83
+ eval_steps=500,
84
+ save_steps=300000
85
+ )
86
+ ```
87
+
88
+ ## Eval results
89
+
90
+ ```
91
+ {'epoch': 2.0,
92
+ 'eval_accuracy': 0.97,
93
+ 'eval_f1': array([0.98220641, 0.91620112, 1. , 0.97709924, 0.98678414,
94
+ 0.97560976]),
95
+ 'eval_loss': 0.14275787770748138,
96
+ 'eval_precision': array([0.96503497, 0.96470588, 1. , 0.96969697, 0.98245614,
97
+ 0.96385542]),
98
+ 'eval_recall': array([1. , 0.87234043, 1. , 0.98461538, 0.99115044,
99
+ 0.98765432]),
100
+ 'eval_runtime': 0.9731,
101
+ 'eval_samples_per_second': 513.798}
102
+ ```
config.json ADDED
@@ -0,0 +1,40 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "distilbert-base-cased",
3
+ "activation": "gelu",
4
+ "architectures": [
5
+ "DistilBertForSequenceClassification"
6
+ ],
7
+ "attention_dropout": 0.1,
8
+ "dim": 768,
9
+ "dropout": 0.1,
10
+ "hidden_dim": 3072,
11
+ "id2label": {
12
+ "0": "DESC",
13
+ "1": "ENTY",
14
+ "2": "ABBR",
15
+ "3": "HUM",
16
+ "4": "NUM",
17
+ "5": "LOC"
18
+ },
19
+ "initializer_range": 0.02,
20
+ "label2id": {
21
+ "ABBR": 2,
22
+ "DESC": 0,
23
+ "ENTY": 1,
24
+ "HUM": 3,
25
+ "LOC": 5,
26
+ "NUM": 4
27
+ },
28
+ "max_position_embeddings": 512,
29
+ "model_type": "distilbert",
30
+ "n_heads": 12,
31
+ "n_layers": 6,
32
+ "output_past": true,
33
+ "pad_token_id": 0,
34
+ "qa_dropout": 0.1,
35
+ "seq_classif_dropout": 0.2,
36
+ "sinusoidal_pos_embds": false,
37
+ "tie_weights_": true,
38
+ "transformers_version": "4.2.2",
39
+ "vocab_size": 28996
40
+ }
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8da6e0ad32e33759f59921cee09d4ed9608684f50ffb3feced0cf256f6a3f7d7
3
+ size 263186580
special_tokens_map.json ADDED
@@ -0,0 +1 @@
 
1
+ {"unk_token": "[UNK]", "sep_token": "[SEP]", "pad_token": "[PAD]", "cls_token": "[CLS]", "mask_token": "[MASK]"}
tokenizer_config.json ADDED
@@ -0,0 +1 @@
 
1
+ {"do_lower_case": false, "unk_token": "[UNK]", "sep_token": "[SEP]", "pad_token": "[PAD]", "cls_token": "[CLS]", "mask_token": "[MASK]", "tokenize_chinese_chars": true, "strip_accents": null, "model_max_length": 512, "name_or_path": "distilbert-base-cased"}
training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:12163274fe7b72d50302e6c69fcf07252ae189f29eba56f1c44bbe67d84d02de
3
+ size 1967
vocab.txt ADDED
The diff for this file is too large to render. See raw diff