julien-c HF staff commited on
Commit
f53f1a8
1 Parent(s): 9bcd743

Migrate model card from transformers-repo

Browse files

Read announcement at https://discuss.huggingface.co/t/announcement-all-model-cards-will-be-migrated-to-hf-co-model-repos/2755
Original file history: https://github.com/huggingface/transformers/commits/master/model_cards/mrm8488/longformer-base-4096-finetuned-squadv2/README.md

Files changed (1) hide show
  1. README.md +72 -0
README.md ADDED
@@ -0,0 +1,72 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ datasets:
4
+ - squad_v2
5
+ ---
6
+
7
+ # Longformer-base-4096 fine-tuned on SQuAD v2
8
+
9
+ [Longformer-base-4096 model](https://huggingface.co/allenai/longformer-base-4096) fine-tuned on [SQuAD v2](https://rajpurkar.github.io/SQuAD-explorer/) for **Q&A** downstream task.
10
+
11
+ ## Longformer-base-4096
12
+
13
+ [Longformer](https://arxiv.org/abs/2004.05150) is a transformer model for long documents.
14
+
15
+ `longformer-base-4096` is a BERT-like model started from the RoBERTa checkpoint and pretrained for MLM on long documents. It supports sequences of length up to 4,096.
16
+
17
+ Longformer uses a combination of a sliding window (local) attention and global attention. Global attention is user-configured based on the task to allow the model to learn task-specific representations.
18
+
19
+ ## Details of the downstream task (Q&A) - Dataset 📚 🧐 ❓
20
+
21
+ Dataset ID: ```squad_v2``` from [HugginFace/NLP](https://github.com/huggingface/nlp)
22
+ | Dataset | Split | # samples |
23
+ | -------- | ----- | --------- |
24
+ | squad_v2 | train | 130319 |
25
+ | squad_v2 | valid | 11873 |
26
+
27
+ How to load it from [nlp](https://github.com/huggingface/nlp)
28
+
29
+ ```python
30
+ train_dataset = nlp.load_dataset('squad_v2', split=nlp.Split.TRAIN)
31
+ valid_dataset = nlp.load_dataset('squad_v2', split=nlp.Split.VALIDATION)
32
+ ```
33
+ Check out more about this dataset and others in [NLP Viewer](https://huggingface.co/nlp/viewer/)
34
+
35
+
36
+ ## Model fine-tuning 🏋️‍
37
+
38
+ The training script is a slightly modified version of [this one](https://colab.research.google.com/drive/1zEl5D-DdkBKva-DdreVOmN0hrAfzKG1o?usp=sharing)
39
+
40
+
41
+
42
+ ## Model in Action 🚀
43
+
44
+ ```python
45
+ import torch
46
+ from transformers import AutoTokenizer, AutoModelForQuestionAnswering
47
+
48
+ tokenizer = AutoTokenizer.from_pretrained("mrm8488/longformer-base-4096-finetuned-squadv2")
49
+ model = AutoModelForQuestionAnswering.from_pretrained("mrm8488/longformer-base-4096-finetuned-squadv2")
50
+
51
+ text = "Huggingface has democratized NLP. Huge thanks to Huggingface for this."
52
+ question = "What has Huggingface done ?"
53
+ encoding = tokenizer(question, text, return_tensors="pt")
54
+ input_ids = encoding["input_ids"]
55
+
56
+ # default is local attention everywhere
57
+ # the forward method will automatically set global attention on question tokens
58
+ attention_mask = encoding["attention_mask"]
59
+
60
+ start_scores, end_scores = model(input_ids, attention_mask=attention_mask)
61
+ all_tokens = tokenizer.convert_ids_to_tokens(input_ids[0].tolist())
62
+
63
+ answer_tokens = all_tokens[torch.argmax(start_scores) :torch.argmax(end_scores)+1]
64
+ answer = tokenizer.decode(tokenizer.convert_tokens_to_ids(answer_tokens))
65
+
66
+ # output => democratized NLP
67
+ ```
68
+ If given the same context we ask something that is not there, the output for **no answer** will be ```<s>```
69
+
70
+ > Created by [Manuel Romero/@mrm8488](https://twitter.com/mrm8488) | [LinkedIn](https://www.linkedin.com/in/manuel-romero-cs/)
71
+
72
+ > Made with <span style="color: #e25555;">&hearts;</span> in Spain