--- license: apache-2.0 language: en --- # BART (large-sized model) ## Model description BART is a transformer encoder-decoder (seq2seq) model with a bidirectional (BERT-like) encoder and an autoregressive (GPT-like) decoder. BART is pre-trained by (1) corrupting text with an arbitrary noising function, and (2) learning a model to reconstruct the original text. BART is particularly effective when fine-tuned for text generation (e.g. summarization, translation) but also works well for comprehension tasks (e.g. text classification, question answering). Weights shared here are effectively from facebook/bart-large but with added noise for BOS embedding to assist the finetuning. ## Intended uses & limitations There have been quite a few issues related to finetuning BART for text generation, and this repo implements solution discussed in [#15559](https://github.com/huggingface/transformers/issues/15559). Particularly adding some noise to pre-trained model's BOS embedding. This seems to solve the problem of endless BOS generation for a finetuned BART model. You can use the raw model for text infilling. However, the model is mostly meant to be fine-tuned on a supervised dataset. See the [model hub](https://huggingface.co/models?search=bart) to look for fine-tuned versions on a task that interests you. ### How to use Here is how to use this model in PyTorch: ```python from transformers import BartTokenizer, BartModel tokenizer = BartTokenizer.from_pretrained('vedu/bart-large-perturbed') model = BartModel.from_pretrained('vedu/bart-large-perturbed') inputs = tokenizer("Hello, my dog is cute", return_tensors="pt") outputs = model(**inputs) last_hidden_states = outputs.last_hidden_state ``` ### BibTeX entry and citation info ```bibtex @article{DBLP:journals/corr/abs-1910-13461, author = {Mike Lewis and Yinhan Liu and Naman Goyal and Marjan Ghazvininejad and Abdelrahman Mohamed and Omer Levy and Veselin Stoyanov and Luke Zettlemoyer}, title = {{BART:} Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension}, journal = {CoRR}, volume = {abs/1910.13461}, year = {2019}, url = {http://arxiv.org/abs/1910.13461}, eprinttype = {arXiv}, eprint = {1910.13461}, timestamp = {Thu, 31 Oct 2019 14:02:26 +0100}, biburl = {https://dblp.org/rec/journals/corr/abs-1910-13461.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} } ```