Back to all models
fill-mask mask_token: [MASK]
Query this model
🔥 This model is currently loaded and running on the Inference API. ⚠️ This model could not be loaded by the inference API. ⚠️ This model can be loaded on the Inference API on-demand.
JSON Output
API endpoint
								$ curl -X POST \
Share Copied link to clipboard

Monthly model downloads

DeepPavlov/bert-base-cased-conversational DeepPavlov/bert-base-cased-conversational
last 30 days



Contributed by

DeepPavlov DeepPavlov MIPT university
6 models

How to use this model directly from the 🤗/transformers library:

Copy to clipboard
from transformers import AutoTokenizer, AutoModelWithLMHead tokenizer = AutoTokenizer.from_pretrained("DeepPavlov/bert-base-cased-conversational") model = AutoModelWithLMHead.from_pretrained("DeepPavlov/bert-base-cased-conversational")


Conversational BERT (English, cased, 12‑layer, 768‑hidden, 12‑heads, 110M parameters) was trained on the English part of Twitter, Reddit, DailyDialogues[1], OpenSubtitles[2], Debates[3], Blogs[4], Facebook News Comments. We used this training data to build the vocabulary of English subtokens and took English cased version of BERT‑base as an initialization for English Conversational BERT.

[1]: Yanran Li, Hui Su, Xiaoyu Shen, Wenjie Li, Ziqiang Cao, and Shuzi Niu. DailyDialog: A Manually Labelled Multi-turn Dialogue Dataset. IJCNLP 2017.

[2]: P. Lison and J. Tiedemann, 2016, OpenSubtitles2016: Extracting Large Parallel Corpora from Movie and TV Subtitles. In Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016)

[3]: Justine Zhang, Ravi Kumar, Sujith Ravi, Cristian Danescu-Niculescu-Mizil. Proceedings of NAACL, 2016.

[4]: J. Schler, M. Koppel, S. Argamon and J. Pennebaker (2006). Effects of Age and Gender on Blogging in Proceedings of 2006 AAAI Spring Symposium on Computational Approaches for Analyzing Weblogs.