ptaszynski
/

bert-base-polish-cyberbullying

Text Classification

Inference Endpoints

Model card Files Files and versions Community

ptaszynski commited on Dec 25, 2023

Commit

5663327

·

1 Parent(s): 0bc3258

Update README.md

Files changed (1) hide show

README.md +21 -13

README.md CHANGED Viewed

@@ -1,16 +1,12 @@
 ---
-language: pl
-license: cc-by-sa-4.0
 datasets:
-- Polish subset of Open Subtitles
-- Polish subset of ParaCrawl
-- Polish Parliamentary Corpus
-- Polish Wikipedia - Feb 2020
-- Expert-annotated Dataset for Automatic Cyberbullying Detection in Polish Laguage
 ---
 # Polbert-CB - Polish BERT trained for Automatic Cyberbullying Detection
@@ -65,11 +61,23 @@ Original dataset:
 Improved dataset:
 ```
-TBA
 ```
 ## References
 * https://github.com/google-research/bert
 * https://github.com/ptaszynski/cyberbullying-Polish
-* https://huggingface.co/datasets/poleval2019_cyberbullying

 ---
+license: cc-by-4.0
 datasets:
+- ptaszynski/PolishCyberbullyingDataset
+language:
+- pl
+tags:
+- cyberbullying
+- hate-speech
 ---
 # Polbert-CB - Polish BERT trained for Automatic Cyberbullying Detection
 Improved dataset:
+The improved dataset used for training this model was released as follows.
+[Expert-annotated dataset to study cyberbullying in Polish language](https://huggingface.co/datasets/ptaszynski/PolishCyberbullyingDataset)
 ```
+@article{ptaszynski2023expert,
+  title={Expert-Annotated Dataset to Study Cyberbullying in Polish Language},
+  author={Ptaszynski, Michal and Pieciukiewicz, Agata and Dybala, Pawel and Skrzek, Pawel and Soliwoda, Kamil and Fortuna, Marcin and Leliwa, Gniewosz and Wroczynski, Michal},
+  journal={Data},
+  volume={9},
+  number={1},
+  pages={1},
+  year={2023},
+  publisher={MDPI}
+}
 ```
 ## References
 * https://github.com/google-research/bert
 * https://github.com/ptaszynski/cyberbullying-Polish
+* https://huggingface.co/datasets/poleval2019_cyberbullying